Interfacing the ESP32c3 RUST board with an Arducam 2MP camera for image processing and computer vision. RTL designed and tested on icebreaker V1.1a FPGA for hardware acceleration.
Computer vision is the process through which a machine learning model analyzes novel data and extracts meaningful information from its environment. Training these models incurs a large initial cost, then once complete can be deployed in a static configuration. When these models are mapped to Application-Specific Integrated Circuits (ASICs) they can perform inference much faster than in software due to specialized architecture and faster clock speed.
My goal for this project is to implement a dynamic system that captures live data and can interpret meaningful data from the real world. My current system bridges the gap between software and hardware, effectively utilizing the strengths of each domain. The ESP32c3 captures image data on its Arducam 2MP Camera, compresses each frame to one pixel per bit and transmits to the Icebreaker FPGA. Here the data image flows through the pretrained Convolutional Neural Network (CNN) resulting in a classification of the image subject. The ESP receives the result and publishes to local WiFi.
- Frontend image compression
- YUV422 Luma values interpreted as grayscale
- Quantized from 256 values (8 bits) to as low as 2 (1 bit)
- Down-sampling from default 320x240 down to 80x60
- Periodic auto-exposure to adjust to changing light levels
- Fault detection and recovery via reset-driven SYN/ACK polling
- Hardware Acceleration of Image processing via FPGA
- Decoupled interface via UART for non-blocking transmission
- Packet framing for proper data alignment
- UART RTS line to ensure data integrity
- Kernel-stationary sliding-window convolution architecture
- Parameterized kernel size, stride, input and weight widths for model flexibility
- Convolution layer supporting N Output Channels with single RAM buffer
- Unit Testing of all hardware components using CocoTB
- Dynamic HTML viewing of image over ESP32 Wi-Fi
- 2026-02-05 — Convolution Layer with N Output Channels
- 2026-01-27 — FPGA packet data framing
- 2026-01-13 — Streaming raw grayscale image
- 2026-01-03 — Bit-packing hardware to support quantized bitstream
- 2025-12-18 — ESP–FPGA integration via UART loopback
- 2025-12-10 — Edge detection on FPGA using sliding-window filters
- 2025-08-22 — Quantization and run-length encoding
- 2025-08-17 — Convolution and image differencing (software)
- 2025-08-12 — Web server integration
- 2025-07-27 — JPEG image decoding from Arducam
- 1 ESP32c3 RUST Dev Board
- 1 Arducam Mini 2MP Plus - OV2640 SPI Camera Module
- 1 Icebreaker FPGA V1.1a
- 1 Tactile Switch
- 1 10KΩ Resistor
- Wire Jumpers
| Component | Version |
|---|---|
| Espressif toolchain | v6.1 (dev-1280-gb33c9cd7ce) |
# Clone the repository
git clone git@github.com:Brmanzo/esp-computer-vision.git
cd esp-computer-vision
# Wiring
CS - GPIO_NUM_7
MOSI - GPIO_NUM_6
MISO - GPIO_NUM_5
SCK - GPIO_NUM_4
GND - GND
VCC - 3v3/5V
SDA - GPIO_NUM_10
SCL - GPIO_NUM_8
Button - GPIO_NUM_3# Source the ESP-IDF Toolchain
source ~/esp/export.sh
# Open the ESP Project Directory
cd ~/esp/esp-computer-vision/firmware
# Target the ESP32c3 Board, then build and flash
idf.py set-target esp32c3
idf.py build flash monitor| Component | Version |
|---|---|
| Yosys | 0.57 (git 3aca86049) |
| nextpnr-ice40 | 0.6-3build5 |
| Verilator | 5.020 |
| Icarus Verilog | 12.0 |
| Verible | v0.0-4051-g9fdb |
| cocotb | 1.9.1 |
| Python | 3.12.3 |
| Netlistsvg | 1.0.2 |
| librsvg (rsvg-convert) | 2.58.0 |
# Within a Python virtual environment run
pip install -r requirements.txt
# Then add utilities.py to path
export PYTHONPATH="$(git rev-parse --show-toplevel)/sim/util/:$PYTHONPATH"
# Wiring
ESP32c3 icebreakerV1.1a
GPIO_NUM_21 - GPIO 4 (PMOD1A)
GPIO_NUM_20 - GPIO 2 (PMOD1A)
GPIO_NUM_1 - GPIO 47 (PMOD1A)
GPIO_NUM_2 - GPIO 45 (PMOD1A)
GND - GND (PMOD1B)# within repo root run
make bitstream
# then flash the resulting ice40.bin using
iceprog ice40.bin# within sim/unit_testing/ open the module you'd like to test, then run
make test| Author | Source |
|---|---|
| Arducam | RPI Pico Cam Project |
| Alex Forencich | Verilog-Uart Interface |
| Dustin Richmond | CSE 225 ASIC Design Course |
This project is licensed under the MIT License. See esp-computer-vision/LICENSE.md for details.