Quantized CNN Accelerator IP for Edge AI

Open-source, vendor-agnostic FPGA IP core for high-throughput INT8 3×3 CNN acceleration optimized for 128×128 edge AI inference.

Description

Quantized CNN Accelerator IP for Edge AI is an open-source, vendor-agnostic FPGA IP core designed to accelerate INT8 convolution for real-time edge AI applications. The project implements a fully parameterized 3×3 pipelined convolution engine optimized for 128×128 image processing. Built entirely in Verilog RTL, the accelerator focuses on achieving high throughput with low resource utilization while maintaining portability across FPGA platforms.

The architecture uses BRAM-based line buffers, a sliding window generator, and a parallel MAC array to achieve a target throughput of one pixel per clock cycle. The design supports fixed-point arithmetic (INT8 inputs and weights with INT32 accumulation), making it suitable for efficient deployment in embedded and industrial edge systems.

This project promotes open hardware development by leveraging open-source FPGA toolchains and providing reusable, modular RTL components. It aims to enable scalable, hardware-accelerated AI inference without vendor lock-in.

Issues & Pull Requests Thread
No issues or pull requests added.