IEEE Open Journal of Circuits and Systems (Jan 2022)

A 0.61-μJ/Frame Pipelined Wired-logic DNN Processor in 16-nm FPGA Using Convolutional Non-Linear Neural Network

  • Atsutake Kosuge,
  • Yao-Chung Hsu,
  • Mototsugu Hamada,
  • Tadahiro Kuroda

DOI
https://doi.org/10.1109/OJCAS.2021.3137263
Journal volume & issue
Vol. 3
pp. 4 – 14

Abstract

Read online

A pipelined wired-logic deep neural network (DNN) processor implemented in a 16-nm field-programmable gate array (FPGA) is presented. The latency and power required for memory access are minimized by utilizing the wired-logic architecture, thus enabling low power and high throughput operation. One technical issue with the wired-logic architecture is that it requires a lot of hardware resources. To reduce them, two core technologies are developed: (1) a convolutional non-linear neural network (CNNN) and (2) a pipeline-type neuron cell. The CNNN optimizes both the network structure and the non-linear activation function of each neuron by using a newly developed back-propagation-based training method. While conventional reinforcement learning can train only a small size network thus limiting its application to handwritten number recognition, the proposed CNNN enables a larger network size making it applicable to object recognition. The pipeline-type neuron cell has a small look-up table (LUT) to process non-linear functions using only a small amount of hardware resources. These two technologies enable the implementation of the entire network on a single FPGA chip with the wired-logic architecture. Three types of CNNN trained on the CIFAR-10 dataset are implemented in 16-nm FPGAs. An energy efficiency of 0.09, 0.12, and $0.61~\mu \text{J}$ /frame is achieved with 70%, 75%, and 82% accuracy, respectively. Compared with a state-of-the-art accelerator using a binary neural network (BNN), the energy efficiency is improved by more than two orders of magnitude.

Keywords