Training Hardware for Binarized Convolutional Neural Network Based on CMOS Invertible Logic

Duckgyu Shin; Naoya Onizawa; Warren J. Gross; Takahiro Hanyu

doi:10.1109/ACCESS.2020.3029576

IEEE Access (Jan 2020)

Training Hardware for Binarized Convolutional Neural Network Based on CMOS Invertible Logic

Duckgyu Shin,
Naoya Onizawa,
Warren J. Gross,
Takahiro Hanyu

Affiliations

Duckgyu Shin: ORCiD; Research Institute of Electrical Communication, Tohoku University, Sendai, Japan
Naoya Onizawa: ORCiD; Research Institute of Electrical Communication, Tohoku University, Sendai, Japan
Warren J. Gross: Department of Electrical and Computer Engineering, McGill University, Montreal, QC, Canada
Takahiro Hanyu: Research Institute of Electrical Communication, Tohoku University, Sendai, Japan

DOI: https://doi.org/10.1109/ACCESS.2020.3029576
Journal volume & issue: Vol. 8
pp. 188004 – 188014

Abstract

Read online

In this article, we implement fast and power-efficient training hardware for convolutional neural networks (CNNs) based on CMOS invertible logic. The backpropagation algorithm is generally hard to implement in hardware because it requires high-precision floating-point arithmetic. Even though parameters of CNNs can be represented by fixed points or even binary during inference, it is still represented by floating points during training. Our hardware uses low-precision data representation for both inference and training. For hardware implementation, we exploit CMOS invertible logic for training. The use of invertible logic enables logic circuits to compute probabilistic bidirectional operation (forward and backward modes) and can be implemented by stochastic computing. The proposed hardware obtains parameters of neural networks such as weights directly from given data (an input feature map and a true label) without backpropagation. For performance evaluation, the proposed hardware is implemented on an FPGA and trains a binarized 2-layer convolutional neural network model using a modified MNIST dataset. This implementation shows an energy efficiency improvement of approximately 134x compared to that of a CPU implementation that executes the training of the same model as that used in the proposed hardware. Training on the proposed hardware is approximately 40x faster than training on the CPU using the backpropagation algorithm while maintaining almost the same cognition accuracy.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords