IEEE Access (Jan 2022)

Deep Neural Networks-Based Weight Approximation and Computation Reuse for 2-D Image Classification

  • Mohammed F. Tolba,
  • Huruy Tekle Tesfai,
  • Hani Saleh,
  • Baker Mohammad,
  • Mahmoud Al-Qutayri

DOI
https://doi.org/10.1109/ACCESS.2022.3161738
Journal volume & issue
Vol. 10
pp. 41551 – 41563

Abstract

Read online

Deep Neural Networks (DNNs) are computationally and memory intensive, which present a big challenge for hardware, especially for resource-constrained devices such as Internet-of-Things (IoT) nodes. This paper introduces a new method to improve DNNs performance by fusing approximate computing with data reuse techniques for image recognition applications. First, starting from the pre-trained network, then the DNNs weights are approximated based on the linear and quadratic approximation methods during the retraining phase to reduce the DNN model size and number of arithmetic operations. Then, the DNNs weights are replaced with the linear/quadratic coefficients to execute the inference so that different DNNs weights can be computed using the same coefficients. That leads to a repetition of the weights, which enables the reuse of the DNN sub-computations (computational reuse) and leverages the same data (data reuse) to reduce DNNs computations memory accesses, and improve energy efficiency, albeit at the cost of increased training time. Complete analysis for MNIST, Fashion MNIST, CIFAR 10, CIFAR 100, and tiny ImageNet datasets is presented for image recognition, where different DNN models are used, including LeNet, ResNet, AlexNet, and VGG16. Our results show that the linear approximation achieves $1211.3\times $ , $21.8\times $ , $700\times $ , and $19.3\times $ on LeNet-5 MNIST, LeNet Fashion MNIST, VGG16 and ResNet-20. respectively, with small accuracy loss. Compared to the state-of-the-art Row Stationary (RS) method, the proposed architecture saved 54% of the total number of adders and multipliers needed. Overall, the proposed approach is suitable for IoT edge devices as it reduces computing complexity, memory size, and memory access with a small impact on accuracy.

Keywords