A Scalable System-on-Chip Acceleration for Deep Neural Networks

Faisal Shehzad; Muhammad Rashid; Mohammed H. Sinky; Saud S. Alotaibi; Muhammad Yousuf Irfan Zia

doi:10.1109/ACCESS.2021.3094675

IEEE Access (Jan 2021)

A Scalable System-on-Chip Acceleration for Deep Neural Networks

Faisal Shehzad,
Muhammad Rashid,
Mohammed H. Sinky,
Saud S. Alotaibi,
Muhammad Yousuf Irfan Zia

Affiliations

Faisal Shehzad: ORCiD; Integrated Digital Systems, University of Bremen, Bremen, Germany
Muhammad Rashid: ORCiD; Computer Engineering Department, Umm Al-Qura University, Mecca, Saudi Arabia
Mohammed H. Sinky: Computer Engineering Department, Umm Al-Qura University, Mecca, Saudi Arabia
Saud S. Alotaibi: ORCiD; Department of Information Systems, Umm Al-Qura University, Mecca, Saudi Arabia
Muhammad Yousuf Irfan Zia: ORCiD; Computer Engineering Department, Umm Al-Qura University, Mecca, Saudi Arabia

DOI: https://doi.org/10.1109/ACCESS.2021.3094675
Journal volume & issue: Vol. 9
pp. 95412 – 95426

Abstract

Read online

The size of neural networks in deep learning techniques is increasing and varies significantly according to the requirements of real-life applications. The increasing network size and scalability requirements pose significant challenges for a high performance implementation of deep neural networks (DNN). Conventional implementations, such as graphical processing units and application specific integrated circuits, are either less efficient or less flexible. Consequently, this article presents a system-on-chip (SoC) solution for the acceleration of DNN, where an ARM processor controls the overall execution and off-loads computational intensive operations to a hardware accelerator. The system implementation is performed on a SoC development board. Experimental results show that the proposed system achieves a speed-up of 22.3, with a network architecture size of $64\times 64$ , in comparison with the native implementation on a dual core cortex ARM-A9 processor. In order to generalize the performance of complete system, a mathematical formula is presented which allows to compute the total execution time for any architecture size. The validation is performed by taking Epileptic Seizure Recognition as the target case study. Finally, the results of the proposed solution are compared with various state-of-the-art solutions in terms of execution time, scalability, and clock frequency.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords