APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators

Paniti Achararit; Muhammad Abdullah Hanif; Rachmad Vidya Wicaksana Putra; Muhammad Shafique; Yuko Hara-Azumi

doi:10.1109/ACCESS.2020.3022327

IEEE Access (Jan 2020)

APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators

Paniti Achararit,
Muhammad Abdullah Hanif,
Rachmad Vidya Wicaksana Putra,
Muhammad Shafique,
Yuko Hara-Azumi

Affiliations

Paniti Achararit: ORCiD; Department of Information and Communications, Tokyo Institute of Technology, Tokyo, Japan
Muhammad Abdullah Hanif: ORCiD; Institute of Computer Engineering, Faculty of Informatics, Vienna University of Technology, Vienna, Austria
Rachmad Vidya Wicaksana Putra: ORCiD; Institute of Computer Engineering, Faculty of Informatics, Vienna University of Technology, Vienna, Austria
Muhammad Shafique: ORCiD; Division of Engineering, New York University Abu Dhabi (NYU AD), Abu Dhabi, United Arab Emirates
Yuko Hara-Azumi: ORCiD; Department of Information and Communications, Tokyo Institute of Technology, Tokyo, Japan

DOI: https://doi.org/10.1109/ACCESS.2020.3022327
Journal volume & issue: Vol. 8
pp. 165319 – 165334

Abstract

Read online

Designing resource-efficient deep neural networks (DNNs) is a challenging task due to the enormous diversity of applications as well as their time-consuming design, training, optimization, and evaluation cycles, especially the resource-constrained embedded systems. To address these challenges, we propose a novel DNN design framework called accuracy-and-performance-aware neural architecture search (APNAS), which can generate DNNs efficiently, as it does not require hardware devices or simulators while searching for optimized DNN model configurations that offer both inference accuracy and high execution performance. In addition, to accelerate the process of DNN generation, APNAS is built on a weight sharing and reinforcement learning-based exploration methodology, which is composed of a recurrent neural network controller as its core to generate sample DNN configurations. The reward in reinforcement learning is formulated as a configurable function to consider the sample DNNs' accuracy and cycle count required to run on a target hardware architecture. To further expedite the DNN generation process, we devise analytical models for cycle count estimation instead of running millions of DNN configurations on real hardware. We demonstrate that these analytical models are highly accurate and provide cycle count estimates identical to those of a cycle-accurate hardware simulator. Experiments that involve quantitatively varying hardware constraints demonstrate that APNAS requires only 0.55 graphics processing unit (GPU) days on a single Nvidia GTX 1080Ti GPU to generate DNNs that offer an average of 53% fewer cycles with negligible accuracy degradation (on average 3%) for image classification compared to state-of-the-art techniques.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords