Efficacy of Topology Scaling for Temperature and Latency Constrained Embedded ConvNets

Valentino Peluso; Roberto  Giorgio Rizzo; Andrea Calimera

doi:10.3390/jlpea10010010

Journal of Low Power Electronics and Applications (Mar 2020)

Efficacy of Topology Scaling for Temperature and Latency Constrained Embedded ConvNets

Valentino Peluso,
Roberto Giorgio Rizzo,
Andrea Calimera

Affiliations

Valentino Peluso: Interuniversity Department of Regional and Urban Studies and Planning, Politecnico di Torino, 10129 Torino, Italy
Roberto Giorgio Rizzo: Department of Control and Computer Engineering, Politecnico di Torino, 10129 Torino, Italy
Andrea Calimera: Department of Control and Computer Engineering, Politecnico di Torino, 10129 Torino, Italy

DOI: https://doi.org/10.3390/jlpea10010010
Journal volume & issue: Vol. 10, no. 1
p. 10

Abstract

Read online

Embedded Convolutional Neural Networks (ConvNets) are driving the evolution of ubiquitous systems that can sense and understand the environment autonomously. Due to their high complexity, aggressive compression is needed to meet the specifications of portable end-nodes. A variety of algorithmic optimizations are available today, from custom quantization and filter pruning to modular topology scaling, which enable fine-tuning of the hyperparameters and the right balance between quality, performance and resource usage. Nonetheless, the implementation of systems capable of sustaining continuous inference over a long period is still a primary source of concern since the limited thermal design power of general-purpose embedded CPUs prevents execution at maximum speed. Neglecting this aspect may result in substantial mismatches and the violation of the design constraints. The objective of this work was to assess topology scaling as a design knob to control the performance and the thermal stability of inference engines for image classification. To this aim, we built a characterization framework to inspect both the functional (accuracy) and non-functional (latency and temperature) metrics of two ConvNet models, MobileNet and MnasNet, ported onto a commercial low-power CPU, the ARM Cortex-A15. Our investigation reveals that different latency constraints can be met even under continuous inference, yet with a severe accuracy penalty forced by thermal constraints. Moreover, we empirically demonstrate that thermal behavior does not benefit from topology scaling as the on-chip temperature still reaches critical values affecting reliability and user satisfaction.

Published in Journal of Low Power Electronics and Applications

ISSN: 2079-9268 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Applications of electric power
Website: http://www.mdpi.com/journal/jlpea

About the journal

Abstract

Keywords