Very Low Power Neural Network FPGA Accelerators for Tag-Less Remote Person Identification Using Capacitive Sensors

Marwen Roukhami; Mihai Teodor Lazarescu; Francesco Gregoretti; Younes Lahbib; Abdelkader Mami

doi:10.1109/ACCESS.2019.2931392

IEEE Access (Jan 2019)

Very Low Power Neural Network FPGA Accelerators for Tag-Less Remote Person Identification Using Capacitive Sensors

Marwen Roukhami,
Mihai Teodor Lazarescu,
Francesco Gregoretti,
Younes Lahbib,
Abdelkader Mami

Affiliations

Marwen Roukhami: UR-LAPER, Faculty of Sciences of Tunis, University of Tunis El-Manar, Tunis, Tunisia
Mihai Teodor Lazarescu: ORCiD; Department of Electronics and Telecommunications, Politecnico di Torino, Torin, Italy
Francesco Gregoretti: Department of Electronics and Telecommunications, Politecnico di Torino, Torin, Italy
Younes Lahbib: LR-EμE, ENICarthage, University of Carthage, Tunis, Tunisia
Abdelkader Mami: UR-LAPER, Faculty of Sciences of Tunis, University of Tunis El-Manar, Tunis, Tunisia

DOI: https://doi.org/10.1109/ACCESS.2019.2931392
Journal volume & issue: Vol. 7
pp. 102217 – 102231

Abstract

Read online

Human detection, identification, and monitoring are essential for many applications aiming to make smarter the indoor environments, where most people spend much of their time (like home, office, transportation, or public spaces). The capacitive sensors can meet stringent privacy, power, cost, and unobtrusiveness requirements, they do not rely on wearables or specific human interactions, but they may need significant on-board data processing to increase their performance. We comparatively analyze in terms of overall processing time and energy several data processing implementations of multilayer perceptron neural networks (NNs) on board capacitive sensors. The NN architecture, optimized using augmented experimental data, consists of six 17-bit inputs, two hidden layers with eight neurons each, and one four-bit output. For the software (SW) NN implementation, we use two STMicroelectronics STM32low-power ARM microcontrollers (MCUs): one MCU optimized for power and one for performance. For hardware (HW) implementations, we use four ultralow-power field-programmable gate arrays (FPGAs), with different sizes, dedicated computation blocks, and data communication interfaces (one FPGA from the Lattice iCE40 family and three FPGAs from the Microsemi IGLOO family). Our shortest SW implementation latency is 54.4 μs and the lowest energy per inference is 990 nJ, while the shortest HW implementation latency is 1.99 μs and the lowest energy is 39 nJ (including the data transfer between MCU and FPGA). The FPGAs active power ranges between 6.24 and 34.7 mW, while their static power is between 79 and 277 μW. They compare very favorably with the static power consumption of Xilinx and Altera low-power device families, which is around 40 mW. The experimental results show that NN inferences offloaded to external FPGAs have lower latency and energy than SW ones (even when using HW multipliers), and the FPGAs with dedicated computational blocks (multiply-accumulate) perform best.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords