EERA-KWS: A 163 TOPS/W Always-on Keyword Spotting Accelerator in 28nm CMOS Using Binary Weight Network and Precision Self-Adaptive Approximate Computing

BO Liu; ZHEN Wang; HU Fan; JING Yang; BO Liu; WENTAO Zhu; LEPENG Huang; YU Gong; WEI Ge; LONGXING Shi

doi:10.1109/ACCESS.2019.2924340

IEEE Access (Jan 2019)

EERA-KWS: A 163 TOPS/W Always-on Keyword Spotting Accelerator in 28nm CMOS Using Binary Weight Network and Precision Self-Adaptive Approximate Computing

BO Liu,
ZHEN Wang,
HU Fan,
JING Yang,
BO Liu,
WENTAO Zhu,
LEPENG Huang,
YU Gong,
WEI Ge,
LONGXING Shi

Affiliations

BO Liu: ORCiD; National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China
ZHEN Wang: Nanjing Prochip Electronic Technology Company Ltd., Nanjing, China
HU Fan: National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China
JING Yang: National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China
BO Liu: National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China
WENTAO Zhu: National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China
LEPENG Huang: National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China
YU Gong: National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China
WEI Ge: ORCiD; National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China
LONGXING Shi: National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China

DOI: https://doi.org/10.1109/ACCESS.2019.2924340
Journal volume & issue: Vol. 7
pp. 82453 – 82465

Abstract

Read online

This paper proposed an energy-efficient reconfigurable accelerator for keyword spotting (EERA-KWS) based on binary weight network (BWN) and fabricated in 28-nm CMOS technology. This keyword spotting system consists of two parts: the feature extraction based on melscale frequency cepstral coefficients (MFCC) and the keywords classification based on a BWN model, which is trained through the Google’s Speech Commands database and deployed on our custom. To reduce the power consumption while maintaining the system recognition accuracy, we first optimize the MFCC implementation with approximate computing techniques, including Pre-emphasis coefficient transformation, rectangular Mel filtering, Framing and FFT optimization. Then, we propose a precision self-adaptive reconfigurable accelerator with digital-analog mixed approximate computing units to process the BWN efficiently. Based on the SNR prediction of background noise and post-detection of network output confidence, the BWN accelerator data path can be dynamically and adaptively reconfigured as 4, 8, or 16 bits. For the BWN accelerator, we proposed a time-delay based addition unit to process bit-wise approximate computing for the convolution layers and fully connected layers, and a LUT based unit for the activation layers. Implemented under TSMC 28 nm HPC+ process technology, the estimated power is $77.8~\mu \text{W}~\sim ~115.9\mu \text{W}$ , the energy efficiency can achieve 163 TOPS/W, which is over $1.8\times $ better than the state-of-the-art architecture.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords