Evaluating the Performance of Pre-Trained Convolutional Neural Network for Audio Classification on Embedded Systems for Anomaly Detection in Smart Cities

Mimoun Lamrini; Mohamed Yassin Chkouri; Abdellah Touhafi

doi:10.3390/s23136227

Sensors (Jul 2023)

Evaluating the Performance of Pre-Trained Convolutional Neural Network for Audio Classification on Embedded Systems for Anomaly Detection in Smart Cities

Mimoun Lamrini,
Mohamed Yassin Chkouri,
Abdellah Touhafi

Affiliations

Mimoun Lamrini: Department of Engineering Sciences and Technology (INDI), Vrije Universiteit Brussel (VUB), 1050 Brussels, Belgium
Mohamed Yassin Chkouri: SIGL Laboratory, National School of Applied Sciences of Tetuan, Abdelmalek Essaadi University, Tetuan 93000, Morocco
Abdellah Touhafi: Department of Engineering Sciences and Technology (INDI), Vrije Universiteit Brussel (VUB), 1050 Brussels, Belgium

DOI: https://doi.org/10.3390/s23136227
Journal volume & issue: Vol. 23, no. 13
p. 6227

Abstract

Read online

Environmental Sound Recognition (ESR) plays a crucial role in smart cities by accurately categorizing audio using well-trained Machine Learning (ML) classifiers. This application is particularly valuable for cities that analyzed environmental sounds to gain insight and data. However, deploying deep learning (DL) models on resource-constrained embedded devices, such as Raspberry Pi (RPi) or Tensor Processing Units (TPUs), poses challenges. In this work, an evaluation of an existing pre-trained model for deployment on Raspberry Pi (RPi) and TPU platforms other than a laptop is proposed. We explored the impact of the retraining parameters and compared the sound classification performance across three datasets: ESC-10, BDLib, and Urban Sound. Our results demonstrate the effectiveness of the pre-trained model for transfer learning in embedded systems. On laptops, the accuracy rates reached 96.6% for ESC-10, 100% for BDLib, and 99% for Urban Sound. On RPi, the accuracy rates were 96.4% for ESC-10, 100% for BDLib, and 95.3% for Urban Sound, while on RPi with Coral TPU, the rates were 95.7% for ESC-10, 100% for BDLib and 95.4% for the Urban Sound. Utilizing pre-trained models reduces the computational requirements, enabling faster inference. Leveraging pre-trained models in embedded systems accelerates the development, deployment, and performance of various real-time applications.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords