LimitAccess: on-device TinyML based robust speech recognition and age classification

Marina Maayah; Ahlam Abunada; Khawla Al-Janahi; Muhammad Ejaz Ahmed; Junaid Qadir

doi:10.1007/s44163-023-00051-x

Discover Artificial Intelligence (Feb 2023)

LimitAccess: on-device TinyML based robust speech recognition and age classification

Marina Maayah,
Ahlam Abunada,
Khawla Al-Janahi,
Muhammad Ejaz Ahmed,
Junaid Qadir

Affiliations

Marina Maayah: Department of Computer Science and Engineering, Qatar University
Ahlam Abunada: Department of Computer Science and Engineering, Qatar University
Khawla Al-Janahi: Department of Computer Science and Engineering, Qatar University
Muhammad Ejaz Ahmed: CSIRO’s Data61
Junaid Qadir: Department of Computer Science and Engineering, Qatar University

DOI: https://doi.org/10.1007/s44163-023-00051-x
Journal volume & issue: Vol. 3, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Automakers from Honda to Lamborghini are incorporating voice interaction technology into their vehicles to improve the user experience and offer value-added services. Speech recognition systems are a key component of smart cars, enhancing convenience and safety for drivers and passengers. In the future, safety-critical features may rely on speech recognition, but this raises concerns about children accessing such services. To address this issue, the LimitAccess system is proposed, which uses TinyML for age classification and helps parents limit children’s access to critical speech recognition services. This study employs a lite convolutional neural network (CNN) model for two different reasons: First, CNN showed superior accuracy compared to other audio classification models for age classification problems. Second, the lite model will be integrated into a microcontroller to meet its limited resource requirements. To train and evaluate our model, we created a dataset that included child and adult voices of the keyword “open”. The system approach categorizes voices into age groups (child, adult) and then utilizes that categorization to grant access to a car. The robustness of the model was enhanced by adding a new class (recordings) to the dataset, which enabled our system to detect replay and synthetic voice attacks. If an adult voice is detected, access to start the car will be granted. However, if a child’s voice or a recording is detected, the system will display a warning message that educates the child about the dangers and consequences of the improper use of a car. Arduino Nano 33 BLE sensing was our embedded device of choice for integrating our trained, optimized model. Our system achieved an overall F1 score of 87.7% and 85.89% accuracy. LimitAccess detected replay and synthetic voice attacks with an 88% F1 score.

Published in Discover Artificial Intelligence

ISSN: 2731-0809 (Online)
Publisher: Springer
Country of publisher: Switzerland
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.springer.com/journal/44163

About the journal

Abstract

Keywords