Voice Frequency-Based Gender Classification Using Convolutional Neural Network for Smart Home

Nasaruddin Nasaruddin; Muhammad Agung P. Pratama Tresma; Masduki Khamdan Muchamad; Zahrul Fuadi

doi:10.1109/ACCESS.2024.3434547

IEEE Access (Jan 2024)

Voice Frequency-Based Gender Classification Using Convolutional Neural Network for Smart Home

Nasaruddin Nasaruddin,
Muhammad Agung P. Pratama Tresma,
Masduki Khamdan Muchamad,
Zahrul Fuadi

Affiliations

Nasaruddin Nasaruddin: ORCiD; Department of Electrical and Computer Engineering, Universitas Syiah Kuala, Banda Aceh, Indonesia
Muhammad Agung P. Pratama Tresma: Department of Electrical and Computer Engineering, Universitas Syiah Kuala, Banda Aceh, Indonesia
Masduki Khamdan Muchamad: Department of Electrical and Computer Engineering, Universitas Syiah Kuala, Banda Aceh, Indonesia
Zahrul Fuadi: Department of Mechanical and Industrial Engineering, Universitas Syiah Kuala, Banda Aceh, Indonesia

DOI: https://doi.org/10.1109/ACCESS.2024.3434547
Journal volume & issue: Vol. 12
pp. 104190 – 104203

Abstract

Read online

The smart home’s functional requirements should include the capability to differentiate between various user categories, such as gender and voice recognition. The data-driven Internet of Things (IoT) can present challenges for the elderly and people with disabilities, but voice recognition technology could offer an effective solution. In addition, developing an accurate gender prediction model for voice recognition is still challenging due to the large time variation and randomness. Therefore, we propose gender classification and detection models based on voice frequency using Convolutional Neural Networks (CNN) with ResNet50 and ResNet101 architectures to enhance smart home functionality. We also introduce an algorithm for converting voice frequencies into images to speed up the recognition and detection processes. The research method involves converting voice frequencies into images to expedite the recognition and detection processes. The CNN models were trained and tested with various learning rates using audio datasets. Performance was evaluated through simulations that measured training accuracy, validation accuracy, recall, precision, and F1 scores. The simulation results show high training accuracy: ResNet50 achieved 99.67% and ResNet101 achieved 99.82%. The validation accuracy of the models also exceeded the accuracy of traditional CNN models in previous studies. The simulation results based on recall, precision, and F1 score for each proposed model are 99.3%, 100%, and 99.65%, respectively. Finally, we successfully used the ResNet50 model to create a low-latency smart home prototype. Thus, this paper significantly contributes to the practical applications of voice-based gender recognition in smart home environments with high accuracy and efficiency in detection.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords