StutterNet: Stuttering Disfluencies Detection in Synthetic Speech Signals via Mel Frequency Cepstral Coefficients Features Using Deep Learning

Muhammad Abubakar; Muhammad Mujahid; Khadija Kanwal; Sajid Iqbal; Muhammad Nabeel Asghar; Abdullah Alaulamie

doi:10.1109/ACCESS.2024.3429343

IEEE Access (Jan 2024)

StutterNet: Stuttering Disfluencies Detection in Synthetic Speech Signals via Mel Frequency Cepstral Coefficients Features Using Deep Learning

Muhammad Abubakar,
Muhammad Mujahid,
Khadija Kanwal,
Sajid Iqbal,
Muhammad Nabeel Asghar,
Abdullah Alaulamie

Affiliations

Muhammad Abubakar: Department of Computer Science, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, Pakistan
Muhammad Mujahid: ORCiD; Artificial Intelligence and Data Analytics (AIDA) Lab, CCIS, Prince Sultan University, Riyadh, Saudi Arabia
Khadija Kanwal: Institute of Computer Science and Information Technology, The Women University, Multan, Pakistan
Sajid Iqbal: ORCiD; Department of Information Systems, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa, Saudi Arabia
Muhammad Nabeel Asghar: Department of Information Systems, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa, Saudi Arabia
Abdullah Alaulamie: Department of Information Systems, College of Computer Science and Information Technology, King Faisal University, Al-Ahsa, Saudi Arabia

DOI: https://doi.org/10.1109/ACCESS.2024.3429343
Journal volume & issue: Vol. 12
pp. 99308 – 99320

Abstract

Read online

Stuttering is a speech disorder characterised by the repetition, prolongation, or blocking of sounds, syllables, or words, which can cause significant social and emotional difficulties for those who experience it. To help find and diagnose stuttering early, it is important for clinicians and researchers to accurately separate stuttering from normal speech. This helps them understand the disorder, find possible causes, and come up with effective interventions and treatments. The UCLASS dataset was used in this study, and 40 MFCC features were taken out to see how well different machine learning classifiers could indicate the difference between normal and stuttering speech. The major problem is the UCLASS imbalanced dataset. The authors address it with the synthetic minority oversampling technique. After machine learning experimentation’s, we propose a novel hybrid model that performs better than individual machine learning. In hybrid, the lightweight SutterNet model takes the best features from the data and then make prediction. The results indicate that the evaluated classifiers showed varying levels of performance. Overall, the results suggest that hybrid classifiers have the potential to accurately classify normal and stuttering speech, which could have important implications for the early identification and diagnosis of stuttering, as well as the development of assistive technologies and effective interventions and treatments.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords