IEEE Access (Jan 2024)

StutterNet: Stuttering Disfluencies Detection in Synthetic Speech Signals via Mel Frequency Cepstral Coefficients Features Using Deep Learning

  • Muhammad Abubakar,
  • Muhammad Mujahid,
  • Khadija Kanwal,
  • Sajid Iqbal,
  • Muhammad Nabeel Asghar,
  • Abdullah Alaulamie

DOI
https://doi.org/10.1109/ACCESS.2024.3429343
Journal volume & issue
Vol. 12
pp. 99308 – 99320

Abstract

Read online

Stuttering is a speech disorder characterised by the repetition, prolongation, or blocking of sounds, syllables, or words, which can cause significant social and emotional difficulties for those who experience it. To help find and diagnose stuttering early, it is important for clinicians and researchers to accurately separate stuttering from normal speech. This helps them understand the disorder, find possible causes, and come up with effective interventions and treatments. The UCLASS dataset was used in this study, and 40 MFCC features were taken out to see how well different machine learning classifiers could indicate the difference between normal and stuttering speech. The major problem is the UCLASS imbalanced dataset. The authors address it with the synthetic minority oversampling technique. After machine learning experimentation’s, we propose a novel hybrid model that performs better than individual machine learning. In hybrid, the lightweight SutterNet model takes the best features from the data and then make prediction. The results indicate that the evaluated classifiers showed varying levels of performance. Overall, the results suggest that hybrid classifiers have the potential to accurately classify normal and stuttering speech, which could have important implications for the early identification and diagnosis of stuttering, as well as the development of assistive technologies and effective interventions and treatments.

Keywords