IEEE Access (Jan 2024)

Advances in Automated Voice Pathology Detection: A Comprehensive Review of Speech Signal Analysis Techniques

  • Anitha Sankaran,
  • Lakshmi Sutha Kumar

DOI
https://doi.org/10.1109/ACCESS.2024.3508884
Journal volume & issue
Vol. 12
pp. 181127 – 181148

Abstract

Read online

Speech has been the principal means of communication among humans for centuries. It arises when the voice tone produced by the vocal cords is modulated by articulators giving rise to meaningful content. Voice pathologies are disorders affecting the voice, for which the prevailing medical procedures are invasive, painful and require physical visits to the healthcare center. The Artificial Intelligence era provides an opportunity to look for alternative modes of treatment, for which the automated voice pathology detection systems discussed in this paper will be a promising solution. This paper begins with a discussion of the significant milestones achieved in the field of speech processing, followed by an explanation of the voice production mechanism. The types of voice pathologies, existing medical procedures and some of the available voice pathology databases are discussed in this study. An in-depth review of the pre-processing steps for a speech signal, the features that parameterize the speech, and the feature extraction process is presented. An extensive literature review on the application of machine learning and deep learning techniques to detect voice pathologies was conducted. Finally, an automated voice pathology detection system was built using 300 speech signals from the Saarbrucken Voice database. The Mel frequency cepstral coeffecients features were extracted and fed to four different machine learning algorithms. The speech signals are also fed to a deep network, a 2D-Convoultional Neural Network-Long Short Term Memory model and a comparative analysis of the performance metrics is performed.

Keywords