Investigation and Evaluation of Glottal Flow Waveform for Voice Pathology Detection

Yuanbo Wu; Changwei Zhou; Ziqi Fan; Di Wu; Xiaojun Zhang; Zhi Tao

doi:10.1109/ACCESS.2020.3046767

IEEE Access (Jan 2021)

Investigation and Evaluation of Glottal Flow Waveform for Voice Pathology Detection

Yuanbo Wu,
Changwei Zhou,
Ziqi Fan,
Di Wu,
Xiaojun Zhang,
Zhi Tao

Affiliations

Yuanbo Wu: ORCiD; School of Optoelectronic Science and Engineering, Soochow University, Suzhou, China
Changwei Zhou: School of Optoelectronic Science and Engineering, Soochow University, Suzhou, China
Ziqi Fan: School of Optoelectronic Science and Engineering, Soochow University, Suzhou, China
Di Wu: School of Optoelectronic Science and Engineering, Soochow University, Suzhou, China
Xiaojun Zhang: School of Optoelectronic Science and Engineering, Soochow University, Suzhou, China
Zhi Tao: ORCiD; School of Optoelectronic Science and Engineering, Soochow University, Suzhou, China

DOI: https://doi.org/10.1109/ACCESS.2020.3046767
Journal volume & issue: Vol. 9
pp. 30 – 44

Abstract

Read online

Automatic voice pathology detection can provide objective estimation and prevention in the early stages of voice diseases. Glottal flow waveform directly reflects the state of glottal excitation. Extracting acoustic features from glottal source signals may contribute to the detection of pathological voice. To improve the performance of voice pathology detection, this article investigates the contribution of the glottal flow waveform for pathological voice detection by evaluating the classification result using features extracted from raw speech utterances and corresponding glottal flow waveforms. The individual feature sets used are extracted from raw or glottal voice utterances with identical parameter settings, which are openSMILE acoustic features, audio features computed by Moving Picture Experts Group-7 standard and classical glottal source features. In addition, a feature selection method in terms of the wrapper approach is used to combine the single features ranked by using the Fisher discrimination ratio. Voice pathology detection experiments were carried out using Random Forest. The best accuracies of 88.52% for the Saarbrücken Voice database and 100.00% for the Massachusetts Eye and Ear Infirmary database are achieved using the combined feature set extracted from the glottal source signal, with improvement of 0.44-3.13% in the accuracies obtained by using raw speech utterances. Compared to state-of-the-art methods, the proposed method achieves the highest accuracy for the Massachusetts Eye and Ear Infirmary database and an increase of 2.75-17.16% in detection accuracy compared to other conventional pipeline systems for the Saarbrücken Voice databse. The experimental results demonstrate that using glottal flow waveform as source signal can improve the performance of pathological voice detection.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords