Segmentation of Glottal Images from High-Speed Videoendoscopy Optimized by Synchronous Acoustic Recordings

Bartosz Kopczynski; Ewa Niebudek-Bogusz; Wioletta Pietruszewska; Pawel Strumillo

doi:10.3390/s22051751

Sensors (Feb 2022)

Segmentation of Glottal Images from High-Speed Videoendoscopy Optimized by Synchronous Acoustic Recordings

Bartosz Kopczynski,
Ewa Niebudek-Bogusz,
Wioletta Pietruszewska,
Pawel Strumillo

Affiliations

Bartosz Kopczynski: Institute of Electronics, Lodz University of Technology, 90-924 Lodz, Poland
Ewa Niebudek-Bogusz: Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-001 Lodz, Poland
Wioletta Pietruszewska: Department of Otolaryngology, Head and Neck Oncology, Medical University of Lodz, 90-001 Lodz, Poland
Pawel Strumillo: Institute of Electronics, Lodz University of Technology, 90-924 Lodz, Poland

DOI: https://doi.org/10.3390/s22051751
Journal volume & issue: Vol. 22, no. 5
p. 1751

Abstract

Read online

Laryngeal high-speed videoendoscopy (LHSV) is an imaging technique offering novel visualization quality of the vibratory activity of the vocal folds. However, in most image analysis methods, the interaction of the medical personnel and access to ground truth annotations are required to achieve accurate detection of vocal folds edges. In our fully automatic method, we combine video and acoustic data that are synchronously recorded during the laryngeal endoscopy. We show that the image segmentation algorithm of the glottal area can be optimized by matching the Fourier spectra of the pre-processed video and the spectra of the acoustic recording during the phonation of sustained vowel /i:/. We verify our method on a set of LHSV recordings taken from subjects with normophonic voice and patients with voice disorders due to glottal insufficiency. We show that the computed geometric indices of the glottal area make it possible to discriminate between normal and pathologic voices. The median of the Open Quotient and Minimal Relative Glottal Area values for healthy subjects were 0.69 and 0.06, respectively, while for dysphonic subjects were 1 and 0.35, respectively. We also validate these results using independent phoniatrician experts.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords