Deep Learning for Neuromuscular Control of Vocal Source for Voice Production

Anil Palaparthi; Rishi K. Alluri; Ingo R. Titze

doi:10.3390/app14020769

Applied Sciences (Jan 2024)

Deep Learning for Neuromuscular Control of Vocal Source for Voice Production

Anil Palaparthi,
Rishi K. Alluri,
Ingo R. Titze

Affiliations

Anil Palaparthi: Utah Center for Vocology, University of Utah, Salt Lake City, UT 84112, USA
Rishi K. Alluri: School of Biological Sciences, University of Utah, Salt Lake City, UT 84112, USA
Ingo R. Titze: Utah Center for Vocology, University of Utah, Salt Lake City, UT 84112, USA

DOI: https://doi.org/10.3390/app14020769
Journal volume & issue: Vol. 14, no. 2
p. 769

Abstract

Read online

A computational neuromuscular control system that generates lung pressure and three intrinsic laryngeal muscle activations (cricothyroid, thyroarytenoid, and lateral cricoarytenoid) to control the vocal source was developed. In the current study, LeTalker, a biophysical computational model of the vocal system was used as the physical plant. In the LeTalker, a three-mass vocal fold model was used to simulate self-sustained vocal fold oscillation. A constant /ə/ vowel was used for the vocal tract shape. The trachea was modeled after MRI measurements. The neuromuscular control system generates control parameters to achieve four acoustic targets (fundamental frequency, sound pressure level, normalized spectral centroid, and signal-to-noise ratio) and four somatosensory targets (vocal fold length, and longitudinal fiber stress in the three vocal fold layers). The deep-learning-based control system comprises one acoustic feedforward controller and two feedback (acoustic and somatosensory) controllers. Fifty thousand steady speech signals were generated using the LeTalker for training the control system. The results demonstrated that the control system was able to generate the lung pressure and the three muscle activations such that the four acoustic and four somatosensory targets were reached with high accuracy. After training, the motor command corrections from the feedback controllers were minimal compared to the feedforward controller except for thyroarytenoid muscle activation.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords