PLoS ONE (Jan 2013)

Modeling consonant-vowel coarticulation for articulatory speech synthesis.

  • Peter Birkholz

DOI
https://doi.org/10.1371/journal.pone.0060603
Journal volume & issue
Vol. 8, no. 4
p. e60603

Abstract

Read online

A central challenge for articulatory speech synthesis is the simulation of realistic articulatory movements, which is critical for the generation of highly natural and intelligible speech. This includes modeling coarticulation, i.e., the context-dependent variation of the articulatory and acoustic realization of phonemes, especially of consonants. Here we propose a method to simulate the context-sensitive articulation of consonants in consonant-vowel syllables. To achieve this, the vocal tract target shape of a consonant in the context of a given vowel is derived as the weighted average of three measured and acoustically-optimized reference vocal tract shapes for that consonant in the context of the corner vowels /a/, /i/, and /u/. The weights are determined by mapping the target shape of the given context vowel into the vowel subspace spanned by the corner vowels. The model was applied for the synthesis of consonant-vowel syllables with the consonants /b/, /d/, /g/, /l/, /r/, /m/, /n/ in all combinations with the eight long German vowels. In a perception test, the mean recognition rate for the consonants in the isolated syllables was 82.4%. This demonstrates the potential of the approach for highly intelligible articulatory speech synthesis.