EURASIP Journal on Audio, Speech, and Music Processing (Jan 2009)

Lip-Synching Using Speaker-Specific Articulation, Shape and Appearance Models

  • Gaspard Breton,
  • Frédéric Elisei,
  • Oxana Govokhina,
  • Gérard Bailly

DOI
https://doi.org/10.1155/2009/769494
Journal volume & issue
Vol. 2009

Abstract

Read online

We describe here the control, shape and appearance models that are built using an original photogrammetric method to capture characteristics of speaker-specific facial articulation, anatomy, and texture. Two original contributions are put forward here: the trainable trajectory formation model that predicts articulatory trajectories of a talking face from phonetic input and the texture model that computes a texture for each 3D facial shape according to articulation. Using motion capture data from different speakers and module-specific evaluation procedures, we show here that this cloning system restores detailed idiosyncrasies and the global coherence of visible articulation. Results of a subjective evaluation of the global system with competing trajectory formation models are further presented and commented.