Imposing temporal consistency on deep monocular body shape and pose estimation

Alexandra Zimmer; Anna Hilsmann; Wieland Morgenstern; Peter Eisert

doi:10.1007/s41095-022-0272-x

Computational Visual Media (Oct 2022)

Imposing temporal consistency on deep monocular body shape and pose estimation

Alexandra Zimmer,
Anna Hilsmann,
Wieland Morgenstern,
Peter Eisert

Affiliations

Alexandra Zimmer: Fraunhofer Heinrich-Hertz-Institut
Anna Hilsmann: Fraunhofer Heinrich-Hertz-Institut
Wieland Morgenstern: Fraunhofer Heinrich-Hertz-Institut
Peter Eisert: Fraunhofer Heinrich-Hertz-Institut

DOI: https://doi.org/10.1007/s41095-022-0272-x
Journal volume & issue: Vol. 9, no. 1
pp. 123 – 139

Abstract

Read online

Abstract Accurate and temporally consistent modeling of human bodies is essential for a wide range of applications, including character animation, understanding human social behavior, and AR/VR interfaces. Capturing human motion accurately from a monocular image sequence remains challenging; modeling quality is strongly influenced by temporal consistency of the captured body motion. Our work presents an elegant solution to integrating temporal constraints during fitting. This increases both temporal consistency and robustness during optimization. In detail, we derive parameters of a sequence of body models, representing shape and motion of a person. We optimize these parameters over the complete image sequence, fitting a single consistent body shape while imposing temporal consistency on the body motion, assuming body joint trajectories to be linear over short time. Our approach enables the derivation of realistic 3D body models from image sequences, including jaw pose, facial expression, and articulated hands. Our experiments show that our approach accurately estimates body shape and motion, even for challenging movements and poses. Further, we apply it to the particular application of sign language analysis, where accurate and temporally consistent motion modelling is essential, and show that the approach is well-suited to this kind of application.

Published in Computational Visual Media

ISSN: 2096-0433 (Print); 2096-0662 (Online)
Publisher: SpringerOpen
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://www.springer.com/41095

About the journal

Abstract

Keywords