Time Coherent Full-Body Poses Estimated Using Only Five Inertial Sensors: Deep versus Shallow Learning

Frank  J. Wouda; Matteo Giuberti; Nina Rudigkeit; Bert-Jan  F. van Beijnum; Mannes Poel; Peter  H. Veltink

doi:10.3390/s19173716

Sensors (Aug 2019)

Time Coherent Full-Body Poses Estimated Using Only Five Inertial Sensors: Deep versus Shallow Learning

Frank J. Wouda,
Matteo Giuberti,
Nina Rudigkeit,
Bert-Jan F. van Beijnum,
Mannes Poel,
Peter H. Veltink

Affiliations

Frank J. Wouda: Department of Biomedical Signals & Systems, Technical Medical Centre, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands
Matteo Giuberti: RADiCAL Solutions, LLC. 125 West 31st Street, New York, NY 10001, USA
Nina Rudigkeit: Xsens Technologies B.V., Pantheon 6a, 7521 PR Enschede, The Netherlands
Bert-Jan F. van Beijnum: Department of Biomedical Signals & Systems, Technical Medical Centre, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands
Mannes Poel: Department of Computer Science, Faculty of Electrical Engineering, Mathematics & Computer Science, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands
Peter H. Veltink: Department of Biomedical Signals & Systems, Technical Medical Centre, University of Twente, P.O. Box 217, 7500 AE Enschede, The Netherlands

DOI: https://doi.org/10.3390/s19173716
Journal volume & issue: Vol. 19, no. 17
p. 3716

Abstract

Read online

Full-body motion capture typically requires sensors/markers to be placed on each rigid body segment, which results in long setup times and is obtrusive. The number of sensors/markers can be reduced using deep learning or offline methods. However, this requires large training datasets and/or sufficient computational resources. Therefore, we investigate the following research question: “What is the performance of a shallow approach, compared to a deep learning one, for estimating time coherent full-body poses using only five inertial sensors?”. We propose to incorporate past/future inertial sensor information into a stacked input vector, which is fed to a shallow neural network for estimating full-body poses. Shallow and deep learning approaches are compared using the same input vector configurations. Additionally, the inclusion of acceleration input is evaluated. The results show that a shallow learning approach can estimate full-body poses with a similar accuracy (~6 cm) to that of a deep learning approach (~7 cm). However, the jerk errors are smaller using the deep learning approach, which can be the effect of explicit recurrent modelling. Furthermore, it is shown that the delay using a shallow learning approach (72 ms) is smaller than that of a deep learning approach (117 ms).

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords