3D Capsule Hand Pose Estimation Network Based on Structural Relationship Information

Yiqi Wu; Shichao Ma; Dejun Zhang; Jun Sun

doi:10.3390/sym12101636

Symmetry (Oct 2020)

3D Capsule Hand Pose Estimation Network Based on Structural Relationship Information

Yiqi Wu,
Shichao Ma,
Dejun Zhang,
Jun Sun

Affiliations

Yiqi Wu: School of Computer, China University of Geosciences, Wuhan 430074, China
Shichao Ma: School of Computer, China University of Geosciences, Wuhan 430074, China
Dejun Zhang: School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China
Jun Sun: College of Information and Engineering, Sichuan Agricultural University, Ya’an 625014, China

DOI: https://doi.org/10.3390/sym12101636
Journal volume & issue: Vol. 12, no. 10
p. 1636

Abstract

Read online

Hand pose estimation from 3D data is a key challenge in computer vision as well as an essential step for human–computer interaction. A lot of deep learning-based hand pose estimation methods have made significant progress but give less consideration to the inner interactions of input data, especially when consuming hand point clouds. Therefore, this paper proposes an end-to-end capsule-based hand pose estimation network (Capsule-HandNet), which processes hand point clouds directly with the consideration of structural relationships among local parts, including symmetry, junction, relative location, etc. Firstly, an encoder is adopted in Capsule-HandNet to extract multi-level features into the latent capsule by dynamic routing. The latent capsule represents the structural relationship information of the hand point cloud explicitly. Then, a decoder recovers a point cloud to fit the input hand point cloud via a latent capsule. This auto-encoder procedure is designed to ensure the effectiveness of the latent capsule. Finally, the hand pose is regressed from the combined feature, which consists of the global feature and the latent capsule. The Capsule-HandNet is evaluated on public hand pose datasets under the metrics of the mean error and the fraction of frames. The mean joint errors of Capsule-HandNet on MSRA and ICVL datasets reach 8.85 mm and 7.49 mm, respectively, and Capsule-HandNet outperforms the state-of-the-art methods on most thresholds under the fraction of frames metric. The experimental results demonstrate the effectiveness of Capsule-HandNet for 3D hand pose estimation.

Published in Symmetry

ISSN: 2073-8994 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/symmetry/

About the journal

Abstract

Keywords