Applied Sciences (Sep 2020)

Gesture Recognition Based on 3D Human Pose Estimation and Body Part Segmentation for RGB Data Input

  • Ngoc-Hoang Nguyen,
  • Tran-Dac-Thinh Phan,
  • Guee-Sang Lee,
  • Soo-Hyung Kim,
  • Hyung-Jeong Yang

DOI
https://doi.org/10.3390/app10186188
Journal volume & issue
Vol. 10, no. 18
p. 6188

Abstract

Read online

This paper presents a novel approach for dynamic gesture recognition using multi-features extracted from RGB data input. Most of the challenges in gesture recognition revolve around the axis of the presence of multiple actors in the scene, occlusions, and viewpoint variations. In this paper, we develop a gesture recognition approach by hybrid deep learning where RGB frames, 3D skeleton joint information, and body part segmentation are used to overcome such problems. Extracted from the RGB images are the multimodal input observations, which are combined by multi-modal stream networks suited to different input modalities: residual 3D convolutional neural networks based on ResNet architecture (3DCNN_ResNet) for RGB images and color body part segmentation modalities; long short-term memory network (LSTM) for 3D skeleton joint modality. We evaluated the proposed model on four public datasets: UTD multimodal human action dataset, gaming 3D dataset, NTU RGB+D dataset, and MSRDailyActivity3D dataset and the experimental results on these datasets proves the effectiveness of our approach.

Keywords