Human Action Recognition of Spatiotemporal Parameters for Skeleton Sequences Using MTLN Feature Learning Framework

Faisal Mehmood; Enqing Chen; Muhammad Azeem Akbar; Abeer Abdulaziz Alsanad

doi:10.3390/electronics10212708

Electronics (Nov 2021)

Human Action Recognition of Spatiotemporal Parameters for Skeleton Sequences Using MTLN Feature Learning Framework

Faisal Mehmood,
Enqing Chen,
Muhammad Azeem Akbar,
Abeer Abdulaziz Alsanad

Affiliations

Faisal Mehmood: School of Information Engineering, Zhengzhou University, No. 100 Science Avenue, Zhengzhou 450001, China
Enqing Chen: School of Information Engineering, Zhengzhou University, No. 100 Science Avenue, Zhengzhou 450001, China
Muhammad Azeem Akbar: Department of Software Engineering, Lappeenranta-Lahti University of Technology, 53851 Lappeenranta, Finland
Abeer Abdulaziz Alsanad: College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11623, Saudi Arabia

DOI: https://doi.org/10.3390/electronics10212708
Journal volume & issue: Vol. 10, no. 21
p. 2708

Abstract

Read online

Human action recognition (HAR) by skeleton data is considered a potential research aspect in computer vision. Three-dimensional HAR with skeleton data has been used commonly because of its effective and efficient results. Several models have been developed for learning spatiotemporal parameters from skeleton sequences. However, two critical problems exist: (1) previous skeleton sequences were created by connecting different joints with a static order; (2) earlier methods were not efficient enough to focus on valuable joints. Specifically, this study aimed to (1) demonstrate the ability of convolutional neural networks to learn spatiotemporal parameters of skeleton sequences from different frames of human action, and (2) to combine the process of all frames created by different human actions and fit in the spatial structure information necessary for action recognition, using multi-task learning networks (MTLNs). The results were significantly improved compared with existing models by executing the proposed model on an NTU RGB+D dataset, an SYSU dataset, and an SBU Kinetic Interaction dataset. We further implemented our model on noisy expected poses from subgroups of the Kinetics dataset and the UCF101 dataset. The experimental results also showed significant improvement using our proposed model.

Published in Electronics

ISSN: 2079-9292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
Website: http://www.mdpi.com/journal/electronics

About the journal

Abstract

Keywords