Skeleton‐based attention‐aware spatial–temporal model for action detection and recognition

Ran Cui; Aichun Zhu; Jingran Wu; Gang Hua

doi:10.1049/iet-cvi.2019.0751

IET Computer Vision (Aug 2020)

Skeleton‐based attention‐aware spatial–temporal model for action detection and recognition

Ran Cui,
Aichun Zhu,
Jingran Wu,
Gang Hua

Affiliations

Ran Cui: School of Information and Control EngineeringChina University of Mining and TechnologyXuzhou221008People's Republic of China
Aichun Zhu: School of Computer Science and TechnologyNanjing Tech UniversityNanjing211800People's Republic of China
Jingran Wu: Department of Information and Electrical EngineeringXuhai College, China University of Mining and TechnologyXuzhou221008People's Republic of China
Gang Hua: School of Information and Control EngineeringChina University of Mining and TechnologyXuzhou221008People's Republic of China

DOI: https://doi.org/10.1049/iet-cvi.2019.0751
Journal volume & issue: Vol. 14, no. 5
pp. 177 – 184

Abstract

Read online

Action detection and recognition are popular subjects of research in the field of computer vision. The task of action detection can be regarded as the sum of action location and recognition. Action features described by using information concerning the human skeleton have the advantages of robustness against external factors and requiring a small amount of calculation. This study proposes a skeleton‐based action analysis model based on a recurrent neural network framework. The model learns action features by modelling static and dynamic features of skeleton joints and the importance of different video frames by introducing an attention module. For action location, conditional random field loss function is introduced to establish the context dependency of output labels. In the aspect of action recognition, the hierarchical training mechanism with triple loss models action features at coarse‐grained and fine‐grained levels. The authors’ proposed method delivers state‐of‐the‐art results on action location and recognition tasks.

Published in IET Computer Vision

ISSN: 1751-9632 (Print); 1751-9640 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519640

About the journal

Abstract

Keywords