Jisuanji kexue yu tansuo (Mar 2021)

Review of Human Action Recognition Based on Deep Learning

  • QIAN Huifang, YI Jianping, FU Yunhu

DOI
https://doi.org/10.3778/j.issn.1673-9418.2009095
Journal volume & issue
Vol. 15, no. 3
pp. 438 – 455

Abstract

Read online

Human action recognition is one of the important topics in video understanding. It is widely used in video surveillance, human-computer interaction, motion analysis, and video information retrieval. According to the chara-cteristics of the backbone network, this paper introduces the latest research results in the field of action recognition from three perspectives: 2D convolutional neural network, 3D convolutional neural network, and spatiotemporal decomposition network. And their advantages and disadvantages are qualitatively analyzed and compared. Then, from the two aspects of scene-related and temporal-related, the commonly used action video datasets are comprehensively summarized, and the characteristics and usage of different datasets are emphatically discussed. Subsequently, the common pre-training strategies in action recognition tasks are introduced, and the influence of pre-training techniques on the performance of action recognition models is emphatically analyzed. Finally, starting from the latest research trends, the future development direction of action recognition is discussed from six perspectives: fine-grained action recognition, streamlined model, few-shot learning, unsupervised learning, adaptive network, and video super-resolution action recognition.

Keywords