Video behavior recognition based on actional-structural graph convolution and temporal extension module

Hui Xu; Jun Kong; Mengyao Liang; Hui Sun; Miao Qi

doi:10.3934/era.2022210

Electronic Research Archive (Sep 2022)

Video behavior recognition based on actional-structural graph convolution and temporal extension module

Hui Xu ,
Jun Kong ,
Mengyao Liang ,
Hui Sun,
Miao Qi

Affiliations

Hui Xu: 1. College of Information Science and Technology, Northeast Normal University, Changchun 130117, China
Jun Kong: 1. College of Information Science and Technology, Northeast Normal University, Changchun 130117, China 2. Key Laboratory of Applied Statistics of MOE, Northeast Normal University, Changchun 130024, China
Mengyao Liang: 1. College of Information Science and Technology, Northeast Normal University, Changchun 130117, China
Hui Sun: 3. Institute for Intelligent Elderly Care, Changchun Humanities and Sciences College, Changchun 130117, China
Miao Qi: 1. College of Information Science and Technology, Northeast Normal University, Changchun 130117, China 2. Key Laboratory of Applied Statistics of MOE, Northeast Normal University, Changchun 130024, China

DOI: https://doi.org/10.3934/era.2022210
Journal volume & issue: Vol. 30, no. 11
pp. 4157 – 4177

Abstract

Read online

Human behavior recognition has always been a hot spot for research in computer vision. In this paper, we propose a novel video behavior recognition method based on Actional-Structural Graph Convolution and a Temporal Extension Module under the framework of a Spatio-Temporal Graph Convolution Neural Network, which can optimize the spatial and temporal features simultaneously. The basic network framework of our method consists of three parts: spatial graph convolution module, temporal extension module and attention mechanism module. In the spatial dimension, the action graph convolution is utilized to obtain abundant spatial features by capturing the correlations of distant joint features, and the structural graph convolution expands the existing skeleton graph to acquire the spatial features of adjacent joints. In the time dimension, the sampling range of the temporal graph is expanded for extracting the same and adjacent joints of adjacent frames. Furthermore, attention mechanisms are introduced to improve the performance of our method. In order to verify the effectiveness and accuracy of our method, a large number of experiments were carried out on two standard behavior recognition datasets: NTU-RGB+D and Kinetics. Comparative experiment results show that our proposed method can achieve better performance.

Published in Electronic Research Archive

ISSN: 2688-1594 (Online)
Publisher: AIMS Press
Country of publisher: United States
LCC subjects: Science: Mathematics; Technology: Technology (General): Industrial engineering. Management engineering: Applied mathematics. Quantitative methods
Website: https://www.aimspress.com/journal/era

About the journal

Abstract

Keywords