Jisuanji kexue (Jan 2022)

Multi-scale Gated Graph Convolutional Network for Skeleton-based Action Recognition

  • GAN Chuang, WU Gui-xing, ZHAN Qing-yuan, WANG Peng-kun, PENG Zhi-lei

DOI
https://doi.org/10.11896/jsjkx.201100164
Journal volume & issue
Vol. 49, no. 1
pp. 181 – 186

Abstract

Read online

Skeleton-based human action recognition is attracting more attention in computer vision.Recently,graph convolutional networks(GCNs),which is powerful to model non-Euclidean structure data,have obtained promising performance and enable a new paradigm for action recognition.Existing approaches mostly model the spatial dependency with emphasis mechanism since the huge pre-defined graph contains large quantities of noise.However,simply emphasizing subsets is not optimal for reflecting the dynamic underlying correlations between vertexes in a global manner.Furthermore,these methods are ineffective to capture the temporal dependencies as the CNNs or RNNs are not capable to model the intricate multi-range temporal relations.To address these issues,a multi-scale gated graph convolutional network (MSG-GCN) is proposed for skeleton-based action recognition.Specifically,a gated temporal convolution module (G-TCM) is presented to capture the consecutive short-term and interval long-term dependencies between vertexes in the temporal domain.Besides,a multi-dimensional attention module for spatial,temporal,and channel,which enhances the expressiveness of spatial graph,is integrated into GCNs with negligible overheads.Extensive experiments on two large-scale benchmark datasets,NTU-RGB+D and Kinetics,demonstrate that our approach outperforms the state-of-the-art baselines.

Keywords