Jisuanji kexue (Jan 2022)
Multi-scale Gated Graph Convolutional Network for Skeleton-based Action Recognition
Abstract
Skeleton-based human action recognition is attracting more attention in computer vision.Recently,graph convolutional networks(GCNs),which is powerful to model non-Euclidean structure data,have obtained promising performance and enable a new paradigm for action recognition.Existing approaches mostly model the spatial dependency with emphasis mechanism since the huge pre-defined graph contains large quantities of noise.However,simply emphasizing subsets is not optimal for reflecting the dynamic underlying correlations between vertexes in a global manner.Furthermore,these methods are ineffective to capture the temporal dependencies as the CNNs or RNNs are not capable to model the intricate multi-range temporal relations.To address these issues,a multi-scale gated graph convolutional network (MSG-GCN) is proposed for skeleton-based action recognition.Specifically,a gated temporal convolution module (G-TCM) is presented to capture the consecutive short-term and interval long-term dependencies between vertexes in the temporal domain.Besides,a multi-dimensional attention module for spatial,temporal,and channel,which enhances the expressiveness of spatial graph,is integrated into GCNs with negligible overheads.Extensive experiments on two large-scale benchmark datasets,NTU-RGB+D and Kinetics,demonstrate that our approach outperforms the state-of-the-art baselines.
Keywords