Cross-Channel Graph Convolutional Networks for Skeleton-Based Action Recognition

Jun Xie; Wentian Xin; Ruyi Liu; Lijie Sheng; Xiangzeng Liu; Xuesong Gao; Sheng Zhong; Lei Tang; Qiguang Miao

doi:10.1109/ACCESS.2021.3049808

IEEE Access (Jan 2021)

Cross-Channel Graph Convolutional Networks for Skeleton-Based Action Recognition

Jun Xie,
Wentian Xin,
Ruyi Liu,
Lijie Sheng,
Xiangzeng Liu,
Xuesong Gao,
Sheng Zhong,
Lei Tang,
Qiguang Miao

Affiliations

Jun Xie: ORCiD; School of Computer Science and Technology, Xidian University, Xi’an, China
Wentian Xin: ORCiD; School of Computer Science and Technology, Xidian University, Xi’an, China
Ruyi Liu: ORCiD; School of Computer Science and Technology, Xidian University, Xi’an, China
Lijie Sheng: School of Computer Science and Technology, Xidian University, Xi’an, China
Xiangzeng Liu: ORCiD; School of Computer Science and Technology, Xidian University, Xi’an, China
Xuesong Gao: State Key Laboratory of Digital Multimedia Technology, Hisense Company Ltd., Qingdao, China
Sheng Zhong: School of Information Science and Technology, Northwest University of China, Xi’an, China
Lei Tang: Xi’an Microelectronics Technology Institute, Xi’an, China
Qiguang Miao: ORCiD; School of Computer Science and Technology, Xidian University, Xi’an, China

DOI: https://doi.org/10.1109/ACCESS.2021.3049808
Journal volume & issue: Vol. 9
pp. 9055 – 9065

Abstract

Read online

In recent years, skeleton-based action recognition, graph convolutional networks, have achieved remarkable performance. In these existing works, the features of all nodes in the neighbor set are aggregated into the updated features of the root node, while these features are located in the same feature channel determined by the same 1 × 1 convolution filter. This may not be optimal for capturing the features of spatial dimensions among adjacent vertices effectively. Besides, the effect of feature channels that are independent of the current action on the performance of the model is rarely investigated in existing methods. In this paper, we propose cross-channel graph convolutional networks for skeleton-based action recognition. The features fusion mechanism in our network is cross-channel, i.e, the updated feature of the root node is derived from different feature channels. Because different feature channels come from different 1 × 1 convolution filters, the cross-channel fusion mechanism significantly improves the ability of the model to capture local features among adjacent vertices. Moreover, by introducing a channel attention mechanism to our model, we suppress the influence of feature channels unrelated to action recognition on model performance, which improves the robustness of the model against the feature channels independent of the current action. Extensive experiments on the two large-scale datasets, NTU-RGB+D and KineticsSkeleton, demonstrate that the performance of our model exceeds the current mainstream methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords