IEEE Access (Jan 2020)

Context-Aware Cross-Attention for Skeleton-Based Human Action Recognition

  • Yanbo Fan,
  • Shuchen Weng,
  • Yong Zhang,
  • Boxin Shi,
  • Yi Zhang

DOI
https://doi.org/10.1109/ACCESS.2020.2968054
Journal volume & issue
Vol. 8
pp. 15280 – 15290

Abstract

Read online

Skeleton-based human action recognition is becoming popular due to its computational efficiency and robustness. Since not all skeleton joints are informative for action recognition, attention mechanisms are adopted to extract informative joints and suppress the influence of irrelevant ones. However, existing attention frameworks usually ignore helpful scenario context information. In this paper, we propose a cross-attention module that consists of a self-attention branch and a cross-attention branch for skeleton-based action recognition. It helps to extract joints that are not only more informative but also highly correlated to the corresponding scenario context information. Moreover, the cross-attention module maintains input variables' size and can be flexibly incorporated into many existing frameworks without breaking their behaviors. To facilitate end-to-end training, we further develop a scenario context information extraction branch to extract context information from raw RGB video directly. We conduct comprehensive experiments on the NTU RGB+D and the Kinetics databases, and experimental results demonstrate the correctness and effectiveness of the proposed model.

Keywords