IEEE Access (Jan 2023)

Attention Relational Network for Skeleton-Based Group Activity Recognition

  • Chuanchuan Wang,
  • Ahmad Sufril Azlan Mohamed

DOI
https://doi.org/10.1109/ACCESS.2023.3332651
Journal volume & issue
Vol. 11
pp. 129230 – 129239

Abstract

Read online

Group activity recognition is a significant and challenging task in computer vision. The solution of group activity prediction can be classified with traditional hand-crafted features, RGB video features, and skeleton data-based deep learning architectures, such as Graph Convolutional Networks (GCNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTMs). However, they rarely explore pose information and rarely use relational networks to reason about group activity behavior. In this work, we leverage minimal prior knowledge about the skeleton information to reason about the interactions from group activity. The objective is to obtain discriminative representations and filter out some ambiguous actions to enhance the performance of group activity recognition. Our contribution is a proposed Attention Relation Network (ARN) that fuses the attention mechanisms and joint vector sequences into the relation network. The skeleton joints vector sequences are previously unexplored pose information and assign greater significance attributed to individuals who are more relevant for distinguishing the group activity behavior. First, our model focuses on the specified edge-level information (encompassing both edge and edge motion data) within the skeleton dataset, considering directionality, to analyze the spatiotemporal aspects of the action. Second, recognizing the inherent motion directionality, we establish diverse directions for skeleton edges and extract distinct motion features (including translation and rotation information) aligned with these various orientations, thereby augmenting the utilization of motion attributes related to the action. We also introduce a representation of human motion achieved by combining relational networks and examining their integrated characteristics. Extensive experiments were tested in the Hockey and UT-interaction datasets to evaluate our method, obtaining competitive performance to the state-of-the-art. Results demonstrate the modeling potential of a skeleton-based method for group activity recognition.

Keywords