IEEE Access (Jan 2024)
DMA-SGCN for Video Motion Recognition: A Tool for Advanced Sports Analysis
Abstract
Video motion recognition plays a crucial role in advanced sports analysis. With video motion recognition, sports analytics has become more data-driven and result-oriented, significantly enhancing the professionalism and efficiency in the sports domain. Over the years, the accuracy of skelecton-based motion recognition algorithms using Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (ConvNets) has plateaued on relevant datasets, struggling to achieve further breakthroughs. This is partly because RNNs lack sufficient capability to model spatial structural features, and while ConvNets can alleviate difficulties in modeling spatial structures, the conversion of skelecton sequences into RGB pseudo-images inherently leads to some information loss. Moreover, ConvNets are not particularly adept at modeling temporal features. Since the associations between articulations in bodily skelectons are better represented using a graph structure, Graph Convolution Network (GCN)-based skelecton motion recognition methods have gained more attention. Recent advancements in Shift Graph Convolutional Network (S-GCN) have enhanced the expressiveness of spatial graphs while improving the versatility of spatial and temporal graph receptive fields. To further enhance this versatility, we propose the Dynamic Motion-Aware Shift Graph Convolutional Network (DMA-SGCN) for video motion recognition. Specifically, we introduce a data-aware driven method to represent associations between articulations. By analyzing the attributes of different motions and combining them with the natural associations of the bodily skelecton, we compute articulation affinities through representation learning. This approach not only improves the accuracy in defining articulation associations but also enhances the awareness of related articulation associations during motion behaviours. Furthermore, we dynamically use the adjacency matrix of skeletal data to guide the feature transfer between articulations. This topological method allows for more effective shift transformations based on articulation associations, addressing the issue of rigid receptive fields in previous GCNs for motion recognition. Contrast and ablation experiments on the largest 3D motion recognition dataset demonstrate that starting from the skeletal data and the motions themselves enables more accurate excavation of dynamic associations of skeletal articulations in the individual motion recognition.
Keywords