Robust Multi-Feature Learning for Skeleton-Based Action Recognition

Yingfu Wang; Zheyuan Xu; Li Li; Jian Yao

doi:10.1109/ACCESS.2019.2945632

IEEE Access (Jan 2019)

Robust Multi-Feature Learning for Skeleton-Based Action Recognition

Yingfu Wang,
Zheyuan Xu,
Li Li,
Jian Yao

Affiliations

Yingfu Wang: ORCiD; School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China
Zheyuan Xu: School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China
Li Li: School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China
Jian Yao: School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China

DOI: https://doi.org/10.1109/ACCESS.2019.2945632
Journal volume & issue: Vol. 7
pp. 148658 – 148671

Abstract

Read online

Skeleton-based action recognition has advanced significantly in the past decade. Among deep learning-based action recognition methods, one of the most commonly used structures is a two-stream network. This type of network extracts high-level spatial and temporal features from skeleton coordinates and optical flows, respectively. However, other features, such as the structure of the skeleton or the relations of specific joint pairs, are sometimes ignored, even though using these features can also improve action recognition performance. To robustly learn more low-level skeleton features, this paper introduces an efficient fully convolutional network to process multiple input features. The network has multiple streams, each of which has the same encoder-decoder structure. A temporal convolutional network and a co-occurrence convolutional network encode the local and global features, and a convolutional classifier decodes high-level features to classify the action. Moreover, a novel fusion strategy is proposed to combine independent feature learning and dependent feature relating. Detailed ablation studies are performed to confirm the network’s robustness to all feature inputs. If more features are combined and the number of streams increases, performance can be further improved. The proposed network is evaluated on three skeleton datasets: NTU-RGB + D, Kinetics, and UTKinect. The experimental results show its effectiveness and performance superiority over state-of-the-art methods.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords