Adaptive cross-fusion learning for multi-modal gesture recognition

Benjia Zhou; Jun Wan; Yanyan Liang; Guodong Guo

Virtual Reality & Intelligent Hardware (Jun 2021)

Adaptive cross-fusion learning for multi-modal gesture recognition

Benjia Zhou,
Jun Wan,
Yanyan Liang,
Guodong Guo

Affiliations

Benjia Zhou: Macau University of Science and Technology, Macau 999078, China
Jun Wan: National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; Corresponding author,
Yanyan Liang: Macau University of Science and Technology, Macau 999078, China
Guodong Guo: Baidu Research, Beijing 100193, China, and National Engineering Laboratory for Deep Learning Technology and Application, Beijing, 100193, China

Journal volume & issue: Vol. 3, no. 3
pp. 235 – 247

Abstract

Read online

Background: Gesture recognition has attracted significant attention because of its wide range of potential applications. Although multi-modal gesture recognition has made significant progress in recent years, a popular method still is simply fusing prediction scores at the end of each branch, which often ignores complementary features among different modalities in the early stage and does not fuse the complementary features into a more discriminative feature. Methods: This paper proposes an Adaptive Cross-modal Weighting (ACmW) scheme to exploit complementarity features from RGB-D data in this study. The scheme learns relations among different modalities by combining the features of different data streams. The proposed ACmW module contains two key functions: (1) fusing complementary features from multiple streams through an adaptive one-dimensional convolution; and (2) modeling the correlation of multi-stream complementary features in the time dimension. Through the effective combination of these two functional modules, the proposed ACmW can automatically analyze the relationship between the complementary features from different streams, and can fuse them in the spatial and temporal dimensions. Results: Extensive experiments validate the effectiveness of the proposed method, and show that our method outperforms state-of-the-art methods on IsoGD and NVGesture.

Published in Virtual Reality & Intelligent Hardware

ISSN: 2096-5796 (Print); 2666-1209 (Online)
Publisher: KeAi Communications Co., Ltd.
Country of publisher: China
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware
Website: https://www.keaipublishing.com/en/journals/virtual-reality-and-intelligent-hardware/

About the journal

Abstract

Keywords