Scientific Reports (May 2024)

Multiscale knowledge distillation with attention based fusion for robust human activity recognition

  • Zhaohui Yuan,
  • Zhengzhe Yang,
  • Hao Ning,
  • Xiangyang Tang

DOI
https://doi.org/10.1038/s41598-024-63195-5
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Knowledge distillation is an effective approach for training robust multi-modal machine learning models when synchronous multimodal data are unavailable. However, traditional knowledge distillation techniques have limitations in comprehensively transferring knowledge across modalities and models. This paper proposes a multiscale knowledge distillation framework to address these limitations. Specifically, we introduce a multiscale semantic graph mapping (SGM) loss function to enable more comprehensive knowledge transfer between teacher and student networks at multiple feature scales. We also design a fusion and tuning (FT) module to fully utilize correlations within and between different data types of the same modality when training teacher networks. Furthermore, we adopt transformer-based backbones to improve feature learning compared to traditional convolutional neural networks. We apply the proposed techniques to multimodal human activity recognition and compared with the baseline method, it improved by 2.31% and 0.29% on the MMAct and UTD-MHAD datasets. Ablation studies validate the necessity of each component.

Keywords