IEEE Access (Jan 2022)

Multi-View CNN-LSTM Architecture for Radar-Based Human Activity Recognition

  • Habib-Ur-Rehman Khalid,
  • Ali Gorji,
  • Andre Bourdoux,
  • Sofie Pollin,
  • Hichem Sahli

DOI
https://doi.org/10.1109/ACCESS.2022.3150838
Journal volume & issue
Vol. 10
pp. 24509 – 24519

Abstract

Read online

In this paper, we propose a Multi-View Convolutional Neural Network and Long Short-Term Memory (CNN-LSTM) network which fuses multiple “views” of the time-range-Doppler radar data-cube for human activity recognition. It adopts the structure of convolutional neural networks to extract optimal frame based features from the time-range, time-Doppler and range-Doppler projections of the radar data-cube. The CNN models are trained using an unsupervised Convolutional Auto-Encoder (CAE) topology. Afterwards, the pre-trained parameters of the encoder are fine-tuned to extract intermediate frame based representations, which are subsequently aggregated via LSTM networks for sequence classification. The temporal correlation among the views is explicitly learned by sharing the LSTM network weights across different views. Moreover, we propose range and Doppler energy dispersion and temporal difference based features as an input to the CNN-LSTM models. Furthermore, we investigate the use of target tracking features as an auxiliary side information. The proposed model is trained on datasets collected in both cluttered and uncluttered environments. For validation, an independent test dataset, with unseen participants, in a cluttered environment was collected. Fusion with auxiliary features improves the generalization by 5%, yielding an overall Macro F1-score of 74.7%.

Keywords