IEEE Access (Jan 2022)
A Comparative Study: Toward an Effective Convolutional Neural Network Architecture for Sensor-Based Human Activity Recognition
Abstract
The feature extraction of human activity recognition (HAR) based on sensor data has been studied as a hand-crafted method. The significant feature extraction ability is a key factor in improving the accuracy of HAR. Recently, deep learning methods have been employed for feature extraction. In this paper, we review previous studies on deep learning methods in HAR and discuss suitable models for feature extraction. First, we applied various convolutional neural networks to clarify the effective architecture for HAR. Afterward, we developed advanced models by embedding submodules, such as self-attention and recurrent neural networks, often adopted in recent studies. Comparative experiments on HASC, UCI, and WISDM public datasets showed that Inception-V3, which used cross-channel multi-size convolution transformation, outperformed other backbones. Through comparative experiments after embedding submodules, submodules do not always have a positive effect on accuracy. Compared with other submodules, SENet has a positive effect. We conclude that it is essential to select an appropriate backbone model before applying the submodules, and submodules are unnecessary in some cases.
Keywords