IEEE Access (Jan 2022)
Ensembled Transfer Learning Based Multichannel Attention Networks for Human Activity Recognition in Still Images
Abstract
Human activity recognition is one of the most difficult tasks in computer vision. Due to the lack of time information, detecting human activities from still photos is more difficult than sensor-based or video-based techniques. Recently, various deep learning based solutions are being proposed one after another, and their performance is constantly improving. In this paper, we proposed a convolutional neural architecture by ensembling transfer learning based multi-channel attention networks. Here, four CNN branches were used to make feature fusion based ensembling and in each branch, an attention module was used to extract the contextual information from the feature map produced by existing pre-trained models. Finally, the extracted feature maps from four branches were concatenated and fed to fully connected network to produce the final recognition output. We considered 3 different datasets, Stanford 40 actions, BU-101 and Willow human actions datasets to evaluate our system. Experimental analysis showed that the proposed ensembled convolutional architecture outperformed previous works by a noteworthy margin.
Keywords