Scientific African (Sep 2023)

Advancing human action recognition: A hybrid approach using attention-based LSTM and 3D CNN

  • El Mehdi Saoudi,
  • Jaafar Jaafari,
  • Said Jai Andaloussi

Journal volume & issue
Vol. 21
p. e01796

Abstract

Read online

In this paper, we propose a novel approach to video action recognition that integrates a modified and optimized 3D Convolutional Neural Network, a Long Short-Term Memory network, and attention mechanisms. This synergy enhances the overall performance, offering an advantage over existing methods in managing the intricacies of real-world scenarios. The uniqueness of our approach lies in its capacity to capture both spatial and temporal information from video sequences and the incorporation of an attention mechanism that selectively emphasizes key areas within the sequences, thereby enhancing recognition accuracy. The model is particularly tailored to handle complex scenarios, such as those with multiple actors or objects, or instances of occlusion. It effectively addresses the subjectivity and variability inherent in action annotations within datasets. We also apply an array of preprocessing techniques to further optimize model performance. Through rigorous experimental evaluations on benchmark datasets, namely UCF101 and HMDB51, we demonstrate that our proposed approach significantly outperforms existing state-of-the-art methods in action recognition. These results underscore the potential of our approach for further advancements in video action recognition research.

Keywords