IEEE Open Journal of the Computer Society (Jan 2024)

A Real-Time 3-Dimensional Object Detection Based Human Action Recognition Model

  • Chhaya Gupta,
  • Nasib Singh Gill,
  • Preeti Gulia,
  • Sangeeta Yadav,
  • Giovanni Pau,
  • Mohammad Alibakhshikenari,
  • Xiangjie Kong

DOI
https://doi.org/10.1109/OJCS.2023.3334528
Journal volume & issue
Vol. 5
pp. 14 – 26

Abstract

Read online

Computer vision technologies have greatly improved in the last few years. Many problems have been solved using deep learning merged with more computational power. Action recognition is one of society's problems that must be addressed. Human Action Recognition (HAR) may be adopted for intelligent video surveillance systems, and the government may use the same for monitoring crimes and security purposes. This paper proposes a deep learning-based HAR model, i.e., a 3-dimensional Convolutional Network with multiplicative LSTM. The suggested model makes it easier to comprehend the tasks that an individual or team of individuals completes. The four-phase proposed model consists of a 3D Convolutional neural network (3DCNN) combined with an LSTM multiplicative recurrent network and Yolov6 for real-time object detection. The four stages of the proposed model are data fusion, feature extraction, object identification, and skeleton articulation approaches. The NTU-RGB-D, KITTI, NTU-RGB-D 120, UCF 101, and Fused datasets are some used to train the model. The suggested model surpasses other cutting-edge models by reaching an accuracy of 98.23%, 97.65%, 98.76%, 95.45%, and 97.65% on the abovementioned datasets. Other state-of-the-art (SOTA) methods compared in this study are traditional CNN, Yolov6, and CNN with BiLSTM. The results verify that actions are classified more accurately by the proposed model that combines all these techniques compared to existing ones.

Keywords