Journal of Imaging (Oct 2024)

YOLO-I3D: Optimizing Inflated 3D Models for Real-Time Human Activity Recognition

  • Ruikang Luo,
  • Aman Anand,
  • Farhana Zulkernine,
  • Francois Rivest

DOI
https://doi.org/10.3390/jimaging10110269
Journal volume & issue
Vol. 10, no. 11
p. 269

Abstract

Read online

Human Activity Recognition (HAR) plays a critical role in applications such as security surveillance and healthcare. However, existing methods, particularly two-stream models like Inflated 3D (I3D), face significant challenges in real-time applications due to their high computational demand, especially from the optical flow branch. In this work, we address these limitations by proposing two major improvements. First, we introduce a lightweight motion information branch that replaces the computationally expensive optical flow component with a lower-resolution RGB input, significantly reducing computation time. Second, we incorporate YOLOv5, an efficient object detector, to further optimize the RGB branch for faster real-time performance. Experimental results on the Kinetics-400 dataset demonstrate that our proposed two-stream I3D Light model improves the original I3D model’s accuracy by 4.13% while reducing computational cost. Additionally, the integration of YOLOv5 into the I3D model enhances accuracy by 1.42%, providing a more efficient solution for real-time HAR tasks.

Keywords