Alexandria Engineering Journal (Apr 2025)

Object detection in real-time video surveillance using attention based transformer-YOLOv8 model

  • Divya Nimma,
  • Omaia Al-Omari,
  • Rahul Pradhan,
  • Zoirov Ulmas,
  • R.V.V. Krishna,
  • Ts. Yousef A.Baker El-Ebiary,
  • Vuda Sreenivasa Rao

Journal volume & issue
Vol. 118
pp. 482 – 495

Abstract

Read online

Object detection plays a crucial role in various applications, including surveillance, autonomous driving, and industrial automation, where accurate and timely identification of objects is essential. This research proposes a novel framework that combines the YOLOv8 backbone network with an attention mechanism and a Transformer-based detection head, significantly enhancing object detection performance in real-time images and video. The incorporation of attention mechanisms refines feature extraction from complex scenes, enabling the model to focus on relevant regions within images. Using the integration of Transformer architecture, the model leverages long-range dependencies and global context, leading to more accurate bounding box predictions. The proposed system effectively processes real-time data, demonstrating superior classification performance with precision rates reaching 96.78 % and recall rates of 96.89 %. The mean average precision (mAP) is calculated at 89.67 %, showcasing the framework's robustness across various practical scenarios. The framework is developed to address challenges in object detection, such as detecting multiple objects in crowded environments and varying lighting conditions. The Python architecture supports the implementation of the proposed model. The Python architecture supports the implementation of the proposed model. The results section assesses the Attention Transformer-YOLOv8 model against established algorithms like Faster R-CNN, YOLOv3, YOLOv5n, and SSD, utilizing metrics.

Keywords