IEEE Access (Jan 2024)

Synergistic Integration of Transfer Learning and Deep Learning for Enhanced Object Detection in Digital Images

  • Safa Riyadh Waheed,
  • Norhaida Mohd Suaib,
  • Mohd Shafry Mohd Rahim,
  • Amjad Rehman Khan,
  • Saeed Ali Bahaj,
  • Tanzila Saba

DOI
https://doi.org/10.1109/ACCESS.2024.3354706
Journal volume & issue
Vol. 12
pp. 13525 – 13536

Abstract

Read online

Presently, the world is progressing towards the notion of smart and secure cities. The automatic recognition of human activity is among the essential landmarks of smart city surveillance projects. Moreover, classifying group activity and behavior detection is complex and indistinct. Consequently, behavior classification systems reliant on visual data hold expansive utility across a spectrum of domains, including but not limited to video surveillance, human-computer interaction, and the safety infrastructure of smart cities. However, automatic behavior classification poses a significant challenge in the context of live videos captured by the smart city surveillance system. In this regard, the use of pictures with pre-trained convolution neural networks (CNNs)-assisted transfer learning (TL) has emerged as a potential technique for deep neural networks (DNNs) object detection., resulting in increased performance in localization for smart city surveillance. Against this backdrop, this paper explores various strategies to develop advanced synthetic datasets that could enhance accuracy when trained with modern DNNs for object detection (mAP). TL was employed to address the limitation of DL that necessitates a huge dataset. The KITTI datasets were used to train a contemporary DNN single-shot multiple box detector (SSMD) in TensorFlow. A variety of metrics were employed to assess the efficacy of the novel automated Transfer Learning (TL) system within a real-world context, specifically designed for object detection within the DL framework (referred to as OD-SSMD). The results unveiled that this developed system outperformed preceding investigations, demonstrating superior performance. Notably, it exhibited the remarkable capability to autonomously discern and pinpoint various attributes and entities within digital images, effectively identifying and localizing each item present within the images.

Keywords