Alexandria Engineering Journal (Feb 2025)

DiffuYOLO: A novel method for small vehicle detection in remote sensing based on diffusion models

  • Jing Li,
  • Zhiyong Zhang,
  • Haochen Sun

Journal volume & issue
Vol. 114
pp. 485 – 496

Abstract

Read online

In the field of remote sensing image processing, particularly in the detection of small vehicle targets, data scarcity and insufficient algorithm accuracy pose major challenges. This paper introduces an innovative model – DiffuYOLO – designed to address these issues. Initially, we developed the DiffuNet network architecture, which incorporates Perceptual Awareness Attributes (P.A. Attr) and Perceptual Awareness Loss (P.A. Loss) into the diffusion model, effectively generating high-quality vehicle remote sensing images to alleviate the problem of insufficient small target data. Secondly, the incorporation of the Efficient Channel Attention (ESCA) mechanism in YOLOv8 significantly enhances the model’s ability to recognize small targets. Lastly, a new Inner-SIoU loss function was introduced to improve the accuracy of measuring the similarity between the predicted and actual bounding boxes. Experimental results on the VEDAI aerial image dataset show that the DiffuYOLO model achieved a mean Average Precision (mAP) of 85.7%, surpassing existing technologies. On the independently validated DOTA dataset, the model’s mAP further increased to 86.8%, demonstrating its efficiency and reliability in detecting small vehicle targets.

Keywords