Complex & Intelligent Systems (Nov 2024)

ATBHC-YOLO: aggregate transformer and bidirectional hybrid convolution for small object detection

  • Dandan Liao,
  • Jianxun Zhang,
  • Ye Tao,
  • Xie Jin

DOI
https://doi.org/10.1007/s40747-024-01652-4
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Object detection using UAV images is a current research focus in the field of computer vision, with frequent advancements in recent years. However, many methods are ineffective for challenging UAV images that feature uneven object scales, sparse spatial distribution, and dense occlusions. We propose a new algorithm for detecting small objects in UAV images, called ATBHC-YOLO. Firstly, the MS-CET module has been introduced to enhance the model’s focus on global sparse features in the spatial distribution of small objects. Secondly, the BHC-FB module is proposed to address the large-scale variance of small objects and enhance the perception of local features. Finally, a more appropriate loss function, WIoU, is used to penalise the quality variance of small object samples and further enhance the model’s detection accuracy. Comparison experiments on the DIOR and VEDAI datasets validate the effectiveness and robustness of the improved method. By conducting experiments on the publicly available UAV benchmark dataset Visdrone, ATBHC-YOLO outperforms the state-of-the-art method(YOLOv7) by 3.5%.

Keywords