Jisuanji kexue yu tansuo (May 2024)

UAV Remote Sensing Object Detection Based on 3D Multi-layer Feature Collaboration

  • LYU Fu, FU Yuheng, HE Lina, YANG Dongpeng

DOI
https://doi.org/10.3778/j.issn.1673-9418.2401007
Journal volume & issue
Vol. 18, no. 5
pp. 1301 – 1317

Abstract

Read online

To solve the large proportion of small targets and complex background in UAV (unmanned aerial vehicle) aerial image, the current object detection model has the problems of low accuracy and missed detection of small targets. Based on the YOLOv8s model, this paper proposes a 3D multi-layer feature collaboration UAV remote sensing object detection algorithm. Firstly, based on the coordinate attention, this paper proposes 3D multi-branch coordinate attention (MBCA), which improves the global feature extraction ability of the model and reduces the computation of spatial dimension by increasing the information interaction of channel dimension and the splitting and fusion of extended branches. Secondly, SPD-Conv is used to replace part of the standard convolution, which effectively retains more feature information and speeds up inference during downsampling. Then, a more efficient FastDBB_Bottleneck module is used in the C2f module, combining PConv and DBB structure reparameterization superposition to further reduce the calculation of the model. Finally, PG-Detect detection head is introduced to significantly reduce the calculation and effectively reduce the missed detection rate of small targets. Experimental results on the VisDrone2019 dataset show that the mAP50 value of the proposed method reaches 44.5%, which is 5.7 percentage points higher than that of the YOLOv8s baseline model. Simultaneously, the crack detection verification experiment is carried out on the self-built dam crack dataset, and the mAP50 value of the improved method is 3.3 percentage points higher than that of YOLOv8s, the FPS reaches 289 frames. Experimental results show that the proposed method improves the accuracy and real-time performance of the detection model in complex scene object detection, and has good adaptability and robustness.

Keywords