IET Image Processing (Dec 2024)

Swin‐YOLOX for autonomous and accurate drone visual landing

  • Rongbin Chen,
  • Ying Xu,
  • Mohamad Sabri bin Sinal,
  • Dongsheng Zhong,
  • Xinru Li,
  • Bo Li,
  • Yadong Guo,
  • Qingjia Luo

DOI
https://doi.org/10.1049/ipr2.13282
Journal volume & issue
Vol. 18, no. 14
pp. 4731 – 4744

Abstract

Read online

Abstract As UAVs are more and more widely used in military and civilian fields, their intelligent applications have also been developed rapidly. However, high‐precision autonomous landing is still an industry challenge. GPS‐based methods will not work in places where GPS signals are not available; multi‐sensor combination navigation is difficult to be widely used because of the high equipment requirements; traditional vision‐based methods are sensitive to scale transformation, background complexity and occlusion, which affect the detection performance. In this paper, we address these problems and apply deep learning methods to target detection in the UAV landing phase. Firstly, we optimize the backbone network of YOLOX and propose the Swin Transformer based YOLOX (Swin‐YOLOX) UAV landing visual positioning algorithm. Secondly, based on the UAV‐VPD database, a batch of actual acquisition data is added to build the UAV‐VPDV2 database by AI annotation method. And finally, the RBN data batch normalization method is used to improve the performance of the model in extracting effective features from the data. Extensive experiments have shown that the AP50 of the proposed method can reach 98.7%, which is superior to other detection models, with a detection speed of 38.4 frames/second, and can meet the requirements of real‐time detection.

Keywords