Swin‐YOLOX for autonomous and accurate drone visual landing

Rongbin Chen; Ying Xu; Mohamad Sabri bin Sinal; Dongsheng Zhong; Xinru Li; Bo Li; Yadong Guo; Qingjia Luo

doi:10.1049/ipr2.13282

IET Image Processing (Dec 2024)

Swin‐YOLOX for autonomous and accurate drone visual landing

Rongbin Chen,
Ying Xu,
Mohamad Sabri bin Sinal,
Dongsheng Zhong,
Xinru Li,
Bo Li,
Yadong Guo,
Qingjia Luo

Affiliations

Rongbin Chen: College of Information Engineering, Jiangmen Polytechnic Jiangmen Guangdong China
Ying Xu: Department of Intelligent Manufacturing Wuyi University Jiangmen China
Mohamad Sabri bin Sinal: School of Computing Universiti Utara Malaysia Kedah Malaysia
Dongsheng Zhong: Department of Intelligent Manufacturing Wuyi University Jiangmen China
Xinru Li: Department of Intelligent Manufacturing Wuyi University Jiangmen China
Bo Li: Department of Intelligent Manufacturing Wuyi University Jiangmen China
Yadong Guo: College of Information Engineering, Jiangmen Polytechnic Jiangmen Guangdong China
Qingjia Luo: College of Information Engineering, Jiangmen Polytechnic Jiangmen Guangdong China

DOI: https://doi.org/10.1049/ipr2.13282
Journal volume & issue: Vol. 18, no. 14
pp. 4731 – 4744

Abstract

Read online

Abstract As UAVs are more and more widely used in military and civilian fields, their intelligent applications have also been developed rapidly. However, high‐precision autonomous landing is still an industry challenge. GPS‐based methods will not work in places where GPS signals are not available; multi‐sensor combination navigation is difficult to be widely used because of the high equipment requirements; traditional vision‐based methods are sensitive to scale transformation, background complexity and occlusion, which affect the detection performance. In this paper, we address these problems and apply deep learning methods to target detection in the UAV landing phase. Firstly, we optimize the backbone network of YOLOX and propose the Swin Transformer based YOLOX (Swin‐YOLOX) UAV landing visual positioning algorithm. Secondly, based on the UAV‐VPD database, a batch of actual acquisition data is added to build the UAV‐VPDV2 database by AI annotation method. And finally, the RBN data batch normalization method is used to improve the performance of the model in extracting effective features from the data. Extensive experiments have shown that the AP50 of the proposed method can reach 98.7%, which is superior to other detection models, with a detection speed of 38.4 frames/second, and can meet the requirements of real‐time detection.

Published in IET Image Processing

ISSN: 1751-9659 (Print); 1751-9667 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Technology: Photography; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://ietresearch.onlinelibrary.wiley.com/journal/17519667

About the journal

Abstract

Keywords