International Journal of Advanced Robotic Systems (Jun 2023)
Visual localization with a monocular camera for unmanned aerial vehicle based on landmark detection and tracking using YOLOv5 and DeepSORT
Abstract
Absolute visual localization is of significant importance for unmanned aerial vehicles when the satellite-based localization system is not available. With the rapid evolution in the field of deep learning, the real-time visual detection and tracking of landmarks by an unmanned aerial vehicle could be implemented onboard. This study demonstrates a landmark-based visual localization framework for unmanned aerial vehicles flying at low altitudes. YOLOv5 and DeepSORT are used for multi-object detection and tracking, respectively. The unmanned aerial vehicle localization is achieved according to the geometric similarity between the geotagged transmission towers and the annotated images captured by a monocular camera. The validation is accomplished both in the Rflysim-based simulation and the quadrotor-based real flight. The localization precision is about 10 m, and the location update frequency reaches 5 Hz with a commercially available entry-level edge artificial intelligence platform. The proposed visual localization strategy needs no satellite image as a reference map, which saves a significant amount of the GPU memory and makes possible the end-to-end implementation on small unmanned aerial vehicles.