Remote Sensing (Mar 2025)

RMVAD-YOLO: A Robust Multi-View Aircraft Detection Model for Imbalanced and Similar Classes

  • Keda Li,
  • Xiangyue Zheng,
  • Jingxin Bi,
  • Gang Zhang,
  • Yi Cui,
  • Tao Lei

DOI
https://doi.org/10.3390/rs17061001
Journal volume & issue
Vol. 17, no. 6
p. 1001

Abstract

Read online

Aircraft detection technology plays a vital role in civilian applications, with significant attention being devoted to research on related algorithms in recent years. However, most existing research predominantly focuses on aircraft detection from a single top–down viewpoint, which constrains the applicability of detection technology across diverse scenarios. To overcome this limitation, we propose RMVAD-YOLO, a multi-view aircraft detection model built upon YOLOv8. First, we propose a novel Robust Multi-Link Scale Interactive Feature Pyramid Network (RMSFPN), which robustly extracts features of the same aircraft category from multiple views while enhancing feature differentiation between different aircraft categories. Second, we propose the Shared Convolutional Dynamic Alignment Detection Head (SCDADH), which enhances task interaction and collaboration by sharing convolutions between the classification and localization branches while simultaneously reducing the number of parameters, enhancing the model’s ability to deal with multi-scale targets. Additionally, to further leverage background information and enhance the model’s adaptability to multi-scale target variations, we incorporate the LSK Module into the backbone network. Finally, we propose the WFMIoUv3 loss function, which strengthens the model’s focus on challenging samples and improves detection robustness. Experimental results on the newly released Multi-Perspective Aircraft Dataset (MAD) demonstrate that RMVAD-YOLO achieves an accuracy of 90.1%, a recall of 76%, 84.8% [email protected], and 70.5% [email protected]:0.95, while reducing parameters and delivering an overall improvement in detection performance compared to the baseline YOLOv8n. RMVAD-YOLO also performed well on the VisDrone 2019 dataset, further demonstrating its reliable generalization capabilities.

Keywords