IEEE Access (Jan 2025)

An Enhanced End-to-End Object Detector for Drone Aerial Imagery

  • Quan Yu,
  • Qiang Tong,
  • Lin Miao,
  • Lin Qi,
  • Xiulei Liu

DOI
https://doi.org/10.1109/ACCESS.2025.3533037
Journal volume & issue
Vol. 13
pp. 18798 – 18813

Abstract

Read online

DETR-like detectors have gained increasing popularity in current practical applications. However, we observe that their pipeline still suffer from several challenges, including unbalanced distribution of positive and negative samples, low-quality initial prediction boxes, and unreasonable gradient structure in the decoding stage. These challenges hinder both the convergence speed and detection performance of the model. To address these issues, we propose an enhanced DETR-like model called EM-DETR. It combines three innovative methods, including Dynamic Groups Assignment, Mixed Query Re-Selection, and Look Forward Stage. Dynamic Groups Assignment employs adaptive parameters to balance the number of positive and negative samples, providing more effective supervision signals for ground-truth boxes. Mixed Query Re-Selection utilizes high-quality bounding boxes regressed by subnet to initialize decoder queries, offering superior prior information for the decoder. Look Forward Stage introduces a more rational gradient structure which eliminates inter-layer information bias between decoders. We conduct extensive experiments to evaluate the effectiveness of our proposed method. On VisDrone2021-DET, EM-DETR with ResNet50 achieved 23.9% AP after 12 epochs of training. Compared to the baseline, this represents an improvement of 4.7% AP. Moreover, the excellent performance of EM-DETR on AI-TOD and Crowdhuman proves the generalization capability of the proposed method.

Keywords