Drones (Jul 2022)

A Novel Multi-Scale Transformer for Object Detection in Aerial Scenes

  • Guanlin Lu,
  • Xiaohui He,
  • Qiang Wang,
  • Faming Shao,
  • Hongwei Wang,
  • Jinkang Wang

DOI
https://doi.org/10.3390/drones6080188
Journal volume & issue
Vol. 6, no. 8
p. 188

Abstract

Read online

Deep learning has promoted the research of object detection in aerial scenes. However, most of the existing networks are limited by the large-scale variation of objects and the confusion of category features. To overcome these limitations, this paper proposes a novel aerial object detection framework called DFCformer. DFCformer is mainly composed of three parts: the backbone network DMViT, which introduces deformation patch embedding and multi-scale adaptive self-attention to capture sufficient features of the objects; FRGC guides feature interaction layer by layer to break the barriers between feature layers and improve the information discrimination and processing ability of multi-scale critical features; CAIM adopts an attention mechanism to fuse multi-scale features to perform hierarchical reasoning on the relationship between different levels and fully utilize the complementary information in multi-scale features. Extensive experiments have been conducted on the FAIR1M dataset, and DFCformer shows its advantages by achieving the highest scores with stronger scene adaptability.

Keywords