Remote Sensing (Oct 2023)

HAM-Transformer: A Hybrid Adaptive Multi-Scaled Transformer Net for Remote Sensing in Complex Scenes

  • Keying Ren,
  • Xiaoyan Chen,
  • Zichen Wang,
  • Xiwen Liang,
  • Zhihui Chen,
  • Xia Miao

DOI
https://doi.org/10.3390/rs15194817
Journal volume & issue
Vol. 15, no. 19
p. 4817

Abstract

Read online

The quality of remote sensing images has been greatly improved by the rapid improvement of unmanned aerial vehicles (UAVs), which has made it possible to detect small objects in the most complex scenes. Recently, learning-based object detection has been introduced and has gained popularity in remote sensing image processing. To improve the detection accuracy of small, weak objects in complex scenes, this work proposes a novel hybrid backbone composed of a convolutional neural network and an adaptive multi-scaled transformer, referred to as HAM-Transformer Net. HAM-Transformer Net firstly extracts the details of feature maps using convolutional local feature extraction blocks. Secondly, hierarchical information is extracted, using multi-scale location coding. Finally, an adaptive multi-scale transformer block is used to extract further features in different receptive fields and to fuse them adaptively. We implemented comparison experiments on a self-constructed dataset. The experiments proved that the method is a significant improvement over the state-of-the-art object detection algorithms. We also conducted a large number of comparative experiments in this work to demonstrate the effectiveness of this method.

Keywords