IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)

GeoFormer: An Effective Transformer-Based Siamese Network for UAV Geolocalization

  • Qingge Li,
  • Xiaogang Yang,
  • Jiwei Fan,
  • Ruitao Lu,
  • Bin Tang,
  • Siyu Wang,
  • Shuang Su

DOI
https://doi.org/10.1109/JSTARS.2024.3392812
Journal volume & issue
Vol. 17
pp. 9470 – 9491

Abstract

Read online

Cross-view geolocalization of unmanned aerial vehicles (UAVs) is a challenging task due to the positional discrepancies and uncertainties in scale and distance between UAVs and satellite views. Existing transformer-based geolocalization methods mainly use encoders to mine image contextual information. However, these methods have some limitations when dealing with scale changes in cross-view images. Therefore, we present an effective transformer-based Siamese network tailored for UAV geolocalization, called GeoFormer. First, an efficient transformer feature extraction network was designed, which utilizes linear attention to reduce the computational complexity and improve the computational efficiency of the network. Among them, we designed an efficient separable perceptron module based on depthwise separable convolution, which can effectively reduce the computational cost while improving the feature representation of the network. Second, we proposed a multiscale feature aggregation module, which deeply fuses salient features at different scales through a feedforward neural network to generate global feature representations with rich semantics, which improves the model's ability to capture image details and represent robust features. Additionally, we designed a semantic-guided region segmentation module, which utilizes a k-modes clustering algorithm to divide the feature map into multiple regions with semantic consistency and performs feature recognition within each semantic region to improve the accuracy of image matching. Finally, we designed a hierarchical reinforcement rotation matching strategy to achieve accurate UAV geolocalization based on the retrieval results of UAV view query satellite images using SuperPoint keypoints extraction and LightGlue rotation matching. According to the experimental results, our method effectively achieves UAV geolocalization.

Keywords