IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2023)

TCIANet: Transformer-Based Context Information Aggregation Network for Remote Sensing Image Change Detection

  • Xintao Xu,
  • Jinjiang Li,
  • Zheng Chen

DOI
https://doi.org/10.1109/JSTARS.2023.3241157
Journal volume & issue
Vol. 16
pp. 1951 – 1971

Abstract

Read online

Change detection based on remote sensing data is an important method to detect the earth surface changes. With the development of deep learning, convolutional neural networks have excelled in the field of change detection. However, the existing neural network models are susceptible to external factors in the change detection process, leading to pseudo change and missed detection in the detection results. In order to better achieve the change detection effect and improve the ability to discriminate pseudo change, this article proposes a new method, namely, transformer-based context information aggregation network for remote sensing image change detection. First, we use a filter-based visual tokenizer to segment each temporal feature map into multiple visual semantic tokens. Second, the addition of the progressive sampling vision transformer not only effectively excludes the interference of irrelevant changes, but also uses the transformer encoder to obtain compact spatiotemporal context information in the token set. Then, the tokens containing rich semantic information are fed into the pixel space, and the transformer decoder is used to acquire pixel-level features. In addition, we use the feature fusion module to fuse low-level semantic feature information to complete the extraction of coarse contour information of the changed region. Then, the semantic relationships between object regions and contours are captured by the contour-graph reasoning module to obtain feature maps with complete edge information. Finally, the prediction model is used to discriminate the change of feature information and generate the final change map. Numerous experimental results show that our method has more obvious advantages in visual effect and quantitative evaluation than other methods.

Keywords