IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2022)

High-Resolution Remote Sensing Image Semantic Segmentation via Multiscale Context and Linear Self-Attention

  • Peng Yin,
  • Dongmei Zhang,
  • Wei Han,
  • Jiang Li,
  • Jianmei Cheng

DOI
https://doi.org/10.1109/JSTARS.2022.3214889
Journal volume & issue
Vol. 15
pp. 9174 – 9185

Abstract

Read online

Remote sensing image semantic segmentation, which aims to realize pixel-level classification according to the content of remote sensing images, has broad applications in various fields. Thanks to the superiority of deep learning (DL), the semantic segmentation model based on the convolutional neural network (CNN) dramatically promotes the development of remote sensing image semantic segmentation. Due to the high resolution, comprehensive coverage, extensive data, and sizeable spectral difference of high-resolution remote sensing images (HRRSI), the existing GPU is not suitable for directly semantic segmentation of the whole image. Cutting the image into small patches will lead to the loss of context information, resulting in the decline of accuracy. To address this issue, we propose the multiscale context self-attention network (MSCSANet). It combines the benefits of the self-attention mechanism with CNN to improve the segmentation quality of various remote sensing images. The MSCSANet extracts multiscale features from multiscale context images to solve the problem of feature loss caused by image segmentation. In addition, in order to make use of the feature of large-scale context, the multiscale context patches are used to guide the local image patch to focus on different fine-grained objects to enhance the feature of the local image patch. Moreover, considering the limited computing resources, we designed a linear self-attention module to reduce the computational complexity. Compared with other DL models, our proposed model can enhance the ability of multiscale features in complex scenes, and realizes improvements of 1.56% mean intersection over union (MIoU) on the Gaofen Image Dataset and 1.93% MIoU on the ISPRS Potsdam Dataset, respectively.

Keywords