IEEE Access (Jan 2024)

Local-Global Feature Capture and Boundary Information Refinement Swin Transformer Segmentor for Remote Sensing Images

  • Rui Lin,
  • Ying Zhang,
  • Xue Zhu,
  • Xueyun Chen

DOI
https://doi.org/10.1109/ACCESS.2024.3350645
Journal volume & issue
Vol. 12
pp. 6088 – 6099

Abstract

Read online

Semantic segmentation of urban remote sensing images is a highly challenging task. Due to the complex background, occlusion overlap, and small-scale targets in urban remote sensing images, the semantic segmentation results suffer from deficiencies such as similar target confusion, blurred target boundaries, and small-scale target omission. To solve the above problems, a local-global feature capture and boundary information refinement Swin Transformer segmentor (LGBSwin) is proposed. First, the dual linear attention module (DLAM) utilizes spatial linear attention and channel linear attention mechanisms for strengthening global modeling capabilities to improve the segmentation ability of similar targets. Second, boundary-aware enhancement (BAE) adaptively mines the boundary semantic information through the effective integration of high-level and low-level features to alleviate blurred boundaries. Finally, feature refinement aggregation (FRA) establishes information relationships between different layers, reduces the loss of local information, and enhances local-global dependence, thus significantly improving the recognition ability of small targets. Experimental results demonstrate the effectiveness of LGBSwin, with an F1 of 91.02% on the ISPRS Vaihingen dataset and 93.35% on the ISPRS Potsdam dataset.

Keywords