Attention Swin Transformer UNet for Landslide Segmentation in Remotely Sensed Images

Bingxue Liu; Wei Wang; Yuming Wu; Xing Gao

doi:10.3390/rs16234464

Remote Sensing (Nov 2024)

Attention Swin Transformer UNet for Landslide Segmentation in Remotely Sensed Images

Bingxue Liu,
Wei Wang,
Yuming Wu,
Xing Gao

Affiliations

Bingxue Liu: State Key Laboratory of Resources and Environmental Information System, Institution of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
Wei Wang: State Key Laboratory of Resources and Environmental Information System, Institution of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
Yuming Wu: State Key Laboratory of Resources and Environmental Information System, Institution of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
Xing Gao: State Key Laboratory of Resources and Environmental Information System, Institution of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China

DOI: https://doi.org/10.3390/rs16234464
Journal volume & issue: Vol. 16, no. 23
p. 4464

Abstract

Read online

The development of artificial intelligence makes it possible to rapidly segment landslides. However, there are still some challenges in landslide segmentation based on remote sensing images, such as low segmentation accuracy, caused by similar features, inhomogeneous features, and blurred boundaries. To address these issues, we propose a novel deep learning model called AST-UNet in this paper. This model is based on structure of SwinUNet, attaching a channel Attention and spatial intersection (CASI) module as a parallel branch of the encoder, and a spatial detail enhancement (SDE) module in the skip connection. Specifically, (1) the spatial intersection module expands the spatial attention range, alleviating noise in the image and enhances the continuity of landslides in segmentation results; (2) the channel attention module refines the spatial attention weights by feature modeling in the channel dimension, improving the model’s ability to differentiate targets that closely resemble landslides; and (3) the spatial detail enhancement module increases the accuracy for landslide boundaries by strengthening the attention of the decoder to detailed features. We use the landslide data from the area of Luding, Sichuan to conduct experiments. The comparative analyses with state-of-the-art (SOTA) models, including FCN, UNet, DeepLab V3+, TransFuse, TranUNet, and SwinUNet, prove the superiority of our AST-UNet for landslide segmentation. The generalization of our model is also verified in the experiments. The proposed AST-UNet obtains an F1-score of 90.14%, mIoU of 83.45%, foreground IoU of 70.81%, and Hausdorff distance of 3.73, respectively, on the experimental datasets.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords