IET Image Processing (Oct 2024)

A dual attentional skip connection based Swin‐UNet for real‐time cloud segmentation

  • Fuhao Wei,
  • Shaofan Wang,
  • Yanfeng Sun,
  • Baocai Yin

DOI
https://doi.org/10.1049/ipr2.13186
Journal volume & issue
Vol. 18, no. 12
pp. 3460 – 3479

Abstract

Read online

Abstract Developing real‐time cloud segmentation technology is urgent for many remote sensing based applications such as weather forecasting. Existing deep learning based cloud segmentation methods involve two shortcomings. (a): They tend to produce discontinuous boundaries and fail to capture less salient feature, which corresponds to thin cloud pixels; (b): they are unrobust towards different scenarios. Those issues are circumvented by integrating U‐Net and the swin transformer together, with an efficiently designed dual attention mechanism based skip connection. Typically, a swin transformer based encoder‐decoder network, by incorporating a dual attentional skip connection with Swin‐UNet (DASUNet) is proposed. DASUNet captures the global relationship of image patches based on its window attention mechanism, which fits the real‐time requirement. Moreover, DASUNet characterizes the less salient features by equipping with token dual attention modules among the skip connection, which compensates the ignorance of less salient features incurred from traditional attention mechanism during the stacking of transformer layers. Experiments on ground‐based images (SWINySeg) and remote sensing images (HRC‐WHU, 38‐Cloud) show that, DASUNet achieves the state‐of‐the‐art or competitive results for cloud segmentation (six top‐1 positions of six metrics among 11 methods on SWINySeg, two top‐1 positions of five metrics among 10 methods on HRC‐WHU, two top‐1 positions of four metrics among 12 methods with ParaNum ≤30M on 38‐Cloud), with 100FPS implementation speed averagely for each 224×224 image.

Keywords