IEEE Access (Jan 2025)

DiffRSS: A Diffusion-Guided Multi-Scale Features Remote Sensing Image Segmentation Method

  • Honghao Liu,
  • Ruixia Yang,
  • Yue Xu,
  • Zhengchao Chen,
  • Yuyang Zheng

DOI
https://doi.org/10.1109/ACCESS.2024.3522286
Journal volume & issue
Vol. 13
pp. 802 – 816

Abstract

Read online

Semantic segmentation in remote sensing is a fundamental task with crucial applications across various domains. Traditional approaches primarily utilize bottom-up discriminative methods, where network architectures learn image features to generate segmentation masks. However, the complexity of remote sensing images, characterized by diverse ground object types and intricate scenes, often results in information redundancy and confusion during feature extraction, impacting segmentation accuracy. To address these challenges, we introduce a novel segmentation framework, DiffRSS, based on the denoising model paradigm. This top-down generative approach learns the data distribution of sample labels and uses image features as guiding priors for generating segmentation masks. We conceptualize the semantic segmentation of remote sensing images as a conditional generation task and design a Multiscale Cyclic Denoising Module (MSCDM), which effectively leverages multiscale features of remote sensing images, leading to superior segmentation outcomes. Inspired by diffusion models, our denoising structure, MSCDM, can be reused multiple times during inference, enhancing the quality of segmentation masks. This method allows for more precise capture and utilization of image features, resulting in finer and more accurate segmentation masks. Extensive testing on three public remote sensing datasets the ISPRS Vaihingen, ISPRS Potsdam, and GID Fine Land Cover Classification datasets demonstrates that our method achieves competitive results.

Keywords