DiffRSS: A Diffusion-Guided Multi-Scale Features Remote Sensing Image Segmentation Method

Honghao Liu; Ruixia Yang; Yue Xu; Zhengchao Chen; Yuyang Zheng

doi:10.1109/ACCESS.2024.3522286

IEEE Access (Jan 2025)

DiffRSS: A Diffusion-Guided Multi-Scale Features Remote Sensing Image Segmentation Method

Honghao Liu,
Ruixia Yang,
Yue Xu,
Zhengchao Chen,
Yuyang Zheng

Affiliations

Honghao Liu: ORCiD; Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
Ruixia Yang: Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
Yue Xu: ORCiD; Key Laboratory of Remote Sensing and Digital Earth, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
Zhengchao Chen: ORCiD; Key Laboratory of Remote Sensing and Digital Earth, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
Yuyang Zheng: ORCiD; Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China

DOI: https://doi.org/10.1109/ACCESS.2024.3522286
Journal volume & issue: Vol. 13
pp. 802 – 816

Abstract

Read online

Semantic segmentation in remote sensing is a fundamental task with crucial applications across various domains. Traditional approaches primarily utilize bottom-up discriminative methods, where network architectures learn image features to generate segmentation masks. However, the complexity of remote sensing images, characterized by diverse ground object types and intricate scenes, often results in information redundancy and confusion during feature extraction, impacting segmentation accuracy. To address these challenges, we introduce a novel segmentation framework, DiffRSS, based on the denoising model paradigm. This top-down generative approach learns the data distribution of sample labels and uses image features as guiding priors for generating segmentation masks. We conceptualize the semantic segmentation of remote sensing images as a conditional generation task and design a Multiscale Cyclic Denoising Module (MSCDM), which effectively leverages multiscale features of remote sensing images, leading to superior segmentation outcomes. Inspired by diffusion models, our denoising structure, MSCDM, can be reused multiple times during inference, enhancing the quality of segmentation masks. This method allows for more precise capture and utilization of image features, resulting in finer and more accurate segmentation masks. Extensive testing on three public remote sensing datasets the ISPRS Vaihingen, ISPRS Potsdam, and GID Fine Land Cover Classification datasets demonstrates that our method achieves competitive results.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords