IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2025)
Adaptive Semantic-Enhanced Denoising Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution
Abstract
Remote sensing image super-resolution (SR) is a crucial task to restore high-resolution (HR) images from low-resolution (LR) observations. Recently, the denoising diffusion probabilistic model (DDPM) has shown promising performance in image reconstructions by overcoming problems inherent in generative models, such as oversmoothing and mode collapse. However, the high-frequency details generated by DDPM often suffer from misalignment with HR images due to the model's tendency to overlook long-range semantic contexts. This challenge is partly due to the prevalent use of a U-Net decoder in the conditional noise predictor, which favors local details and can introduce noise with considerable variance during prediction. To tackle these limitations, an adaptive semantic-enhanced DDPM (ASDDPM) is proposed to enhance the detail-preserving capability of the DDPM by integrating low-frequency semantic insights through a transformer. Specifically, a novel adaptive diffusion transformer decoder is developed to bridge the semantic gap between the encoder and decoder by regulating the noise prediction with the global contextual relationships and long-range dependencies in the diffusion process. In addition, a residual feature fusion strategy establishes information exchange between the two decoders at multiple levels. As a result, the predicted noise generated by our approach closely approximates that of the real noise distribution. Extensive experiments on two SR and two semantic segmentation datasets confirm the superior performance of the proposed ASDDPM in both SR and the subsequent downstream applications.
Keywords