IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2025)
Single-Image Superresolution for RGB Remote Sensing Imagery via Multiscale CNN-Transformer Feature Fusion
Abstract
Single-image superresolution (SISR) of remote sensing images aims to improve image resolution through algorithmic means while restoring rich high-frequency detailed information. Previously, convolutional neural network (CNN) achieves impressive progress in SISR due to its strong local feature extraction capability. However, the CNN is difficult to model long-range dependencies, which limits the performance of SISR. Transformer has a greater advantage over the CNN in long-range dependence modeling. Moreover, remote sensing elements typically exhibit multiscale characteristics and maintain strong coupling relationships with the surrounding environment. Paying attention to multiscale local and global information can help further improve the performance of SISR. Existing methods encounter challenges in leveraging diverse feature sets within single architecture networks. In this article, we propose a multiscale CNN-Transformer feature fusion (MSCT) network for remote sensing SISR. MSCT is a hybrid structural network, which consists of the adaptive residual dense block (ARDB), the multiscale Transformer block (MSTB), and the local–global information enhancement block (LGEB). Among them, ARDB can extract abundant multiscale local features via dense connected convolutional layers and adaptive residual learning. MSTB generates multiscale tokens with multiple convolutional layers to obtain multiscale global information. LGEB further enhance the extraction of local–global information in spatial and frequency domain. Extensive experiments on multiple datasets show that MSCT achieves competitive results on RGB remote sensing imagery compared with other representative SISR methods. In addition, SR experiments on Sentinel-2 images show that MSCT has a certain generalization ability on multispectral images. Extended experiments also demonstrate that MSCT can effectively improve the performance of object detection.
Keywords