Pansharpening via Multiscale Embedding and Dual Attention Transformers

Wensheng Fan; Fan Liu; Jingzhi Li

doi:10.1109/JSTARS.2023.3344215

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)

Pansharpening via Multiscale Embedding and Dual Attention Transformers

Wensheng Fan,
Fan Liu,
Jingzhi Li

Affiliations

Wensheng Fan: ORCiD; College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan, China
Fan Liu: ORCiD; College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Jinzhong, China
Jingzhi Li: ORCiD; College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Jinzhong, China

DOI: https://doi.org/10.1109/JSTARS.2023.3344215
Journal volume & issue: Vol. 17
pp. 2705 – 2717

Abstract

Read online

Pansharpening is a fundamental and crucial image processing task for many remote sensing applications, which generates a high-resolution multispectral image by fusing a low-resolution multispectral image and a high-resolution panchromatic image. Recently, vision transformers have been introduced into the pansharpening task for utilizing global contextual information. However, long-range and local dependencies modeling and multiscale feature learning are all essential to the pansharpening task. Learning and exploiting these various information raises a big challenge and limits the performance and efficiency of existing pansharpening methods. To solve this issue, we propose a pansharpening network based on multiscale embedding and dual attention transformers (MDPNet). Specifically, a multiscale embedding block is proposed to embed multiscale information of the images into vectors. Thus, transformers only need to process a multispectral embedding sequence and a panchromatic embedding sequence to efficiently use multiscale information. Furthermore, an additive hybrid attention transformer is proposed to fuse the embedding sequences in an additive injection manner. Finally, a channel self-attention transformer is proposed to utilize channel correlations for high-quality detail generation. Experiments over QuickBird and WorldView-3 datasets demonstrate the proposed MDPNet outperforms state-of-the-art methods visually and quantitatively with low running time. Ablation studies further verify the effectiveness of the proposed multiscale embedding and transformers in pansharpening.

Published in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

ISSN: 1939-1404 (Print); 2151-1535 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Ocean engineering; Science: Physics: Geophysics. Cosmic physics
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=4609443

About the journal

Abstract

Keywords