International Journal of Digital Earth (Dec 2024)

NGST-Net: A N-Gram based Swin Transformer Network for improving multispectral and hyperspectral image fusion

  • Ziyuan Feng,
  • Xianfeng Zhang,
  • Bo Zhou,
  • Miao Ren,
  • Xiao Chen

DOI
https://doi.org/10.1080/17538947.2024.2359574
Journal volume & issue
Vol. 17, no. 1

Abstract

Read online

Transformer-based deep networks have been widely employed for fusing multispectral and hyperspectral images because the global self-attention mechanism ensures that more pixel information is used in the fusion. However, the unreasonable information interactions between local windows in window self-attention-based Transformer can adversely affect the network’s modeling of the global relationship, resulting in low-quality fusion results. This study proposes a novel N-Gram based Swin Transformer Network (NGST-Net). It utilizes more pixels for modeling the global relationship, mining more spectral information of similar pixels. A N-Gram strategy is proposed to learn the spectral feature relationship between the local window and previous and subsequent windows to guide the information interactions between the local windows. A maskless shifted window strategy enables the efficient modeling of the global relationship. Experimental results show that the proposed NGST-Net has fewer parameters, higher inference speed, and smaller memory footprint than several recently published Transformer methods. The improvements in the SAM, ERGAS, and PSNR over the second-best method are 9.9%, 6.5%, and 1%, respectively, on the CAVE dataset. The inference speed of NGST-Net is 10.7 times faster than that of Fusformer. The proposed NGST-Net is a novel deep network model for the effective fusion of multispectral and hyperspectral images.

Keywords