Remote Sensing (Feb 2023)

IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer

  • Zhang Xiong,
  • Xiaohui Zhang,
  • Qingping Hu,
  • Hongwei Han

DOI
https://doi.org/10.3390/rs15051352
Journal volume & issue
Vol. 15, no. 5
p. 1352

Abstract

Read online

The current deep learning-based image fusion methods can not sufficiently learn the features of images in a wide frequency range. Therefore, we proposed IFormerFusion, which is based on the Inception Transformer and cross-domain frequency fusion. To learn features from high- and low-frequency information, we designed the IFormer mixer, which splits the input features through the channel dimension and feeds them into parallel paths for high- and low-frequency mixers to achieve linear computational complexity. The high-frequency mixer adopts a convolution and a max-pooling path, while the low-frequency mixer adopts a criss-cross attention path. Considering that the high-frequency information relates to the texture detail, we designed a cross-domain frequency fusion strategy, which trades high-frequency information between the source images. This structure can sufficiently integrate complementary features and strengthen the capability of texture retaining. Experiments on the TNO, OSU, and Road Scene datasets demonstrate that IFormerFusion outperforms other methods in object and subject evaluations.

Keywords