AMFuse: Add–Multiply-Based Cross-Modal Fusion Network for Multi-Spectral Semantic Segmentation

Haijun Liu; Fenglei Chen; Zhihong Zeng; Xiaoheng Tan

doi:10.3390/rs14143368

Remote Sensing (Jul 2022)

AMFuse: Add–Multiply-Based Cross-Modal Fusion Network for Multi-Spectral Semantic Segmentation

Haijun Liu,
Fenglei Chen,
Zhihong Zeng,
Xiaoheng Tan

Affiliations

Haijun Liu: School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China
Fenglei Chen: School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China
Zhihong Zeng: School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China
Xiaoheng Tan: School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China

DOI: https://doi.org/10.3390/rs14143368
Journal volume & issue: Vol. 14, no. 14
p. 3368

Abstract

Read online

Multi-spectral semantic segmentation has shown great advantages under poor illumination conditions, especially for remote scene understanding of autonomous vehicles, since the thermal image can provide complementary information for RGB image. However, methods to fuse the information from RGB image and thermal image are still under-explored. In this paper, we propose a simple but effective module, add–multiply fusion (AMFuse) for RGB and thermal information fusion, consisting of two simple math operations—addition and multiplication. The addition operation focuses on extracting cross-modal complementary features, while the multiplication operation concentrates on the cross-modal common features. Moreover, the attention module and atrous spatial pyramid pooling (ASPP) modules are also incorporated into our proposed AMFuse modules, to enhance the multi-scale context information. Finally, in the UNet-style encoder–decoder framework, the ResNet model is adopted as the encoder. As for the decoder part, the multi-scale information obtained from our proposed AMFuse modules is hierarchically merged layer-by-layer to restore the feature map resolution for semantic segmentation. The experiments of RGBT multi-spectral semantic segmentation and salient object detection demonstrate the effectiveness of our proposed AMFuse module for fusing the RGB and thermal information.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords