Lightweight Cross-Modal Information Mutual Reinforcement Network for RGB-T Salient Object Detection

Chengtao Lv; Bin Wan; Xiaofei Zhou; Yaoqi Sun; Jiyong Zhang; Chenggang Yan

doi:10.3390/e26020130

Entropy (Jan 2024)

Lightweight Cross-Modal Information Mutual Reinforcement Network for RGB-T Salient Object Detection

Chengtao Lv,
Bin Wan,
Xiaofei Zhou,
Yaoqi Sun,
Jiyong Zhang,
Chenggang Yan

Affiliations

Chengtao Lv: School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China
Bin Wan: School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China
Xiaofei Zhou: School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China
Yaoqi Sun: School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China
Jiyong Zhang: School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China
Chenggang Yan: School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China

DOI: https://doi.org/10.3390/e26020130
Journal volume & issue: Vol. 26, no. 2
p. 130

Abstract

Read online

RGB-T salient object detection (SOD) has made significant progress in recent years. However, most existing works are based on heavy models, which are not applicable to mobile devices. Additionally, there is still room for improvement in the design of cross-modal feature fusion and cross-level feature fusion. To address these issues, we propose a lightweight cross-modal information mutual reinforcement network for RGB-T SOD. Our network consists of a lightweight encoder, the cross-modal information mutual reinforcement (CMIMR) module, and the semantic-information-guided fusion (SIGF) module. To reduce the computational cost and the number of parameters, we employ the lightweight module in both the encoder and decoder. Furthermore, to fuse the complementary information between two-modal features, we design the CMIMR module to enhance the two-modal features. This module effectively refines the two-modal features by absorbing previous-level semantic information and inter-modal complementary information. In addition, to fuse the cross-level feature and detect multiscale salient objects, we design the SIGF module, which effectively suppresses the background noisy information in low-level features and extracts multiscale information. We conduct extensive experiments on three RGB-T datasets, and our method achieves competitive performance compared to the other 15 state-of-the-art methods.

Published in Entropy

ISSN: 1099-4300 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Astronomy: Astrophysics; Science: Physics
Website: http://www.mdpi.com/journal/entropy

About the journal

Abstract

Keywords