VAM-Net: Vegetation-Attentive deep network for Multi-modal fusion of visible-light and vegetation-sensitive images

Yufu Zang; Shuye Wang; Haiyan Guan; Daifeng Peng; Jike Chen; Yanming Chen; Mahmoud R. Delavar

International Journal of Applied Earth Observations and Geoinformation (Mar 2024)

VAM-Net: Vegetation-Attentive deep network for Multi-modal fusion of visible-light and vegetation-sensitive images

Yufu Zang,
Shuye Wang,
Haiyan Guan,
Daifeng Peng,
Jike Chen,
Yanming Chen,
Mahmoud R. Delavar

Affiliations

Yufu Zang: School of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China; Corresponding author.
Shuye Wang: School of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
Haiyan Guan: School of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
Daifeng Peng: School of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
Jike Chen: School of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
Yanming Chen: School of Earth Sciences and Engineering, Hohai University, Nanjing, Jiangsu 211100, China; Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Natural Resources, Shenzhen, Guangdong 518034, China
Mahmoud R. Delavar: School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran

Journal volume & issue: Vol. 127
p. 103642

Abstract

Read online

Multi-modal fusion of remote sensing images poses challenges because of the intricate imaging mechanisms and variations in radiation across different modalities. Specifically, the fusion of visible-light and vegetation-sensitive images encounters similar difficulties. Traditional methods have seldom considered the varied imaging mechanisms and radiation difference between modalities, resulting in discrepancies in the correspond features. To address the issue, we propose the VAM-Net (Vegetation-Attentive Multi-modal deep Network) combining a radiometric correction mechanism and a lightweight multi-modal adaptive feature selection method for fusing multi-modal images. First, the vegetation index (VDVI) is integrated into visible-light images to mitigate the radiometric differences between visible-light images and vegetation-sensitive images (e.g., infrared and red edge images). Then, a two-branch network incorporating attention mechanisms is designed to independently capture the texture features and select similar features cross two different modalities of images. Last, a new loss function is presented to ensure the learned features are suitable for multi-modal fusion. The VAM-Net is evaluated by visible-light and vegetation-sensitive images in three different areas, and the experimental results show that VAM-Net attains an average precision of 67.02%, and recall of 35.49%, and an average RMSE of 2.191px, demonstrating the accuracy and robustness of VAM-Net in multi-modal image fusion.

Published in International Journal of Applied Earth Observations and Geoinformation

ISSN: 1569-8432 (Print); 1872-826X (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Geography. Anthropology. Recreation: Physical geography; Geography. Anthropology. Recreation: Environmental sciences
Website: https://www.journals.elsevier.com/international-journal-of-applied-earth-observation-and-geoinformation

About the journal

Abstract

Keywords