IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2025)

Building Extraction From High-Resolution Multispectral and SAR Images Using a Boundary-Link Multimodal Fusion Network

  • Zhe Zhao,
  • Boya Zhao,
  • Yuanfeng Wu,
  • Zutian He,
  • Lianru Gao

DOI
https://doi.org/10.1109/JSTARS.2025.3525709
Journal volume & issue
Vol. 18
pp. 3864 – 3878

Abstract

Read online

Automatically extracting buildings with high precision from remote sensing images is crucial for various applications. Due to their distinct imaging modalities and complementary characteristics, optical and synthetic aperture radar (SAR) images serve as primary data sources for this task. We propose a novel boundary-link multimodal fusion network for joint semantic segmentation to leverage the information in these images. An initial building extraction result is obtained from the multimodal fusion network, followed by refinement using building boundaries. The model achieves high-precision building delineation by leveraging building boundary and semantic information from optical and SAR images. It distinguishes buildings from the background in complex environments, such as dense urban areas or regions with mixed vegetation, particularly when small buildings lack distinct texture or color features. We conducted experiments using the MSAW dataset (RGB-NIR and SAR data) and DFC track2 datasets (RGB and SAR data). The results indicate that our model significantly enhances extraction accuracy and improves building boundary delineation. The intersection over union metric is 2.5% to 3.5% higher than that of other multimodal joint segmentation methods.

Keywords