International Journal of Applied Earth Observations and Geoinformation (Nov 2023)
Multi-scale Feature Fusion and Transformer Network for urban green space segmentation from high-resolution remote sensing images
Abstract
Accurate extraction of urban green space is critical for preserving urban ecological balance and enhancing urban life quality. However, due to the complex urban green space morphology (e.g., different sizes and shapes), it is still challenging to extract green space effectively from high-resolution image. To address this issue, we proposed a novel hybrid method, Multi-scale Feature Fusion and Transformer Network (MFFTNet), as a new deep learning approach for extracting urban green space from high-resolution (GF-2) image. Our method was characterized by two aspects: (1) a multi-scale feature fusion module and transformer network that enhanced the recovery of green space edge information and (2) vegetation feature (NDVI) that highlighted vegetation information and enhanced vegetation boundaries identification. The GF-2 image was utilized to build two urban green space labeled datasets, namely Greenfield and Greenfield2. We compared the proposed MFFTNet with the existing popular deep learning models (like PSPNet, DensASPP, etc.) to evaluate the effectiveness of MFFTNet by the Mean Intersection Over Union (MIOU) benchmark on Greenfield, Greenfield2, and a public dataset (WHDLD). Experiments on Greenfield2 showed that MFFTNet can achieve a high MIOU (86.50%), which outperformed deep learning networks like PSPNet and DensASPP by 0.86% and 3.28%, respectively. Meanwhile, the MIOU of MFFTNet incorporating vegetation feature (NDVI) was further achieved to 86.76% on Greenfield2. Our experimental results demonstrate that the proposed MFFTNet with vegetation feature (NDVI) outperforms the state-of-the-art methods in urban green space segmentation.