Geo-spatial Information Science (Apr 2024)
A cross-stage features fusion network for building extraction from remote sensing images
Abstract
The deep learning-based building extraction methods produce different feature maps at different stages of the network, which contain different information features. The detailed information of the feature maps decreases along the depth of the network, and insufficiently detailed information results in limited accuracy. However, existing methods are incapable of making full use of low-level feature maps with rich details. To overcome these shortcomings, we proposed a Cross-stage Features Fusion Network (CFF-Net) for building extraction from remote sensing images. In the CFF-Net, we innovatively proposed a Cross-stage Features Fusion (CFF) module that fuses different features generated at different stages. And we used the attention mechanism to make the network more focused on important information at different scales. To further improve the accuracy of building extraction, we designed the Prediction Enhancement (PE) module, where the last convolutional layer and the feature map generated in the intermediate stage are used for prediction at the same time to enhance the final result. To evaluate the effectiveness of the proposed network, we conduct quantitative and qualitative experiments on the two publicly available datasets, i.e. the Inria dataset and the WHU datasets. CFF-Net outperformed other state-of-the-art algorithms on the two datasets in IoU and F1 metrics. The efficiency analysis reveals that the proposed CFF-Net achieves a great balance between building extraction performance and complexity/efficiency, with faster convergence and higher robustness.
Keywords