IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2024)

OCANet: An Overcomplete Convolutional Attention Network for Building Extraction From High-Resolution Remote Sensing Images

  • Bo Zhang,
  • Jiajia Huang,
  • Fan Wu,
  • Wenjuan Zhang

DOI
https://doi.org/10.1109/JSTARS.2024.3471804
Journal volume & issue
Vol. 17
pp. 18427 – 18443

Abstract

Read online

Building extraction from remote sensing (RS) image holds a crucial position in the fields of urban planning and sustainable development. In high-resolution (HR) RS images, the characteristics of buildings, including their shapes, structures, and textures, become increasingly complex. This complexity poses considerable challenges to the prediction and recognition of small, dense, and complex-shaped buildings. To address these problems, we present a novel overcomplete convolutional attention network (OCANet) to enhance the accuracy of building extraction from HR RS image. Specifically, the proposed method adopts a multiscale convolutional attention encoder to focus on the two-dimensional structure of the building image while enhancing computational efficiency. Additionally, an overcomplete fusion branch module is introduced to control the network's deep receptive field size, enabling a more concentrated focus on smaller and denser buildings. Furthermore, an edge refinement fusion module is proposed to further enhance the network's capability to extract building edge details by integrating shallow feature information from different scales with deep semantic information. The efficacy of the individually designed modules is validated through ablation studies on public datasets, including the WHU aerial building dataset and the Massachusetts building dataset. Additionally, a dataset leveraging Gaofen-2 imagery, featuring a variety of building types, is introduced to benchmark against other state-of-the-art networks. Both qualitative and quantitative evaluations demonstrate the ability of OCANet to extract dense, small, and complex-shaped buildings in complex urban landscapes. The proposed method provides excellent performance compared to other networks while reducing computational overhead.

Keywords