Hierarchical Disentangling Network for Building Extraction from Very High Resolution Optical Remote Sensing Imagery

Jianhao Li; Yin Zhuang; Shan Dong; Peng Gao; Hao Dong; He Chen; Liang Chen; Lianlin Li

doi:10.3390/rs14071767

Remote Sensing (Apr 2022)

Hierarchical Disentangling Network for Building Extraction from Very High Resolution Optical Remote Sensing Imagery

Jianhao Li,
Yin Zhuang,
Shan Dong,
Peng Gao,
Hao Dong,
He Chen,
Liang Chen,
Lianlin Li

Affiliations

Jianhao Li: Beijing Key Laboratory of Embedded Real-Time Information Processing Technology, Beijing Institute of Technology, Beijing 100081, China
Yin Zhuang: Beijing Key Laboratory of Embedded Real-Time Information Processing Technology, Beijing Institute of Technology, Beijing 100081, China
Shan Dong: Beijing Key Laboratory of Embedded Real-Time Information Processing Technology, Beijing Institute of Technology, Beijing 100081, China
Peng Gao: Shanghai AI Laboratory, Shanghai 200232, China
Hao Dong: Center on Frontiers of Computing Studies and School of Electronic Engineering and Computer Science, Peking University, Beijing 100087, China
He Chen: Beijing Key Laboratory of Embedded Real-Time Information Processing Technology, Beijing Institute of Technology, Beijing 100081, China
Liang Chen: Beijing Key Laboratory of Embedded Real-Time Information Processing Technology, Beijing Institute of Technology, Beijing 100081, China
Lianlin Li: Center on Frontiers of Computing Studies and School of Electronic Engineering and Computer Science, Peking University, Beijing 100087, China

DOI: https://doi.org/10.3390/rs14071767
Journal volume & issue: Vol. 14, no. 7
p. 1767

Abstract

Read online

Building extraction using very high resolution (VHR) optical remote sensing imagery is an essential interpretation task that impacts human life. However, buildings in different environments exhibit various scales, complicated spatial distributions, and different imaging conditions. Additionally, with the spatial resolution of images increasing, there are diverse interior details and redundant context information present in building and background areas. Thus, the above-mentioned situations would create large intra-class variances and poor inter-class discrimination, leading to uncertain feature descriptions for building extraction, which would result in over- or under-extraction phenomena. In this article, a novel hierarchical disentangling network with an encoder–decoder architecture called HDNet is proposed to consider both the stable and uncertain feature description in a convolution neural network (CNN). Next, a hierarchical disentangling strategy is set up to individually generate strong and weak semantic zones using a newly designed feature disentangling module (FDM). Here, the strong and weak semantic zones set up the stable and uncertain description individually to determine a more stable semantic main body and uncertain semantic boundary of buildings. Next, a dual-stream semantic feature description is built to gradually integrate strong and weak semantic zones by the designed component feature fusion module (CFFM), which is able to generate a powerful semantic description for more complete and refined building extraction. Finally, extensive experiments are carried out on three published datasets (i.e., WHU satellite, WHU aerial, and INRIA), and the comparison results show that the proposed HDNet outperforms other state-of-the-art (SOTA) methods.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords