Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network

Penghua Liu; Xiaoping Liu; Mengxi Liu; Qian Shi; Jinxing Yang; Xiaocong Xu; Yuanying Zhang

doi:10.3390/rs11070830

Remote Sensing (Apr 2019)

Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network

Penghua Liu,
Xiaoping Liu,
Mengxi Liu,
Qian Shi,
Jinxing Yang,
Xiaocong Xu,
Yuanying Zhang

Affiliations

Penghua Liu: School of Geography and Planning, Sun Yat-Sen University, West Xingang Road, Guangzhou 510275, China
Xiaoping Liu: School of Geography and Planning, Sun Yat-Sen University, West Xingang Road, Guangzhou 510275, China
Mengxi Liu: School of Geography and Planning, Sun Yat-Sen University, West Xingang Road, Guangzhou 510275, China
Qian Shi: School of Geography and Planning, Sun Yat-Sen University, West Xingang Road, Guangzhou 510275, China
Jinxing Yang: School of Geographical Sciences, Guangzhou University, West Waihuan Street/Road, Guangzhou 510006, China
Xiaocong Xu: School of Geography and Planning, Sun Yat-Sen University, West Xingang Road, Guangzhou 510275, China
Yuanying Zhang: School of Geography and Planning, Sun Yat-Sen University, West Xingang Road, Guangzhou 510275, China

DOI: https://doi.org/10.3390/rs11070830
Journal volume & issue: Vol. 11, no. 7
p. 830

Abstract

Read online

The rapid development in deep learning and computer vision has introduced new opportunities and paradigms for building extraction from remote sensing images. In this paper, we propose a novel fully convolutional network (FCN), in which a spatial residual inception (SRI) module is proposed to capture and aggregate multi-scale contexts for semantic understanding by successively fusing multi-level features. The proposed SRI-Net is capable of accurately detecting large buildings that might be easily omitted while retaining global morphological characteristics and local details. On the other hand, to improve computational efficiency, depthwise separable convolutions and convolution factorization are introduced to significantly decrease the number of model parameters. The proposed model is evaluated on the Inria Aerial Image Labeling Dataset and the Wuhan University (WHU) Aerial Building Dataset. The experimental results show that the proposed methods exhibit significant improvements compared with several state-of-the-art FCNs, including SegNet, U-Net, RefineNet, and DeepLab v3+. The proposed model shows promising potential for building detection from remote sensing images on a large scale.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords