Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages

Jianfeng Huang; Xinchang Zhang; Ying Sun; Qinchuan Xin

doi:10.1109/JSTARS.2021.3073935

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (Jan 2021)

Attention-Guided Label Refinement Network for Semantic Segmentation of Very High Resolution Aerial Orthoimages

Jianfeng Huang,
Xinchang Zhang,
Ying Sun,
Qinchuan Xin

Affiliations

Jianfeng Huang: ORCiD; Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Atmospheric Sciences, Sun Yat-sen University, Zhuhai, China
Xinchang Zhang: ORCiD; School of Geography and Remote Sensing, Guangzhou University, Guangzhou, China
Ying Sun: ORCiD; Guangdong Key Laboratory for Urbanization and Geo-simulation and the School of Geography and Planning, Sun Yat-sen University, Guangzhou, China
Qinchuan Xin: ORCiD; Guangdong Key Laboratory for Urbanization and Geo-simulation and the School of Geography and Planning, Sun Yat-sen University, Guangzhou, China

DOI: https://doi.org/10.1109/JSTARS.2021.3073935
Journal volume & issue: Vol. 14
pp. 4490 – 4503

Abstract

Read online

The recent applications of fully convolutional networks (FCNs) have shown to improve the semantic segmentation of very high resolution (VHR) remote-sensing images because of the excellent feature representation and end-to-end pixel labeling capabilities. While many FCN-based methods concatenate features from multilevel encoding stages to refine the coarse labeling results, the semantic gap between features of different levels and the selection of representative features are often overlooked, leading to the generation of redundant information and unexpected classification results. In this article, we propose an attention-guided label refinement network (ALRNet) for improved semantic labeling of VHR images. ALRNet follows the paradigm of the encoder-decoder architecture, which progressively refines the coarse labeling maps of different scales by using the channelwise attention mechanism. A novel attention-guided feature fusion module based on the squeeze-and-excitation module is designed to fuse higher level and lower level features. In this way, the semantic gaps among features of different levels are declined, and the category discrimination of each pixel in the lower level features is strengthened, which is helpful for subsequent label refinement. ALRNet is tested on three public datasets, including two ISRPS 2-D labeling datasets and the Wuhan University aerial building dataset. Results demonstrated that ALRNet had shown promising segmentation performance in comparison with state-of-the-art deep learning networks. The source code of ALRNet is made publicly available for further studies.

Published in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

ISSN: 1939-1404 (Print); 2151-1535 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Ocean engineering; Science: Physics: Geophysics. Cosmic physics
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=4609443

About the journal

Abstract

Keywords