Dynamic High-Resolution Network for Semantic Segmentation in Remote-Sensing Images

Shichen Guo; Qi Yang; Shiming Xiang; Pengfei Wang; Xuezhi Wang

doi:10.3390/rs15092293

Remote Sensing (Apr 2023)

Dynamic High-Resolution Network for Semantic Segmentation in Remote-Sensing Images

Shichen Guo,
Qi Yang,
Shiming Xiang,
Pengfei Wang,
Xuezhi Wang

Affiliations

Shichen Guo: Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
Qi Yang: University of Chinese Academy of Sciences, Beijing 100049, China
Shiming Xiang: University of Chinese Academy of Sciences, Beijing 100049, China
Pengfei Wang: Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China
Xuezhi Wang: Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China

DOI: https://doi.org/10.3390/rs15092293
Journal volume & issue: Vol. 15, no. 9
p. 2293

Abstract

Read online

Semantic segmentation of remote-sensing (RS) images is one of the most fundamental tasks in the understanding of a remote-sensing scene. However, high-resolution RS images contain plentiful detailed information about ground objects, which scatter everywhere spatially and have variable sizes, styles, and visual appearances. Due to the high similarity between classes and diversity within classes, it is challenging to obtain satisfactory and accurate semantic segmentation results. This paper proposes a Dynamic High-Resolution Network (DyHRNet) to solve this problem. Our proposed network takes HRNet as a super-architecture, aiming to leverage the important connections and channels by further investigating the parallel streams at different resolution representations of the original HRNet. The learning task is conducted under the framework of a neural architecture search (NAS) and channel-wise attention module. Specifically, the Accelerated Proximal Gradient (APG) algorithm is introduced to iteratively solve the sparse regularization subproblem from the perspective of neural architecture search. In this way, valuable connections are selected for cross-resolution feature fusion. In addition, a channel-wise attention module is designed to weight the channel contributions for feature aggregation. Finally, DyHRNet fully realizes the dynamic advantages of data adaptability by combining the APG algorithm and channel-wise attention module simultaneously. Compared with nine classical or state-of-the-art models (FCN, UNet, PSPNet, DeepLabV3+, OCRNet, SETR, SegFormer, HRNet+FCN, and HRNet+OCR), DyHRNet has shown high performance on three public challenging RS image datasets (Vaihingen, Potsdam, and LoveDA). Furthermore, the visual segmentation results, the learned structures, the iteration process analysis, and the ablation study all demonstrate the effectiveness of our proposed model.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords