An Object-Aware Network Embedding Deep Superpixel for Semantic Segmentation of Remote Sensing Images

Ziran Ye; Yue Lin; Baiyu Dong; Xiangfeng Tan; Mengdi Dai; Dedong Kong

doi:10.3390/rs16203805

Remote Sensing (Oct 2024)

An Object-Aware Network Embedding Deep Superpixel for Semantic Segmentation of Remote Sensing Images

Ziran Ye,
Yue Lin,
Baiyu Dong,
Xiangfeng Tan,
Mengdi Dai,
Dedong Kong

Affiliations

Ziran Ye: Institute of Digital Agriculture, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
Yue Lin: School of Spatial Planning and Design, Hangzhou City University, Hangzhou 310015, China
Baiyu Dong: College of Environmental and Resource Sciences, Zhejiang University, Hangzhou 310058, China
Xiangfeng Tan: Institute of Digital Agriculture, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
Mengdi Dai: Institute of Digital Agriculture, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
Dedong Kong: Institute of Digital Agriculture, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China

DOI: https://doi.org/10.3390/rs16203805
Journal volume & issue: Vol. 16, no. 20
p. 3805

Abstract

Read online

Semantic segmentation forms the foundation for understanding very high resolution (VHR) remote sensing images, with extensive demand and practical application value. The convolutional neural networks (CNNs), known for their prowess in hierarchical feature representation, have dominated the field of semantic image segmentation. Recently, hierarchical vision transformers such as Swin have also shown excellent performance for semantic segmentation tasks. However, the hierarchical structure enlarges the receptive field to accumulate features and inevitably leads to the blurring of object boundaries. We introduce a novel object-aware network, Embedding deep SuperPixel, for VHR image semantic segmentation called ESPNet, which integrates advanced ConvNeXt and the learnable superpixel algorithm. Specifically, the developed task-oriented superpixel generation module can refine the results of the semantic segmentation branch by preserving object boundaries. This study reveals the capability of utilizing deep convolutional neural networks to accomplish both superpixel generation and semantic segmentation of VHR images within an integrated end-to-end framework. The proposed method achieved mIoU scores of 84.32, 90.13, and 55.73 on the Vaihingen, Potsdam, and LoveDA datasets, respectively. These results indicate that our model surpasses the current advanced methods, thus demonstrating the effectiveness of the proposed scheme.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords