CTSeg: CNN and ViT collaborated segmentation framework for efficient land-use/land-cover mapping with high-resolution remote sensing images

Jifa Chen; Gang Chen; Pin Zhou; Yufeng He; Lianzhe Yue; Mingjun Ding; Hui Lin

doi:10.1016/j.jag.2025.104546

International Journal of Applied Earth Observations and Geoinformation (May 2025)

CTSeg: CNN and ViT collaborated segmentation framework for efficient land-use/land-cover mapping with high-resolution remote sensing images

Jifa Chen,
Gang Chen,
Pin Zhou,
Yufeng He,
Lianzhe Yue,
Mingjun Ding,
Hui Lin

Affiliations

Jifa Chen: Key Laboratory of Poyang Lake Wetland and Watershed Research of Ministry of Education & School of Geography and Environment, Jiangxi Normal University, Nanchang 330022, China; College of Marine Science and Technology, China University of Geosciences, Wuhan 430074, China
Gang Chen: College of Marine Science and Technology, China University of Geosciences, Wuhan 430074, China
Pin Zhou: College of Marine Science and Technology, China University of Geosciences, Wuhan 430074, China; School of Hydraulic Engineering, Nanchang Institute of Technology, Nanchang 330022, China
Yufeng He: Key Laboratory of Poyang Lake Wetland and Watershed Research of Ministry of Education & School of Geography and Environment, Jiangxi Normal University, Nanchang 330022, China
Lianzhe Yue: College of Marine Science and Technology, China University of Geosciences, Wuhan 430074, China; School of Marine Technology and Geomatics, Jiangsu Ocean University, Lianyungang 222005, China
Mingjun Ding: Key Laboratory of Poyang Lake Wetland and Watershed Research of Ministry of Education & School of Geography and Environment, Jiangxi Normal University, Nanchang 330022, China; Corresponding author.
Hui Lin: Key Laboratory of Poyang Lake Wetland and Watershed Research of Ministry of Education & School of Geography and Environment, Jiangxi Normal University, Nanchang 330022, China; Institute of Space and Earth Information Science, The Chinese University of Hong Kong, Shatin 999077, Hong Kong, China

DOI: https://doi.org/10.1016/j.jag.2025.104546
Journal volume & issue: Vol. 139
p. 104546

Abstract

Read online

Semantic segmentation models present significant work in land-use/land-cover (LULC) mapping. Even though vision transformers (ViT) with long-sequence interactions have recently emerged as popular solutions alongside convolutional neural networks (CNN), they remain less effective for high-resolution remote sensing data characterized by small volumes and rich heterogeneities. In this paper, we propose a novel CNN and ViT collaborated segmentation framework (CTSeg) to address these weaknesses. Following the encoder-decoder architecture, we first introduce an encoding backbone with multifarious attention mechanisms to respectively capture global and local contexts. It is designed with parallel dual branches where the position-relation aggregation (PRA) blocks and others with channel relations (CRA) form the CNN-based encoding module, whereas the ViT-based one comprises multi-stage window-shifted transformer (WST) blocks with cross-window interactions. We further explore the online knowledge distillation presented with pixel-wise and channel-wise feature distillation modules to facilitate bidirectional learning between the CNN and ViT backbones, supported by a well-designed loss decay strategy. In addition, we develop a multiscale feature decoding module to produce more high-quality segmentation predictions where the leveraged correlation-weighted fusions emphasize the heterogeneous feature representations. Extensive comparison and ablation studies on two benchmark datasets demonstrate its competitive performance in efficient LULC mapping.

Published in International Journal of Applied Earth Observations and Geoinformation

ISSN: 1569-8432 (Print); 1872-826X (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Geography. Anthropology. Recreation: Physical geography; Geography. Anthropology. Recreation: Environmental sciences
Website: https://www.journals.elsevier.com/international-journal-of-applied-earth-observation-and-geoinformation

About the journal

Abstract

Keywords