Remote Sensing (Apr 2025)
SCRM-Net: Self-Supervised Deep Clustering Feature Representation for Urban 3D Mesh Semantic Segmentation
Abstract
Semantic urban 3D meshes obtained by deep learning networks have been widely applied in urban analytics. Typically, a large amount of labeled samples are required to train a deep learning network to extract discriminative features for the semantic segmentation of urban 3D mesh. However, it is labor intensive and time consuming to obtain enough labeled samples due to the complexity of urban 3D scenes. To obtain discriminative features without extensive labeled data, we propose a novel self-supervised deep clustering feature learning network, named SCRM-Net. The proposed SCRM-Net consists of two mutually self-supervised branches: one branch utilizes autoencoder to learn intrinsic feature representations of urban 3D Mesh, while the other applies GCN to capture the structural relationships between them. During the semantic segmentation process, only a limited proportion of the labeled samples is required to fine-tune the pretrained encoder of SCRM-Net for discriminative feature extraction and to train the segmentation head consisting of two edge convolution layers. Extensive comparative experiments demonstrate the effectiveness of our approach and show its competitiveness against the state-of-the-art semantic segmentation methods.
Keywords