Point cloud semantic segmentation with adaptive spatial structure graph transformer

Ting Han; Yiping Chen; Jin Ma; Xiaoxue Liu; Wuming Zhang; Xinchang Zhang; Huajuan Wang

International Journal of Applied Earth Observations and Geoinformation (Sep 2024)

Point cloud semantic segmentation with adaptive spatial structure graph transformer

Ting Han,
Yiping Chen,
Jin Ma,
Xiaoxue Liu,
Wuming Zhang,
Xinchang Zhang,
Huajuan Wang

Affiliations

Ting Han: School of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, 519082, China
Yiping Chen: School of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, 519082, China; Corresponding author.
Jin Ma: School of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, 519082, China
Xiaoxue Liu: Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University, Xiamen, 361005, China
Wuming Zhang: School of Geospatial Engineering and Science, Sun Yat-Sen University, Zhuhai, 519082, China
Xinchang Zhang: School of Geography and Remote Sensing, Guangzhou University, Guangzhou, 510006, China; College of Geography and Remote sensing Sciences, Xinjiang University, Urumqi, 830046, China; Guangdong Urban and Rural Planning and Construction Intelligent Service Engineering Technology Research Center, Guangzhou, 511300, China
Huajuan Wang: Zhuhai Surveying and Mapping Institution, Zhuhai, 519000, China

Journal volume & issue: Vol. 133
p. 104105

Abstract

Read online

With the rapid development of LiDAR and artificial intelligence technologies, 3D point cloud semantic segmentation has become a highlight research topic. This technology is able to significantly enhance the capabilities of building information modeling, navigation and environmental perception. However, current deep learning-based methods primarily rely on voxelization or multi-layer convolution for feature extraction. These methods often face challenges in effectively differentiating between homogeneous objects or structurally adherent targets in complex real-world scenes. To this end, we propose a Graph Transformer point cloud semantic segmentation network (ASGFormer) tailored for structurally adherent objects. Firstly, ASGFormer combines Graph and Transformer to promote global correlation understanding in the graph. Secondly, spatial index and position embedding are constructed based on distance relationships and feature differences. Through a learnable mechanism, the structural weights between points are dynamically adjusted, achieving adaptive spatial structure within the graph. Finally, dummy nodes are introduced to facilitate global information storage and transmission between layers, effectively addressing the issue of information loss at the terminal nodes of the graph. Comprehensive experiments are conducted on the various real-world 3D point cloud datasets, analyzing the effectiveness of proposed ASGFormer through qualitative and quantitative evaluations. ASGFormer outperforms existing approaches with of 91.3% for OA, 78.0% for mAcc, and 72.3% for mIoU on S3DIS dataset. Moreover, ASGFormer achieves 72.8%, 45.5%, 81.6%, 70.1% mIoU on ScanNet, City-Facade, Toronto 3D and Semantic KITTI dataset, respectively. Notably, the proposed method demonstrates effective differentiation of homogeneous structurally adherent objects, further contributing to the intelligent perception and modeling of complex scenes.

Published in International Journal of Applied Earth Observations and Geoinformation

ISSN: 1569-8432 (Print); 1872-826X (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Geography. Anthropology. Recreation: Physical geography; Geography. Anthropology. Recreation: Environmental sciences
Website: https://www.journals.elsevier.com/international-journal-of-applied-earth-observation-and-geoinformation

About the journal

Abstract

Keywords