International Journal of Applied Earth Observations and Geoinformation (Apr 2024)

Dynamic clustering transformer network for point cloud segmentation

  • Dening Lu,
  • Jun Zhou,
  • Kyle (Yilin) Gao,
  • Jing Du,
  • Linlin Xu,
  • Jonathan Li

Journal volume & issue
Vol. 128
p. 103791

Abstract

Read online

Point cloud segmentation is one of the most important tasks in LiDAR remote sensing with widespread scientific, industrial, and commercial applications. The research thereof has resulted in many breakthroughs in 3D object and scene understanding. Existing methods typically utilize hierarchical architectures for feature representation. However, the commonly used sampling and grouping methods in hierarchical networks are not only time-consuming but also limited to point-wise 3D coordinates, ignoring the local semantic homogeneity of point clusters. To address these issues, we propose a novel 3D point cloud representation network, called Dynamic Clustering Transformer Network (DCTNet). It has an encoder–decoder architecture, allowing for both local and global feature learning. Specifically, the encoder consists of a series of dynamic clustering-based Local Feature Aggregating (LFA) blocks and Transformer-based Global Feature Learning (GFL) blocks. In the LFA block, we propose novel semantic feature-based dynamic sampling and clustering methods, which enable the model to be aware of local semantic homogeneity for local feature aggregation. Furthermore, instead of traditional interpolation approaches, we propose a new semantic feature-guided upsampling method in the decoder for dense prediction. To our knowledge, DCTNet is the first work to introduce semantic information-based dynamic clustering into 3D Transformers. Extensive experiments on an object-based dataset (ShapeNet), and an airborne multispectral LiDAR dataset demonstrate the State-of-the-Art (SOTA) segmentation performance of DCTNet in terms of both accuracy and efficiency. Our code will be made publicly available.

Keywords