IET Computer Vision (Apr 2024)

Point cloud semantic segmentation based on local feature fusion and multilayer attention network

  • Junjie Wen,
  • Jie Ma,
  • Yuehua Zhao,
  • Tong Nie,
  • Mengxuan Sun,
  • Ziming Fan

DOI
https://doi.org/10.1049/cvi2.12255
Journal volume & issue
Vol. 18, no. 3
pp. 381 – 392

Abstract

Read online

Abstract Semantic segmentation from a three‐dimensional point cloud is vital in autonomous driving, computer vision, and augmented reality. However, current semantic segmentation does not effectively use the point cloud's local geometric features and contextual information, essential for improving segmentation accuracy. A semantic segmentation network that uses local feature fusion and a multilayer attention mechanism is proposed to address these challenges. Specifically, the authors designed a local feature fusion module to encode the geometric and feature information separately, which fully leverages the point cloud's feature perception and geometric structure representation. Furthermore, the authors designed a multilayer attention pooling module consisting of local attention pooling and cascade attention pooling to extract contextual information. Local attention pooling is used to learn local neighbourhood information, and cascade attention pooling captures contextual information from deeper local neighbourhoods. Finally, an enhanced feature representation of important information is obtained by aggregating the features from the two deep attention pooling methods. Extensive experiments on large‐scale point‐cloud datasets Stanford 3D large‐scale indoor spaces and SemanticKITTI indicate that authors network shows excellent advantages over existing representative methods regarding local geometric feature description and global contextual relationships.

Keywords