Pushing the limit to near real-time indoor LiDAR-based semantic segmentation

P. Bournez; J. Salzinger; M. Cella; F. Vultaggio; F. d’Apolito; P. Fanta-Jende

doi:10.5194/isprs-archives-XLVIII-2-W8-2024-45-2024

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (Dec 2024)

Pushing the limit to near real-time indoor LiDAR-based semantic segmentation

P. Bournez,
J. Salzinger,
M. Cella,
F. Vultaggio,
F. d’Apolito,
P. Fanta-Jende

Affiliations

P. Bournez: AIT, Austrian Institute of Technology - Center for Vision, Automation and Control, Unit Assistive and Autonomous Systems, Austria
J. Salzinger: AIT, Austrian Institute of Technology - Center for Vision, Automation and Control, Unit Assistive and Autonomous Systems, Austria
M. Cella: AIT, Austrian Institute of Technology - Center for Vision, Automation and Control, Unit Assistive and Autonomous Systems, Austria
F. Vultaggio: AIT, Austrian Institute of Technology - Center for Vision, Automation and Control, Unit Assistive and Autonomous Systems, Austria
F. d’Apolito: AIT, Austrian Institute of Technology - Center for Vision, Automation and Control, Unit Assistive and Autonomous Systems, Austria
P. Fanta-Jende: AIT, Austrian Institute of Technology - Center for Vision, Automation and Control, Unit Assistive and Autonomous Systems, Austria

DOI: https://doi.org/10.5194/isprs-archives-XLVIII-2-W8-2024-45-2024
Journal volume & issue: Vol. XLVIII-2-W8-2024
pp. 45 – 52

Abstract

Read online

Semantic segmentation of indoor 3D point clouds is a critical technology for understanding three dimensional indoor environments, with significant applications in indoor navigation, positioning, and intelligent robotics. While real-time semantic segmentation is already a reality for images, existing classification pipelines for LiDAR point clouds assume a pre-existing map which relies on data collected from accurate but heavy sensors. However, this approach is impractical for high-level task planning and autonomous exploration, which benefits from a rapid 3D structure understanding of the environment. Furthermore, while RGB cameras remain a popular choice in good visibility conditions, such sensors are inefficient in environments where visibility is hindered. Consequently, LiDAR point clouds emerge as a rather reliable source of environmental information in such circumstances. In this paper, we adapt an existing semantic segmentation model, Superpoint Transformer, to LiDAR-based situation where RGB inputs are not available and near real-time processing is attempted. To this end, we simulated our robot’s trajectory and leveraged Hidden Point Removal using the open-source dataset S3DIS to train the model. We investigated various strategies such as modifying the interval prediction and thoroughly study its influence on the prediction intervals. Our model demonstrates an improvement from 40 to 67.6 mean Intersection over Union (mIoU) compared to the baseline on simple (floor, ceiling, walls) and complex (doors, windows) classes.

Published in The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences

ISSN: 1682-1750 (Print); 2194-9034 (Online)
Publisher: Copernicus Publications
Country of publisher: Germany
LCC subjects: Technology: Engineering (General). Civil engineering (General): Applied optics. Photonics
Website: http://www.isprs.org/publications/archives.aspx

About the journal