Voxel- and Bird’s-Eye-View-Based Semantic Scene Completion for LiDAR Point Clouds

Li Liang; Naveed Akhtar; Jordan Vice; Ajmal Mian

doi:10.3390/rs16132266

Remote Sensing (Jun 2024)

Voxel- and Bird’s-Eye-View-Based Semantic Scene Completion for LiDAR Point Clouds

Li Liang,
Naveed Akhtar,
Jordan Vice,
Ajmal Mian

Affiliations

Li Liang: Department of Computer Science and Software Engineering, The University of Western Australia, 35 Stirling Hwy, Crawley, WA 6009, Australia
Naveed Akhtar: School of Computing and Information Systems, The University of Melbourne, Parkville, VIC 3052, Australia
Jordan Vice: Department of Computer Science and Software Engineering, The University of Western Australia, 35 Stirling Hwy, Crawley, WA 6009, Australia
Ajmal Mian: Department of Computer Science and Software Engineering, The University of Western Australia, 35 Stirling Hwy, Crawley, WA 6009, Australia

DOI: https://doi.org/10.3390/rs16132266
Journal volume & issue: Vol. 16, no. 13
p. 2266

Abstract

Read online

Semantic scene completion is a crucial outdoor scene understanding task that has direct implications for technologies like autonomous driving and robotics. It compensates for unavoidable occlusions and partial measurements in LiDAR scans, which may otherwise cause catastrophic failures. Due to the inherent complexity of this task, existing methods generally rely on complex and computationally demanding scene completion models, which limits their practicality in downstream applications. Addressing this, we propose a novel integrated network that combines the strengths of 3D and 2D semantic scene completion techniques for efficient LiDAR point cloud scene completion. Our network leverages a newly devised lightweight multi-scale convolutional block (MSB) to efficiently aggregate multi-scale features, thereby improving the identification of small and distant objects. It further utilizes a layout-aware semantic block (LSB), developed to grasp the overall layout of the scene to precisely guide the reconstruction and recognition of features. Moreover, we also develop a feature fusion module (FFM) for effective interaction between the data derived from two disparate streams in our network, ensuring a robust and cohesive scene completion process. Extensive experiments with the popular SemanticKITTI dataset demonstrate that our method achieves highly competitive performance, with an mIoU of 35.7 and an IoU of 51.4. Notably, the proposed method achieves an mIoU improvement of 2.6 % compared to previous methods.

Published in Remote Sensing

ISSN: 2072-4292 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/remotesensing/

About the journal

Abstract

Keywords