iScience (Sep 2024)
Voxel self-attention and center-point for 3D object detector
Abstract
Summary: With the advancement of autonomous driving, the industrial demand for 3D object detection has continuously increased, leading to the development of anchor-based LiDAR object detectors reliant on convolutional neural networks (CNN). However, on the one hand, the poor receptive field of CNN limits the understanding of the scene. On the other hand, anchor-based methods cannot accurately predict the posture of objects in the steering. Therefore, in this paper, we propose the voxel self-attention and center-point (VSAC). Firstly, a voxel self-attention network is designed into VSAC to capture extensive voxel relationship. Secondly, considering the impact of feature weight on prediction results, the pseudo spatiotemporal feature pyramid net (PST-FPN) is proposed. Finally, we employ a center-point detection head to make the prediction direction closer to the real object during steering. The experimental results of VSAC on the widely used KITTI dataset, Waymo Open Dataset, and nuScenes dataset demonstrate its positive performance.