Voxel self-attention and center-point for 3D object detector

Likang Fan; Jie Cao; Xulei Liu; Xianyong Li; Liting Deng; Hongwei Sun; Yiqiang Peng

iScience (Sep 2024)

Voxel self-attention and center-point for 3D object detector

Likang Fan,
Jie Cao,
Xulei Liu,
Xianyong Li,
Liting Deng,
Hongwei Sun,
Yiqiang Peng

Affiliations

Likang Fan: Vehicle Measurement, Control and Safety Key Laboratory of Sichuan Province, Xihua University, Sichuan 100089, China
Jie Cao: Vehicle Measurement, Control and Safety Key Laboratory of Sichuan Province, Xihua University, Sichuan 100089, China
Xulei Liu: Vehicle Measurement, Control and Safety Key Laboratory of Sichuan Province, Xihua University, Sichuan 100089, China
Xianyong Li: Vehicle Measurement, Control and Safety Key Laboratory of Sichuan Province, Xihua University, Sichuan 100089, China
Liting Deng: Vehicle Measurement, Control and Safety Key Laboratory of Sichuan Province, Xihua University, Sichuan 100089, China
Hongwei Sun: Guangdong Xinbao Electrical Appliances Holdings CO, LTD, Longzhou Road, Leliu Town, Shunde District, Foshan City, Guangdong, P.R. China
Yiqiang Peng: Vehicle Measurement, Control and Safety Key Laboratory of Sichuan Province, Xihua University, Sichuan 100089, China; Corresponding author

Journal volume & issue: Vol. 27, no. 9
p. 110759

Abstract

Read online

Summary: With the advancement of autonomous driving, the industrial demand for 3D object detection has continuously increased, leading to the development of anchor-based LiDAR object detectors reliant on convolutional neural networks (CNN). However, on the one hand, the poor receptive field of CNN limits the understanding of the scene. On the other hand, anchor-based methods cannot accurately predict the posture of objects in the steering. Therefore, in this paper, we propose the voxel self-attention and center-point (VSAC). Firstly, a voxel self-attention network is designed into VSAC to capture extensive voxel relationship. Secondly, considering the impact of feature weight on prediction results, the pseudo spatiotemporal feature pyramid net (PST-FPN) is proposed. Finally, we employ a center-point detection head to make the prediction direction closer to the real object during steering. The experimental results of VSAC on the widely used KITTI dataset, Waymo Open Dataset, and nuScenes dataset demonstrate its positive performance.

Published in iScience

ISSN: 2589-0042 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Science
Website: http://www.cell.com/iscience/home

About the journal

Abstract

Keywords