CenterTransFuser: radar point cloud and visual information fusion for 3D object detection

Yan Li; Kai Zeng; Tao Shen

doi:10.1186/s13634-022-00944-6

EURASIP Journal on Advances in Signal Processing (Jan 2023)

CenterTransFuser: radar point cloud and visual information fusion for 3D object detection

Yan Li,
Kai Zeng,
Tao Shen

Affiliations

Yan Li: School of Information Engineering and Automation, Kunming University of Science and Technology
Kai Zeng: School of Information Engineering and Automation, Kunming University of Science and Technology
Tao Shen: School of Information Engineering and Automation, Kunming University of Science and Technology

DOI: https://doi.org/10.1186/s13634-022-00944-6
Journal volume & issue: Vol. 2023, no. 1
pp. 1 – 23

Abstract

Read online

Abstract Sensor fusion is an important component of the perception system in autonomous driving, and the fusion of radar point cloud information and camera visual information can improve the perception capability of autonomous vehicles. However, most of the existing studies ignore the extraction of local neighborhood information and only consider shallow fusion between the two modalities based on the extracted global information, which cannot perform a deep fusion of cross-modal contextual information interaction. Meanwhile, in data preprocessing, the noise in radar data is usually only filtered by the depth information derived from image feature prediction, and such methods affect the accuracy of radar branching to generate regions of interest and cannot effectively filter out irrelevant information of radar points. This paper proposes the CenterTransFuser model that makes full use of millimeter-wave radar point cloud information and visual information to enable cross-modal fusion of the two heterogeneous information. Specifically, a new interaction called cross-transformer is explored, which cooperatively exploits cross-modal cross-multiple attention and joint cross-multiple attention to mine radar and image complementary information. Meanwhile, an adaptive depth thresholding filtering method is designed to reduce the noise of radar modality-independent information projected onto the image. The CenterTransFuser model is evaluated on the challenging nuScenes dataset, and it achieves excellent performance. Particularly, the detection accuracy is significantly improved for pedestrians, motorcycles, and bicycles, showing the superiority and effectiveness of the proposed model.

Published in EURASIP Journal on Advances in Signal Processing

ISSN: 1687-6172 (Print); 1687-6180 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication; Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics
Website: https://asp-eurasipjournals.springeropen.com

About the journal

Abstract

Keywords