Camera–Radar Fusion with Modality Interaction and Radar Gaussian Expansion for 3D Object Detection

Xiang Liu; Zhenglin Li; Yang Zhou; Yan Peng; Jun Luo

doi:10.34133/cbsystems.0079

Cyborg and Bionic Systems (Jan 2024)

Camera–Radar Fusion with Modality Interaction and Radar Gaussian Expansion for 3D Object Detection

Xiang Liu,
Zhenglin Li,
Yang Zhou,
Yan Peng,
Jun Luo

Affiliations

Xiang Liu: Institute of Artificial Intelligence, Shanghai University, Shanghai, China.
Zhenglin Li: Institute of Artificial Intelligence, Shanghai University, Shanghai, China.
Yang Zhou: Institute of Artificial Intelligence, Shanghai University, Shanghai, China.
Yan Peng: Institute of Artificial Intelligence, Shanghai University, Shanghai, China.
Jun Luo: Institute of Artificial Intelligence, Shanghai University, Shanghai, China.

DOI: https://doi.org/10.34133/cbsystems.0079
Journal volume & issue: Vol. 5

Abstract

Read online

The fusion of millimeter-wave radar and camera modalities is crucial for improving the accuracy and completeness of 3-dimensional (3D) object detection. Most existing methods extract features from each modality separately and conduct fusion with specifically designed modules, potentially resulting in information loss during modality transformation. To address this issue, we propose a novel framework for 3D object detection that iteratively updates radar and camera features through an interaction module. This module serves a dual purpose by facilitating the fusion of multi-modal data while preserving the original features. Specifically, radar and image features are sampled and aggregated with a set of sparse 3D object queries, while retaining the integrity of the original radar features to prevent information loss. Additionally, an innovative radar augmentation technique named Radar Gaussian Expansion is proposed. This module allocates radar measurements within each voxel to neighboring ones as a Gaussian distribution, reducing association errors during projection and enhancing detection accuracy. Our proposed framework offers a comprehensive solution to the fusion of radar and camera data, ultimately leading to heightened accuracy and completeness in 3D object detection processes. On the nuScenes test benchmark, our camera–radar fusion method achieves state-of-the-art 3D object detection results with a 41.6% mean average precision and 52.5% nuScenes detection score.

Published in Cyborg and Bionic Systems

ISSN: 2692-7632 (Online)
Publisher: American Association for the Advancement of Science (AAAS)
Country of publisher: United States
LCC subjects: Science: Science (General): Cybernetics
Website: https://spj.sciencemag.org/journals/cbsystems/

About the journal