Cyborg and Bionic Systems (Jan 2024)
Camera–Radar Fusion with Modality Interaction and Radar Gaussian Expansion for 3D Object Detection
Abstract
The fusion of millimeter-wave radar and camera modalities is crucial for improving the accuracy and completeness of 3-dimensional (3D) object detection. Most existing methods extract features from each modality separately and conduct fusion with specifically designed modules, potentially resulting in information loss during modality transformation. To address this issue, we propose a novel framework for 3D object detection that iteratively updates radar and camera features through an interaction module. This module serves a dual purpose by facilitating the fusion of multi-modal data while preserving the original features. Specifically, radar and image features are sampled and aggregated with a set of sparse 3D object queries, while retaining the integrity of the original radar features to prevent information loss. Additionally, an innovative radar augmentation technique named Radar Gaussian Expansion is proposed. This module allocates radar measurements within each voxel to neighboring ones as a Gaussian distribution, reducing association errors during projection and enhancing detection accuracy. Our proposed framework offers a comprehensive solution to the fusion of radar and camera data, ultimately leading to heightened accuracy and completeness in 3D object detection processes. On the nuScenes test benchmark, our camera–radar fusion method achieves state-of-the-art 3D object detection results with a 41.6% mean average precision and 52.5% nuScenes detection score.