IEEE Access (Jan 2023)

Infrastructure 3D Target Detection Based on Multi-Mode Fusion for Intelligent and Connected Vehicles

  • Xiucai Zhang,
  • Lei He,
  • Rui Lv,
  • Changcheng Jin,
  • Yuhai Wang

DOI
https://doi.org/10.1109/ACCESS.2023.3292174
Journal volume & issue
Vol. 11
pp. 72803 – 72812

Abstract

Read online

Autonomous driving technology faces significant safety challenges due to the lack of a global perspective and the limitations of long-range perception capabilities. It is widely recognized that vehicle-infrastructure cooperation is essential to achieve Level 5 autonomy. Therefore, it is imperative to develop vehicle-road collaboration to enable accurate 3D target detection over a wide range and multiple targets infrastructure. In this paper, we propose using ResNet50+FPN as the backbone network and adding CoTNet and CBAM dual attention mechanisms to extract and encode four levels of image features. In point cloud feature extraction, We divide a point cloud into equally spaced 3D voxels and transforms a group of points within each voxel into a unified feature representation through the newly introduced voxel feature encoding (VFE) layer. In multi-mode fusion, we propose a simple and effective multi-mode fusion method based on regional point fusion and regional voxel fusion for multi-mode fusion. Additionally, the VoxelNet architecture is utilized to combine image features and point cloud features. The proposed algorithm is evaluated on the DAIR-V2X dataset in 3D and BEV perspectives, and results show a significant improvement in the average precision (AP) of vehicles, pedestrians, and cyclists in 3D object detection on the infrastructure side, in a wide area with multiple objects, compared to existing 3D object detection algorithms.

Keywords