Alexandria Engineering Journal (Oct 2024)

Enhancing 3D object detection through multi-modal fusion for cooperative perception

  • Bin Xia,
  • Jun Zhou,
  • Fanyu Kong,
  • Yuhe You,
  • Jiarui Yang,
  • Lin Lin

Journal volume & issue
Vol. 104
pp. 46 – 55

Abstract

Read online

Fueled by substantial advancements in deep learning, the domain of autonomous driving is swiftly advancing towards more robust and effective intelligent systems. One of the critical challenges in this field is achieving accurate 3D object detection, which is often hindered by data sparsity and occlusion. To address these issues, we propose a method centered around a multi-modal fusion strategy that leverages vehicle-road cooperation to enhance perception capabilities. Our approach integrates label information from roadside perception point clouds to harmonize and enrich the representation of image and LiDAR data. This comprehensive integration significantly improves detection accuracy by providing a fuller understanding of the surrounding environment. Rigorous evaluations of our proposed method on two benchmark datasets, KITTI and Waymo Open, demonstrate its superior performance, with our model achieving 87.52% 3D Average Precision (3D AP) and 93.71% Bird’s Eye View Average Precision (BEV AP) on the KITTI val set. These results highlight the effectiveness of our method in detecting sparse and distant objects, contributing to the development of safer and more efficient autonomous driving solutions.

Keywords