Frontiers in Computer Science (Jan 2025)

Monocular 3D object detection for occluded targets based on spatial relationships and decoupled depth predictions

  • Yanfei Gao,
  • Xiongwei Miao,
  • Guoye Zhang

DOI
https://doi.org/10.3389/fcomp.2024.1382080
Journal volume & issue
Vol. 6

Abstract

Read online

Autonomous driving is the future trend. Accurate 3D object detection is a prerequisite for achieving autonomous driving. Currently, 3D object detection relies on three main sensors: monocular cameras, stereo cameras, and lidar. In comparison to methods based on stereo cameras and lidar, monocular 3D object detection offers advantages such as a broad detection field and low deployment costs. However, the accuracy of existing monocular 3D object detection methods is not ideal, especially for occluded targets. To tackle this challenge, the paper introduces a novel approach for monocular 3D object detection, denoted as SRDDP-M3D, aiming to improve monocular 3D object detection by considering spatial relationships between targets, and by refining depth predictions through a decoupled approach. We consider how objects are positioned relative to each other in the environment and encode the spatial relationships between neighboring objects, the detection performance is enhanced specially for occluded targets. Furthermore, a strategy of decoupling the prediction of target depth into two components of target visual depth and target attribute depth is introduced. This decoupling is designed to improve the accuracy of predicting the overall depth of the target. Experimental results using the KITTI dataset demonstrate that this approach substantially enhances the detection accuracy of occluded targets.

Keywords