Monocular 3D object detection for occluded targets based on spatial relationships and decoupled depth predictions

Yanfei Gao; Xiongwei Miao; Guoye Zhang

doi:10.3389/fcomp.2024.1382080

Frontiers in Computer Science (Jan 2025)

Monocular 3D object detection for occluded targets based on spatial relationships and decoupled depth predictions

Yanfei Gao,
Xiongwei Miao,
Guoye Zhang

Affiliations

Yanfei Gao: Shanxi Finance and Taxation College, Taiyuan, China
Xiongwei Miao: Shanxi Intelligent Big Data Industry Technology Innovation Research Institute, Taiyuan, China
Guoye Zhang: Shanxi Provincial Digital Government Service Center, Taiyuan, China

DOI: https://doi.org/10.3389/fcomp.2024.1382080
Journal volume & issue: Vol. 6

Abstract

Read online

Autonomous driving is the future trend. Accurate 3D object detection is a prerequisite for achieving autonomous driving. Currently, 3D object detection relies on three main sensors: monocular cameras, stereo cameras, and lidar. In comparison to methods based on stereo cameras and lidar, monocular 3D object detection offers advantages such as a broad detection field and low deployment costs. However, the accuracy of existing monocular 3D object detection methods is not ideal, especially for occluded targets. To tackle this challenge, the paper introduces a novel approach for monocular 3D object detection, denoted as SRDDP-M3D, aiming to improve monocular 3D object detection by considering spatial relationships between targets, and by refining depth predictions through a decoupled approach. We consider how objects are positioned relative to each other in the environment and encode the spatial relationships between neighboring objects, the detection performance is enhanced specially for occluded targets. Furthermore, a strategy of decoupling the prediction of target depth into two components of target visual depth and target attribute depth is introduced. This decoupling is designed to improve the accuracy of predicting the overall depth of the target. Experimental results using the KITTI dataset demonstrate that this approach substantially enhances the detection accuracy of occluded targets.

Published in Frontiers in Computer Science

ISSN: 2624-9898 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/computer-science#

About the journal

Abstract

Keywords