GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution

Minh-Quan Viet Bui; Duc Tuan Ngo; Hoang-Anh Pham; Duc Dung Nguyen

doi:10.7717/peerj-cs.686

PeerJ Computer Science (Oct 2021)

GAC3D: improving monocular 3D object detection with ground-guide model and adaptive convolution

Minh-Quan Viet Bui,
Duc Tuan Ngo,
Hoang-Anh Pham,
Duc Dung Nguyen

Affiliations

Minh-Quan Viet Bui: Computer Science and Engineering Faculty, Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City, Vietnam
Duc Tuan Ngo: Computer Science and Engineering Faculty, Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City, Vietnam
Hoang-Anh Pham: Computer Science and Engineering Faculty, Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City, Vietnam
Duc Dung Nguyen: Computer Science and Engineering Faculty, Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City, Vietnam

DOI: https://doi.org/10.7717/peerj-cs.686
Journal volume & issue: Vol. 7
p. e686

Abstract

Read online Read online

Monocular 3D object detection has recently become prevalent in autonomous driving and navigation applications due to its cost-efficiency and easy-to-embed to existent vehicles. The most challenging task in monocular vision is to estimate a reliable object’s location cause of the lack of depth information in RGB images. Many methods tackle this ill-posed problem by directly regressing the object’s depth or take the depth map as a supplement input to enhance the model’s results. However, the performance relies heavily on the estimated depth map quality, which is bias to the training data. In this work, we propose depth-adaptive convolution to replace the traditional 2D convolution to deal with the divergent context of the image’s features. This lead to significant improvement in both training convergence and testing accuracy. Second, we propose a ground plane model that utilizes geometric constraints in the pose estimation process. With the new method, named GAC3D, we achieve better detection results. We demonstrate our approach on the KITTI 3D Object Detection benchmark, which outperforms existing monocular methods.

Published in PeerJ Computer Science

ISSN: 2376-5992 (Online)
Publisher: PeerJ Inc.
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://peerj.com/computer-science/

About the journal

Abstract

Keywords