IEEE Access (Jan 2024)

Optimization-Based Monocular 3D Object Tracking via Combined Ellipsoid-Cuboid Representation

  • Gyeong Chan Kim,
  • Youngseok Jang,
  • H. Jin Kim

DOI
https://doi.org/10.1109/ACCESS.2024.3438162
Journal volume & issue
Vol. 12
pp. 109281 – 109292

Abstract

Read online

Monocular 3D object tracking is a challenging task because monocular image lacks depth information necessary for 3D scene understanding. Modern methods typically rely on deep learning to reconstruct 3D information from learned prior, which demands strenuous effort on acquiring ground-truth annotated data and does not generalize for various camera settings. We present a method to continuously track 3D location and orientation of the target object from a monocular image sequence from 2D instance segmentation methods. We reconstruct the structure and trajectory of the objects using factor graph optimization incorporating reprojection error of keypoint tracks, kinematic motion model and bounding box constraints. We propose a combined ellipsoid-cuboid object representation and bounding box constraint to model the object dimension. We evaluate our algorithm in simulation dataset generated using CARLA, and the result indicates that the method is robust to 2D bounding box error and the proposed object representation yields more accurate pose and size estimation compared to solely using either representation.

Keywords