Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation

Soon-Yong Park; Chang-Min Son; Won-Jae Jeong; Sieun Park

doi:10.3390/app13020693

Applied Sciences (Jan 2023)

Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation

Soon-Yong Park,
Chang-Min Son,
Won-Jae Jeong,
Sieun Park

Affiliations

Soon-Yong Park: School of Electronics Engineering, Kyungpook National University, Daegu 41566, Republic of Korea
Chang-Min Son: Graduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of Korea
Won-Jae Jeong: Graduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of Korea
Sieun Park: Graduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of Korea

DOI: https://doi.org/10.3390/app13020693
Journal volume & issue: Vol. 13, no. 2
p. 693

Abstract

Read online

Estimating the three-dimensional (3D) pose of real objects using only a single RGB image is an interesting and difficult topic. This study proposes a new pipeline to estimate and represent the pose of an object in an RGB image only with the 4-DoF annotation to a matching CAD model. The proposed method retrieves CAD candidates from the ShapeNet dataset and utilizes the pose-constrained 2D renderings of the candidates to find the best matching CAD model. The pose estimation pipeline consists of several steps of learned networks followed by image similarity measurements. First, from a single RGB image, the category and the object region are determined and segmented. Second, the 3-DoF rotational pose of the object is estimated by a learned pose-contrast network only using the segmented object region. Thus, 2D rendering images of CAD candidates are generated based on the rotational pose result. Finally, an image similarity measurement is performed to find the best matching CAD model and to determine the 1-DoF focal length of the camera to align the model with the object. Conventional pose estimation methods employ the 9-DoF pose parameters due to the unknown scale of both image object and CAD model. However, this study shows that only 4-DoF annotation parameters between real object and CAD model is enough to facilitates the projection of the CAD model to the RGB space for image-graphic applications such as Extended Reality. In the experiments, performance of the proposed method is analyzed by using ground truth and comparing with a triplet-loss learning method.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords