A Unified Framework for Depth-Assisted Monocular Object Pose Estimation

Dinh-Cuong Hoang; Phan Xuan Tan; Thu-Uyen Nguyen; Hai-Nam Pham; Chi-Minh Nguyen; Son-Anh Bui; Quang-Tri Duong; van-Duc Vu; van-Thiep Nguyen; van-Hiep Duong; Ngoc-Anh Hoang; Khanh-Toan Phan; Duc-Thanh Tran; Ngoc-Trung Ho; Cong-Trinh Tran

doi:10.1109/ACCESS.2024.3443148

IEEE Access (Jan 2024)

A Unified Framework for Depth-Assisted Monocular Object Pose Estimation

Dinh-Cuong Hoang,
Phan Xuan Tan,
Thu-Uyen Nguyen,
Hai-Nam Pham,
Chi-Minh Nguyen,
Son-Anh Bui,
Quang-Tri Duong,
van-Duc Vu,
van-Thiep Nguyen,
van-Hiep Duong,
Ngoc-Anh Hoang,
Khanh-Toan Phan,
Duc-Thanh Tran,
Ngoc-Trung Ho,
Cong-Trinh Tran

Affiliations

Dinh-Cuong Hoang: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Phan Xuan Tan: ORCiD; College of Engineering, Shibaura Institute of Technology, Tokyo, Japan
Thu-Uyen Nguyen: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Hai-Nam Pham: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Chi-Minh Nguyen: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Son-Anh Bui: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
Quang-Tri Duong: ORCiD; Greenwich Vietnam, FPT University, Hanoi, Vietnam
van-Duc Vu: ORCiD; IT Department, FPT University, Hanoi, Vietnam
van-Thiep Nguyen: ORCiD; IT Department, FPT University, Hanoi, Vietnam
van-Hiep Duong: ORCiD; IT Department, FPT University, Hanoi, Vietnam
Ngoc-Anh Hoang: ORCiD; IT Department, FPT University, Hanoi, Vietnam
Khanh-Toan Phan: ORCiD; IT Department, FPT University, Hanoi, Vietnam
Duc-Thanh Tran: ORCiD; IT Department, FPT University, Hanoi, Vietnam
Ngoc-Trung Ho: ORCiD; IT Department, FPT University, Hanoi, Vietnam
Cong-Trinh Tran: ORCiD; IT Department, FPT University, Hanoi, Vietnam

DOI: https://doi.org/10.1109/ACCESS.2024.3443148
Journal volume & issue: Vol. 12
pp. 111723 – 111740

Abstract

Read online

Monocular Depth Estimation (MDE) and Object Pose Estimation (OPE) are important tasks in visual scene understanding. Traditionally, these challenges have been addressed independently, with separate deep neural networks designed for each task. However, we contend that MDE, which provides information about object distances from the camera, and OPE, which focuses on determining precise object position and orientation, are inherently connected. Combining these tasks in a unified approach facilitates the integration of spatial context, offering a more comprehensive understanding of object distribution in three-dimensional space. Consequently, this work addresses both challenges simultaneously, treating them as a multi-task learning problem. Our proposed solution is encapsulated in a Unified Framework for Depth-Assisted Monocular Object Pose Estimation. Leveraging Red-Green-Blue (RGB) images as input, our framework estimates pose of multiple object instances alongside an instance-level depth map. During training, we utilize both depth and color images, but during inference, the model relies exclusively on color images. To enhance the depth-aware features crucial for robust object pose estimation, we introduce a depth estimation branch supervised by depth images. These features undergo further refinement through a cross-task attention module, contributing to the innovation of our method in significantly improving feature discriminability and robustness in object pose estimation. Through extensive experiments, our approach demonstrates competitive performance compared to state-of-the-art methods in object pose estimation. Moreover, our method operates in real-time, underscoring its efficiency and practical applicability in various scenarios. This unified framework not only advances the state of the art in monocular depth estimation and object pose estimation but also underscores the potential of multi-task learning for enhancing the understanding of complex visual scenes.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords