Breaking New Ground in Monocular Depth Estimation with Dynamic Iterative Refinement and Scale Consistency

Akmalbek Abdusalomov; Sabina Umirzakova; Makhkamov Bakhtiyor Shukhratovich; Azamat Kakhorov; Young-Im Cho

doi:10.3390/app15020674

Applied Sciences (Jan 2025)

Breaking New Ground in Monocular Depth Estimation with Dynamic Iterative Refinement and Scale Consistency

Akmalbek Abdusalomov,
Sabina Umirzakova,
Makhkamov Bakhtiyor Shukhratovich,
Azamat Kakhorov,
Young-Im Cho

Affiliations

Akmalbek Abdusalomov: Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Republic of Korea
Sabina Umirzakova: Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Republic of Korea
Makhkamov Bakhtiyor Shukhratovich: Department of Computer Systems, Tashkent University of Information Technologies Named After Muhammad Al-Khwarizmi, Tashkent 100200, Uzbekistan
Azamat Kakhorov: Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
Young-Im Cho: Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Republic of Korea

DOI: https://doi.org/10.3390/app15020674
Journal volume & issue: Vol. 15, no. 2
p. 674

Abstract

Read online

Monocular depth estimation (MDE) is a critical task in computer vision with applications in autonomous driving, robotics, and augmented reality. However, predicting depth from a single image poses significant challenges, especially in dynamic scenes where moving objects introduce scale ambiguity and inaccuracies. In this paper, we propose the Dynamic Iterative Monocular Depth Estimation (DI-MDE) framework, which integrates an iterative refinement process with a novel scale-alignment module to address these issues. Our approach combines elastic depth bins that adjust dynamically based on uncertainty estimates with a scale-alignment mechanism to ensure consistency between static and dynamic regions. Leveraging self-supervised learning, DI-MDE does not require ground truth depth labels, making it scalable and applicable to real-world environments. Experimental results on standard datasets such as SUN RGB-D and KITTI demonstrate that our method achieves state-of-the-art performance, significantly improving depth prediction accuracy in dynamic scenes. This work contributes a robust and efficient solution to the challenges of monocular depth estimation, offering advancements in both depth refinement and scale consistency.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords