Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency

Chang Yeop Lee; Dong Ju Kim; Young Joo Suh; Do Kyung Hwang

doi:10.1109/ACCESS.2024.3523912

IEEE Access (Jan 2025)

Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency

Chang Yeop Lee,
Dong Ju Kim,
Young Joo Suh,
Do Kyung Hwang

Affiliations

Chang Yeop Lee: ORCiD; Institute of Artificial Intelligence, Pohang University of Science and Technology, Pohang-si, South Korea
Dong Ju Kim: ORCiD; Institute of Artificial Intelligence, Pohang University of Science and Technology, Pohang-si, South Korea
Young Joo Suh: ORCiD; Institute of Artificial Intelligence, Pohang University of Science and Technology, Pohang-si, South Korea
Do Kyung Hwang: ORCiD; Korea Institute of Robotics and Technology Convergence, Pohang-si, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3523912
Journal volume & issue: Vol. 13
pp. 2763 – 2782

Abstract

Read online

This paper introduces a novel knowledge distillation (KD) framework for monocular depth estimation (MDE), incorporating dynamic weight adaptation to address critical challenges. The proposed approach effectively mitigates visual limitations, including blurred object boundaries and discontinuous artifacts, while preserving quantitative performance. Beyond addressing these visual challenges, the innovative KD framework reduces the complexity of depth information, thereby significantly enhancing computational efficiency. To validate the effectiveness of the proposed framework, extensive comparative evaluations were performed using state-of-the-art models, including AdaBins, LocalBins, BinsFormer, PixelFormer, and ZoeDepth. These evaluations were conducted on benchmark datasets, including NYU Depth V2 and SUN RGB-D for indoor environments and KITTI for outdoor scenarios, to ensure a rigorous and comprehensive assessment of robustness and generalization capabilities. The results demonstrate that the proposed KD framework outperforms existing methods in visual quality across all datasets while achieving notable computational benefits, including a 15.45% reduction in Floating Point Operations (FLOPs) for the LocalBins model and 7.72% for the ZoeDepth model.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords