IEEE Access (Jan 2025)
Improving Monocular Depth Estimation Through Knowledge Distillation: Better Visual Quality and Efficiency
Abstract
This paper introduces a novel knowledge distillation (KD) framework for monocular depth estimation (MDE), incorporating dynamic weight adaptation to address critical challenges. The proposed approach effectively mitigates visual limitations, including blurred object boundaries and discontinuous artifacts, while preserving quantitative performance. Beyond addressing these visual challenges, the innovative KD framework reduces the complexity of depth information, thereby significantly enhancing computational efficiency. To validate the effectiveness of the proposed framework, extensive comparative evaluations were performed using state-of-the-art models, including AdaBins, LocalBins, BinsFormer, PixelFormer, and ZoeDepth. These evaluations were conducted on benchmark datasets, including NYU Depth V2 and SUN RGB-D for indoor environments and KITTI for outdoor scenarios, to ensure a rigorous and comprehensive assessment of robustness and generalization capabilities. The results demonstrate that the proposed KD framework outperforms existing methods in visual quality across all datasets while achieving notable computational benefits, including a 15.45% reduction in Floating Point Operations (FLOPs) for the LocalBins model and 7.72% for the ZoeDepth model.
Keywords