International Journal of Information Management Data Insights (Apr 2024)
Deciphering pixel insights: A deep dive into deep learning strategies for enhanced indoor depth estimation
Abstract
Depth estimation is one of the crucial tasks for autonomous systems, which provides important information about the distance between the system and its surroundings. Traditionally, Light Detection and Ranging and stereo cameras have been used for distance measurement, despite the significant cost. In contrast, monocular cameras offer a more cost-effective solution, but lack inherent depth information. The synergy of big data and deep learning has led to various advanced architectures for monocular depth estimation. However, due to the characteristics of the monocular depth estimation case that is ill posed problem, we incorporate Attention Gates (AG) within an encoder-decoder based architecture. This helps prevent pattern recognition failures caused by variations in object sizes that share identical depth values. Our research involves evaluating popular pretrained architectures, assessing the impact of using AG, and creating effective head blocks to tackle depth estimation challenges. Notably, our approach demonstrates improved evaluation metrics on the DIODE dataset, positioning Attention U-Net as a promising solution. Therefore, utilizing the superior performance obtained by Attention U-Net in performing monocular depth estimation on low-cost autonomous systems could relatively reduce the cost of using lidar or stereo cameras in measuring distance.11 https://github.com/KrisnaPinasthika/Deciphering-Pixel-Insights