A Review of Benchmark Datasets and Training Loss Functions in Neural Depth Estimation

Faisal Khan; Shahid Hussain; Shubhajit Basak; Mohamed Moustafa; Peter Corcoran

doi:10.1109/ACCESS.2021.3124978

IEEE Access (Jan 2021)

A Review of Benchmark Datasets and Training Loss Functions in Neural Depth Estimation

Faisal Khan,
Shahid Hussain,
Shubhajit Basak,
Mohamed Moustafa,
Peter Corcoran

Affiliations

Faisal Khan: ORCiD; Department of Electronic Engineering, College of Science and Engineering, National University of Ireland Galway, Galway, Ireland
Shahid Hussain: ORCiD; Data Science Institute, National University of Ireland Galway, Galway, Ireland
Shubhajit Basak: ORCiD; National University of Ireland Galway, School of Computer Science, Galway, Ireland
Mohamed Moustafa: ORCiD; Department of Electronic Engineering, College of Science and Engineering, National University of Ireland Galway, Galway, Ireland
Peter Corcoran: ORCiD; Department of Electronic Engineering, College of Science and Engineering, National University of Ireland Galway, Galway, Ireland

DOI: https://doi.org/10.1109/ACCESS.2021.3124978
Journal volume & issue: Vol. 9
pp. 148479 – 148503

Abstract

Read online

In many applications, such as robotic perception, scene understanding, augmented reality, 3D reconstruction, and medical image analysis, depth from images is a fundamentally ill-posed problem. The success of depth estimation models relies on assembling a suitably large and diverse training dataset and on the selection of appropriate loss functions. It is critical for researchers in this field to be made aware of the wide range of publicly available depth datasets along with the properties of various loss functions that have been applied to depth estimation. Selection of the right training data combined with appropriate loss functions will accelerate new research and enable better comparison with state-of-the-art. Accordingly, this work offers a comprehensive review of available depth datasets as well as the loss functions that are applied in this problem domain. These depth datasets are categorised into five primary categories based on their application, namely (i) people detection and action recognition, (ii) faces and facial pose, (iii) perception-based navigation (i.e., street signs, roads), (iv) object and scene recognition, and (v) medical applications. The important characteristics and properties of each depth dataset are described and compared. A mixing strategy for depth datasets is presented in order to generalise model results across different environments and use cases. Furthermore, depth estimation loss functions that can help with training deep learning depth estimation models across different datasets are discussed. State-of-the-art deep learning-based depth estimation methods evaluations are presented for three of the most popular datasets. Finally, a discussion about challenges and future research along with recommendations for building comprehensive depth datasets will be presented as to help researchers in the selection of appropriate datasets and loss functions for evaluating their results and algorithms.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords