Confidence-aware self-supervised learning for dense monocular depth estimation in dynamic laparoscopic scene

Yasuhide Hirohata; Maina Sogabe; Tetsuro Miyazaki; Toshihiro Kawase; Kenji Kawashima

doi:10.1038/s41598-023-42713-x

Scientific Reports (Sep 2023)

Confidence-aware self-supervised learning for dense monocular depth estimation in dynamic laparoscopic scene

Yasuhide Hirohata,
Maina Sogabe,
Tetsuro Miyazaki,
Toshihiro Kawase,
Kenji Kawashima

Affiliations

Yasuhide Hirohata: The Department of Information Physics and Computing, The University of Tokyo
Maina Sogabe: The Department of Information Physics and Computing, The University of Tokyo
Tetsuro Miyazaki: The Department of Information Physics and Computing, The University of Tokyo
Toshihiro Kawase: The School of Engineering Department of Information and Communication Engineering, Tokyo Denki University
Kenji Kawashima: The Department of Information Physics and Computing, The University of Tokyo

DOI: https://doi.org/10.1038/s41598-023-42713-x
Journal volume & issue: Vol. 13, no. 1
pp. 1 – 13

Abstract

Read online

Abstract This paper tackles the challenge of accurate depth estimation from monocular laparoscopic images in dynamic surgical environments. The lack of reliable ground truth due to inconsistencies within these images makes this a complex task. Further complicating the learning process is the presence of noise elements like bleeding and smoke. We propose a model learning framework that uses a generic laparoscopic surgery video dataset for training, aimed at achieving precise monocular depth estimation in dynamic surgical settings. The architecture employs binocular disparity confidence information as a self-supervisory signal, along with the disparity information from a stereo laparoscope. Our method ensures robust learning amidst outliers, influenced by tissue deformation, smoke, and surgical instruments, by utilizing a unique loss function. This function adjusts the selection and weighting of depth data for learning based on their given confidence. We trained the model using the Hamlyn Dataset and verified it with Hamlyn Dataset test data and a static dataset. The results show exceptional generalization performance and efficacy for various scene dynamics, laparoscope types, and surgical sites.

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal