Synergizing Deep Learning-Enabled Preprocessing and Human–AI Integration for Efficient Automatic Ground Truth Generation

Christopher Collazo; Ian Vargas; Brendon Cara; Carla J. Weinheimer; Ryan P. Grabau; Dmitry Goldgof; Lawrence Hall; Samuel A. Wickline; Hua Pan

doi:10.3390/bioengineering11050434

Bioengineering (Apr 2024)

Synergizing Deep Learning-Enabled Preprocessing and Human–AI Integration for Efficient Automatic Ground Truth Generation

Christopher Collazo,
Ian Vargas,
Brendon Cara,
Carla J. Weinheimer,
Ryan P. Grabau,
Dmitry Goldgof,
Lawrence Hall,
Samuel A. Wickline,
Hua Pan

Affiliations

Christopher Collazo: College of Engineering, University of South Florida, Tampa, FL 33620, USA
Ian Vargas: The Heart Institute, College of Medicine, University of South Florida, Tampa, FL 33602, USA
Brendon Cara: The Heart Institute, College of Medicine, University of South Florida, Tampa, FL 33602, USA
Carla J. Weinheimer: Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA
Ryan P. Grabau: The Heart Institute, College of Medicine, University of South Florida, Tampa, FL 33602, USA
Dmitry Goldgof: College of Engineering, University of South Florida, Tampa, FL 33620, USA
Lawrence Hall: College of Engineering, University of South Florida, Tampa, FL 33620, USA
Samuel A. Wickline: The Heart Institute, College of Medicine, University of South Florida, Tampa, FL 33602, USA
Hua Pan: Department of Medicine, Washington University in St. Louis, St. Louis, MO 63110, USA

DOI: https://doi.org/10.3390/bioengineering11050434
Journal volume & issue: Vol. 11, no. 5
p. 434

Abstract

Read online

The progress of incorporating deep learning in the field of medical image interpretation has been greatly hindered due to the tremendous cost and time associated with generating ground truth for supervised machine learning, alongside concerns about the inconsistent quality of images acquired. Active learning offers a potential solution to these problems of expanding dataset ground truth by algorithmically choosing the most informative samples for ground truth labeling. Still, this effort incurs the costs of human labeling, which needs minimization. Furthermore, automatic labeling approaches employing active learning often exhibit overfitting tendencies while selecting samples closely aligned with the training set distribution and excluding out-of-distribution samples, which could potentially improve the model’s effectiveness. We propose that the majority of out-of-distribution instances can be attributed to inconsistent cross images. Since the FDA approved the first whole-slide image system for medical diagnosis in 2017, whole-slide images have provided enriched critical information to advance the field of automated histopathology. Here, we exemplify the benefits of a novel deep learning strategy that utilizes high-resolution whole-slide microscopic images. We quantitatively assess and visually highlight the inconsistencies within the whole-slide image dataset employed in this study. Accordingly, we introduce a deep learning-based preprocessing algorithm designed to normalize unknown samples to the training set distribution, effectively mitigating the overfitting issue. Consequently, our approach significantly increases the amount of automatic region-of-interest ground truth labeling on high-resolution whole-slide images using active deep learning. We accept 92% of the automatic labels generated for our unlabeled data cohort, expanding the labeled dataset by 845%. Additionally, we demonstrate expert time savings of 96% relative to manual expert ground-truth labeling.

Published in Bioengineering

ISSN: 2306-5354 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology; Science: Biology (General)
Website: https://www.mdpi.com/journal/bioengineering

About the journal

Abstract

Keywords