Histology segmentation using active learning on regions of interest in oral cavity squamous cell carcinoma

Jonathan Folmsbee; Lei Zhang; Xulei Lu; Jawaria Rahman; John Gentry; Brendan Conn; Marilena Vered; Paromita Roy; Ruta Gupta; Diana Lin; Shabnam Samankan; Pooja Dhorajiva; Anu Peter; Minhua Wang; Anna Israel; Margaret Brandwein-Weber; Scott Doyle

doi:10.1016/j.jpi.2022.100146

Journal of Pathology Informatics (Jan 2022)

Histology segmentation using active learning on regions of interest in oral cavity squamous cell carcinoma

Jonathan Folmsbee,
Lei Zhang,
Xulei Lu,
Jawaria Rahman,
John Gentry,
Brendan Conn,
Marilena Vered,
Paromita Roy,
Ruta Gupta,
Diana Lin,
Shabnam Samankan,
Pooja Dhorajiva,
Anu Peter,
Minhua Wang,
Anna Israel,
Margaret Brandwein-Weber,
Scott Doyle

Affiliations

Jonathan Folmsbee: Department of Pathology & Anatomical Sciences, University at Buffalo SUNY, Buffalo, NY, USA; Department of Biomedical Engineering, University at Buffalo SUNY, Buffalo, NY, USA; Corresponding author at: Jacobs School 955 Main Street, Room 4205, Pathology and Anatomical Sciences, Buffalo, NY 14203, USA
Lei Zhang: Department of Pathology & Anatomical Sciences, University at Buffalo SUNY, Buffalo, NY, USA
Xulei Lu: Icahn School of Medicine, The Mount Sinai Hospital, New York, NY, USA
Jawaria Rahman: Department of Pathology, Case Western University, Cleveland, OH, USA
John Gentry: Department of Pathology, Nebraska Medical Health System, Omaha, NE, USA
Brendan Conn: Department of Pathology, University of Edinburgh, Edinburgh, UK
Marilena Vered: Department of Oral Pathology, Oral Medicine and Maxillofacial Imaging, School of Dental Medicine, Tel Aviv University, Tel Aviv, IL, USA; Institute of Pathology, Sheba Medical Center, Tel Hashomer, Ramat Gan, IL, USA
Paromita Roy: Department of Pathology, Tata Memorial Cancer Center, Mumbai, IN, USA
Ruta Gupta: Department of Tissue Pathology and Diagnostic Oncology, NSW Health Pathology, Royal Prince Alfred Hospital and University of Sydney, Sydney, AU, USA
Diana Lin: Department of Pathology, The University of Alabama at Birmingham, Birmingham, AL, USA
Shabnam Samankan: Department of Pathology, George Washington University Hospital, Washington, DC, USA
Pooja Dhorajiva: Department of Oncologic Surgical Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Anu Peter: Department of Pathology, University of Pennsylvania, Philadelphia, PA, USA
Minhua Wang: Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
Anna Israel: Department of Anatomic Pathology, Robert J. Tomsich Pathology and Laboratory Medicine Institute, Cleveland Clinic, Cleveland, OH, USA
Margaret Brandwein-Weber: Icahn School of Medicine, The Mount Sinai Hospital, New York, NY, USA
Scott Doyle: Department of Pathology & Anatomical Sciences, University at Buffalo SUNY, Buffalo, NY, USA; Department of Biomedical Engineering, University at Buffalo SUNY, Buffalo, NY, USA

DOI: https://doi.org/10.1016/j.jpi.2022.100146
Journal volume & issue: Vol. 13
p. 100146

Abstract

Read online

In digital pathology, deep learning has been shown to have a wide range of applications, from cancer grading to segmenting structures like glomeruli. One of the main hurdles for digital pathology to be truly effective is the size of the dataset needed for generalization to address the spectrum of possible morphologies. Small datasets limit classifiers’ ability to generalize. Yet, when we move to larger datasets of whole slide images (WSIs) of tissue, these datasets may cause network bottlenecks as each WSI at its original magnification can be upwards of 100 000 by 100 000 pixels, and over a gigabyte in file size. Compounding this problem, high quality pathologist annotations are difficult to obtain, as the volume of necessary annotations to create a classifier that can generalize would be extremely costly in terms of pathologist-hours. In this work, we use Active Learning (AL), a process for iterative interactive training, to create a modified U-net classifier on the region of interest (ROI) scale. We then compare this to Random Learning (RL), where images for addition to the dataset for retraining are randomly selected. Our hypothesis is that AL shows benefits for generating segmentation results versus randomly selecting images to annotate. We show that after 3 iterations, that AL, with an average Dice coefficient of 0.461, outperforms RL, with an average Dice Coefficient of 0.375, by 0.086.

Published in Journal of Pathology Informatics

ISSN: 2229-5089 (Print); 2153-3539 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Medicine: Pathology
Website: https://www.journals.elsevier.com/journal-of-pathology-informatics

About the journal

Abstract

Keywords