npj Imaging
(Jul 2024)
CohortFinder: an open-source tool for data-driven partitioning of digital pathology and imaging cohorts to yield robust machine-learning models
Fan Fan,
Georgia Martinez,
Thomas DeSilvio,
John Shin,
Yijiang Chen,
Jackson Jacobs,
Bangchen Wang,
Takaya Ozeki,
Maxime W. Lafarge,
Viktor H. Koelzer,
Laura Barisoni,
Anant Madabhushi,
Satish E. Viswanath,
Andrew Janowczyk
Affiliations
Fan Fan
Emory University and Georgia Institute of Technology, Department of Biomedical Engineering
Georgia Martinez
Case Western Reserve University, Department of Biomedical Engineering
Thomas DeSilvio
Case Western Reserve University, Department of Biomedical Engineering
John Shin
Case Western Reserve University, Department of Biomedical Engineering
Yijiang Chen
Case Western Reserve University, Department of Biomedical Engineering
Jackson Jacobs
Emory University and Georgia Institute of Technology, Department of Biomedical Engineering
Bangchen Wang
Duke University, Department of Pathology, Division of AI & Computational Pathology
Takaya Ozeki
University of Michigan, Department of Internal Medicine, Division of Nephrology
Maxime W. Lafarge
University Hospital of Zurich, University of Zurich, Department of Pathology and Molecular Pathology
Viktor H. Koelzer
University Hospital of Zurich, University of Zurich, Department of Pathology and Molecular Pathology
Laura Barisoni
Duke University, Department of Pathology, Division of AI & Computational Pathology
Anant Madabhushi
Emory University and Georgia Institute of Technology, Department of Biomedical Engineering
Satish E. Viswanath
Case Western Reserve University, Department of Biomedical Engineering
Andrew Janowczyk
Emory University and Georgia Institute of Technology, Department of Biomedical Engineering
DOI
https://doi.org/10.1038/s44303-024-00018-2
Journal volume & issue
Vol. 2,
no. 1
pp.
1
– 7
Abstract
Read online
Abstract Batch effects (BEs) refer to systematic technical differences in data collection unrelated to biological variations whose noise is shown to negatively impact machine learning (ML) model generalizability. Here we release CohortFinder ( http://cohortfinder.com ), an open-source tool aimed at mitigating BEs via data-driven cohort partitioning. We demonstrate CohortFinder improves ML model performance in downstream digital pathology and medical image processing tasks. CohortFinder is freely available for download at cohortfinder.com.
Published in npj Imaging
ISSN
2948-197X (Online)
Publisher
Nature Portfolio
Country of publisher
United Kingdom
LCC subjects
Medicine: Medicine (General): Medical technology
Medicine: Medicine (General): Medical physics. Medical radiology. Nuclear medicine
Website
https://www.nature.com/npjimaging/
About the journal
WeChat QR code
Close