Performance of Automated Machine Learning in Predicting Outcomes of Pneumatic Retinopexy

Arina Nisanova, BA; Arefeh Yavary, MSc; Jordan Deaner, MD; Ferhina S. Ali, MD, MPH; Priyanka Gogte, MD; Richard Kaplan, MD; Kevin C. Chen, MD; Eric Nudleman, MD, PhD; Dilraj Grewal, MD; Meenakashi Gupta, MD; Jeremy Wolfe, MD; Michael Klufas, MD; Glenn Yiu, MD, PhD; Iman Soltani, PhD; Parisa Emami-Naeini, MD, MPH

doi:10.1016/j.xops.2024.100470

Ophthalmology Science (Sep 2024)

Performance of Automated Machine Learning in Predicting Outcomes of Pneumatic Retinopexy

Arina Nisanova, BA,
Arefeh Yavary, MSc,
Jordan Deaner, MD,
Ferhina S. Ali, MD, MPH,
Priyanka Gogte, MD,
Richard Kaplan, MD,
Kevin C. Chen, MD,
Eric Nudleman, MD, PhD,
Dilraj Grewal, MD,
Meenakashi Gupta, MD,
Jeremy Wolfe, MD,
Michael Klufas, MD,
Glenn Yiu, MD, PhD,
Iman Soltani, PhD,
Parisa Emami-Naeini, MD, MPH

Affiliations

Arina Nisanova, BA: School of Medicine, University of California Davis, Davis, California
Arefeh Yavary, MSc: Department of Computer Science, University of California Davis, Davis, California
Jordan Deaner, MD: Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania
Ferhina S. Ali, MD, MPH: New York Medical College, Valhalla, New York
Priyanka Gogte, MD: Associated Retinal Consultants, Royal Oak, Michigan
Richard Kaplan, MD: New York Eye and Ear Infirmary of Mount Sinai, New York, New York
Kevin C. Chen, MD: Vantage Eye Center, Salinas, California
Eric Nudleman, MD, PhD: Shiley Eye Center, University of California San Diego, La Jolla, California
Dilraj Grewal, MD: Eye Center, Duke University, Durham, North Carolina
Meenakashi Gupta, MD: New York Eye and Ear Infirmary of Mount Sinai, New York, New York
Jeremy Wolfe, MD: Associated Retinal Consultants, Royal Oak, Michigan
Michael Klufas, MD: Wills Eye Hospital, Thomas Jefferson University, Philadelphia, Pennsylvania
Glenn Yiu, MD, PhD: Tschannen Eye Institute, University of California Davis, Sacramento, California
Iman Soltani, PhD: Department of Mechanical and Aerospace Engineering, University of California Davis, Davis, California; Iman Soltani-Bozchalooi, PhD, 1 Shields Ave, Davis, CA 95616.
Parisa Emami-Naeini, MD, MPH: Tschannen Eye Institute, University of California Davis, Sacramento, California; Correspondence: Parisa Emami-Naeini, MD, MPH, 4860 Y Street, Suite 2400, Sacramento, CA 95817.

DOI: https://doi.org/10.1016/j.xops.2024.100470
Journal volume & issue: Vol. 4, no. 5
p. 100470

Abstract

Read online

Purpose: Automated machine learning (AutoML) has emerged as a novel tool for medical professionals lacking coding experience, enabling them to develop predictive models for treatment outcomes. This study evaluated the performance of AutoML tools in developing models predicting the success of pneumatic retinopexy (PR) in treatment of rhegmatogenous retinal detachment (RRD). These models were then compared with custom models created by machine learning (ML) experts. Design: Retrospective multicenter study. Participants: Five hundred and thirty nine consecutive patients with primary RRD that underwent PR by a vitreoretinal fellow at 6 training hospitals between 2002 and 2022. Methods: We used 2 AutoML platforms: MATLAB Classification Learner and Google Cloud AutoML. Additional models were developed by computer scientists. We included patient demographics and baseline characteristics, including lens and macula status, RRD size, number and location of breaks, presence of vitreous hemorrhage and lattice degeneration, and physicians’ experience. The dataset was split into a training (n = 483) and test set (n = 56). The training set, with a 2:1 success-to-failure ratio, was used to train the MATLAB models. Because Google Cloud AutoML requires a minimum of 1000 samples, the training set was tripled to create a new set with 1449 datapoints. Additionally, balanced datasets with a 1:1 success-to-failure ratio were created using Python. Main Outcome Measures: Single-procedure anatomic success rate, as predicted by the ML models. F2 scores and area under the receiver operating curve (AUROC) were used as primary metrics to compare models. Results: The best performing AutoML model (F2 score: 0.85; AUROC: 0.90; MATLAB), showed comparable performance to the custom model (0.92, 0.86) when trained on the balanced datasets. However, training the AutoML model with imbalanced data yielded misleadingly high AUROC (0.81) despite low F2-score (0.2) and sensitivity (0.17). Conclusions: We demonstrated the feasibility of using AutoML as an accessible tool for medical professionals to develop models from clinical data. Such models can ultimately aid in the clinical decision-making, contributing to better patient outcomes. However, outcomes can be misleading or unreliable if used naively. Limitations exist, particularly if datasets contain missing variables or are highly imbalanced. Proper model selection and data preprocessing can improve the reliability of AutoML tools. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Published in Ophthalmology Science

ISSN: 2666-9145 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Ophthalmology
Website: https://www.journals.elsevier.com/ophthalmology-science/

About the journal

Abstract

Keywords