PLOS Digital Health (Jun 2023)
AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning
Abstract
Diagnostic and prognostic models are increasingly important in medicine and inform many clinical decisions. Recently, machine learning approaches have shown improvement over conventional modeling techniques by better capturing complex interactions between patient covariates in a data-driven manner. However, the use of machine learning introduces technical and practical challenges that have thus far restricted widespread adoption of such techniques in clinical settings. To address these challenges and empower healthcare professionals, we present an open-source machine learning framework, AutoPrognosis 2.0, to facilitate the development of diagnostic and prognostic models. AutoPrognosis leverages state-of-the-art advances in automated machine learning to develop optimized machine learning pipelines, incorporates model explainability tools, and enables deployment of clinical demonstrators, without requiring significant technical expertise. To demonstrate AutoPrognosis 2.0, we provide an illustrative application where we construct a prognostic risk score for diabetes using the UK Biobank, a prospective study of 502,467 individuals. The models produced by our automated framework achieve greater discrimination for diabetes than expert clinical risk scores. We have implemented our risk score as a web-based decision support tool, which can be publicly accessed by patients and clinicians. By open-sourcing our framework as a tool for the community, we aim to provide clinicians and other medical practitioners with an accessible resource to develop new risk scores, personalized diagnostics, and prognostics using machine learning techniques. Software: https://github.com/vanderschaarlab/AutoPrognosis Author summary Previous studies have reported promising applications of machine learning (ML) approaches in healthcare. However, there remain significant challenges to using ML for diagnostic and prognostic modeling, particularly for non-ML experts, that currently prevent broader adoption of these approaches. We developed an open-source tool, AutoPrognosis 2.0, to address these challenges and make modern statistical and machine learning methods available to expert and non-expert ML users. AutoPrognosis configures and optimizes ML pipelines using automated machine learning to develop powerful predictive models, while also providing interpretability methods to allow users to understand and debug these models. This study illustrates the application of AutoPrognosis to diabetes risk prediction using data from UK Biobank. The risk score developed using AutoPrognosis outperforms existing risk scores and has been implemented as a web-based decision support tool that can be publicly accessed by patients and clinicians. This study suggests that AutoPrognosis 2.0 can be used by healthcare experts to create new clinical tools and predictive pipelines across various clinical outcomes, employing advanced machine learning techniques.