Predicting individual patient and hospital-level discharge using machine learning

Jia Wei; Jiandong Zhou; Zizheng Zhang; Kevin Yuan; Qingze Gu; Augustine Luk; Andrew J. Brent; David A. Clifton; A. Sarah Walker; David W. Eyre

doi:10.1038/s43856-024-00673-x

Communications Medicine (Nov 2024)

Predicting individual patient and hospital-level discharge using machine learning

Jia Wei,
Jiandong Zhou,
Zizheng Zhang,
Kevin Yuan,
Qingze Gu,
Augustine Luk,
Andrew J. Brent,
David A. Clifton,
A. Sarah Walker,
David W. Eyre

Affiliations

Jia Wei: Nuffield Department of Medicine, University of Oxford
Jiandong Zhou: Nuffield Department of Medicine, University of Oxford
Zizheng Zhang: Big Data Institute, Nuffield Department of Population Health, University of Oxford
Kevin Yuan: Big Data Institute, Nuffield Department of Population Health, University of Oxford
Qingze Gu: Nuffield Department of Medicine, University of Oxford
Augustine Luk: Nuffield Department of Medicine, University of Oxford
Andrew J. Brent: Nuffield Department of Medicine, University of Oxford
David A. Clifton: Department of Engineering Science, University of Oxford
A. Sarah Walker: Nuffield Department of Medicine, University of Oxford
David W. Eyre: Big Data Institute, Nuffield Department of Population Health, University of Oxford

DOI: https://doi.org/10.1038/s43856-024-00673-x
Journal volume & issue: Vol. 4, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background Accurately predicting hospital discharge events could help improve patient flow and the efficiency of healthcare delivery. However, using machine learning and diverse electronic health record (EHR) data for this task remains incompletely explored. Methods We used EHR data from February-2017 to January-2020 from Oxfordshire, UK to predict hospital discharges in the next 24 h. We fitted separate extreme gradient boosting models for elective and emergency admissions, trained on the first two years of data and tested on the final year of data. We examined individual-level and hospital-level model performance and evaluated the impact of training data size and recency, prediction time, and performance in subgroups. Results Our models achieve AUROCs of 0.87 and 0.86, AUPRCs of 0.66 and 0.64, and F1 scores of 0.61 and 0.59 for elective and emergency admissions, respectively. These models outperform a logistic regression model using the same features and are substantially better than a baseline logistic regression model with more limited features. Notably, the relative performance increase from adding additional features is greater than the increase from using a sophisticated model. Aggregating individual probabilities, daily total discharge estimates are accurate with mean absolute errors of 8.9% (elective) and 4.9% (emergency). The most informative predictors include antibiotic prescriptions, medications, and hospital capacity factors. Performance remains robust across patient subgroups and different training strategies, but is lower in patients with longer admissions and those who died in hospital. Conclusions Our findings highlight the potential of machine learning in optimising hospital patient flow and facilitating patient care and recovery.

Published in Communications Medicine

ISSN: 2730-664X (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine
Website: https://www.nature.com/commsmed/

About the journal