Risk prediction of 30-day mortality after stroke using machine learning: a nationwide registry-based cohort study

Wenjuan Wang; Anthony G. Rudd; Yanzhong Wang; Vasa Curcin; Charles D. Wolfe; Niels Peek; Benjamin Bray

doi:10.1186/s12883-022-02722-1

BMC Neurology (May 2022)

Risk prediction of 30-day mortality after stroke using machine learning: a nationwide registry-based cohort study

Wenjuan Wang,
Anthony G. Rudd,
Yanzhong Wang,
Vasa Curcin,
Charles D. Wolfe,
Niels Peek,
Benjamin Bray

Affiliations

Wenjuan Wang: School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King’s College London
Anthony G. Rudd: School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King’s College London
Yanzhong Wang: School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King’s College London
Vasa Curcin: School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King’s College London
Charles D. Wolfe: School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King’s College London
Niels Peek: Division of Informatics, Imaging and Data Science, School of Health Sciences, University of Manchester
Benjamin Bray: School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King’s College London

DOI: https://doi.org/10.1186/s12883-022-02722-1
Journal volume & issue: Vol. 22, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Backgrounds We aimed to develop and validate machine learning (ML) models for 30-day stroke mortality for mortality risk stratification and as benchmarking models for quality improvement in stroke care. Methods Data from the UK Sentinel Stroke National Audit Program between 2013 to 2019 were used. Models were developed using XGBoost, Logistic Regression (LR), LR with elastic net with/without interaction terms using 80% randomly selected admissions from 2013 to 2018, validated on the 20% remaining admissions, and temporally validated on 2019 admissions. The models were developed with 30 variables. A reference model was developed using LR and 4 variables. Performances of all models was evaluated in terms of discrimination, calibration, reclassification, Brier scores and Decision-curves. Results In total, 488,497 stroke patients with a 12.3% 30-day mortality rate were included in the analysis. In 2019 temporal validation set, XGBoost model obtained the lowest Brier score (0.069 (95% CI: 0.068–0.071)) and the highest area under the ROC curve (AUC) (0.895 (95% CI: 0.891–0.900)) which outperformed LR reference model by 0.04 AUC (p 15%). The XGBoost model reclassified 1648 (8.1%) low-risk cases by the LR reference model as being moderate or high-risk and gained the most net benefit in decision curve analysis. Conclusions All models with 30 variables are potentially useful as benchmarking models in stroke-care quality improvement with ML slightly outperforming others.

Published in BMC Neurology

ISSN: 1471-2377 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Internal medicine: Neurosciences. Biological psychiatry. Neuropsychiatry: Neurology. Diseases of the nervous system
Website: http://bmcneurol.biomedcentral.com

About the journal

Abstract

Keywords