BMJ Open (Aug 2019)
Training machine learning models to predict 30-day mortality in patients discharged from the emergency department: a retrospective, population-based registry study
Abstract
Objectives The aim of this work was to train machine learning models to identify patients at end of life with clinically meaningful diagnostic accuracy, using 30-day mortality in patients discharged from the emergency department (ED) as a proxy.Design Retrospective, population-based registry study.Setting Swedish health services.Primary and secondary outcome measures All cause 30-day mortality.Methods Electronic health records (EHRs) and administrative data were used to train six supervised machine learning models to predict all-cause mortality within 30 days in patients discharged from EDs in southern Sweden, Europe.Participants The models were trained using 65 776 ED visits and validated on 55 164 visits from a separate ED to which the models were not exposed during training.Results The outcome occurred in 136 visits (0.21%) in the development set and in 83 visits (0.15%) in the validation set. The model with highest discrimination attained ROC–AUC 0.95 (95% CI 0.93 to 0.96), with sensitivity 0.87 (95% CI 0.80 to 0.93) and specificity 0.86 (0.86 to 0.86) on the validation set.Conclusions Multiple models displayed excellent discrimination on the validation set and outperformed available indexes for short-term mortality prediction interms of ROC–AUC (by indirect comparison). The practical utility of the models increases as the data they were trained on did not require costly de novo collection but were real-world data generated as a by-product of routine care delivery.