Using machine learning to model older adult inpatient trajectories from electronic health records data
Maria Herrero-Zazo,
Tomas Fitzgerald,
Vince Taylor,
Helen Street,
Afzal N. Chaudhry,
John R. Bradley,
Ewan Birney,
Victoria L. Keevil
Affiliations
Maria Herrero-Zazo
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK; Department of Medicine for the Elderly, Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK
Tomas Fitzgerald
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
Vince Taylor
Cambridge Clinical Informatics, Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK
Helen Street
Research and Development, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK
Afzal N. Chaudhry
Department of Medicine, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK; NIHR Cambridge Biomedical Research Centre, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
John R. Bradley
Department of Medicine, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK; NIHR Cambridge Biomedical Research Centre, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
Ewan Birney
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK; Corresponding author
Victoria L. Keevil
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK; Department of Medicine for the Elderly, Addenbrooke’s Hospital, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK; Department of Medicine, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK; Corresponding author
Summary: Electronic Health Records (EHR) data can provide novel insights into inpatient trajectories. Blood tests and vital signs from de-identified patients’ hospital admission episodes (AE) were represented as multivariate time-series (MVTS) to train unsupervised Hidden Markov Models (HMM) and represent each AE day as one of 17 states. All HMM states were clinically interpreted based on their patterns of MVTS variables and relationships with clinical information. Visualization differentiated patients progressing toward stable ‘discharge-like’ states versus those remaining at risk of inpatient mortality (IM). Chi-square tests confirmed these relationships (two states associated with IM; 12 states with ≥1 diagnosis). Logistic Regression and Random Forest (RF) models trained with MVTS data rather than states had higher prediction performances of IM, but results were comparable (best RF model AUC-ROC: MVTS data = 0.85; HMM states = 0.79). ML models extracted clinically interpretable signals from hospital data. The potential of ML to develop decision-support tools for EHR systems warrants investigation.