PLoS ONE (Jan 2019)

A validation of machine learning-based risk scores in the prehospital setting.

  • Douglas Spangler,
  • Thomas Hermansson,
  • David Smekal,
  • Hans Blomberg

DOI
https://doi.org/10.1371/journal.pone.0226518
Journal volume & issue
Vol. 14, no. 12
p. e0226518

Abstract

Read online

BACKGROUND:The triage of patients in prehospital care is a difficult task, and improved risk assessment tools are needed both at the dispatch center and on the ambulance to differentiate between low- and high-risk patients. This study validates a machine learning-based approach to generating risk scores based on hospital outcomes using routinely collected prehospital data. METHODS:Dispatch, ambulance, and hospital data were collected in one Swedish region from 2016-2017. Dispatch center and ambulance records were used to develop gradient boosting models predicting hospital admission, critical care (defined as admission to an intensive care unit or in-hospital mortality), and two-day mortality. Composite risk scores were generated based on the models and compared to National Early Warning Scores (NEWS) and actual dispatched priorities in a prospectively gathered dataset from 2018. RESULTS:A total of 38203 patients were included from 2016-2018. Concordance indexes (or areas under the receiver operating characteristics curve) for dispatched priorities ranged from 0.51-0.66, while those for NEWS ranged from 0.66-0.85. Concordance ranged from 0.70-0.79 for risk scores based only on dispatch data, and 0.79-0.89 for risk scores including ambulance data. Dispatch data-based risk scores consistently outperformed dispatched priorities in predicting hospital outcomes, while models including ambulance data also consistently outperformed NEWS. Model performance in the prospective test dataset was similar to that found using cross-validation, and calibration was comparable to that of NEWS. CONCLUSIONS:Machine learning-based risk scores outperformed a widely-used rule-based triage algorithm and human prioritization decisions in predicting hospital outcomes. Performance was robust in a prospectively gathered dataset, and scores demonstrated adequate calibration. Future research should explore the robustness of these methods when applied to other settings, establish appropriate outcome measures for use in determining the need for prehospital care, and investigate the clinical impact of interventions based on these methods.