Machine Learning Model to Identify Sepsis Patients in the Emergency Department: Algorithm Development and Validation

Pei-Chen Lin; Kuo-Tai Chen; Huan-Chieh Chen; Md. Mohaimenul Islam; Ming-Chin Lin

doi:10.3390/jpm11111055

Journal of Personalized Medicine (Oct 2021)

Machine Learning Model to Identify Sepsis Patients in the Emergency Department: Algorithm Development and Validation

Pei-Chen Lin,
Kuo-Tai Chen,
Huan-Chieh Chen,
Md. Mohaimenul Islam,
Ming-Chin Lin

Affiliations

Pei-Chen Lin: Graduate Institute of Biomedical Informatics, College of Medicine Science and Technology, Taipei Medical University, Taipei 106, Taiwan
Kuo-Tai Chen: Emergency Department, Chi-Mei Medical Center, Tainan 710, Taiwan
Huan-Chieh Chen: Department of Neurosurgery, Taipei Medical University-Wan Fang Hospital, Taipei 116, Taiwan
Md. Mohaimenul Islam: Graduate Institute of Biomedical Informatics, College of Medicine Science and Technology, Taipei Medical University, Taipei 106, Taiwan
Ming-Chin Lin: Graduate Institute of Biomedical Informatics, College of Medicine Science and Technology, Taipei Medical University, Taipei 106, Taiwan

DOI: https://doi.org/10.3390/jpm11111055
Journal volume & issue: Vol. 11, no. 11
p. 1055

Abstract

Read online

Accurate stratification of sepsis can effectively guide the triage of patient care and shared decision making in the emergency department (ED). However, previous research on sepsis identification models focused mainly on ICU patients, and discrepancies in model performance between the development and external validation datasets are rarely evaluated. The aim of our study was to develop and externally validate a machine learning model to stratify sepsis patients in the ED. We retrospectively collected clinical data from two geographically separate institutes that provided a different level of care at different time periods. The Sepsis-3 criteria were used as the reference standard in both datasets for identifying true sepsis cases. An eXtreme Gradient Boosting (XGBoost) algorithm was developed to stratify sepsis patients and the performance of the model was compared with traditional clinical sepsis tools; quick Sequential Organ Failure Assessment (qSOFA) and Systemic Inflammatory Response Syndrome (SIRS). There were 8296 patients (1752 (21%) being septic) in the development and 1744 patients (506 (29%) being septic) in the external validation datasets. The mortality of septic patients in the development and validation datasets was 13.5% and 17%, respectively. In the internal validation, XGBoost achieved an area under the receiver operating characteristic curve (AUROC) of 0.86, exceeding SIRS (0.68) and qSOFA (0.56). The performance of XGBoost deteriorated in the external validation (the AUROC of XGBoost, SIRS and qSOFA was 0.75, 0.57 and 0.66, respectively). Heterogeneity in patient characteristics, such as sepsis prevalence, severity, age, comorbidity and infection focus, could reduce model performance. Our model showed good discriminative capabilities for the identification of sepsis patients and outperformed the existing sepsis identification tools. Implementation of the ML model in the ED can facilitate timely sepsis identification and treatment. However, dataset discrepancies should be carefully evaluated before implementing the ML approach in clinical practice. This finding reinforces the necessity for future studies to perform external validation to ensure the generalisability of any developed ML approaches.

Published in Journal of Personalized Medicine

ISSN: 2075-4426 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Medicine
Website: http://www.mdpi.com/journal/jpm

About the journal

Abstract

Keywords