Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments

Lillian Sung; Conor Corbin; Ethan Steinberg; Emily Vettese; Aaron Campigotto; Loreto Lecce; George A. Tomlinson; Nigam Shah

doi:10.1186/s12885-020-07618-2

BMC Cancer (Nov 2020)

Development and utility assessment of a machine learning bloodstream infection classifier in pediatric patients receiving cancer treatments

Lillian Sung,
Conor Corbin,
Ethan Steinberg,
Emily Vettese,
Aaron Campigotto,
Loreto Lecce,
George A. Tomlinson,
Nigam Shah

Affiliations

Lillian Sung: Division of Haematology/Oncology, The Hospital for Sick Children
Conor Corbin: Biomedical Informatics Research, Stanford University
Ethan Steinberg: Biomedical Informatics Research, Stanford University
Emily Vettese: Division of Haematology/Oncology, The Hospital for Sick Children
Aaron Campigotto: Division of Infectious Diseases, The Hospital for Sick Children
Loreto Lecce: Division of Neonatology, The Hospital for Sick Children
George A. Tomlinson: Department of Medicine, University Health Network
Nigam Shah: Biomedical Informatics Research, Stanford University

DOI: https://doi.org/10.1186/s12885-020-07618-2
Journal volume & issue: Vol. 20, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Objectives were to build a machine learning algorithm to identify bloodstream infection (BSI) among pediatric patients with cancer and hematopoietic stem cell transplantation (HSCT) recipients, and to compare this approach with presence of neutropenia to identify BSI. Methods We included patients 0–18 years of age at cancer diagnosis or HSCT between January 2009 and November 2018. Eligible blood cultures were those with no previous blood culture (regardless of result) within 7 days. The primary outcome was BSI. Four machine learning algorithms were used: elastic net, support vector machine and two implementations of gradient boosting machine (GBM and XGBoost). Model training and evaluation were performed using temporally disjoint training (60%), validation (20%) and test (20%) sets. The best model was compared to neutropenia alone in the test set. Results Of 11,183 eligible blood cultures, 624 (5.6%) were positive. The best model in the validation set was GBM, which achieved an area-under-the-receiver-operator-curve (AUROC) of 0.74 in the test set. Among the 2236 in the test set, the number of false positives and specificity of GBM vs. neutropenia were 508 vs. 592 and 0.76 vs. 0.72 respectively. Among 139 test set BSIs, six (4.3%) non-neutropenic patients were identified by GBM. All received antibiotics prior to culture result availability. Conclusions We developed a machine learning algorithm to classify BSI. GBM achieved an AUROC of 0.74 and identified 4.3% additional true cases in the test set. The machine learning algorithm did not perform substantially better than using presence of neutropenia alone to predict BSI.

Published in BMC Cancer

ISSN: 1471-2407 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Internal medicine: Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Website: http://bmccancer.biomedcentral.com

About the journal

Abstract

Keywords