Prediction of Bacteremia Based on 12-Year Medical Data Using a Machine Learning Approach: Effect of Medical Data by Extraction Time

Kyoung Hwa Lee; Jae June Dong; Subin Kim; Dayeong Kim; Jong Hoon Hyun; Myeong-Hun Chae; Byeong Soo Lee; Young Goo Song

doi:10.3390/diagnostics12010102

Diagnostics (Jan 2022)

Prediction of Bacteremia Based on 12-Year Medical Data Using a Machine Learning Approach: Effect of Medical Data by Extraction Time

Kyoung Hwa Lee,
Jae June Dong,
Subin Kim,
Dayeong Kim,
Jong Hoon Hyun,
Myeong-Hun Chae,
Byeong Soo Lee,
Young Goo Song

Affiliations

Kyoung Hwa Lee: Division of Infectious Diseases, Department of Internal Medicine, Yonsei University College of Medicine, Seoul 06273, Korea
Jae June Dong: Department of Family Medicine, Yonsei University College of Medicine, Seoul 06273, Korea
Subin Kim: Division of Infectious Diseases, Department of Internal Medicine, Yonsei University College of Medicine, Seoul 06273, Korea
Dayeong Kim: Division of Infectious Diseases, Department of Internal Medicine, Yonsei University College of Medicine, Seoul 06273, Korea
Jong Hoon Hyun: Division of Infectious Diseases, Department of Internal Medicine, Yonsei University College of Medicine, Seoul 06273, Korea
Myeong-Hun Chae: Selvas Artificial Intelligence Incorporate, Seoul 08594, Korea
Byeong Soo Lee: Selvas Artificial Intelligence Incorporate, Seoul 08594, Korea
Young Goo Song: Division of Infectious Diseases, Department of Internal Medicine, Yonsei University College of Medicine, Seoul 06273, Korea

DOI: https://doi.org/10.3390/diagnostics12010102
Journal volume & issue: Vol. 12, no. 1
p. 102

Abstract

Read online

Early detection of bacteremia is important to prevent antibiotic abuse. Therefore, we aimed to develop a clinically applicable bacteremia prediction model using machine learning technology. Data from two tertiary medical centers’ electronic medical records during a 12-year-period were extracted. Multi-layer perceptron (MLP), random forest, and gradient boosting algorithms were applied for machine learning analysis. Clinical data within 12 and 24 hours of blood culture were analyzed and compared. Out of 622,771 blood cultures, 38,752 episodes of bacteremia were identified. In MLP with 128 hidden layer nodes, the area under the receiver operating characteristic curve (AUROC) of the prediction performance in 12- and 24-h data models was 0.762 (95% confidence interval (CI); 0.7617–0.7623) and 0.753 (95% CI; 0.7520–0.7529), respectively. AUROC of causative-pathogen subgroup analysis predictive value for Acinetobacter baumannii bacteremia was the highest at 0.839 (95% CI; 0.8388–0.8394). Compared to primary bacteremia, AUROC of sepsis caused by pneumonia was highest. Predictive performance of bacteremia was superior in younger age groups. Bacteremia prediction using machine learning technology appeared possible for acute infectious diseases. This model was more suitable especially to pneumonia caused by Acinetobacter baumannii. From the 24-h blood culture data, bacteremia was predictable by substituting only the continuously variable values.

Published in Diagnostics

ISSN: 2075-4418 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Medicine: Medicine (General)
Website: http://www.mdpi.com/journal/diagnostics

About the journal

Abstract

Keywords