Journal of Global Antimicrobial Resistance (Dec 2024)
Machine Learning for Community-Acquired Pneumonia Diagnosis Using Routine Clinical and Laboratory Data
Abstract
Background: Community-acquired pneumonia (CAP) is diagnosed based on clinical information, laboratory tests, and chest imaging. However, chest radiography is often inaccessible in primary care, causing variability in clinical diagnosis. This study aims to develop a machine learning model to diagnose CAP using only clinical and laboratory data. METHODS: This study included patients who presented with fever and respiratory symptoms to the outpatient clinic or emergency room of a tertiary care center between 2009 and 2018. A total of 10,707 adult patients were randomly divided into training (70%) and test (30%) sets. We analyzed the model for internal validation on 1,364 patients who visited the same institution between August 2019 and December 2020.The performance of the machine-learning models was measured using the area under the receiver operating characteristic curve (AUROC). RESULTS: Among the algorithms tested, eXtreme Gradient Boosting (XGBOOST) achieved the highest AUROC (0.936, 95% CI: 0.924-0.947), followed by the gradient boost (0.931, 95% CI: 0.919-0.943) and random forest (0.926, 95% CI: 0.912-0.938) models in the test set. The most significant independent variables for diagnosing pneumonia were the presence of cough, crackle lung sounds, and CRP levels. In the validation set, XGBOOST achieved an AUC of 0.919 (95% CI: 0.886-0.933), with a sensitivity of 82.30%, specificity of 88.92%, and accuracy of 87.90%. CONCLUSIONS: The machine learning model accurately diagnosed community-acquired pneumonia, indicating its potential to assist in primary care settings without relying on chest imaging.