Machine learning-assisted prediction of pneumonia based on non-invasive measures

Clement Yaw Effah; Ruoqi Miao; Emmanuel Kwateng Drokow; Clement Agboyibor; Ruiping Qiao; Yongjun Wu; Lijun Miao; Yanbin Wang

doi:10.3389/fpubh.2022.938801

Frontiers in Public Health (Jul 2022)

Machine learning-assisted prediction of pneumonia based on non-invasive measures

Clement Yaw Effah,
Ruoqi Miao,
Emmanuel Kwateng Drokow,
Clement Agboyibor,
Ruiping Qiao,
Yongjun Wu,
Lijun Miao,
Yanbin Wang

Affiliations

Clement Yaw Effah: College of Public Health, Zhengzhou University, Zhengzhou, China
Ruoqi Miao: College of Public Health, Zhengzhou University, Zhengzhou, China
Emmanuel Kwateng Drokow: Department of Radiation Oncology, Zhengzhou University People's Hospital, Henan Provincial People's Hospital, Zhengzhou, China
Clement Agboyibor: School of Pharmaceutical Sciences, Zhengzhou University, Zhengzhou, China
Ruiping Qiao: Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
Yongjun Wu: College of Public Health, Zhengzhou University, Zhengzhou, China
Lijun Miao: Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
Yanbin Wang: Center of Health Management, General Hospital of Anyang Iron and Steel Group Co., Ltd, Anyang, China

DOI: https://doi.org/10.3389/fpubh.2022.938801
Journal volume & issue: Vol. 10

Abstract

Read online

BackgroundPneumonia is an infection of the lungs that is characterized by high morbidity and mortality. The use of machine learning systems to detect respiratory diseases via non-invasive measures such as physical and laboratory parameters is gaining momentum and has been proposed to decrease diagnostic uncertainty associated with bacterial pneumonia. Herein, this study conducted several experiments using eight machine learning models to predict pneumonia based on biomarkers, laboratory parameters, and physical features.MethodsWe perform machine-learning analysis on 535 different patients, each with 45 features. Data normalization to rescale all real-valued features was performed. Since it is a binary problem, we categorized each patient into one class at a time. We designed three experiments to evaluate the models: (1) feature selection techniques to select appropriate features for the models, (2) experiments on the imbalanced original dataset, and (3) experiments on the SMOTE data. We then compared eight machine learning models to evaluate their effectiveness in predicting pneumoniaResultsBiomarkers such as C-reactive protein and procalcitonin demonstrated the most significant discriminating power. Ensemble machine learning models such as RF (accuracy = 92.0%, precision = 91.3%, recall = 96.0%, f1-Score = 93.6%) and XGBoost (accuracy = 90.8%, precision = 92.6%, recall = 92.3%, f1-score = 92.4%) achieved the highest performance accuracy on the original dataset with AUCs of 0.96 and 0.97, respectively. On the SMOTE dataset, RF and XGBoost achieved the highest prediction results with f1-scores of 92.0 and 91.2%, respectively. Also, AUC of 0.97 was achieved for both RF and XGBoost models.ConclusionsOur models showed that in the diagnosis of pneumonia, individual clinical history, laboratory indicators, and symptoms do not have adequate discriminatory power. We can also conclude that the ensemble ML models performed better in this study.

Published in Frontiers in Public Health

ISSN: 2296-2565 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Public aspects of medicine
Website: https://www.frontiersin.org/journals/public-health

About the journal

Abstract

Keywords