Heliyon (Jul 2024)

Integrating ensemble and machine learning models for early prediction of pneumonia mortality using laboratory tests

  • Seung Min Baik,
  • Kyung Sook Hong,
  • Jae-Myeong Lee,
  • Dong Jin Park

Journal volume & issue
Vol. 10, no. 14
p. e34525

Abstract

Read online

Background: The recent use of artificial intelligence (AI) in medical research is noteworthy. However, most research has focused on medical imaging. Although the importance of laboratory tests in the clinical field is acknowledged by clinicians, they are undervalued in medical AI research. Our study aims to develop an early prediction AI model for pneumonia mortality, primarily using laboratory test results. Materials and methods: We developed a mortality prediction model using initial laboratory results and basic clinical information of patients with pneumonia. Several machine learning (ML) models and a deep learning method—multilayer perceptron (MLP)—were selected for model development. The area under the receiver operating characteristic curve (AUROC) and F1-score were optimized to improve model performance. In addition, an ensemble model was developed by blending several models to improve the prediction performance. We used 80,940 data instances for model development. Results: Among the ML models, XGBoost exhibited the best performance (AUROC = 0.8989, accuracy = 0.88, F1-score = 0.80). MLP achieved an AUROC of 0.8498, accuracy of 0.86, and F1-score of 0.75. The performance of the ensemble model was the best among the developed models, with an AUROC of 0.9006, accuracy of 0.90, and F1-score of 0.81. Several laboratory tests were conducted to identify risk factors that affect pneumonia mortality using the ''Feature importance'' technique and SHapley Additive exPlanations. We identified several laboratory results, including systolic blood pressure, serum glucose level, age, aspartate aminotransferase-to-alanine aminotransferase ratio, and monocyte-to-lymphocyte ratio, as significant predictors of mortality in patients with pneumonia. Conclusions: Our study demonstrates that the ensemble model, incorporating XGBoost, CatBoost, and LGBM techniques, outperforms individual ML and deep learning models in predicting pneumonia mortality. Our findings emphasize the importance of integrating AI techniques to leverage laboratory test data effectively, offering a promising direction for advancing AI applications in medical research and clinical decision-making.

Keywords