Scientific Reports (Nov 2024)

Prediction of acute respiratory infections using machine learning techniques in Amhara Region, Ethiopia

  • Abdulaziz Kebede Kassaw,
  • Gashaw Bekele,
  • Ahmed Kebede Kassaw,
  • Ali Yimer

DOI
https://doi.org/10.1038/s41598-024-76847-3
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Many studies have shown that infectious diseases are responsible for the majority of deaths in children under five. Among these children, Acute Respiratory Infections is the most prevalent illness and cause of death worldwide. Acute respiratory infections continue to be the leading cause of death in developing countries, including Ethiopia. In order to predict the main factors contributing to acute respiratory infections in the Amhara regional state of Ethiopia, a machine learning technique was employed. This study utilized data from the 2016 Ethiopian Demographic and Health Survey. Seven machine learning models, including logistic regression, random forests, decision trees, Gradient Boosting, support vector machines, Naïve Bayes, and K-nearest neighbors, were employed to forecast the factors influencing acute respiratory infections. The accuracy of each model was assessed using receiver operating characteristic curves and various metrics. Among the seven models used, the Random Forest algorithm demonstrated the highest accuracy in predicting acute respiratory infections, with an accuracy rate of 90.35% and Area under the Curve of 94.80%. This was followed by the Decision Tree model with an accuracy rate of 88.69%, K-nearest neighbors with 86.35%, and Gradient Boosting with 82.69%. The Random Forest algorithm also exhibited positive and negative predictive values of 92.22% and 88.83%, respectively. Several factors were identified as significantly associated with ARI among children under five in the Amhara regional state, Ethiopia. These factors, included families with a poorer wealth status (log odds of 0.18) compared to their counterparts, families with four to six children (log odds of 0.1) compared to families with fewer than three living children, children without a history of diarrhea (log odds of -0.08), mothers who had occupation(log odds of 0.06) compared mothers who didn’t have occupation, children under six months of age (log odds of -0.05) compared to children older than six months, mothers with no education (log odds of 0.04) compared to mothers with primary education or higher, rural residents (log odds of 0.03) compared to non-rural residents, families using wood as a cooking material (log odds of 0.03) compared to those using electricity. Through Shapley Additive exPlanations value analysis on the Random Forest algorithm, we have identified significant risk factors for acute respiratory infections among children in the Amhara regional state of Ethiopia. The study found that the family’s wealth index, the number of children in the household, the mother’s occupation, the mother’s educational level, the type of residence, and the fuel type used for cooking were all associated with acute respiratory infections. Additionally, the research emphasized the importance of children being free from diarrhea and living in households with fewer children as essential factors for improving children’s health outcomes in the Amhara regional state, Ethiopia.