Journal of Epidemiology and Global Health (Jul 2024)
Exploring Machine Learning Algorithms to Predict Diarrhea Disease and Identify its Determinants among Under-Five Years Children in East Africa
Abstract
Abstract Background The second most common cause of death for children under five is diarrhea. Early Predicting diarrhea disease and identify its determinants (factors) using an advanced machine learning model is the most effective way to save the lives of children. Hence, this study aimed to predict diarrheal diseases, identify their determinants, and generate some rules using machine learning models. Methods The study used secondary data from the 12 east African countries for DHS dataset analysis using Python. Machine learning techniques such as Random Forest, Decision Tree (DT), K-Nearest Neighbor, Logistic Regression (LR), wrapper feature selection and SHAP values are used for identify determinants. Result The final experimentation results indicated the random forest model performed the best to predict diarrhea disease with an accuracy of 86.5%, precision of 89%, F-measure of 86%, AUC curve of 92%, and recall of 82%. Important predictors’ identified age, countries, wealth status, mother’s educational status, mother’s age, source of drinking water, number of under-five children immunization status, media exposure, timing of breast feeding, mother’s working status, types of toilet, and twin status were associated with a higher predicted probability of diarrhea disease. Conclusion According to this study, child caregivers are fully aware of sanitation and feeding their children, and moms are educated, which can reduce child mortality by diarrhea in children in east Africa. This leads to a recommendation for policy direction to reduce infant mortality in East Africa.
Keywords