Application of machine learning methods for predicting infant mortality in Rwanda: analysis of Rwanda demographic health survey 2014–15 dataset

Emmanuel Mfateneza; Pierre Claver Rutayisire; Emmanuel Biracyaza; Sanctus Musafiri; Willy Gasafari Mpabuka

doi:10.1186/s12884-022-04699-8

BMC Pregnancy and Childbirth (May 2022)

Application of machine learning methods for predicting infant mortality in Rwanda: analysis of Rwanda demographic health survey 2014–15 dataset

Emmanuel Mfateneza,
Pierre Claver Rutayisire,
Emmanuel Biracyaza,
Sanctus Musafiri,
Willy Gasafari Mpabuka

Affiliations

Emmanuel Mfateneza: African Centre of Excellence in Data Science, University of Rwanda
Pierre Claver Rutayisire: Applied Statistics Department, University of Rwanda
Emmanuel Biracyaza: Prison Fellowship Rwanda
Sanctus Musafiri: Clinical Department of Internal Medicine, University of Rwanda
Willy Gasafari Mpabuka: Transparency International Rwanda

DOI: https://doi.org/10.1186/s12884-022-04699-8
Journal volume & issue: Vol. 22, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Background Extensive research on infant mortality (IM) exists in developing countries; however, most of the methods applied thus far relied on conventional regression analyses with limited prediction capability. Advanced of Machine Learning (AML) methods provide accurate prediction of IM; however, there is no study conducted using ML methods in Rwanda. This study, therefore, applied Machine Learning Methods for predicting infant mortality in Rwanda. Methods A cross-sectional study design was conducted using the 2014–15 Rwanda Demographic and Health Survey. Python software version 3.8 was employed to test and apply ML methods through Random Forest (RF), Decision Tree, Support Vector Machine and Logistic regression. STATA version 13 was used for analysing conventional methods. Evaluation metrics methods specifically confusion matrix, accuracy, precision, recall, F1 score, and Area under the Receiver Operating Characteristics (AUROC) were used to evaluate the performance of predictive models. Results Ability of prediction was between 68.6% and 61.5% for AML. We preferred with the RF model (61.5%) presenting the best performance. The RF model was the best predictive model of IM with accuracy (84.3%), recall (91.3%), precision (80.3%), F1 score (85.5%), and AUROC (84.2%); followed by decision tree model with model accuracy (83%), recall (91%), precision (79%), F1 score (84.67%) and AUROC(82.9%), followed by support vector machine with model accuracy (68.6%), recall (74.9%), precision(67%), F1 score (70.73%) and AUROC (68.6%) and last was a logistic regression with the low accuracy of prediction (61.5%), recall (61.1%), precision (62.2%), F1 score (61.6%) and AUROC (61.5%) compared to other predictive models. Our predictive models showed that marital status, children ever born, birth order and wealth index are the 4 top predictors of IM. Conclusions In developing a predictive model, ML methods are used to classify certain hidden information that could not be detected by traditional statistical methods. Random Forest was classified as the best classifier to be used for the predictive models of IM.

Published in BMC Pregnancy and Childbirth

ISSN: 1471-2393 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Gynecology and obstetrics
Website: http://bmcpregnancychildbirth.biomedcentral.com

About the journal

Abstract

Keywords