Bulletin of the National Research Centre (Jan 2025)
Machine learning algorithms for predictive modeling of dyslipidemia-associated cardiovascular disease risk in pregnancy: a comparison of boosting, random forest, and decision tree regression
Abstract
Abstract Background Cardiovascular diseases (CVD) are major contributors to maternal mortality and morbidity during pregnancy and increased atherogenic index of plasma levels is associated with a higher risk of CVD and obesity. Methods In this study, we utilized three different machine learning algorithms (boosting, random forest, and decision tree regression) to predict dyslipidemia-associated cardiovascular disease using atherogenic index and lipid profile parameters based on a cross-sectional study datasets of 112 pregnant women aged between 15 and 49 conducted at Aminu Kano Teaching Hospital. Results The results showed that random forest regression outperformed both boosting and decision tree regression, recording the lowest error criteria (MSE = 0.071 and RMSE = 0.266) for evaluating the model. These findings indicated that all the three algorithms have the potential to effectively model the data from atherogenic indices and lipid profile parameters but random forest and boosting were found to outperform decision tree models with respective R2 values of 0.95 and 0.92. Conclusions Overall, the study highlights the accuracy of machine learning models (random forest, boosting, and decision trees) in predicting dyslipidemia-associated cardiovascular diseases and the findings could contribute to the development of effective strategies for the prevention and treatment of dyslipidemia-associated cardiovascular diseases.
Keywords