PLoS ONE (Jan 2022)
An evolutionary machine learning algorithm for cardiovascular disease risk prediction.
Abstract
IntroductionThis study developed a novel risk assessment model to predict the occurrence of cardiovascular disease (CVD) events. It uses a Genetic Algorithm (GA) to develop an easy-to-use model with high accuracy, calibrated based on the Isfahan Cohort Study (ICS) database.MethodsThe ICS was a population-based prospective cohort study of 6,504 healthy Iranian adults aged ≥ 35 years followed for incident CVD over ten years, from 2001 to 2010. To develop a risk score, the problem of predicting CVD was solved using a well-designed GA, and finally, the results were compared with classic machine learning (ML) and statistical methods.ResultsA number of risk scores such as the WHO, and PARS models were utilized as the baseline for comparison due to their similar chart-based models. The Framingham and PROCAM models were also applied to the dataset, with the area under a Receiver Operating Characteristic curve (AUROC) equal to 0.633 and 0.683, respectively. However, the more complex Deep Learning model using a three-layered Convolutional Neural Network (CNN) performed best among the ML models, with an AUROC of 0.74, and the GA-based eXplanaible Persian Atherosclerotic CVD Risk Stratification (XPARS) showed higher performance compared to the statistical methods. XPARS with eight features showed an AUROC of 0.76, and the XPARS with four features, showed an AUROC of 0.72.ConclusionA risk model that is extracted using GA substantially improves the prediction of CVD compared to conventional methods. It is clear, interpretable and can be a suitable replacement for conventional statistical methods.