Journal of Mazandaran University of Medical Sciences (Apr 2021)
Comparing the Results of Logistic Regression Model and Classification and Regression Tree Analysis in Determining Prognostic Factors for Coronary Artery Disease in Mashhad, Iran
Abstract
Background and purpose: Understanding of the risk factors for cardiovascular artery disease, which is the leading cause of death worldwide, can lead to essential changes in its etiology, prevalence, and treatment. The aim of this study was to compare the results of logistic regression model and Classification and Regression Tree Analysis (CART) in determining the prognostic factors for coronary artery disease in people living in Mashhad, Iran. Materials and methods: The present case-control study used the cohort data of Mashhad stroke and heart atherosclerotic disorder (MASHAD STUDY), 2009. The prognostic factors for coronary artery disease were determined by CART and Logistic regression models using R and Stata 14. Then, the efficiency of the models was compared by computing the area under the performance characteristic curve (AUC). All patients with coronary artery disease were considered as the case and for each case, three controls were selected. Results: According to Logistic model, prognostic factors for coronary artery disease included age, history of myocardial infarction, diabetes, history of hyperlipidemia, and family history of heart disease (father and brother). The CART algorithm showed age, history of myocardial infarction, history of hypertension, depression, physical activity level, and body mass index as prognostic factors for coronary artery disease in people in Mashhad. Conclusion: Myocardial infarction and age were common prognostic factors for coronary artery disease according to the models applied. According to the efficiency of logistics model, binary multiple logistic regression model is suggested to be used in identifying the factors affecting coronary artery disease, if there is no interaction between the predictors.