Кардиоваскулярная терапия и профилактика (Jan 2022)

Machine learning for predicting 5-year mortality risks: data from the ESSE-RF study in Primorsky Krai

  • V. A. Nevzorova,
  • T. A. Brodskaya,
  • K. I. Shakhgeldyan,
  • B. I. Geltser,
  • V. V. Kosterin,
  • L. G. Priseko

DOI
https://doi.org/10.15829/1728-8800-2022-2908
Journal volume & issue
Vol. 21, no. 1

Abstract

Read online

Aim. To develop and perform comparative assessment of the accuracy of models for predicting 5-year mortality risks according to the Epidemiology of Cardiovascular Diseases and their Risk Factors in Regions of Russian Federation (ESSE-RF) study in Primorsky Krai.Material and methods. The study included 2131 people (1257 women and 874 men) aged 23-67 years with a median of 47 years (95% confidence interval [46; 48]). The study protocol included measurement of blood pressure (BP), heart rate (HR), waist circumference, hip circumference, and waist-to-hip ratio (WHR). The following blood biochemical parameters: total cholesterol (TC), low and high density lipoprotein cholesterol, triglycerides, apolipoproteins AI and B, lipoprotein(a), N-terminal pro-brain natriuretic peptide (NT-proNBP), D-dimer, fibrinogen, C-reactive protein (CRP), glucose, creatinine, uric acid. The study endpoint was 5-year all-cause death (2013-2018). The group of deceased patients during this period consisted of 42 (2%) people, while those continued the study — 2089 (98%). The χ2, Fisher and MannWhitney tests, univariate logistic regression (LR) were used for data processing and analysis. To build predictive models, we used following machine learning (ML) methods: multivariate LR, Weibull regression, and stochastic gradient boosting.Results. The prognostic models developed on the ML basis, using parameters of age, sex, smoking, systolic blood pressure (SBP) and TC level in their structure, had higher quality metrics than Systematic COronary Risk Evaluation (SCORE) system. The inclusion of CRP, glucose, NT-proNBP, and heart rate into the predictors increased the accuracy of all models with the maximum rise in quality metrics in the multivariate LR model. Predictive potential of other factors (WHR, lipid profile, fibrinogen, D-dimer, etc.) was low and did not improve the prediction quality. An analysis of the influence degree of individual predictors on the mortality rate indicated the prevailing contribution of five factors as follows: age, levels of TC, NT-proNBP, CRP, and glucose. A less noticeable effect was associated with the level of HR, SBP and smoking, while the contribution of sex was minimal.Conclusion. The use of modern ML methods increases the accuracy of predictive models and provides a higher efficiency of risk stratification, especially among individuals with a low and moderate death risk from cardiovascular diseases.

Keywords