Ecotoxicology and Environmental Safety (Jan 2025)

Predicting the risk of cardiovascular disease in adults exposed to heavy metals: Interpretable machine learning

  • Meiyue Shen,
  • Yine Zhang,
  • Runqing Zhan,
  • Tingwei Du,
  • Peixuan Shen,
  • Xiaochuan Lu,
  • Shengnan Liu,
  • Rongrong Guo,
  • Xiaoli Shen

Journal volume & issue
Vol. 290
p. 117570

Abstract

Read online

Machine learning exhibits excellent performance in terms of predictive power. We aimed to construct an interpretable machine learning model utilizing National Health and Nutrition Examination Survey data to investigate the relationship between heavy metal exposure and cardiovascular disease (CVD). A total of 4600 adults were included in the analysis. The Least Absolute Shrinkage and Selection Operator regression method was employed to select relevant feature variables. Subsequently, six machine learning models were constructed, including random forest, decision tree, gradient boosting decision tree, k-nearest neighbor, support vector machine, and AdaBoost algorithms. Feature importance analysis, partial dependence plot, and shapley additive explanations were integrated to enhance the interpretability of the CVD prediction model. Among all models, the random forest exhibited the best performance, with an accuracy of 90 %, an area under the curve of 0.85, and an F1 score of 0.86. Urine cadmium (Cd), blood lead (Pb), urine thallium (Tl), and urine tungsten (W) were identified as the most significant predictors of CVD, with importance scores of 0.062, 0.057, 0.051, and 0.050, respectively. At the overall level, higher levels of urine Cd, blood Pb, and urine W were associated with an increased risk of CVD, whereas a lower level of urine Tl was linked to a reduced CVD risk. Additionally, the analysis of synergistic effects revealed that Cd was the predominant determinant of CVD risk. The random forest-based CVD prediction model demonstrated excellent predictive power and provided valuable insights for personalized patient care and optimal resource allocation in populations exposed to heavy metals.

Keywords