Scientific Reports (Jul 2024)

Random survival forest for predicting the combined effects of multiple physiological risk factors on all-cause mortality

  • Bu Zhao,
  • Vy Kim Nguyen,
  • Ming Xu,
  • Justin A. Colacino,
  • Olivier Jolliet

DOI
https://doi.org/10.1038/s41598-024-66261-0
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Understanding the combined effects of risk factors on all-cause mortality is crucial for implementing effective risk stratification and designing targeted interventions, but such combined effects are understudied. We aim to use survival-tree based machine learning models as more flexible nonparametric techniques to examine the combined effects of multiple physiological risk factors on mortality. More specifically, we (1) study the combined effects between multiple physiological factors and all-cause mortality, (2) identify the five most influential factors and visualize their combined influence on all-cause mortality, and (3) compare the mortality cut-offs with the current clinical thresholds. Data from the 1999–2014 NHANES Survey were linked to National Death Index data with follow-up through 2015 for 17,790 adults. We observed that the five most influential factors affecting mortality are the tobacco smoking biomarker cotinine, glomerular filtration rate (GFR), plasma glucose, sex, and white blood cell count. Specifically, high mortality risk is associated with being male, active smoking, low GFR, elevated plasma glucose levels, and high white blood cell count. The identified mortality-based cutoffs for these factors are mostly consistent with relevant studies and current clinical thresholds. This approach enabled us to identify important cutoffs and provide enhanced risk prediction as an important basis to inform clinical practice and develop new strategies for precision medicine.

Keywords