Computer Methods and Programs in Biomedicine Update (Jan 2024)

Machine learning approaches for predicting frailty base on multimorbidities in US adults using NHANES data (1999–2018)

  • Teng Li,
  • Xueke Li,
  • Haoran XU,
  • Yanyan Wang,
  • Jingyu Ren,
  • Shixiang Jing,
  • Zichen Jin,
  • Gang chen,
  • Youyou Zhai,
  • Zeyu Wu,
  • Ge Zhang,
  • Yuying Wang

Journal volume & issue
Vol. 6
p. 100164

Abstract

Read online

Background: The global increase in an aging population has led to more common age-related health challenges, particularly multimorbidity and frailty, but there is a significant gap. Methods: This cross-sectional study utilized data from the National Health and Nutrition Examination Survey (1999–2018). The association between age and frailty was assessed using a restricted cubic spline (RCS) model, while weighted adjusted multivariable logistic regression evaluated the effect of diseases to frailty. And in machine learning process, feature selection for the frailty prediction model involved three algorithms. The model's performance was optimized using nested cross-validation and tested with various algorithms including decision tree, Logistic Regression, k-Nearest Neighbor, Random Forest, Recursive Partitioning and Regression Trees, and eXtreme Gradient Boosting (XGBoost). We used areas under the receiver operating characteristic curve (AUC) and area under the precision-recall curve (AU-PRC) to evaluate six algorithms, select the optimal model, and test the discrimination and consistency of the optimal model. Results: The study included 46,187 participants, with 6,009 cases of frailty. RCS analysis showed a non-linear association between age and frailty, with a turning point at 49 years. Key impacting variables identified are Anemia, Arthritis, Diabetes Mellitus, Coronary Heart Disease, and Hypertension. In the machine learning process, we selected the optimal data set by feature selection, including 13 variables. Through nested cross-validation, a total of 31,900 models were built using 6 algorithms. And the XGBoost model showed the highest performance (AUC = 0.8828 and AU-PRC = 0.624), and clear proficiency in both discrimination and calibration. Conclusions: We found 49 years maintain the balance of physiological reserve and external aggression. In addition, chronic diseases are trigger factor of frailty, while acute diseases are contributing factor that exacerbates the body's rapid decline. Last, the XGBoost frailty prediction model, with its simplicity, high performance and high clinical value holds potential for clinical application.

Keywords