Brain and Behavior (Dec 2023)

Ischemic stroke prediction using machine learning in elderly Chinese population: The Rugao Longitudinal Ageing Study

  • Huai‐Wen Chang,
  • Hui Zhang,
  • Guo‐Ping Shi,
  • Jiang‐Hong Guo,
  • Xue‐Feng Chu,
  • Zheng‐Dong Wang,
  • Yin Yao,
  • Xiao‐Feng Wang

DOI
https://doi.org/10.1002/brb3.3307
Journal volume & issue
Vol. 13, no. 12
pp. n/a – n/a

Abstract

Read online

Abstract Objective Compared logistic regression (LR) with machine learning (ML) models, to predict the risk of ischemic stroke in an elderly population in China. Methods We applied 2208 records from the Rugao Longitudinal Ageing Study (RLAS) for ischemic stroke risk prediction assessment. Input variables included 103 phenotypes. For 3‐year ischemic stroke risk prediction, we compared the discrimination and calibration of LR model and ML methods, where ML methods include Random Forest (RF), Gaussian kernel Support Vector Machines (SVM), Multilayer perceptron (MLP), K‐Nearest Neighbors Algorithm (KNN), and Gradient Boosting Decision Tree (GBDT) to develop an ischemic stroke risk prediction model. Results Age, pulse, waist circumference, education level, β2‐microglobulin, homocysteine, cystatin C, folate, free triiodothyronine, platelet distribution width, QT interval, and QTc interval were significant induced predictors of ischemic stroke. For ischemic stroke prediction, the ML approach was able to tap more biochemical and ECG‐related multidimensional phenotypic indicators compared to the LR model, which placed more importance on general demographic indicators. Compared to the LR model, SVM provided the best discrimination and calibration (C‐index: 0.79 vs. 0.71, 11.27% improvement in model utility), with the best performance in both validation and test data. Conclusion In a comparison of LR with five ML models, the accuracy of ischemic stroke prediction was higher by combining ML with multiple phenotypes. Combined with other studies based on elderly populations in China, ML techniques, especially SVM, have shown good long‐term predictive performance, inspiring the potential value of ML use in clinical practice.

Keywords