BMJ Open (Mar 2023)

Development of rapid and effective risk prediction models for stroke in the Chinese population: a cross-sectional study

  • Yan Xu,
  • Wei Yan,
  • Xiaoyun Chen,
  • Xiaona Chen,
  • Yuexin Qiu,
  • Shiqi Cheng,
  • Yuhang Wu,
  • Songbo Hu,
  • Yiying Chen,
  • Junsai Yang,
  • Huilie Zheng

DOI
https://doi.org/10.1136/bmjopen-2022-068045
Journal volume & issue
Vol. 13, no. 3

Abstract

Read online

Objectives The purpose of this study was to use easily obtained and directly observable clinical features to establish predictive models to identify patients at increased risk of stroke.Setting and participants A total of 46 240 valid records were obtained from 8 research centres and 14 communities in Jiangxi province, China, between February and September 2018.Primary and secondary outcome measures The area under the receiver operating characteristic curve (AUC), sensitivity, specificity and accuracy were calculated to test the performance of the five models (logistic regression (LR), random forest (RF), decision tree (DT), extreme gradient boosting (XGBoost) and gradient boosting DT). The calibration curve was used to show calibration performance.Results The results indicated that XGBoost (AUC: 0.924, accuracy: 0.873, sensitivity: 0.776, specificity: 0.916) and RF (AUC: 0.924, accuracy: 0.872, sensitivity: 0.778, specificity: 0.913) demonstrated excellent performance in predicting stroke. Physical inactivity, hypertension, meat-based diet and high salt intake were important prediction features of stroke.Conclusion The five machine learning models all had good predictive and discriminatory performance for stroke. The performance of RF and XGBoost was slightly better than that of LR, which was easier to interpret and less prone to overfitting. This work provides a rapid and accurate tool for stroke risk assessment, which can help to improve the efficiency of stroke screening medical services and the management of high-risk groups.