Machine learning–based predictive model for post-stroke dementia

Zemin Wei; Mengqi Li; Chenghui Zhang; Jinli Miao; Wenmin Wang; Hong Fan

doi:10.1186/s12911-024-02752-4

BMC Medical Informatics and Decision Making (Nov 2024)

Machine learning–based predictive model for post-stroke dementia

Zemin Wei,
Mengqi Li,
Chenghui Zhang,
Jinli Miao,
Wenmin Wang,
Hong Fan

Affiliations

Zemin Wei: Department of Geriatrics, Shaoxing People’s Hospital
Mengqi Li: School of Medicine, Shaoxing University
Chenghui Zhang: School of Medicine, Shaoxing University
Jinli Miao: The Yangtze River Delta Biological Medicine Research and Development Center of Zhejiang Province, Yangtze Delta Region Institution of Tsinghua University
Wenmin Wang: The Yangtze River Delta Biological Medicine Research and Development Center of Zhejiang Province, Yangtze Delta Region Institution of Tsinghua University
Hong Fan: Department of Geriatrics, Shaoxing People’s Hospital

DOI: https://doi.org/10.1186/s12911-024-02752-4
Journal volume & issue: Vol. 24, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background Post-stroke dementia (PSD), a common complication, diminishes rehabilitation efficacy and affects disease prognosis in stroke patients. Many factors may be related to PSD, including demographic, comorbidities, and examination characteristics. However, most existing methods are qualitative evaluations of independent factors, which ignore the interaction amongst various factors. Therefore, the purpose of this study is to explore the applicability of machine learning (ML) methods for predicting PSD. Methods 9 acceptable features were screened out by the Spearman correlation analysis and Boruta algorithm. We developed and evaluated 8 ML models: logistic regression, elastic net, k-nearest neighbors, decision tree, extreme gradient boosting, support vector machine, random forest, and multilayer perceptron. Results A total of 539 stroke patients were included in this study. Among the 8 models used to predict PSD, extreme gradient boosting and random forest showed the highest area under the curve (AUC) of the receiver operating characteristic curve (ROC), with values of 0.7287 and 0.7285, respectively. The most important features for predicting PSD included age, high sensitivity C-reactive protein, stroke side and location, and the occurrence of cerebral hemorrhage. Conclusion Our findings suggest that ML models, especially extreme gradient boosting, can best predict the risk of PSD.

Published in BMC Medical Informatics and Decision Making

ISSN: 1472-6947 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: http://bmcmedinformdecismak.biomedcentral.com

About the journal

Abstract

Keywords