Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population

Feng-Juan Yan; Xie-Hui Chen; Xiao-Qing Quan; Li-Li Wang; Xin-Yi Wei; Jia-Liang Zhu

doi:10.3389/fnagi.2023.1180351

Frontiers in Aging Neuroscience (Jun 2023)

Development and validation of an interpretable machine learning model—Predicting mild cognitive impairment in a high-risk stroke population

Feng-Juan Yan,
Xie-Hui Chen,
Xiao-Qing Quan,
Li-Li Wang,
Xin-Yi Wei,
Jia-Liang Zhu

Affiliations

Feng-Juan Yan: Department of Geriatrics, Shenzhen Longhua District Central Hospital, Shenzhen, Guangdong, China
Xie-Hui Chen: Department of Geriatrics, Shenzhen Longhua District Central Hospital, Shenzhen, Guangdong, China
Xiao-Qing Quan: Department of Geriatrics, Shenzhen Longhua District Central Hospital, Shenzhen, Guangdong, China
Li-Li Wang: Department of Cardiology, Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, Shandong, China
Xin-Yi Wei: Department of Cardiology, The Third Hospital of Jinan, Jinan, Shandong, China
Jia-Liang Zhu: The First Affiliated Hospital of Jinan University, Guangzhou, Guangdong, China

DOI: https://doi.org/10.3389/fnagi.2023.1180351
Journal volume & issue: Vol. 15

Abstract

Read online

BackgroundMild cognitive impairment (MCI) is considered a preclinical stage of Alzheimer’s disease (AD). People with MCI have a higher risk of developing dementia than healthy people. As one of the risk factors for MCI, stroke has been actively treated and intervened. Therefore, selecting the high-risk population of stroke as the research object and discovering the risk factors of MCI as early as possible can prevent the occurrence of MCI more effectively.MethodsThe Boruta algorithm was used to screen variables, and eight machine learning models were established and evaluated. The best performing models were used to assess variable importance and build an online risk calculator. Shapley additive explanation is used to explain the model.ResultsA total of 199 patients were included in the study, 99 of whom were male. Transient ischemic attack (TIA), homocysteine, education, hematocrit (HCT), diabetes, hemoglobin, red blood cells (RBC), hypertension, prothrombin time (PT) were selected by Boruta algorithm. Logistic regression (AUC = 0.8595) was the best model for predicting MCI in high-risk groups of stroke, followed by elastic network (ENET) (AUC = 0.8312), multilayer perceptron (MLP) (AUC = 0.7908), extreme gradient boosting (XGBoost) (AUC = 0.7691), and support vector machine (SVM) (AUC = 0.7527), random forest (RF) (AUC = 0.7451), K-nearest neighbors (KNN) (AUC = 0.7380), decision tree (DT) (AUC = 0.6972). The importance of variables suggests that TIA, diabetes, education, and hypertension are the top four variables of importance.ConclusionTransient ischemic attack (TIA), diabetes, education, and hypertension are the most important risk factors for MCI in high-risk groups of stroke, and early intervention should be performed to reduce the occurrence of MCI.

Published in Frontiers in Aging Neuroscience

ISSN: 1663-4365 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Medicine: Internal medicine: Neurosciences. Biological psychiatry. Neuropsychiatry
Website: https://www.frontiersin.org/journals/aging-neuroscience

About the journal

Abstract

Keywords