Zhongguo quanke yixue (Sep 2023)

Application of metaPRS and APOEε4 to Optimize Genetic Risk Prediction Modeling Strategy for Mild Cognitive Impairment

  • LI Zimeng, WANG Rong, CHEN Shuai, ZHAO Caili, WANG Xiaocong, WEN Yalu, LIU Long

DOI
https://doi.org/10.12114/j.issn.1007-9572.2022.0756
Journal volume & issue
Vol. 26, no. 25
pp. 3104 – 3111

Abstract

Read online

Background Mild cognitive impairment (MCI) is an important stage to intervene and delay the progression of dementia, and it has been shown closely associated with genetic factors, among which apolipoprotein E (APOE) ε4 is recognized as an important risk allele of MCI in the medical field. Due to the lack of Genome-Wide Association Study (GWAS) summary data of MCI, it is common to use the GWAS summary data of Alzheimer's disease (AD) as the base dataset to calculate the polygenic risk score (PRS) of MCI, resulting in suboptimal PRS genetic risk prediction for MCI. Objective To explore the and optimize the statistical modeling strategy of genetic risk in MCI from the perspective of generalized linear model and machine learning, using meta-polygenic risk score (metaPRS) and APOEε4 as important predictors. Methods PRS for the 12 MCI-related traits were calculated and integrated into metaPRS for MCI by elastic-net Logistic regression model. SCOREAPOE was calculated by weighting the APOEε4 effect size with age correction. XGBoost, GBM, Logistic regression and Lasso regression were used as statistical modeling methods to verify the inclusion strategies of different predictors based on metaPRS, SCOREAPOE and basic demographic information (age, gender, education level) . AUC and F-measure were used to evaluate the predictive effect of statistical modeling of genetic risk of MCI. Results metaPRS and SCOREAPOE have high predictive value for the genetic risk of MCI. After including metaPRS, SCOREAPOE and basic demographic information (age, gender, education level) , the predictive effect of each statistical modeling method is XGBoost (AUC=0.69, F-measure=0.88) , GBM (AUC=0.76, F-measure=0.87) , Logistic regression (AUC=0.77, F-measure=0.89) , and Lasso regression (AUC=0.76, F-measure=0.92) . Conclusion When the sample size is 325 (less than 500) , the Lasso regression model constructed by including metaPRS, SCOREAPOE and basic demographic information (age, gender, education level) as predictors has the best effect on MCI genetic risk prediction, providing a new idea and perspective for statistical modeling of genetic risk of complex diseases such as MCI.

Keywords