Communications Biology (Aug 2024)

Improving genetic risk modeling of dementia from real-world data in underrepresented populations

  • Mingzhou Fu,
  • Leopoldo Valiente-Banuet,
  • Satpal S. Wadhwa,
  • Bogdan Pasaniuc,
  • Keith Vossel,
  • Timothy S. Chang

DOI
https://doi.org/10.1038/s42003-024-06742-0
Journal volume & issue
Vol. 7, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Genetic risk modeling for dementia offers significant benefits, but studies based on real-world data, particularly for underrepresented populations, are limited. We employ an Elastic Net model for dementia risk prediction using single-nucleotide polymorphisms prioritized by functional genomic data from multiple neurodegenerative disease genome-wide association studies. We compare this model with APOE and polygenic risk score models across genetic ancestry groups (Hispanic Latino American sample: 610 patients with 126 cases; African American sample: 440 patients with 84 cases; East Asian American sample: 673 patients with 75 cases), using electronic health records from UCLA Health for discovery and the All of Us cohort for validation. Our model significantly outperforms other models across multiple ancestries, improving the area-under-precision-recall curve by 31–84% (Wilcoxon signed-rank test p-value <0.05) and the area-under-the-receiver-operating characteristic by 11–17% (DeLong test p-value <0.05) compared to the APOE and the polygenic risk score models. We identify shared and ancestry-specific risk genes and biological pathways, reinforcing and adding to existing knowledge. Our study highlights the benefits of integrating functional mapping, multiple neurodegenerative diseases, and machine learning for genetic risk models in diverse populations. Our findings hold potential for refining precision medicine strategies in dementia diagnosis.