Alzheimer’s & Dementia: Translational Research & Clinical Interventions (Jan 2024)

The power of representation: Statistical analysis of diversity in US Alzheimer's disease genetics data

  • Diane Xue,
  • Elizabeth E. Blue,
  • Matthew P. Conomos,
  • Alison E. Fohner

DOI
https://doi.org/10.1002/trc2.12462
Journal volume & issue
Vol. 10, no. 1
pp. n/a – n/a

Abstract

Read online

Abstract INTRODUCTION Alzheimer's disease (AD) is a complex disease influenced by genetics and environment. More than 75 susceptibility loci have been linked to late‐onset AD, but most of these loci were discovered in genome‐wide association studies (GWAS) exclusive to non‐Hispanic White individuals. There are wide disparities in AD risk across racially stratified groups, and while these disparities are not due to genetic differences, underrepresentation in genetic research can further exacerbate and contribute to their persistence. We investigated the racial/ethnic representation of participants in United States (US)‐based AD genetics and the statistical implications of current representation. METHODS We compared racial/ethnic data of participants from array and sequencing studies in US AD genetics databases, including National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) and NIAGADS Data Sharing Service (dssNIAGADS), to AD and related dementia (ADRD) prevalence and mortality. We then simulated the statistical power of these datasets to identify risk variants from non‐White populations. RESULTS There is insufficient statistical power (probability <80%) to detect single nucleotide polymorphisms (SNPs) with low to moderate effect sizes (odds ratio [OR]<1.5) using array data from Black and Hispanic participants; studies of Asian participants are not powered to detect variants OR <= 2. Using available and projected sequencing data from Black and Hispanic participants, risk variants with OR = 1.2 are detectable at high allele frequencies. Sample sizes remain insufficiently powered to detect these variants in Asian populations. DISCUSSION AD genetics datasets are largely representative of US ADRD burden. However, there is a wide discrepancy between proportional representation and statistically meaningful representation. Most variation identified in GWAS of non‐Hispanic White individuals have low to moderate effects. Comparable risk variants in non‐White populations are not detectable given current sample sizes, which could lead to disparities in future studies and drug development. We urge AD genetics researchers and institutions to continue investing in recruiting diverse participants and use community‐based participatory research practices.

Keywords