PLoS Genetics (Apr 2011)

Enhanced statistical tests for GWAS in admixed populations: assessment using African Americans from CARe and a Breast Cancer Consortium.

  • Bogdan Pasaniuc,
  • Noah Zaitlen,
  • Guillaume Lettre,
  • Gary K Chen,
  • Arti Tandon,
  • W H Linda Kao,
  • Ingo Ruczinski,
  • Myriam Fornage,
  • David S Siscovick,
  • Xiaofeng Zhu,
  • Emma Larkin,
  • Leslie A Lange,
  • L Adrienne Cupples,
  • Qiong Yang,
  • Ermeg L Akylbekova,
  • Solomon K Musani,
  • Jasmin Divers,
  • Joe Mychaleckyj,
  • Mingyao Li,
  • George J Papanicolaou,
  • Robert C Millikan,
  • Christine B Ambrosone,
  • Esther M John,
  • Leslie Bernstein,
  • Wei Zheng,
  • Jennifer J Hu,
  • Regina G Ziegler,
  • Sarah J Nyante,
  • Elisa V Bandera,
  • Sue A Ingles,
  • Michael F Press,
  • Stephen J Chanock,
  • Sandra L Deming,
  • Jorge L Rodriguez-Gil,
  • Cameron D Palmer,
  • Sarah Buxbaum,
  • Lynette Ekunwe,
  • Joel N Hirschhorn,
  • Brian E Henderson,
  • Simon Myers,
  • Christopher A Haiman,
  • David Reich,
  • Nick Patterson,
  • James G Wilson,
  • Alkes L Price

DOI
https://doi.org/10.1371/journal.pgen.1001371
Journal volume & issue
Vol. 7, no. 4
p. e1001371

Abstract

Read online

While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.