HGG Advances (Jul 2022)

Accounting for population structure in genetic studies of cystic fibrosis

  • Hanley Kingston,
  • Adrienne M. Stilp,
  • William Gordon,
  • Jai Broome,
  • Stephanie M. Gogarten,
  • Hua Ling,
  • John Barnard,
  • Shannon Dugan-Perez,
  • Patrick T. Ellinor,
  • Stacey Gabriel,
  • Soren Germer,
  • Richard A. Gibbs,
  • Namrata Gupta,
  • Kenneth Rice,
  • Albert V. Smith,
  • Michael C. Zody,
  • Scott M. Blackman,
  • Garry Cutting,
  • Michael R. Knowles,
  • Yi-Hui Zhou,
  • Margaret Rosenfeld,
  • Ronald L. Gibson,
  • Michael Bamshad,
  • Alison Fohner,
  • Elizabeth E. Blue

Journal volume & issue
Vol. 3, no. 3
p. 100117

Abstract

Read online

Summary: CFTR F508del (c.1521_1523delCTT, p.Phe508delPhe) is the most common pathogenic allele underlying cystic fibrosis (CF), and its frequency varies in a geographic cline across Europe. We hypothesized that genetic variation associated with this cline is overrepresented in a large cohort (N > 5,000) of persons with CF who underwent whole-genome sequencing and that this pattern could result in spurious associations between variants correlated with both the F508del genotype and CF-related outcomes. Using principal-component (PC) analyses, we showed that variation in the CFTR region disproportionately contributes to a PC explaining a relatively high proportion of genetic variance. Variation near CFTR was correlated with population structure among persons with CF, and this correlation was driven by a subset of the sample inferred to have European ancestry. We performed genome-wide association studies comparing persons with CF with one versus two copies of the F508del allele; this allowed us to identify genetic variation associated with the F508del allele and to determine that standard PC-adjustment strategies eliminated the significant association signals. Our results suggest that PC adjustment can adequately prevent spurious associations between genetic variants and CF-related traits and are therefore effective tools to control for population structure even when population structure is confounded with disease severity and a common pathogenic variant.

Keywords