PLoS Genetics (Sep 2011)

Phased whole-genome genetic risk in a family quartet using a major allele reference sequence.

  • Frederick E Dewey,
  • Rong Chen,
  • Sergio P Cordero,
  • Kelly E Ormond,
  • Colleen Caleshu,
  • Konrad J Karczewski,
  • Michelle Whirl-Carrillo,
  • Matthew T Wheeler,
  • Joel T Dudley,
  • Jake K Byrnes,
  • Omar E Cornejo,
  • Joshua W Knowles,
  • Mark Woon,
  • Katrin Sangkuhl,
  • Li Gong,
  • Caroline F Thorn,
  • Joan M Hebert,
  • Emidio Capriotti,
  • Sean P David,
  • Aleksandra Pavlovic,
  • Anne West,
  • Joseph V Thakuria,
  • Madeleine P Ball,
  • Alexander W Zaranek,
  • Heidi L Rehm,
  • George M Church,
  • John S West,
  • Carlos D Bustamante,
  • Michael Snyder,
  • Russ B Altman,
  • Teri E Klein,
  • Atul J Butte,
  • Euan A Ashley

DOI
https://doi.org/10.1371/journal.pgen.1002280
Journal volume & issue
Vol. 7, no. 9
p. e1002280

Abstract

Read online

Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (< 1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.