HGG Advances (Jan 2023)

High-quality read-based phasing of cystic fibrosis cohort informs genetic understanding of disease modification

  • Scott Mastromatteo,
  • Angela Chen,
  • Jiafen Gong,
  • Fan Lin,
  • Bhooma Thiruvahindrapuram,
  • Wilson W.L. Sung,
  • Joe Whitney,
  • Zhuozhi Wang,
  • Rohan V. Patel,
  • Katherine Keenan,
  • Anat Halevy,
  • Naim Panjwani,
  • Julie Avolio,
  • Cheng Wang,
  • Guillaume Côté-Maurais,
  • Stéphanie Bégin,
  • Damien Adam,
  • Emmanuelle Brochiero,
  • Candice Bjornson,
  • Mark Chilvers,
  • April Price,
  • Michael Parkins,
  • Richard van Wylick,
  • Dimas Mateos-Corral,
  • Daniel Hughes,
  • Mary Jane Smith,
  • Nancy Morrison,
  • Elizabeth Tullis,
  • Anne L. Stephenson,
  • Pearce Wilcox,
  • Bradley S. Quon,
  • Winnie M. Leung,
  • Melinda Solomon,
  • Lei Sun,
  • Felix Ratjen,
  • Lisa J. Strug

Journal volume & issue
Vol. 4, no. 1
p. 100156

Abstract

Read online

Summary: Phasing of heterozygous alleles is critical for interpretation of cis-effects of disease-relevant variation. We sequenced 477 individuals with cystic fibrosis (CF) using linked-read sequencing, which display an average phase block N50 of 4.39 Mb. We use these samples to construct a graph representation of CFTR haplotypes, demonstrating its utility for understanding complex CF alleles. These are visualized in a Web app, CFTbaRcodes, that enables interactive exploration of CFTR haplotypes present in this cohort. We perform fine-mapping and phasing of the chr7q35 trypsinogen locus associated with CF meconium ileus, an intestinal obstruction at birth associated with more severe CF outcomes and pancreatic disease. A 20-kb deletion polymorphism and a PRSS2 missense variant p.Thr8Ile (rs62473563) are shown to independently contribute to meconium ileus risk (p = 0.0028, p = 0.011, respectively) and are PRSS2 pancreas eQTLs (p = 9.5 × 10−7 and p = 1.4 × 10−4, respectively), suggesting the mechanism by which these polymorphisms contribute to CF. The phase information from linked reads provides a putative causal explanation for variation at a CF-relevant locus, which also has implications for the genetic basis of non-CF pancreatitis, to which this locus has been reported to contribute.

Keywords