PLoS Genetics (Jan 2013)

Fusion of large-scale genomic knowledge and frequency data computationally prioritizes variants in epilepsy.

  • Ian M Campbell,
  • Mitchell Rao,
  • Sean D Arredondo,
  • Seema R Lalani,
  • Zhilian Xia,
  • Sung-Hae L Kang,
  • Weimin Bi,
  • Amy M Breman,
  • Janice L Smith,
  • Carlos A Bacino,
  • Arthur L Beaudet,
  • Ankita Patel,
  • Sau Wai Cheung,
  • James R Lupski,
  • Paweł Stankiewicz,
  • Melissa B Ramocki,
  • Chad A Shaw

DOI
https://doi.org/10.1371/journal.pgen.1003797
Journal volume & issue
Vol. 9, no. 9
p. e1003797

Abstract

Read online

Curation and interpretation of copy number variants identified by genome-wide testing is challenged by the large number of events harbored in each personal genome. Conventional determination of phenotypic relevance relies on patterns of higher frequency in affected individuals versus controls; however, an increasing amount of ascertained variation is rare or private to clans. Consequently, frequency data have less utility to resolve pathogenic from benign. One solution is disease-specific algorithms that leverage gene knowledge together with variant frequency to aid prioritization. We used large-scale resources including Gene Ontology, protein-protein interactions and other annotation systems together with a broad set of 83 genes with known associations to epilepsy to construct a pathogenicity score for the phenotype. We evaluated the score for all annotated human genes and applied Bayesian methods to combine the derived pathogenicity score with frequency information from our diagnostic laboratory. Analysis determined Bayes factors and posterior distributions for each gene. We applied our method to subjects with abnormal chromosomal microarray results and confirmed epilepsy diagnoses gathered by electronic medical record review. Genes deleted in our subjects with epilepsy had significantly higher pathogenicity scores and Bayes factors compared to subjects referred for non-neurologic indications. We also applied our scores to identify a recently validated epilepsy gene in a complex genomic region and to reveal candidate genes for epilepsy. We propose a potential use in clinical decision support for our results in the context of genome-wide screening. Our approach demonstrates the utility of integrative data in medical genomics.