PLoS ONE (Jan 2017)

DoEstRare: A statistical test to identify local enrichments in rare genomic variants associated with disease.

  • Elodie Persyn,
  • Matilde Karakachoff,
  • Solena Le Scouarnec,
  • Camille Le Clézio,
  • Dominique Campion,
  • French Exome Consortium,
  • Jean-Jacques Schott,
  • Richard Redon,
  • Lise Bellanger,
  • Christian Dina

DOI
https://doi.org/10.1371/journal.pone.0179364
Journal volume & issue
Vol. 12, no. 7
p. e0179364

Abstract

Read online

Next-generation sequencing technologies made it possible to assay the effect of rare variants on complex diseases. As an extension of the "common disease-common variant" paradigm, rare variant studies are necessary to get a more complete insight into the genetic architecture of human traits. Association studies of these rare variations show new challenges in terms of statistical analysis. Due to their low frequency, rare variants must be tested by groups. This approach is then hindered by the fact that an unknown proportion of the variants could be neutral. The risk level of a rare variation may be determined by its impact but also by its position in the protein sequence. More generally, the molecular mechanisms underlying the disease architecture may involve specific protein domains or inter-genic regulatory regions. While a large variety of methods are optimizing functionality weights for each single marker, few evaluate variant position differences between cases and controls. Here, we propose a test called DoEstRare, which aims to simultaneously detect clusters of disease risk variants and global allele frequency differences in genomic regions. This test estimates, for cases and controls, variant position densities in the genetic region by a kernel method, weighted by a function of allele frequencies. We compared DoEstRare with previously published strategies through simulation studies as well as re-analysis of real datasets. Based on simulation under various scenarios, DoEstRare was the sole to consistently show highest performance, in terms of type I error and power both when variants were clustered or not. DoEstRare was also applied to Brugada syndrome and early-onset Alzheimer's disease data and provided complementary results to other existing tests. DoEstRare, by integrating variant position information, gives new opportunities to explain disease susceptibility. DoEstRare is implemented in a user-friendly R package.