PLoS ONE (Jan 2014)

Rare variant association testing by adaptive combination of P-values.

  • Wan-Yu Lin,
  • Xiang-Yang Lou,
  • Guimin Gao,
  • Nianjun Liu

DOI
https://doi.org/10.1371/journal.pone.0085728
Journal volume & issue
Vol. 9, no. 1
p. e85728

Abstract

Read online

With the development of next-generation sequencing technology, there is a great demand for powerful statistical methods to detect rare variants (minor allele frequencies (MAFs)<1%) associated with diseases. Testing for each variant site individually is known to be underpowered, and therefore many methods have been proposed to test for the association of a group of variants with phenotypes, by pooling signals of the variants in a chromosomal region. However, this pooling strategy inevitably leads to the inclusion of a large proportion of neutral variants, which may compromise the power of association tests. To address this issue, we extend the [Formula: see text]-MidP method (Cheung et al., 2012, Genet Epidemiol 36: 675-685) and propose an approach (named 'adaptive combination of P-values for rare variant association testing', abbreviated as 'ADA') that adaptively combines per-site P-values with the weights based on MAFs. Before combining P-values, we first imposed a truncation threshold upon the per-site P-values, to guard against the noise caused by the inclusion of neutral variants. This ADA method is shown to outperform popular burden tests and non-burden tests under many scenarios. ADA is recommended for next-generation sequencing data analysis where many neutral variants may be included in a functional region.