BMC Medical Genomics (Apr 2018)

Exact association test for small size sequencing data

  • Joowon Lee,
  • Seungyeoun Lee,
  • Jin-Young Jang,
  • Taesung Park

DOI
https://doi.org/10.1186/s12920-018-0344-z
Journal volume & issue
Vol. 11, no. S2
pp. 21 – 31

Abstract

Read online

Abstract Background Recent statistical methods for next generation sequencing (NGS) data have been successfully applied to identifying rare genetic variants associated with certain diseases. However, most commonly used methods (e.g., burden tests and variance-component tests) rely on large sample sizes. Notwithstanding, due to its-still high cost, NGS data is generally restricted to small sample sizes, that cannot be analyzed by most existing methods. Methods In this work, we propose a new exact association test for sequencing data that does not require a large sample approximation, which is applicable to both common and rare variants. Our method, based on the Generalized Cochran-Mantel-Haenszel (GCMH) statistic, was applied to NGS datasets from intraductal papillary mucinous neoplasm (IPMN) patients. IPMN is a unique pancreatic cancer subtype that can turn into an invasive and hard-to-treat metastatic disease. Results Application of our method to IPMN data successfully identified susceptible genes associated with progression of IPMN to pancreatic cancer. Conclusions Our method is expected to identify disease-associated genetic variants more successfully, and corresponding signal pathways, improving our understanding of specific disease’s etiology and prognosis.

Keywords