A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data

Scherer Stephen W; Pinto Dalila; Tango Toshiro; Takahashi Kunihiko; Nishiyama Takeshi; Takami Satoshi; Kishino Hirohisa

doi:10.1186/1471-2105-12-205

BMC Bioinformatics (May 2011)

A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data

Scherer Stephen W,
Pinto Dalila,
Tango Toshiro,
Takahashi Kunihiko,
Nishiyama Takeshi,
Takami Satoshi,
Kishino Hirohisa

Affiliations

Scherer Stephen W
Pinto Dalila
Tango Toshiro
Takahashi Kunihiko
Nishiyama Takeshi
Takami Satoshi
Kishino Hirohisa

DOI: https://doi.org/10.1186/1471-2105-12-205
Journal volume & issue: Vol. 12, no. 1
p. 205

Abstract

Read online

Abstract Background Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. Results We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. Conclusions The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.

Published in BMC Bioinformatics

ISSN: 1471-2105 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Biology (General)
Website: http://www.biomedcentral.com/bmcbioinformatics/

About the journal