PLoS ONE (Jan 2009)
How many genetic variants remain to be discovered?
Abstract
A great majority of genetic markers discovered in recent genome-wide association studies have small effect sizes, and they explain only a small fraction of the genetic contribution to the diseases. How many more variants can we expect to discover and what study sizes are needed? We derive the connection between the cumulative risk of the SNP variants to the latent genetic risk model and heritability of the disease. We determine the sample size required for case-control studies in order to achieve a certain expected number of discoveries in a collection of most significant SNPs. Assuming similar allele frequencies and effect sizes of the currently validated SNPs, complex phenotypes such as type-2 diabetes would need approximately 800 variants to explain its 40% heritability. Much smaller numbers of variants are needed if we assume rare-variants but higher penetrance models. We estimate that up to 50,000 cases and an equal number of controls are needed to discover 800 common low-penetrant variants among the top 5000 SNPs. Under common and rare low-penetrance models, the very large studies required to discover the numerous variants are probably at the limit of practical feasibility. Under rare-variant with medium- to high-penetrance models (odds-ratios between 1.6 and 4.0), studies comparable in size to many existing studies are adequate provided the genotyping technology can interrogate more and rarer variants.