PLoS ONE (Jan 2013)
Considerations for subgroups and phenocopies in complex disease genetics.
Abstract
The number of identified genetic variants associated to complex disease cannot fully explain heritability. This may be partially due to more complicated patterns of predisposition than previously suspected. Diseases such as multiple sclerosis (MS) may consist of multiple disease causing mechanisms, each comprised of several elements. We describe how the effect of subgroups can be calculated using the standard association measurement odds ratio, which is then manipulated to provide a formula for the true underlying association present within the subgroup. This is sensitive to the initial minor allele frequencies present in both cases and the subgroup of patients. The methodology is then extended to the χ(2) statistic, for two related scenarios. First, to determine the true χ(2) when phenocopies or disease subtypes reduce association and are reclassified as controls when calculating statistics. Here, the χ(2) is given by (1 + σ * (a + b)/(c + d))/(1 - σ), or (1 + σ)/(1 - σ) for equal numbers of cases and controls. Second, when subgroups corresponding to heterogeneity mask the true effect size, but no reclassification is made. Here, the proportion increase in total sample size required to attain the same χ(2) statistic as the subgroup is given as γ = (1 - σ/2)/((1 - σ)(1 - σc/(a + c))(1 - σd/(b + d))), and a python script to calculate and plot this value is provided at kirc.se. Practical examples show how in a study of modest size (1000 cases and 1000 controls), a non-significant SNP may exceed genome-wide significance when corresponding to a subgroup of 20% of cases, and may occur in heterozygous form in all cases. This methodology may explain the modest association found in diseases such as MS wherein heterogeneity confounds straightforward measurement of association.