PLoS Genetics (Aug 2006)
Regional admixture mapping and structured association testing: conceptual unification and an extensible general linear model.
Abstract
Individual genetic admixture estimates, determined both across the genome and at specific genomic regions, have been proposed for use in identifying specific genomic regions harboring loci influencing phenotypes in regional admixture mapping (RAM). Estimates of individual ancestry can be used in structured association tests (SAT) to reduce confounding induced by various forms of population substructure. Although presented as two distinct approaches, we provide a conceptual framework in which both RAM and SAT are special cases of a more general linear model. We clarify which variables are sufficient to condition upon in order to prevent spurious associations and also provide a simple closed form "semiparametric" method of evaluating the reliability of individual admixture estimates. An estimate of the reliability of individual admixture estimates is required to make an inherent errors-in-variables problem tractable. Casting RAM and SAT methods as a general linear model offers enormous flexibility enabling application to a rich set of phenotypes, populations, covariates, and situations, including interaction terms and multilocus models. This approach should allow far wider use of RAM and SAT, often using standard software, in addressing admixture as either a confounder of association studies or a tool for finding loci influencing complex phenotypes in species as diverse as plants, humans, and nonhuman animals.