BMC Bioinformatics (Aug 2017)

Identifying pleiotropic genes in genome-wide association studies from related subjects using the linear mixed model and Fisher combination function

  • James J. Yang,
  • L Keoki Williams,
  • Anne Buu

DOI
https://doi.org/10.1186/s12859-017-1791-9
Journal volume & issue
Vol. 18, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background A multivariate genome-wide association test is proposed for analyzing data on multivariate quantitative phenotypes collected from related subjects. The proposed method is a two-step approach. The first step models the association between the genotype and marginal phenotype using a linear mixed model. The second step uses the correlation between residuals of the linear mixed model to estimate the null distribution of the Fisher combination test statistic. Results The simulation results show that the proposed method controls the type I error rate and is more powerful than the marginal tests across different population structures (admixed or non-admixed) and relatedness (related or independent). The statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that applying the multivariate association test may facilitate identification of the pleiotropic genes contributing to the risk for alcohol dependence commonly expressed by four correlated phenotypes. Conclusions This study proposes a multivariate method for identifying pleiotropic genes while adjusting for cryptic relatedness and population structure between subjects. The two-step approach is not only powerful but also computationally efficient even when the number of subjects and the number of phenotypes are both very large.

Keywords