BMC Genomics (Oct 2011)

A genome-wide survey for SNPs altering microRNA seed sites identifies functional candidates in GWAS

  • Lee Yu-Chi,
  • Parnell Laurence D,
  • Lai Chao-Qiang,
  • Richardson Kris,
  • Ordovas Jose M

DOI
https://doi.org/10.1186/1471-2164-12-504
Journal volume & issue
Vol. 12, no. 1
p. 504

Abstract

Read online

Abstract Background Gene variants within regulatory regions are thought to be major contributors of the variation of complex traits/diseases. Genome wide association studies (GWAS), have identified scores of genetic variants that appear to contribute to human disease risk. However, most of these variants do not appear to be functional. Thus, the significance of the association may be brought up by still unknown mechanisms or by linkage disequilibrium (LD) with functional polymorphisms. In the present study, focused on functional variants related with the binding of microRNAs (miR), we utilized SNP data, including newly released 1000 Genomes Project data to perform a genome-wide scan of SNPs that abrogate or create miR recognition element (MRE) seed sites (MRESS). Results We identified 2723 SNPs disrupting, and 22295 SNPs creating MRESSs. We estimated the percent of SNPs falling within both validated (5%) and predicted conserved MRESSs (3%). We determined 87 of these MRESS SNPs were listed in GWAS association studies, or in strong LD with a GWAS SNP, and may represent the functional variants of identified GWAS SNPs. Furthermore, 39 of these have evidence of co-expression of target mRNA and the predicted miR. We also gathered previously published eQTL data supporting a functional role for four of these SNPs shown to associate with disease phenotypes. Comparison of FST statistics (a measure of population subdivision) for predicted MRESS SNPs against non MRESS SNPs revealed a significantly higher (P = 0.0004) degree of subdivision among MRESS SNPs, suggesting a role for these SNPs in environmentally driven selection. Conclusions We have demonstrated the potential of publicly available resources to identify high priority candidate SNPs for functional studies and for disease risk prediction.