PLoS ONE (Jan 2019)

Abundance of ethnically biased microsatellites in human gene regions.

  • Nick Kinney,
  • Lin Kang,
  • Laurel Eckstrand,
  • Arichanah Pulenthiran,
  • Peter Samuel,
  • Ramu Anandakrishnan,
  • Robin T Varghese,
  • P Michalak,
  • Harold R Garner

DOI
https://doi.org/10.1371/journal.pone.0225216
Journal volume & issue
Vol. 14, no. 12
p. e0225216

Abstract

Read online

Microsatellites-a type of short tandem repeat (STR)-have been used for decades as putatively neutral markers to study the genetic structure of diverse human populations. However, recent studies have demonstrated that some microsatellites contribute to gene expression, cis heritability, and phenotype. As a corollary, some microsatellites may contribute to differential gene expression and RNA/protein structure stability in distinct human populations. To test this hypothesis, we investigate genotype frequencies, functional relevance, and adaptive potential of microsatellites in five super-populations (ethnicities) drawn from the 1000 Genomes Project. We discover 3,984 ethnically-biased microsatellite loci (EBML); for each EBML at least one ethnicity has genotype frequencies statistically different from the remaining four. South Asian, East Asian, European, and American EBML show significant overlap; on the contrary, the set of African EBML is mostly unique. We cross-reference the 3,984 EBML with 2,060 previously identified expression STRs (eSTRs); repeats known to affect gene expression (64 total) are over-represented. The most significant pathway enrichments are those associated with the matrisome: a broad collection of genes encoding the extracellular matrix and its associated proteins. At least 14 of the EBML have established links to human disease. Analysis of the 3,984 EBML with respect to known selective sweep regions in the genome shows that allelic variation in some of them is likely associated with adaptive evolution.