PLoS ONE (Jan 2013)
Microsatellite tandem repeats are abundant in human promoters and are associated with regulatory elements.
Abstract
Tandem repeats are genomic elements that are prone to changes in repeat number and are thus often polymorphic. These sequences are found at a high density at the start of human genes, in the gene's promoter. Increasing empirical evidence suggests that length variation in these tandem repeats can affect gene regulation. One class of tandem repeats, known as microsatellites, rapidly alter in repeat number. Some of the genetic variation induced by microsatellites is known to result in phenotypic variation. Recently, our group developed a novel method for measuring the evolutionary conservation of microsatellites, and with it we discovered that human microsatellites near transcription start sites are often highly conserved. In this study, we examined the properties of microsatellites found in promoters. We found a high density of microsatellites at the start of genes. We showed that microsatellites are statistically associated with promoters using a wavelet analysis, which allowed us to test for associations on multiple scales and to control for other promoter related elements. Because promoter microsatellites tend to be G/C rich, we hypothesized that G/C rich regulatory elements may drive the association between microsatellites and promoters. Our results indicate that CpG islands, G-quadruplexes (G4) and untranslated regulatory regions have highly significant associations with microsatellites, but controlling for these elements in the analysis does not remove the association between microsatellites and promoters. Due to their intrinsic lability and their overlap with predicted functional elements, these results suggest that many promoter microsatellites have the potential to affect human phenotypes by generating mutations in regulatory elements, which may ultimately result in disease. We discuss the potential functions of human promoter microsatellites in this context.