Genetics and Molecular Biology (Jan 2005)
In silico characterization of microsatellites in Eucalyptus spp.: abundance, length variation and transposon associations
Abstract
This study assessed the abundance of microsatellites, or simple sequence repeats (SSR), in 19 Eucalyptus EST libraries from FORESTs, containing cDNA sequences from five species: E. grandis, E. globulus, E. saligna, E. urophylla and E. camaldulensis. Overall, a total of 11,534 SSRs and 8,447 SSR-containing sequences (25.5% of total ESTs) were identified, with an average of 1 SSR/2.5 kb when considering all motifs and 1 SSR/3.1 kb when mononucleotides were not included. Dimeric repeats were the most abundant (41.03%), followed by trimerics (36.11%) and monomerics (19.59%). The most frequent motifs were A/T (87.24%) for monomerics, AG/CT (94.44%) for dimerics, CCG/CGG (37.87%) for trimerics, AAGG/CCTT (18.75%) for tetramerics, AGAGG/CCTCT (14.04%) for pentamerics and ACGGCG/CGCCGT (6.30%) for hexamerics. According to sequence length, Class II or potentially variable markers were the most commonly found, followed by Class III. Two sequences presented high similarity to previously published Eucalyptus sequences from the NCBI database, EMBRA_72 and EMBRA_122. Local blastn search for transposons did not reveal the presence of any transposable elements with a cut-off value of 10-50. The large number of microsatellites identified will contribute to the refinement of marker-assisted mapping and to the discovery of novel markers for virtually all genes of economic interest.
Keywords