Comparative Genomics Analysis of Repetitive Elements in Ten Gymnosperm Species: “Dark Repeatome” and Its Abundance in Conifer and <i>Gnetum</i> Species
Avi Titievsky,
Yuliya A. Putintseva,
Elizaveta A. Taranenko,
Sofya Baskin,
Natalia V. Oreshkova,
Elia Brodsky,
Alexandra V. Sharova,
Vadim V. Sharov,
Julia Panov,
Dmitry A. Kuzmin,
Leonid Brodsky,
Konstantin V. Krutovsky
Affiliations
Avi Titievsky
Tauber Bioinformatics Research Center, University of Haifa, Haifa 3498838, Israel
Yuliya A. Putintseva
Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, 660036 Krasnoyarsk, Russia
Elizaveta A. Taranenko
Tauber Bioinformatics Research Center, University of Haifa, Haifa 3498838, Israel
Sofya Baskin
Tauber Bioinformatics Research Center, University of Haifa, Haifa 3498838, Israel
Natalia V. Oreshkova
Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, 660036 Krasnoyarsk, Russia
Elia Brodsky
Pine Biotech Inc., New Orleans, LA 70112, USA
Alexandra V. Sharova
Tauber Bioinformatics Research Center, University of Haifa, Haifa 3498838, Israel
Vadim V. Sharov
Tauber Bioinformatics Research Center, University of Haifa, Haifa 3498838, Israel
Julia Panov
Tauber Bioinformatics Research Center, University of Haifa, Haifa 3498838, Israel
Dmitry A. Kuzmin
Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, 660074 Krasnoyarsk, Russia
Leonid Brodsky
Tauber Bioinformatics Research Center, University of Haifa, Haifa 3498838, Israel
Konstantin V. Krutovsky
Laboratory of Forest Genomics, Genome Research and Education Center, Institute of Fundamental Biology and Biotechnology, Siberian Federal University, 660036 Krasnoyarsk, Russia
Repetitive elements (RE) and transposons (TE) can comprise up to 80% of some plant genomes and may be essential for regulating their evolution and adaptation. The “repeatome” information is often unavailable in assembled genomes because genomic areas of repeats are challenging to assemble and are often missing from final assembly. However, raw genomic sequencing data contain rich information about RE/TEs. Here, raw genomic NGS reads of 10 gymnosperm species were studied for the content and abundance patterns of their “repeatome”. We utilized a combination of alignment on databases of repetitive elements and de novo assembly of highly repetitive sequences from genomic sequencing reads to characterize and calculate the abundance of known and putative repetitive elements in the genomes of 10 conifer plants: Pinus taeda, Pinus sylvestris, Pinus sibirica, Picea glauca, Picea abies, Abies sibirica, Larix sibirica, Juniperus communis, Taxus baccata, and Gnetum gnemon. We found that genome abundances of known and newly discovered putative repeats are specific to phylogenetically close groups of species and match biological taxa. The grouping of species based on abundances of known repeats closely matches the grouping based on abundances of newly discovered putative repeats (kChains) and matches the known taxonomic relations.