Mobile DNA (May 2024)

Teaching transposon classification as a means to crowd source the curation of repeat annotation – a tardigrade perspective

  • Valentina Peona,
  • Jacopo Martelossi,
  • Dareen Almojil,
  • Julia Bocharkina,
  • Ioana Brännström,
  • Max Brown,
  • Alice Cang,
  • Tomàs Carrasco-Valenzuela,
  • Jon DeVries,
  • Meredith Doellman,
  • Daniel Elsner,
  • Pamela Espíndola-Hernández,
  • Guillermo Friis Montoya,
  • Bence Gaspar,
  • Danijela Zagorski,
  • Paweł Hałakuc,
  • Beti Ivanovska,
  • Christopher Laumer,
  • Robert Lehmann,
  • Ljudevit Luka Boštjančić,
  • Rahia Mashoodh,
  • Sofia Mazzoleni,
  • Alice Mouton,
  • Maria Anna Nilsson,
  • Yifan Pei,
  • Giacomo Potente,
  • Panagiotis Provataris,
  • José Ramón Pardos-Blas,
  • Ravindra Raut,
  • Tomasa Sbaffi,
  • Florian Schwarz,
  • Jessica Stapley,
  • Lewis Stevens,
  • Nusrat Sultana,
  • Radka Symonova,
  • Mohadeseh S. Tahami,
  • Alice Urzì,
  • Heidi Yang,
  • Abdullah Yusuf,
  • Carlo Pecoraro,
  • Alexander Suh

DOI
https://doi.org/10.1186/s13100-024-00319-8
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background The advancement of sequencing technologies results in the rapid release of hundreds of new genome assemblies a year providing unprecedented resources for the study of genome evolution. Within this context, the significance of in-depth analyses of repetitive elements, transposable elements (TEs) in particular, is increasingly recognized in understanding genome evolution. Despite the plethora of available bioinformatic tools for identifying and annotating TEs, the phylogenetic distance of the target species from a curated and classified database of repetitive element sequences constrains any automated annotation effort. Moreover, manual curation of raw repeat libraries is deemed essential due to the frequent incompleteness of automatically generated consensus sequences. Results Here, we present an example of a crowd-sourcing effort aimed at curating and annotating TE libraries of two non-model species built around a collaborative, peer-reviewed teaching process. Manual curation and classification are time-consuming processes that offer limited short-term academic rewards and are typically confined to a few research groups where methods are taught through hands-on experience. Crowd-sourcing efforts could therefore offer a significant opportunity to bridge the gap between learning the methods of curation effectively and empowering the scientific community with high-quality, reusable repeat libraries. Conclusions The collaborative manual curation of TEs from two tardigrade species, for which there were no TE libraries available, resulted in the successful characterization of hundreds of new and diverse TEs in a reasonable time frame. Our crowd-sourcing setting can be used as a teaching reference guide for similar projects: A hidden treasure awaits discovery within non-model organisms.

Keywords