Genome Biology (Nov 2023)

Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor

  • Nozhat T. Hassan,
  • David L. Adelson

DOI
https://doi.org/10.1186/s13059-023-03102-9
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 8

Abstract

Read online

Abstract Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated DNA transposons in non-mammalian species, as it contains a 3′ fused hAT transposase domain. We also demonstrate the generality of this problem by identifying misannotated TEs as genes in other vertebrate genomes. Such misannotations can lead to errors in phylogenetic analyses and wasted time for investigators. The study proposes adding a final TE-check to gene annotation pipelines to mitigate this problem.

Keywords