BMC Research Notes (Jun 2023)

Identification of annotation artifacts concerning the chalcone synthase (CHS)

  • Martin Bartas,
  • Adriana Volna,
  • Jiri Cerven,
  • Boas Pucker

DOI
https://doi.org/10.1186/s13104-023-06386-z
Journal volume & issue
Vol. 16, no. 1
pp. 1 – 6

Abstract

Read online

Abstract Objective Chalcone synthase (CHS) catalyzes the initial step of the flavonoid biosynthesis. The CHS encoding gene is well studied in numerous plant species. Rapidly growing sequence databases contain hundreds of CHS entries that are the result of automatic annotation. In this study, we evaluated apparent multiplication of CHS domains in CHS gene models of four plant species. Main findings CHS genes with an apparent triplication of the CHS domain encoding part were discovered through database searches. Such genes were found in Macadamia integrifolia, Musa balbisiana, Musa troglodytarum, and Nymphaea colorata. A manual inspection of the CHS gene models in these four species with massive RNA-seq data suggests that these gene models are the result of artificial fusions in the annotation process. While there are hundreds of seemingly correct CHS records in the databases, it is not clear why these annotation artifacts appeared.

Keywords