PLoS Computational Biology (Mar 2021)

CODON-Software to manual curation of prokaryotic genomes.

  • Bruno Merlin,
  • Jorianne Thyeska Castro Alves,
  • Pablo Henrique Caracciolo Gomes de Sá,
  • Mônica Silva de Oliveira,
  • Larissa Maranhão Dias,
  • Gislenne da Silva Moia,
  • Victória Cardoso Dos Santos,
  • Adonney Allan de Oliveira Veras

DOI
https://doi.org/10.1371/journal.pcbi.1008797
Journal volume & issue
Vol. 17, no. 3
p. e1008797

Abstract

Read online

Genome annotation conceptually consists of inferring and assigning biological information to gene products. Over the years, numerous pipelines and computational tools have been developed aiming to automate this task and assist researchers in gaining knowledge about target genes of study. However, even with these technological advances, manual annotation or manual curation is necessary, where the information attributed to the gene products is verified and enriched. Despite being called the gold standard process for depositing data in a biological database, the task of manual curation requires significant time and effort from researchers who sometimes have to parse through numerous products in various public databases. To assist with this problem, we present CODON, a tool for manual curation of genomic data, capable of performing the prediction and annotation process. This software makes use of a finite state machine in the prediction process and automatically annotates products based on information obtained from the Uniprot database. CODON is equipped with a simple and intuitive graphic interface that assists on manual curation, enabling the user to decide about the analysis based on information as to identity, length of the alignment, and name of the organism in which the product obtained a match. Further, visual analysis of all matches found in the database is possible, impacting significantly in the curation task considering that the user has at his disposal all the information available for a given product. An analysis performed on eleven organisms was used to test the efficiency of this tool by comparing the results of prediction and annotation through CODON to ones from the NCBI and RAST platforms.