Frontiers in Veterinary Science (Feb 2023)

Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species

  • Sinwoo Park,
  • Jinbaek Lee,
  • Jaeryeong Kim,
  • Dohyeon Kim,
  • Jin Hyup Lee,
  • Seung Pil Pack,
  • Minseok Seo,
  • Minseok Seo

DOI
https://doi.org/10.3389/fvets.2023.1128570
Journal volume & issue
Vol. 10

Abstract

Read online

IntroductionFor reference genomes and gene annotations are key materials that can determine the limits of the molecular biology research of a species; however, systematic research on their quality assessment remains insufficient.MethodsWe collected reference assemblies, gene annotations, and 3,420 RNA-sequencing (RNA-seq) data from 114 species and selected effective indicators to simultaneously evaluate the reference genome quality of various species, including statistics that can be obtained empirically during the mapping process of short reads. Furthermore, we newly presented and applied transcript diversity and quantification success rates that can relatively evaluate the quality of gene annotations of various species. Finally, we proposed a next-generation sequencing (NGS) applicability index by integrating a total of 10 effective indicators that can evaluate the genome and gene annotation of a specific species.Results and discussionBased on these effective evaluation indicators, we successfully evaluated and demonstrated the relative accessibility of NGS applications in all species, which will directly contribute to determining the technological boundaries in each species. Simultaneously, we expect that it will be a key indicator to examine the direction of future development through relative quality evaluation of genomes and gene annotations in each species, including countless organisms whose genomes and gene annotations will be constructed in the future.

Keywords