Data in Brief (Oct 2018)
Data on genome annotation and analysis of earthworm Eisenia fetida
Abstract
The present article reports the complete draft genome annotation of earthworm Eisenia fetida, obtained from the manuscript entitled “Timing and Scope of Genomic Expansion within Annelida: Evidence from Homeoboxes in the Genome of the Earthworm E. fetida” (Zwarycz et al., 2015) and provides the data on the repetitive elements, protein coding genes and noncoding RNAs present in the genome dataset of the species. The E. fetida protein coding genes were predicted from AUGUSTUS gene prediction and subsequently annotated based on their sequence similarity, Gene Ontology (GO) functional terms, InterPro domains, Clusters of Orthologous Groups (COGs) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways information. The genome wide comparison of orthologous clusters and phylogenomic analysis of the core genes were performed to understand the events of genome evolution and genomic diversity between E. fetida and its related metazoans. In addition, the genome dataset was screened to identify the crucial stem cell markers, regeneration specific genes and immune-related genes and their functionally enriched GO terms were predicted from Fisher׳s enrichment analysis. The E. fetida genome annotation data containing the GFF (general feature format) annotation file, predicted coding gene sequences and translated protein sequences were deposited to the figshare repository under the DOI: https://doi.org/10.6084/m9.figshare.6142322.v1. Keywords: Eisenia fetida, Genome annotation, Orthologous groups, Regeneration