Dataset of the first transcriptome assembly of the tree crop “yerba mate” (Ilex paraguariensis) and systematic characterization of protein coding genes

Patricia M. Aguilera; Humberto J. Debat; Mauro Grabiele

Data in Brief (Apr 2018)

Dataset of the first transcriptome assembly of the tree crop “yerba mate” (Ilex paraguariensis) and systematic characterization of protein coding genes

Patricia M. Aguilera,
Humberto J. Debat,
Mauro Grabiele

Affiliations

Patricia M. Aguilera: Instituto de Biología Subtropical (UNaM-CONICET) and Instituto de Biotecnología de Misiones, Universidad Nacional de Misiones, 3300 Posadas, Misiones, Argentina
Humberto J. Debat: Instituto de Patología Vegetal, Centro de Investigaciones Agropecuarias (INTA), 5000 Córdoba, Argentina
Mauro Grabiele: Instituto de Biología Subtropical (UNaM-CONICET) and Instituto de Biotecnología de Misiones, Universidad Nacional de Misiones, 3300 Posadas, Misiones, Argentina; Corresponding author.

Journal volume & issue: Vol. 17
pp. 1036 – 1040

Abstract

Read online

This contribution contains data associated to the research article entitled “Exploring the genes of yerba mate (Ilex paraguariensis A. St.-Hil.) by NGS and de novo transcriptome assembly” (Debat et al., 2014) [1]. By means of a bioinformatic approach involving extensive NGS data analyses, we provide a resource encompassing the full transcriptome assembly of yerba mate, the first available reference for the Ilex L. genus. This dataset (Supplementary files 1 and 2) consolidates the transcriptome-wide assembled sequences of I. paraguariensis with further comprehensive annotation of the protein coding genes of yerba mate via the integration of Arabidopsis thaliana databases. The generated data is pivotal for the characterization of agronomical relevant genes in the tree crop yerba mate -a non-model species- and related taxa in Ilex. The raw sequencing data dissected here is available at DDBJ/ENA/GenBank (NCBI Resource Coordinators, 2016) [2] Sequence Read Archive (SRA) under the accession SRP043293 and the assembled sequences have been deposited at the Transcriptome Shotgun Assembly Sequence Database (TSA) under the accession GFHV00000000.

Published in Data in Brief

ISSN: 2352-3409 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Science (General)
Website: http://www.journals.elsevier.com/data-in-brief/

About the journal