Frontiers in Plant Science (Aug 2015)

ReprOlive: a Database with Linked Data for the Olive Tree (Olea europaea L.) Reproductive Transcriptome

  • Rosario M. Carmona,
  • Rosario M. Carmona,
  • Adoración eZafra,
  • Pedro eSeoane,
  • Antonio J. Castro,
  • Darío eGuerrero-Fernández,
  • Trinidad eCastillo-Castillo,
  • Ana eMedina-García,
  • Francisco M Cánovas,
  • José eAldana-Montes,
  • Ismael eNavas-Delgado,
  • Juan De Dios eAlché,
  • M. Gonzalo eClaros

DOI
https://doi.org/10.3389/fpls.2015.00625
Journal volume & issue
Vol. 6

Abstract

Read online

Plant reproductive transcriptomes have been analysed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues (http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads and 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts were also annotated with the corresponding orthologues in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an orthologue. A minimum of 23,568 different tentative transcripts was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 tentative transcripts for further gene expression studies. Partial transcriptomes from pollen, pistil and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localisation of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures and reactive oxygen species.

Keywords