Dataset of de novo assembly and functional annotation of the transcriptomes of three native oleaginous microalgae from the Peruvian Amazon
Marianela Cobos,
Hicler N. Rodríguez,
Segundo L. Estela,
Carlos G. Castro,
J. Dylan Maddox,
Jae D. Paredes,
Juan R. Saldaña,
Álvaro B. Tresierra,
Juan C. Castro
Affiliations
Marianela Cobos
Laboratorio de Biotecnología y Bioenergética (LBB), Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú; Corresponding authors at: Laboratorio de Biotecnología y Bioenergética, Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú
Hicler N. Rodríguez
Laboratorio de Biotecnología y Bioenergética (LBB), Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú; Unidad Especializada de Biotecnología, Centro de Investigaciones de Recursos Naturales de la Amazonía (CIRNA), Universidad Nacional de la Amazonia Peruana (UNAP), Psje. Los Paujiles S/N, San Juan Bautista, Loreto, 16000, Perú
Segundo L. Estela
Laboratorio de Biotecnología y Bioenergética (LBB), Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú
Carlos G. Castro
Laboratorio de Biotecnología y Bioenergética (LBB), Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú; Unidad Especializada de Biotecnología, Centro de Investigaciones de Recursos Naturales de la Amazonía (CIRNA), Universidad Nacional de la Amazonia Peruana (UNAP), Psje. Los Paujiles S/N, San Juan Bautista, Loreto, 16000, Perú
J. Dylan Maddox
Laboratorio de Biotecnología y Bioenergética (LBB), Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú; Pritzker Laboratory for Molecular Systematics and Evolution, Field Museum of Natural History, 1400 S. Lake Shore Drive, Chicago, IL 60605, USA; Environmental Sciences, American Public University System, Charles Town, WV 25414, USA
Jae D. Paredes
Laboratorio de Biotecnología y Bioenergética (LBB), Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú
Juan R. Saldaña
Laboratorio de Biotecnología y Bioenergética (LBB), Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú
Álvaro B. Tresierra
Laboratorio de Biotecnología y Bioenergética (LBB), Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú
Juan C. Castro
Unidad Especializada de Biotecnología, Centro de Investigaciones de Recursos Naturales de la Amazonía (CIRNA), Universidad Nacional de la Amazonia Peruana (UNAP), Psje. Los Paujiles S/N, San Juan Bautista, Loreto, 16000, Perú; Departamento Académico de Ciencias Biomédicas y Biotecnología, Facultad de Ciencias Biológicas, Universidad Nacional de la Amazonia Peruana (UNAP), Ciudad Universitaria - Zungarococha, San Juan Bautista, Loreto, 16000, Perú; Corresponding authors at: Laboratorio de Biotecnología y Bioenergética, Universidad Científica del Perú (UCP), Av. Abelardo Quiñones Km 2.5, San Juan Bautista, Loreto, 16006, Perú
Microalgae are photosynthetic organisms with cosmopolitan distribution (i.e., marine, freshwater and terrestrial habitats) and possess a great diversity of species [1] and consequently an immense variation in biochemical compositions [2]. To date genomic information is available mainly from the model green microalga Chlamydomonas reinhardtii [3]. Here we provide the dataset of a de novo assembly and functional annotation of the transcriptomes of three native oleaginous microalgae from the Peruvian Amazon. Native oleaginous microalgae species Ankistrodesmus sp., Chlorella sp., and Scenedesmus sp. were cultured in triplicate using Chu-10 medium with or without a source of nitrate (NaNO3). Total RNA was purified, the cDNA libraries were constructed and sequenced as paired-end reads on an Illumina HiSeq™2500 platform. Transcriptomes were de novo assembled using Trinity v2.9.1. A total of 48,554 transcripts (range from 250 to 7966 bp; N50 = 1047) for Ankistrodesmus sp., 108,126 transcripts (range from 250 to 8160 bp; N50 = 1090) for Chlorella sp., and 77,689 transcripts (range from 250 to 8481 bp; N50 = 1281) for Scenedesmus sp. were de novo assembled. Completeness of the assembled transcriptomes were evaluated with the Benchmarking Universal Single-Copy Orthologs (BUSCO) software v2/v3. Functional annotation of the assembled transcriptomes was conducted with TransDecoder v3.0.1 and the web-based platforms Kyoto Encyclopedia of Genes and Genomes (KEGG) Automatic Annotation Server (KAAS) and FunctionAnnotator. The raw reads were deposited into NCBI and are accessible via BioProject accession number PRJNA628966 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA628966) and Sequence Read Archive (SRA) with accession numbers: SRX8295665 (https://www.ncbi.nlm.nih.gov/sra/SRX8295665), SRX8295666 (https://www.ncbi.nlm.nih.gov/sra/SRX8295666), SRX8295667 (https://www.ncbi.nlm.nih.gov/sra/SRX8295667), SRX8295668 (https://www.ncbi.nlm.nih.gov/sra/SRX8295668), SRX8295669 (https://www.ncbi.nlm.nih.gov/sra/SRX8295669), and SRX8295670 (https://www.ncbi.nlm.nih.gov/sra/SRX8295670). Additionally, transcriptome shotgun assembly sequences and functional annotations are available via Discover Mendeley Data (https://data.mendeley.com/datasets/47wdjmw9xr/1).