Genomics Data (Mar 2017)

Unraveling the microbial and functional diversity of Coamo thermal spring in Puerto Rico using metagenomic library generation and shotgun sequencing

  • Ricky Padilla-Del Valle,
  • Luis R. Morales-Vale,
  • Carlos Ríos-Velázquez

DOI
https://doi.org/10.1016/j.gdata.2016.12.010
Journal volume & issue
Vol. 11, no. C
pp. 98 – 101

Abstract

Read online

In Puerto Rico, the microbial diversity of the thermal spring (ThS) in Coamo has never been studied using metagenomics. The focus of our research was to generate a metagenomic library from the ThS of Coamo, Puerto Rico and explore the microbial and functional diversity. The metagenomic library from the ThS waters was generated using direct DNA isolation. High molecular weight (40 kbp) DNA was end-repaired, electro eluted and ligated into a fosmid vector (pCCFOS1); then transduced into Escherichia coli EPI300-T1R using T1 bacteriophages. The library consisted of approximately 6000 clones, 90% containing metagenomic DNA. Next-Generation-Sequencing technology (Illumina MiSeq) was used to process the ThS metagenome. After removing the cloning vector, 122,026 sequences with 33.10 Mbps size and 64% of G + C content were annotated and analyzed using the MG-RAST online server. Bacteria showed to be the most abundant domain (95.84%) followed by unidentified sequences (2.28%), viruses (1.67%), eukaryotes (0.15%), and archaea (0.01%). The most abundant phyla were Proteobacteria (95.03%), followed by unidentified (2.28%), unclassified from viruses (1.74%), Firmicutes (0.20%) and Actinobacteria (0.18%). The most abundant species were Escherichia coli, Polaromonas naphthalenivorans, Albidiferax ferrireducens and Acidovorax sp. Subsystem functional analysis showed that 20% of genes belong to transposable elements, 10% to clustering-based subsystems, and 8% to the production of cofactors. Functional analysis using NOG annotation showed that 82.79% of proteins are poorly characterized indicating the possibility of novel microbial functions and with potential biomedical and biotechnological applications. Metagenomic data was deposited into the NCBI database under the accession number SAMN06131862.

Keywords