Canadian Journal of Biotechnology (Oct 2017)
De novo assembly, functional annotation and comparative alignment of whole genome of a halo-tolerant Exiguobacterium profundum PHM11 with related genomes
Abstract
Advances in the next-generation sequencing (NGS) technologies have invigorated the exploration of microbial genomes in retrieving the hidden traits. In this study, high-throughput next generation whole genome sequencing of a halotolerant E. profundum PHM11 was performed on Illumina HiSeq paired end sequencing plateform and assembled through de novo Linux based approaches using Velvet (V 1.2.10.) algorithm package. Quality filtering in de novo sequencing produced 72,947,390 reads with total size of 7335195362 bp (7335.1 Mb). High quality reads having minimum and maximum contig lengths of 202 and 958471 (~0.95 Mb) bp were further considered for assembly. Final PHM11 genome has a size of ~2.92 Mb comprising 70 contigs, 47.93% G+C content with 761858 (26.08%), 757313 (25.92%), 699924 (23.96%), and 700172 (23.97%) percentages of adenine, thymine, cytosine and guanine nucleotides. Throughout micro-satellite mining of genome showed a total of 3005 SSRs, covering 0.1 of whole PHM11 genome, with relative abundance; 1029, relative density; 10951, and percentages of penta repeats; 65.75, hexa repeats; 28.75, mono repeats; 3.89, and tetra-repeats; 1.59, respectively. Gene networks related to the arrangement of key genes and presence of lysogenic phage DNA were reflected through generating the chromosome map of PHM11 genome. Functional annotations of genome reflected the different protein families, and hidden inherent metabolic pathways providing unusual features. A total of 3033 protein coding genes and 33 non-protein coding genes were identified; out of these only 2316 could be characterized and 737 were reported as hypothetical. Random genes of different metabolic pathways were amplified from its genome and authenticated through their sequencing. Genome-rearrangements in PHM11 could be deciphered through aligning its genome with thirteen other genomes of different Exiguobacterium species.