BMC Bioinformatics (Nov 2011)

HabiSign: a novel approach for comparison of metagenomes and rapid identification of habitat-specific sequences

  • Ghosh Tarini,
  • Mohammed Monzoorul,
  • Rajasingh Hannah,
  • Chadaram Sudha,
  • Mande Sharmila S

DOI
https://doi.org/10.1186/1471-2105-12-S13-S9
Journal volume & issue
Vol. 12, no. Suppl 13
p. S9

Abstract

Read online

Abstract Background One of the primary goals of comparative metagenomic projects is to study the differences in the microbial communities residing in diverse environments. Besides providing valuable insights into the inherent structure of the microbial populations, these studies have potential applications in several important areas of medical research like disease diagnostics, detection of pathogenic contamination and identification of hitherto unknown pathogens. Here we present a novel and rapid, alignment-free method called HabiSign, which utilizes patterns of tetra-nucleotide usage in microbial genomes to bring out the differences in the composition of both diverse and related microbial communities. Results Validation results show that the metagenomic signatures obtained using the HabiSign method are able to accurately cluster metagenomes at biome, phenotypic and species levels, as compared to an average tetranucleotide frequency based approach and the recently published dinucleotide relative abundance based approach. More importantly, the method is able to identify subsets of sequences that are specific to a particular habitat. Apart from this, being alignment-free, the method can rapidly compare and group multiple metagenomic data sets in a short span of time. Conclusions The proposed method is expected to have immense applicability in diverse areas of metagenomic research ranging from disease diagnostics and pathogen detection to bio-prospecting. A web-server for the HabiSign algorithm is available at http://metagenomics.atc.tcs.com/HabiSign/.