BMC Genomics (Apr 2011)

Comparative supragenomic analyses among the pathogens <it>Staphylococcus aureus, Streptococcus pneumoniae</it>, and <it>Haemophilus influenzae </it>Using a modification of the finite supragenome model

  • Yu Susan,
  • Hayes Jay,
  • Powell Evan,
  • Hiller Luisa N,
  • Pusch Gordon D,
  • Hogg Justin S,
  • Hall Barry G,
  • Earl Josh,
  • Janto Benjamin,
  • Ahmed Azad,
  • Boissy Robert,
  • Kathju Sandeep,
  • Stoodley Paul,
  • Post J Christopher,
  • Ehrlich Garth D,
  • Hu Fen Z

DOI
https://doi.org/10.1186/1471-2164-12-187
Journal volume & issue
Vol. 12, no. 1
p. 187

Abstract

Read online

Abstract Background Staphylococcus aureus is associated with a spectrum of symbiotic relationships with its human host from carriage to sepsis and is frequently associated with nosocomial and community-acquired infections, thus the differential gene content among strains is of interest. Results We sequenced three clinical strains and combined these data with 13 publically available human isolates and one bovine strain for comparative genomic analyses. All genomes were annotated using RAST, and then their gene similarities and differences were delineated. Gene clustering yielded 3,155 orthologous gene clusters, of which 2,266 were core, 755 were distributed, and 134 were unique. Individual genomes contained between 2,524 and 2,648 genes. Gene-content comparisons among all possible S. aureus strain pairs (n = 136) revealed a mean difference of 296 genes and a maximum difference of 476 genes. We developed a revised version of our finite supragenome model to estimate the size of the S. aureus supragenome (3,221 genes, with 2,245 core genes), and compared it with those of Haemophilus influenzae and Streptococcus pneumoniae. There was excellent agreement between RAST's annotations and our CDS clustering procedure providing for high fidelity metabolomic subsystem analyses to extend our comparative genomic characterization of these strains. Conclusions Using a multi-species comparative supragenomic analysis enabled by an improved version of our finite supragenome model we provide data and an interpretation explaining the relatively larger core genome of S. aureus compared to other opportunistic nasopharyngeal pathogens. In addition, we provide independent validation for the efficiency and effectiveness of our orthologous gene clustering algorithm.