BMC Genomics (Jan 2018)

PGAP-X: extension on pan-genome analysis pipeline

  • Yongbing Zhao,
  • Chen Sun,
  • Dongyu Zhao,
  • Yadong Zhang,
  • Yang You,
  • Xinmiao Jia,
  • Junhui Yang,
  • Lingping Wang,
  • Jinyue Wang,
  • Haohuan Fu,
  • Yu Kang,
  • Fei Chen,
  • Jun Yu,
  • Jiayan Wu,
  • Jingfa Xiao

DOI
https://doi.org/10.1186/s12864-017-4337-7
Journal volume & issue
Vol. 19, no. S1
pp. 115 – 124

Abstract

Read online

Abstract Background Since PGAP (pan-genome analysis pipeline) was published in 2012, it has been widely employed in bacterial genomics research. Though PGAP has integrated several modules for pan-genomics analysis, how to properly and effectively interpret and visualize the results data is still a challenge. Result To well present bacterial genomic characteristics, a novel cross-platform software was developed, named PGAP-X. Four kinds of data analysis modules were developed and integrated: whole genome sequences alignment, orthologous genes clustering, pan-genome profile analysis, and genetic variants analysis. The results from these analyses can be directly visualized in PGAP-X. The modules for data visualization in PGAP-X include: comparison of genome structure, gene distribution by conservation, pan-genome profile curve and variation on genic and genomic region. Meanwhile, result data produced by other programs with similar function can be imported to be further analyzed and visualized in PGAP-X. To test the performance of PGAP-X, we comprehensively analyzed 14 Streptococcus pneumonia strains and 14 Chlamydia trachomatis. The results show that, S. pneumonia strains have higher diversity on genome structure and gene contents than C. trachomatis strains. In addition, S. pneumonia strains might have suffered many evolutionary events, such genomic rearrangements, frequent horizontal gene transfer, homologous recombination, and other evolutionary process. Conclusion Briefly, PGAP-X directly presents the characteristics of bacterial genomic diversity with different visualization methods, which could help us to intuitively understand dynamics and evolution in bacterial genomes. The source code and the pre-complied executable programs are freely available from http://pgapx.ybzhao.com .

Keywords