International Journal of Mycobacteriology (Jan 2016)

The First population structure and comparative genomics analysis of Mycobacterium africanum strains from Ghana reveals higher diversity of Lineage 5

  • Isaac D Otchere,
  • Simon R Harris,
  • Sanches L Busso,
  • Adwoa Asante-Poku,
  • Stephen Osei-Wusu,
  • Kwadwo Korama,
  • Julian Parkhill,
  • Sebastien Gagneux,
  • Dorothy Yeboah-Manu

DOI
https://doi.org/10.1016/j.ijmyco.2016.09.051
Journal volume & issue
Vol. 5, no. 5
pp. 80 – 81

Abstract

Read online

Objective/background: Mycobacterium africanum (MAF) remains an important TB causing pathogen in West Africa; however, little is known about its population structure and actual diversity which may have implications for diagnostics and vaccines. We carried out comparative genomics analysis of candidate Mycobacterium tuberculosis (MTB) and MAF using whole genome sequencing. Methods: Clinical MTB complex strains (n = 187) comprising L4 (n = 22), L5 (n = 126), and L6 (n = 39) isolated over 8 years from Ghana were whole genome sequenced. The reads were mapped onto a reference genome for phylogenetic and functional genomics analysis. A maximum likelihood tree with 100 bootstraps was constructed from the single nucleotide polymorphisms (SNPs) found using RAxML and clustered with hierBAPS. A total of 147 (18 L4, 36 L6, and 93 L5) of the genomes were de novo assembled and annotated for comparative pangenome analysis using Roary. Results: The population structure of MAF revealed at least five clusters of L5 as compared to three for L6. We also identified a group of three multi-drug-resistants (MDRs) within a single cluster of L5 strains from Southern Ghana isolated in 2013. Among the global collection of MTB complex, there were four Ghana-specific L5 clusters of which one (L5.1.1) had traits of clonal expansion. From the 5947pan genes extracted from the collection, 3215 (54.1%) were core to all the 147 genomes whereas 719 (12.1%) were found in single genomes. Most of the variable genes were PE-PGRS/PPE (1,281) duplicates of other genes (431). The genome degradation was more pronounced in Lineages 4 and 6 as compared to Lineage 5. We identified the absence of some unique genes among specific lineages and/or clades with possible clinical implications. For example, mpt64 and mlaD encoding respectively an immunogenic protein and a mammalian cell entry protein were missing from all L6 genomes. In addition, all L5 strains had an amino acid substitution I43N within the mpt64 gene. Analysis of SNPs within some genes encoding proteins for substrate metabolism, ion transport and secretory systems showed higher proportion of SNPs among L6 compared to L5 and L4. We also identified a number of lineage/sublineage specific SNPs and indels that may be utilized in rapid PCR based genotyping of MTB complex. Conclusion: This work emphasizes on the possibility that the mpt64-based rapid diagnostic kit would not be effective in MAF endemic settings. More mutations in ESAT-6 secretory system of MAF compared to MTB sensu stricto can affect efficacy of ESAT-6-based vaccines in the future.

Keywords