mSystems (Dec 2023)

Informed interpretation of metagenomic data by StrainPhlAn enables strain retention analyses of the upper airway microbiome

  • Nadja Mostacci,
  • Tsering Monika Wüthrich,
  • Léa Siegwald,
  • Silas Kieser,
  • Ruth Steinberg,
  • Olga Sakwinska,
  • Philipp Latzin,
  • Insa Korten,
  • Markus Hilty

DOI
https://doi.org/10.1128/msystems.00724-23
Journal volume & issue
Vol. 8, no. 6

Abstract

Read online

ABSTRACT Shotgun metagenomic sequencing has the potential to provide bacterial strain-level resolution which is of key importance to tackle a host of clinical questions. While bioinformatic tools that achieve strain-level resolution are available, thorough benchmarking is needed to validate their use for less investigated and low biomass microbiomes like those from the upper respiratory tract. We analyzed a previously published data set of longitudinally collected nasopharyngeal samples from Bangladeshi infants (Microbiota and Health study) and a novel data set of oropharyngeal samples from Swiss children with cystic fibrosis. Data from bacterial cultures were used for benchmarking the parameters of StrainPhlAn 3, a bioinformatic tool designed for strain-level resolution. In addition, StrainPhlAn 3 results were compared with metagenomic assemblies derived from StrainGE and newly derived whole-genome sequencing data. After optimizing the analytical parameters, we compared StrainPhlAn 3 results to culture gold standard methods and achieved sensitivity values of 87% (Streptococcus pneumoniae), 80% (Moraxella catarrhalis), 75% (Haemophilus influenzae), and 57% (Staphylococcus aureus) for 420 nasopharyngeal and 75% (H. influenzae) and 46% (S. aureus) for 260 oropharyngeal samples. Comparing the phylogenetic tree of the core genome of 50 S. aureus isolates with a corresponding marker gene tree generated by StrainPhlAn 3 revealed a striking similarity in tree topology for all but three samples indicating adequate strain resolution. In conclusion, a comparison of StrainPhlAn 3 results to data from bacterial cultures revealed that strain-level tracking of the respiratory microbiome is feasible despite the high content of host DNA when parameters are carefully optimized to fit low biomass microbiomes.IMPORTANCEThe usage of 16S rRNA gene sequencing has become the state-of-the-art method for the characterization of the microbiota in health and respiratory disease. The method is reliable for low biomass samples due to prior amplification of the 16S rRNA gene but has limitations as species and certainly strain identification is not possible. However, the usage of metagenomic tools for the analyses of microbiome data from low biomass samples is not straight forward, and careful optimization is needed. In this work, we show that by validating StrainPhlAn 3 results with the data from bacterial cultures, the strain-level tracking of the respiratory microbiome is feasible despite the high content of host DNA being present when parameters are carefully optimized to fit low biomass microbiomes. This work further proposes that strain retention analyses are feasible, at least for more abundant species. This will help to better understand the longitudinal dynamics of the upper respiratory microbiome during health and disease.

Keywords