Microbiome (Aug 2023)
metaSpectraST: an unsupervised and database-independent analysis workflow for metaproteomic MS/MS data using spectrum clustering
Abstract
Abstract Background The high diversity and complexity of the microbial community make it a formidable challenge to identify and quantify the large number of proteins expressed in the community. Conventional metaproteomics approaches largely rely on accurate identification of the MS/MS spectra to their corresponding short peptides in the digested samples, followed by protein inference and subsequent taxonomic and functional analysis of the detected proteins. These approaches are dependent on the availability of protein sequence databases derived either from sample-specific metagenomic data or from public repositories. Due to the incompleteness and imperfections of these protein sequence databases, and the preponderance of homologous proteins expressed by different bacterial species in the community, this computational process of peptide identification and protein inference is challenging and error-prone, which hinders the comparison of metaproteomes across multiple samples. Results We developed metaSpectraST, an unsupervised and database-independent metaproteomics workflow, which quantitatively profiles and compares metaproteomics samples by clustering experimentally observed MS/MS spectra based on their spectral similarity. We applied metaSpectraST to fecal samples collected from littermates of two different mother mice right after weaning. Quantitative proteome profiles of the microbial communities of different mice were obtained without any peptide-spectrum identification and used to evaluate the overall similarity between samples and highlight any differentiating markers. Compared to the conventional database-dependent metaproteomics analysis, metaSpectraST is more successful in classifying the samples and detecting the subtle microbiome changes of mouse gut microbiomes post-weaning. metaSpectraST could also be used as a tool to select the suitable biological replicates from samples with wide inter-individual variation. Conclusions metaSpectraST enables rapid profiling of metaproteomic samples quantitatively, without the need for constructing the protein sequence database or identification of the MS/MS spectra. It maximally preserves information contained in the experimental MS/MS spectra by clustering all of them first and thus is able to better profile the complex microbial communities and highlight their functional changes, as compared with conventional approaches. tag the videobyte in this section as ESM4 Video Abstract
Keywords