mSystems (Feb 2023)
Comparative Analysis of Core Microbiome Assignments: Implications for Ecological Synthesis
Abstract
ABSTRACT The concept of a core microbiome has been broadly used to refer to the consistent presence of a set of taxa across multiple samples within a given habitat. The assignment of taxa to core microbiomes can be performed by several methods based on the abundance and occupancy (i.e., detection across samples) of individual taxa. These approaches have led to methodological inconsistencies, with direct implications for ecological interpretation. Here, we reviewed a set of methods most commonly used to infer core microbiomes in divergent systems. We applied these methods using large data sets and analyzed simulations to determine their accuracy in core microbiome assignments. Our results show that core taxa assignments vary significantly across methods and data set types, with occupancy-based methods most accurately defining true core membership. We also found the ability of these methods to accurately capture core assignments to be contingent on the distribution of taxon abundance and occupancy in the data set. Finally, we provide specific recommendations for further studies using core taxa assignments and discuss the need for unifying methodical approaches toward data processing to advance ecological synthesis. IMPORTANCE Different methods are commonly used to assign core microbiome membership, leading to methodological inconsistencies across studies. In this study, we review a set of the most commonly used core microbiome assignment methods and compare their core assignments using both simulated and empirical data. We report inconsistent classifications from commonly applied core microbiome assignment methods. Furthermore, we demonstrate the implication that variable core assignments may have on downstream ecological interpretations. Although we still lack a standardized approach to core taxa assignments, our study provides a direction to properly test core assignment methods and offers advances in model parameterization and method choice across distinct data types.
Keywords