mSystems (Apr 2020)

Multiple-Disease Detection and Classification across Cohorts via Microbiome Search

  • Xiaoquan Su,
  • Gongchao Jing,
  • Zheng Sun,
  • Lu Liu,
  • Zhenjiang Xu,
  • Daniel McDonald,
  • Zengbin Wang,
  • Honglei Wang,
  • Antonio Gonzalez,
  • Yufeng Zhang,
  • Shi Huang,
  • Gavin Huttley,
  • Rob Knight,
  • Jian Xu

DOI
https://doi.org/10.1128/mSystems.00150-20
Journal volume & issue
Vol. 5, no. 2

Abstract

Read online

ABSTRACT Microbiome-based disease classification depends on well-validated disease-specific models or a priori organismal markers. These are lacking for many diseases. Here, we present an alternative, search-based strategy for disease detection and classification, which detects diseased samples via their outlier novelty versus a database of samples from healthy subjects and then compares these to databases of samples from patients. Our strategy’s precision, sensitivity, and speed outperform model-based approaches. In addition, it is more robust to platform heterogeneity and to contamination in 16S rRNA gene amplicon data sets. This search-based strategy shows promise as an important first step in microbiome big-data-based diagnosis. IMPORTANCE Here, we present a search-based strategy for disease detection and classification, which detects diseased samples via their outlier novelty versus a database of samples from healthy subjects and then compares them to databases of samples from patients. This approach enables the identification of microbiome states associated with disease even in the presence of different cohorts, multiple sequencing platforms, or significant contamination.

Keywords