Gut Microbes (Dec 2023)

Microbial genes outperform species and SNVs as diagnostic markers for Crohn’s disease on multicohort fecal metagenomes empowered by artificial intelligence

  • Sheng Gao,
  • Xiang Gao,
  • Ruixin Zhu,
  • Dingfeng Wu,
  • Zhongsheng Feng,
  • Na Jiao,
  • Ruicong Sun,
  • Wenxing Gao,
  • Qing He,
  • Zhanju Liu,
  • Lixin Zhu

DOI
https://doi.org/10.1080/19490976.2023.2221428
Journal volume & issue
Vol. 15, no. 1

Abstract

Read online

ABSTRACTDysbiosis of gut microbial community is associated with the pathogenesis of CD and may serve as a promising noninvasive diagnostic tool. We aimed to compare the performances of the microbial markers of different biological levels by conducting a multidimensional analysis on the microbial metagenomes of CD. We collected fecal metagenomic datasets generated from eight cohorts that altogether include 870 CD patients and 548 healthy controls. Microbial alterations in CD patients were assessed at multidimensional levels including species, gene, and SNV level, and then diagnostic models were constructed using artificial intelligence algorithm. A total of 227 species, 1047 microbial genes, and 21,877 microbial SNVs were identified that differed between CD and controls. The species, gene, and SNV models achieved an average AUC of 0.97, 0.95, and 0.77, respectively. Notably, the gene model exhibited superior diagnostic capability, achieving an average AUC of 0.89 and 0.91 for internal and external validations, respectively. Moreover, the gene model was specific for CD against other microbiome-related diseases. Furthermore, we found that phosphotransferase system (PTS) contributed substantially to the diagnostic capability of the gene model. The outstanding performance of PTS was mainly explained by genes celB and manY, which demonstrated high predictabilities for CD with metagenomic datasets and was validated in an independent cohort by qRT-PCR analysis. Our global metagenomic analysis unravels the multidimensional alterations of the microbial communities in CD and identifies microbial genes as robust diagnostic biomarkers across geographically and culturally distinct cohorts.

Keywords