Gut Pathogens (Dec 2022)

Metagenomic analysis of the interaction between the gut microbiota and colorectal cancer: a paired-sample study based on the GMrepo database

  • Han Chen,
  • Jianhua Jiao,
  • Min Wei,
  • Xingzhou Jiang,
  • Ruoyun Yang,
  • Xin Yu,
  • Guoxin Zhang,
  • Xiaoying Zhou

DOI
https://doi.org/10.1186/s13099-022-00527-8
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background Previous evidence has shown that the gut microbiota plays a role in the development and progression of colorectal cancer (CRC). This study aimed to provide quantitative analysis and visualization of the interaction between the gut microbiota and CRC in order to establish a more precise microbiota panel for CRC diagnosis. Method A paired-sample study was designed by retrieving original metagenomic data from the GMrepo database. The differences in the distribution of the gut microbiota between CRCs and controls were analysed at the species level. A co-occurrence network was established, and the microbial interactions with environmental factors were assessed. Random forest models were used to determine significant biomarkers for differentiating CRC and control samples. Results A total of 709 metagenomic samples from 6 projects were identified. After matching, 86 CRC patients and 86 matched healthy controls from six countries were enrolled. A total of 484 microbial species and 166 related genera were analysed. In addition to previously recognized associations between Fusobacterium nucleatum and species belonging to the genera Peptostreptococcus, Porphyromonas, and Prevotella and CRC, we found new associations with the novel species of Parvimonas micra and Collinsella tanakaei. In CRC patients, Bacteroides uniformis and Collinsella tanakaei were positively correlated with age, whereas Dorea longicatena, Adlercreutzia equolifaciens, and Eubacterium hallii had positive associations with body mass index (BMI). Finally, a random forest model was established by integrating different numbers of species with the highest model-building importance and lowest inner subcategory bias. The median value of the area under the receiver operating characteristic curve (AUC) was 0.812 in the training cohort and 0.790 in the validation set. Conclusions Our study provides a novel bioinformatics approach for investigating the interaction between the gut microbiota and CRC using an online free database. The identification of key species and their associated genes should be further emphasized to determine the relative causality of microbial organisms in the development of CRC.

Keywords