European Journal of Medical Research (Jul 2023)

Predicting diagnostic biomarkers associated with immune infiltration in Crohn's disease based on machine learning and bioinformatics

  • Wenhui Bao,
  • Lin Wang,
  • Xiaoxiao Liu,
  • Ming Li

DOI
https://doi.org/10.1186/s40001-023-01200-9
Journal volume & issue
Vol. 28, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Objective The objective of this study is to investigate potential biomarkers of Crohn's disease (CD) and the pathological importance of infiltration of associated immune cells in disease development using machine learning. Methods Three publicly accessible CD gene expression profiles were obtained from the GEO database. Inflammatory tissue samples were selected and differentiated between colonic and ileal tissues. To determine the differentially expressed genes (DEGs) between CD and healthy controls, the larger sample size was merged as a training unit. The function of DEGs was comprehended through disease enrichment (DO) and gene set enrichment analysis (GSEA) on DEGs. Promising biomarkers were identified using the support vector machine-recursive feature elimination and lasso regression models. To further clarify the efficacy of potential biomarkers as diagnostic genes, the area under the ROC curve was observed in the validation group. Additionally, using the CIBERSORT approach, immune cell fractions from CD patients were examined and linked with potential biomarkers. Results Thirty-four DEGs were identified in colon tissue, of which 26 were up-regulated and 8 were down-regulated. In ileal tissues, 50 up-regulated and 50 down-regulated DEGs were observed. Disease enrichment of colon and ileal DEGs primarily focused on immunity, inflammatory bowel disease, and related pathways. CXCL1, S100A8, REG3A, and DEFA6 in colon tissue and LCN2 and NAT8 in ileum tissue demonstrated excellent diagnostic value and could be employed as CD gene biomarkers using machine learning methods in conjunction with external dataset validation. In comparison to controls, antigen processing and presentation, chemokine signaling pathway, cytokine–cytokine receptor interactions, and natural killer cell-mediated cytotoxicity were activated in colonic tissues. Cytokine–cytokine receptor interactions, NOD-like receptor signaling pathways, and toll-like receptor signaling pathways were activated in ileal tissues. NAT8 was found to be associated with CD8 T cells, while CXCL1, S100A8, REG3A, LCN2, and DEFA6 were associated with neutrophils, indicating that immune cell infiltration in CD is closely connected. Conclusion CXCL1, S100A8, REG3A, and DEFA6 in colonic tissue and LCN2 and NAT8 in ileal tissue can be employed as CD biomarkers. Additionally, immune cell infiltration is crucial for CD development.

Keywords