Autoimmunity (Dec 2024)

Analysis and validation of diagnostic biomarkers and immune cell infiltration characteristics in Crohn’s disease by integrating bioinformatics and machine learning

  • Xiao-Jun Ren,
  • Man-Ling Zhang,
  • Zhao-Hong Shi,
  • Pei-Pei Zhu

DOI
https://doi.org/10.1080/08916934.2024.2422352
Journal volume & issue
Vol. 57, no. 1

Abstract

Read online

Crohn’s disease (CD) presents significant diagnostic and therapeutic challenges due to its unclear etiology, frequent relapses, and limited treatment options. Traditional monitoring often relies on invasive and costly gastrointestinal procedures. This study aimed to identify specific diagnostic markers for CD using advanced computational approaches. Four gene expression datasets from the Gene Expression Omnibus (GEO) were analyzed, identifying differentially expressed genes (DEGs) through gene set enrichment analysis in R. Key biomarkers were selected using machine learning algorithms, including LASSO logistic regression, SVM‑RFE, and Random Forest, and their accuracy was assessed using receiver operating characteristic (ROC) curves and nomogram models. Immune cell infiltration was analyzed using the CIBERSORT algorithm, which helped reveal associations between diagnostic markers and immune cell patterns in CD. From a training set of 605 CD samples and 82 normal controls, we identified eight significant biomarkers: LCN2, FOLH1, CXCL1, FPR1, S100P, IGFBP5, CHP2, and AQP9. The diagnostic model showed high predictive power (AUC=0.954) and performed well in external validation (AUC = 1). Immune cell infiltration analysis highlighted various immune cells involved in CD, with all diagnostic markers strongly linked to immune cell interactions. Our findings propose candidate hub genes and present a nomogram for CD diagnosis, providing potential diagnostic biomarkers for clinical applications in CD.

Keywords