BMC Gastroenterology (Oct 2021)

Screening of characteristic genes in ulcerative colitis by integrating gene expression profiles

  • Yingbo Han,
  • Xiumin Liu,
  • Hongmei Dong,
  • Dacheng Wen

DOI
https://doi.org/10.1186/s12876-021-01940-0
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background This study aimed to screen the feature modules and characteristic genes related to ulcerative colitis (UC) and construct a support vector machine (SVM) classifier to distinguish UC patients. Methods Four datasets that contained UC and control samples were obtained from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) with consistency were screened via the MetaDE method. The weighted gene coexpression network (WGCNA) was used to distinguish significant modules based on the four datasets. The protein–protein interaction network was established based on intersection genes. Enrichment analysis of Gene Ontology (GO) biological processes (BPs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment were established based on DAVID. An SVM combined with recursive feature elimination was also applied to construct a disease classifier for the disease diagnosis of UC patients. The efficacy of the SVM classifier was evaluated through receiver operating characteristic curves. Results Twelve highly preserved modules were obtained using the WGCNA, and 2009 DEGs with significant consistency were selected using the MetaDE method. Sixteen significantly related GO BPs and 12 KEGG pathways were obtained, such as cytokine-cytokine receptor interaction, cell adhesion molecules, and leukocyte transendothelial migration. Subsequently, 41 genes were used to construct an SVM classifier, such as CXCL1, CCR2, IL1B, and IL1A. The area under the curve (AUC) was 0.999 in the training dataset, whereas the AUC was 0.886, 0.790, and 0.819 in the validation set (GSE65114, GSE37283, and GSE36807, respectively). Conclusions An SVM classifier based on feature genes might correctly identify healthy people or UC patients.

Keywords