Informatics in Medicine Unlocked (Jan 2023)
Integrative transcriptomics analysis of early-onset and late-onset colorectal cancer
Abstract
Colorectal cancer (CRC) represents the third most prevalent form of cancer, comprising approximately 7% of all cancer varieties. The primary objective of this study is to pinpoint and scrutinize the differentially expressed genes (DEGs) present in early-onset CRC (EOCRC) as compared to late-onset CRC (LOCRC). For our analysis, we retrieved RNA-seq data from the GEO database, subsequently employing the GEO2R tool for its evaluation. This step was followed by an extensive process of gene and pathway enrichment, identification of protein-protein interactions, and the prediction of pivotal transcriptional factors. Additionally, we examined survival rates and responses to chemotherapy treatments. We identified 250 DEGs, of which 235 were down-regulated and 15 were up-regulated. The most prevalent categories identified were extracellular structure organization in biological processes, collagen-containing extracellular matrix in cellular compartments, platelet-derived growth factor in molecular functions, and protein digestion and absorption in KEGG pathways. Furthermore, we isolated ten central hub genes: COL1A1, VWF, COL3A1, EGF, IGF1, COL1A2, ITGB3, COL11A2, COL6A1, and CD163. Among these, COL1A1 and EGF stand out; COL1A1 serves as a predictive biomarker, while EGF holds potential as a prognostic biomarker. Additionally, our predictive models highlight FOXC1, GATA2, YY1, TFAP2A, and PPARG as the most influential transcriptional factors governing these hub genes. In summation, our findings suggest that these hub genes play a crucial role in differentiating EOCRC from LOCRC, warranting further comprehensive investigation.