Informatics in Medicine Unlocked (Jan 2024)
Interplay of machine learning and bioinformatics approaches to identify genetic biomarkers that affect survival of patients with glioblastoma
Abstract
Glioblastoma, also known as grade IV astrocytoma, is an aggressive and quickly developing brain tumor whose median survival period is believed to be between 12 and 18 months. Patients with glioblastoma are at high risk of developing comorbidities like leukemia, atherosclerosis, autism, sudden cardiac death, and pancreatic neoplasms. Identification of influential biomarker genes is crucial to diagnose and design therapeutic targets for cancer. To do this, we considered The Cancer Genome Atlas (TCGA) dataset to identify the significant genes of glioblastoma. Therefore, we pre-processed the dataset and applied the Kruskal-Wallis test and Bonferroni correction methods to select significant biomarker genes. A total of 26 significant dysregulated genes have been identified from 16261 genes, of which 19 are up-regulated and 7 are down-regulated genes. We performed analysis of functional and ontological pathways, protein-protein interactions (PPI), and protein-drug interactions (PDI) to predict the functions of these influential genes. Comorbidities validation was performed using gold benchmark databases. Furthermore, the Cox proportional hazard model and the product-limit (PL) estimator were used to examine the influence of clinical and genetic variables that play an important role in the survival of glioblastoma patients. This study provides the basis of identifying cancer-influencing genes and understanding the impact of glioblastoma on the progression of comorbidities.