Frontiers in Molecular Biosciences (Jan 2022)

Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree

  • Qing Chen,
  • Ji Zhang,
  • Banghe Bao,
  • Fan Zhang,
  • Jie Zhou

DOI
https://doi.org/10.3389/fmolb.2021.815243
Journal volume & issue
Vol. 8

Abstract

Read online

The early clinical symptoms of gastric cancer are not obvious, and metastasis may have occurred at the time of treatment. Poor prognosis is one of the important reasons for the high mortality of gastric cancer. Therefore, the identification of gastric cancer-related genes can be used as relevant markers for diagnosis and treatment to improve diagnosis precision and guide personalized treatment. In order to further reveal the pathogenesis of gastric cancer at the gene level, we proposed a method based on Gradient Boosting Decision Tree (GBDT) to identify the susceptible genes of gastric cancer through gene interaction network. Based on the known genes related to gastric cancer, we collected more genes which can interact with them and constructed a gene interaction network. Random Walk was used to extract network association of each gene and we used GBDT to identify the gastric cancer-related genes. To verify the AUC and AUPR of our algorithm, we implemented 10-fold cross-validation. GBDT achieved AUC as 0.89 and AUPR as 0.81. We selected four other methods to compare with GBDT and found GBDT performed best.

Keywords