IEEE Access (Jan 2020)

Screening of Pathogenic Genes for Colorectal Cancer and Deep Learning in the Diagnosis of Colorectal Cancer

  • Yanke Li,
  • Fuqiang Zhang,
  • Chengzhong Xing

DOI
https://doi.org/10.1109/ACCESS.2020.3003999
Journal volume & issue
Vol. 8
pp. 114916 – 114929

Abstract

Read online

Based on complex networks and machine learning methods, this paper studies the mining of colorectal cancer treatment genes, and innovatively combines a variety of feature extraction and comparative analysis methods, from gene network features, gene attribute features, network and attribute integration The three aspects of characteristics comprehensively excavate the genetic characteristics, and demonstrate the feasibility of the study through comparative analysis from different perspectives. Constructing a colorectal cancer gene network, analyzing the changes in the network structure during the development of colorectal cancer, and mining the network characteristics of genes are the first issues to be studied in this paper. The analysis of the network structure compares the changes in the network structure of the driver genes in the Normal network and the Tumor network and the edge mechanism of the driver genes, and the distribution of the eigenvalues of the driver genes and non-driver genes in the Tumor network. During the development of colorectal cancer, the network structure of the gene has changed significantly, and the prediction results based on the network structure show better prediction results than the non-network structure. These findings are feasible for the research direction of this paper. The argument is carried out, and the relevant analysis results are also given in the article. However, the research method of the thesis is based on network research, so comparing structural features with non-structural features only shows that structural features have a good classification ability, and cannot directly explain that modeling using gene networks is better than not using gene networks. Finally, based on the random forest, the optimized classification is improved to reveal the important factors affecting the diagnosis of colorectal cancer, and then to identify the true potential colorectal cancer driver genes, providing guidance for the clinical research of colorectal cancer and driving gene mining.

Keywords