IEEE Access (Jan 2020)
Classification of Cancers Based on a Comprehensive Pathway Activity Inferred by Genes and Their Interactions
Abstract
Cancers, a group of multifactorial complex diseases, are generally caused by mutation of multiple genes or dysregulation of gene interactions. Applying machine learning methods to microarray gene expression profiles for disease classification is a popular method to predict disease state or outcome. Traditional computational methods that detect genes differentially expressed between cancer and normal samples are ineffective in independent cohorts of patients. However, current methods consider pathways as simple gene sets and include pathway topological information but ignore significant individual genes and interactions between genes, which are essential to infer a more robust pathway activity. In this study, we proposed a novel approach to describe the activity of a pathway that incorporates both the differential expression degree of genes between the case and control and the interaction strength between genes. We applied the method to the classifications of seven cancers. Within-dataset experiments and cross-dataset experiments demonstrated that our novel method achieved robust and superior performance when compared to the five existing methods.
Keywords