PLoS ONE (Jan 2021)
Improved cancer biomarkers identification using network-constrained infinite latent feature selection.
Abstract
Identifying biomarkers that are associated with different types of cancer is an important goal in the field of bioinformatics. Different researcher groups have analyzed the expression profiles of many genes and found some certain genetic patterns that can promote the improvement of targeted therapies, but the significance of some genes is still ambiguous. More reliable and effective biomarkers identification methods are then needed to detect candidate cancer-related genes. In this paper, we proposed a novel method that combines the infinite latent feature selection (ILFS) method with the functional interaction (FIs) network to rank the biomarkers. We applied the proposed method to the expression data of five cancer types. The experiments indicated that our network-constrained ILFS (NCILFS) provides an improved prediction of the diagnosis of the samples and locates many more known oncogenes than the original ILFS and some other existing methods. We also performed functional enrichment analysis by inspecting the over-represented gene ontology (GO) biological process (BP) terms and applying the gene set enrichment analysis (GSEA) method on selected biomarkers for each feature selection method. The enrichments analysis reports show that our network-constraint ILFS can produce more biologically significant gene sets than other methods. The results suggest that network-constrained ILFS can identify cancer-related genes with a higher discriminative power and biological significance.