IEEE Access (Jan 2022)
Influential Gene Selection From High-Dimensional Genomic Data Using a Bio-Inspired Algorithm Wrapped Broad Learning System
Abstract
The classification of high dimensional gene expression/ microarray data always plays an important role in various disease diagnoses and drug discovery. To avoid the curse of high dimensionality, the selection of the most influential genes remains a challenging task for the researchers in the machine learning field. As the extraction of influential features by a bio-inspired algorithm is a non-deterministic polynomial-time (NP-Hard) task, the possibility of applying new algorithm is always there. In this suggested work, a recently developed bio-inspired algorithm, Monarch Butterfly Optimization (MBO), is wrapped with the Broad Learning System (BLS), called MBO-BLS, to choose the most influential features and classify the microarray data simultaneously. In the first stage, a pre-selection method (Relief) is used to select a feature subset. Then, this selected feature subset undergoes further execution with the MBO-BLS model. To estimate the efficacy of the presented model, six cancerous microarray datasets are taken. Here, sensitivity, specificity, precision, F-score, Kappa, and MCC measures are used for an impartial comparison. Further, to prove the supremacy of the presented method, the basic BLS, Genetic Algorithm wrapped BLS (GA-BLS), Particle Swarm Optimization wrapped BLS (PSO-BLS), and the existing ten models are taken for comparison. Moreover, to examine the designed model statistically, Analysis of variance (ANOVA) test is also performed here. From the above qualitative and quantitative analysis, it is concluded that the proposed MBO-BLS model outclasses other considering models.
Keywords