Journal of King Saud University: Computer and Information Sciences (Feb 2020)

A two stage grading approach for feature selection and classification of microarray data using Pareto based feature ranking techniques: A case study

  • Rasmita Dash

Journal volume & issue
Vol. 32, no. 2
pp. 232 – 247

Abstract

Read online

High dimensional search space in microarray data with large number of genes and few dozen of samples increases the complexity of analysis of such databases. All the genes are not significant and hence informative genes are required to be extracted. So dimension reduction is necessary for this process. It is often found in literature that the ranking approaches are used for feature selection. Different ranking techniques may assign different rank to the same gene and the selection made based on these ranks may not be suitable for different problems. So use of one ranking technique may lead to rejection of some important genes and possibly selection of some insignificant genes. Such selection may degrade the performance of the classifier. To overcome this problem, here a bi-objective ranked based Pareto front technique is proposed. In this technique using two ranked based technique the Pareto optimal solution is generated with a set of features. For the experimental work, 21 models based on 7 feature ranking strategies are considered. Eight different microarray data are taken to find the suitable ranking combination for the work. A grading method is used to rank the models and statistical test is performed to validate the findings. Keywords: Feature ranking technique, Statistical analysis, Pareto front, Multi-objective optimization, Classification technique, Microarray database