Applied Artificial Intelligence (Nov 2017)
Feature Selection via Pareto Multi-objective Genetic Algorithms
Abstract
Feature selection, an important combinatorial optimization problem in data mining, aims to find a reduced subset of features of high quality in a dataset. Different categories of importance measures can be used to estimate the quality of a feature subset. Since each measure provides a distinct perspective of data and of which are their important features, in this article we investigate the simultaneous optimization of importance measures from different categories using multi-objective genetic algorithms grounded in the Pareto theory. An extensive experimental evaluation of the proposed method is presented, including an analysis of the performance of predictive models built using the selected subsets of features. The results show the competitiveness of the method in comparison with six feature selection algorithms. As an additional contribution, we conducted a pioneer, rigorous, and replicable systematic review on related work. As a result, a summary of 93 related papers strengthens features of our method.