Journal of Universal Computer Science (May 2022)

An Enhanced Evolutionary Based Feature Selection Approach Using Grey Wolf Optimizer for the Classification of High-dimensional Biological Data

  • Thaer Thaher,
  • Mohammed Awad,
  • Mohammed Aldasht,
  • Alaa Sheta,
  • Hamza Turabieh,
  • Hamouda Chantar

DOI
https://doi.org/10.3897/jucs.78218
Journal volume & issue
Vol. 28, no. 5
pp. 499 – 539

Abstract

Read online Read online Read online

Feature selection (FS) is a pre-processing step that aims to eliminate the redundant and less-informative features to enhance the performance of data mining techniques. It is also considered as one of the key success factors for classification problems in high-dimensional datasets. This paper proposes an efficient wrapper feature selection method based on Grey Wolf Optimizer (GWO). GWO is a recent metaheuristic algorithm that has been widely employed to solve diverse optimization problems. However, GWO mainly follows the search directions toward the leading wolves, making it prone to fall into local optima, especially when dealing with high-dimensional problems, which is the case when dealing with many biological datasets. An enhanced variation of GWO called EGWO, which adapts two enhancements, is introduced to overcome this specific shortcoming. In the first place, the transition parameter concept is incorporated to move GWO from the exploration phase to the exploitation phase. Several adaptive non-linear decreasing formulas are introduced to control the transition parameters. In the second place, a random-based search strategy is exploited to empower diversity during the search process. Two binarization schemes using S-shaped and V-shaped transfer functions are incorporated to map the continuous search space into a binary one for FS. The efficiency of the proposed EGWO is validated on ten high-dimensional low-samples biological data. Our experiments show the promising performance of EGWO compared to the original GWO approach and other state-of-the-art techniques in terms of dimensionality reduction and the enhancement of classification performance.

Keywords