Alexandria Engineering Journal (Apr 2025)
ReliefF guided variable spiral tuna swarm optimization algorithm with somersault foraging for feature selection
Abstract
The feature selection (FS) technique is a powerful knowledge discovery tool for understanding complex problems by identifying the most relevant features. With the rapid development of high-throughput technologies, high-dimensional, multi-text and multi-classification data have become increasingly common, FS is considered as an effective method for dimension reduction. So a ReliefF guided variable spiral tuna swarm optimization (TSO) algorithm with somersault foraging was proposed to solve the FS problem. Firstly, inspired by the sudden flipping behavior of manta rays when capturing plankton, a novel somersault foraging strategy is introduced to help the TSO algorithm escape from local optima. Secondly, a ReliefF-guided strategy is incorporated to add and remove features so as to improve the classification accuracy. Additionally, seven different mathematical spirals are employed to replace the original spiral foraging pattern in the TSO algorithm. By adjusting the search scope of the spiral foraging strategy, this approach enhances the search performance of the algorithm and reduces the likelihood of getting trapped in local optima. The proposed algorithm was tested on 18 UCI datasets. The first set of experiments demonstrates the effectiveness of the somersault foraging strategy, ReliefF guiding strategy and variable spiral strategy. The RReTSO CY algorithm successfully reduces the average fitness value, achieves higher classification accuracy and selects fewer features. In the second set of experiments, RReTSO CY is compared with other binary swarm intelligence optimization algorithms, with results indicating that the proposed method effectively reduces the feature subset size, improves classification accuracy and achieves the lowest average fitness value. Finally, the Wilcoxon rank-sum tests were conducted to statistically validate the effectiveness of the proposed algorithm.