IEEE Access (Jan 2024)

Handling the Class Imbalance Problem With an Improved Sine Cosine Algorithm for Optimal Instance Selection

  • Rajalakshmi Shenbaga Moorthy,
  • Arikumar K. Selvaraj,
  • Sahaya Beni Prathiba,
  • Gokul Yenduri,
  • Sachi Nandan Mohanty,
  • Janjhyam Venkata Naga Ramesh

DOI
https://doi.org/10.1109/ACCESS.2024.3417822
Journal volume & issue
Vol. 12
pp. 87131 – 87151

Abstract

Read online

Class imbalance is a significant study problem that is biased, exhibiting excellent performance toward the majority classes in the dataset while showing inferior performance toward minority classes. When dealing with real-world issues, this kind of biased nature affects classification accuracy. The Improved Binary Sine Cosine Algorithm (IBSCA) has been used in this work to identify a subset of the majority class in the best possible way. The proposed IBSCA makes some enhancements over the conventional Binary Sine Cosine Algorithm (BSCA) to address the issue of premature convergence with local optimal solutions. To improve classification accuracy for unbalanced datasets, the proposed IBSCA seeks to identify the optimal collection of instances from the majority class. The advised IBSCA makes use of the alpha agent, beta agent, and random agent’s location, which tends to devote considerable time to exploration to find the best possible set of instances. By using the geometric mean (G-mean) and F-score to describe the fitness function, the proposed IBSCA aims to solve the multi-objective optimization issue. On 18 datasets with different imbalance ratios taken from the KEEL repository, experimentation is conducted. Comparisons are made between the suggested IBSCA and the traditional Binary Sine Cosine Algorithm, Binary Particle Swarm Optimization (BPSO), and Binary Grey Wolf Optimization (BGWO). Additionally, the performance of the suggested IBSCA is evaluated against the top outcomes from different research papers. Metrics like sensitivity, F-score, G-mean, and area under curve (AUC) show that the suggested IBSCA outperforms the state-of-the-art algorithms. The statistical findings using the Wilcoxon signed rank test and Friedman test also demonstrate that the suggested IBSCA is more efficient than the other conventional algorithms.

Keywords