PeerJ Computer Science (Aug 2024)
Migrating birds optimization-based feature selection for text classification
Abstract
Text classification tasks, particularly those involving a large number of features, pose significant challenges in effective feature selection. This research introduces a novel methodology, MBO-NB, which integrates Migrating Birds Optimization (MBO) approach with naïve Bayes as an internal classifier to address these challenges. The motivation behind this study stems from the recognized limitations of existing techniques in efficiently handling extensive feature sets. Traditional approaches often fail to adequately streamline the feature selection process, resulting in suboptimal classification accuracy and increased computational overhead. In response to this need, our primary objective is to propose a scalable and effective solution that enhances both computational efficiency and classification accuracy in text classification systems. To achieve this objective, we preprocess raw data using the Information Gain algorithm, strategically reducing the feature count from an average of 62,221 to 2,089. Through extensive experiments, we demonstrate the superior effectiveness of MBO-NB in feature reduction compared to other existing techniques, resulting in significantly improved classification accuracy. Furthermore, the successful integration of naïve Bayes within MBO offers a comprehensive and well-rounded solution to the feature selection problem. In individual comparisons with Particle Swarm Optimization (PSO), MBO-NB consistently outperforms by an average of 6.9% across four setups. This research provides valuable insights into enhancing feature selection methods, thereby contributing to the advancement of text classification techniques. By offering a scalable and effective solution, MBO-NB addresses the pressing need for improved feature selection methods in text classification, thereby facilitating the development of more robust and efficient classification systems.
Keywords