Computer Science Journal of Moldova (Jul 2024)
Novel feature selection method for accurate breast cancer classification using Correlation coefficient and Modified GWO Algorithm
Abstract
Breast cancer is perceived as the most common cause of mortality among women globally. Early detection of this disease is critical to reduce significantly the possibility of death. Machine learning techniques have been proved to be efficient and very successful for an accurate breast cancer diagnosis. In this paper, an efficient hybrid Feature Selection (FS) method named a Correlation technique-Modified Grey Wolf Optimizer (CMGWO) was proposed for accurate breast cancer classification based on dimensionality reduction. The suggested technique is based on two stages: the feature selection step and the classification step. Feature selection is the process of picking the most significant characteristics from a dataset. This stage is crucial in machine learning. Firstly, we focus on the filter method by using a Correlation technique for dimensionality reduction. This technique is intended to eliminate and reduce the number of features by selecting one feature from the other correlated features. Secondly, we use the Modified Grey Wolf Optimization algorithm (MGWO) to locate and determine the most significant features from uncorrelated features. After that, we use multiple classifiers to classify breast cancer disease based on the selected features. The Wisconsin Diagnostic Breast Cancer (WDBC) database was used to prove the performance of our proposed work. The experimental results show that the combination of the correlation method and MGWO for feature selection increases the accuracy rate of classification with a minimum number of features. The performances of different machine learning algorithms were evaluated, including Random Forest classifier (RF), Support Vector Machine (SVM) Classifier, and Naïve Bayes (NB) Classifier for the classification step. The suggested technique proves to be the best approach and reliable one among all studied approaches since it increases classification accuracy to 99.12\% obtained by CMGWO using Random Forest classifier and demonstrates its significance in detecting breast cancer.
Keywords