International Journal of Computational Intelligence Systems (Mar 2024)
Improving Breast Cancer Diagnosis Accuracy by Particle Swarm Optimization Feature Selection
Abstract
Abstract Breast cancer has been one of the leading causes of death among women in the world. Early detection of this disease can save patient’s lives and reduce mortality. Due to the large number of features involved in the diagnosis of this disease, the breast cancer diagnosis process can be time consuming. To reduce cost and time and improving accuracy of breast cancer diagnosis, this paper propose a feature selection algorithm based on particle swarm optimization (PSO) combined with machine learning methods for selection the most effective features for breast cancer diagnosis among all features. In order to evaluate the efficiency of the proposed feature selection method, it was tested on three most common breast cancer datasets available in the University of California, Irvine (UCI) repository named: Coimbra dataset (CD), Wisconsin Diagnostic Breast Cancer dataset (WDBC) and Wisconsin Prognostic Breast Cancer dataset (WPBC). In the Coimbra dataset with all its 9 features and without PSO feature selection algorithm the highest obtained accuracy was 87% by Support Vector Machine method, while with PSO feature selection algorithm the accuracy reached to 91% and the number of features was reduced from 9 to 4. In the WDBC dataset with all its 30 features and without PSO feature selection algorithm the highest obtained accuracy was 99% by Random Forest method, while with PSO feature selection algorithm the accuracy reached to 100% and the number of features was reduced from 30 to 19. In the WPBC dataset with all its 33 features and without PSO feature selection algorithm the highest obtained accuracy was 94% by Support Vector Machine method, while with PSO feature selection algorithm the accuracy reached to 96% and the number of features was reduced from 33 to 17. The results of this paper indicated that the proposed feature selection algorithm based on PSO algorithm can improve the accuracy of breast cancer diagnosis. While it has selected fewer and more effective features than the total number of features in the original datasets.
Keywords