Database Systems Journal (Feb 2023)

A Correlation Based Way to Predict the Type of Breast Cancer for Diagnosis

  • Shahidul Islam KHAN

Journal volume & issue
Vol. XIII, no. 1
pp. 19 – 26

Abstract

Read online

Nowadays, breast cancer is considered one of the most common causes of death among adult women. At the same time, the bright side is that among all the types of cancer, breast cancer is more curable, if diagnosed in the early stages. In this paper, the diagnosis of breast cancer has been proposed using the least possible number of features based on correlation. In the proposed method, we have used correlation to find the strength between the input and the target features. Then we provided a way to create a new subset that consists of only the most relevant features. We have used the Wisconsin breast cancer data set (WBCD) for the experiments. The performance of the model is justified using classification accuracy and the f-score. The result shows that our proposed method obtained the highest classification accuracy (95.26%) with the Random Forest classification using only 4 features from 29 available features, which led to a reduction of 86% in data set size.

Keywords