Jurnal Riset Informatika (Mar 2023)
Comparison of Breast Cancer Classification Using the Decision Tree ID3 Algorithm and K-Nearest Neighbors Algorithm
Abstract
One of the leading causes of death is cancer. The most common cancer in women is breast cancer. Breast cancer (Carcinoma mammae) is a malignant neoplasm originating from the parenchyma. Breast cancer ranks first in terms of the highest number of cancers in Indonesia and is among the first contributors to cancer deaths. Globocan data in 2020 shows that the number of new breast cancer cases reached 68,858 (16.6%) of the total 396,914 new cancer cases in Indonesia. Meanwhile, deaths reached more than 22 thousand cases (Romkom, 2022). This death rate is increasing due to insufficient information about breast cancer’s early symptoms and dangers. Of this lack of information, a system is needed that can provide information about breast cancer, such as early diagnosis. Several parameters and classification data mining techniques can predict which patients will develop breast cancer and which do not. In this study, a comparison of the classification of breast cancer using the Decision Tree ID3 algorithm and the K-Nearest Neighbors algorithm will be carried out. Attribute data consists of Menopause, Tumor-Size, Node-Caps, Deg-Malig, Breast-Squad,, and Irradiant. The main objective of this study is to improve classification performance in breast cancer diagnosis by applying feature selection to several classification algorithms. The Decision Tree ID3 algorithm has an accuracy rate of 93.333%, and the K-Nearest Neighbors algorithm has an accuracy rate of 76.6667%.
Keywords