Jurnal Sisfokom (Jun 2024)
Comparison of the Performance of Random Forest and K-Nearest Neighbor in Classifying Leukemia Using Principal Component Analysis
Abstract
Leukemia is the most common blood cancer in Asia, one of which is Indonesia. Leukemia can affect blood cells, bone marrow, lymph nodes and other parts of the lymphatic system. One way to detect leukemia is to use microarray technology by applying gene expression. Microarrays have a very large number of genes so it is necessary to reduce the number of genes in order to eliminate irrelevant features and increase the accuracy of the classification process. The leukemia feature/gene reduction process was carried out using PCA and the classification process was carried out using RF and KNN. The accuracy results from the RF classification method using 100 n_estimators were 78.57%, while using the KNN method the accuracy results with K=1 were 78.57%, K=3 and 5 were 85.71%, and K=7 and 9 were 71.42%. The best accuracy results use KNN with K=3 and 5.
Keywords