Open Engineering (May 2021)

Utilization of K-nearest neighbor algorithm for classification of white blood cells in AML M4, M5, and M7

  • Prakisya Nurcahya Pradana Taufik,
  • Liantoni Febri,
  • Hatta Puspanda,
  • Aristyagama Yusfia Hafid,
  • Setiawan Andika

DOI
https://doi.org/10.1515/eng-2021-0065
Journal volume & issue
Vol. 11, no. 1
pp. 662 – 668

Abstract

Read online

Acute myeloid leukemia (AML) M4, M5, and M7 are subtypes of leukemia derived from myeloid cell derivatives that influences the results of the identification of AMLs, which includes myeloblast, monoblast, and megakaryoblast. Furthermore, they are divided into more specific types, including myeloblasts, promyelocytes, monoblasts, promonocytes, monocytes, and megakaryoblasts, which must be clearly identified in order to further calculate the ratio value in the blood. Therefore, this research aims to classify these cell types using the K-nearest neighbor (KNN) algorithm. Three distance metrics are tested, namely, Euclidean, Chebychev, and Minkowski, and both the weighted and unweighted were tested. The features used as parameters are area, nucleus ratio, circularity, perimeter, mean, and standard deviation, and about 1,450 objects are used as training and testing data. In addition, to ensure that the classification is not overfitting, K-fold cross validation was conducted. The results show that the unweighted Minkowski distance acquired about 240 of 290 objects at K = 19, which is the best. Therefore, the unweighted Minkowski distance is selected for further analysis. The accuracy, recall, and precision values of KNN with unweighted Minkowski distance obtained from fivefold cross validation are 80.552, 44.145, and 42.592%, respectively.

Keywords