Indonesian Journal of Data and Science (Jul 2023)

Comparison of Performance of Four Distance Metric Algorithms in K-Nearest Neighbor Method on Diabetes Patient Data

  • Dewi Ratnasari

DOI
https://doi.org/10.56705/ijodas.v4i2.71
Journal volume & issue
Vol. 4, no. 2

Abstract

Read online

Diabetes is a chronic disease that occurs when the pancreas no longer produces insulin or when the body cannot effectively use the insulin it produces. The aim of this study is to analyze and compare the classification performance on diabetes patient dataset using four distance metric algorithms in the K-Nearest Neighbor (K-NN) method. Based on previous research, the performance values obtained were not sufficiently high, not exceeding 80%. Therefore, some actions are needed with the hope of obtaining new performance values and making comparisons with previous studies. Based on the test results using the confusion matrix, the accuracy level using Euclidean distance measurement obtained the best performance value at k=17 with 10-k fold, with an accuracy of 85.71%, precision of 86.24%, recall of 85.71%, and F-measure of 85.12%. The Manhattan distance measurement obtained the best performance value at k=25 with 10-k fold, with an accuracy of 85.53%, precision of 85.54%, recall of 85.53%, and F-measure of 85.10%. The Minkowski distance measurement obtained the best performance value at k=17 with 10-k fold, with an accuracy of 85.71%, precision of 86.24%, recall of 85.71%, and F-measure of 85.12%. On the other hand, the Hamming distance measurement obtained the best performance value at k=23 with 10-k fold, with an accuracy of 75.32%, precision of 79.27%, recall of 75.32%, and F-measure of 71.45%.

Keywords