Journal of Information Systems and Informatics (Sep 2024)
Manhattan Metric Technique in K-Means Clustering for Data Grouping
Abstract
Clustering can be defined as a method commonly applied in data mining to group objects into clusters. Clusters consist of data objects that are similar to each other in a group but different from objects in other clusters. In this study, the data used is the data of KIP scholarship recipients for the 2016-2023 period. Various clustering metric measurement techniques have been frequently used by researchers, especially those focusing on distance and similarity metrics, such as Euclidean Distance, Manhattan, and Minkowski. In general, K-Means is an unsupervised learning method used in the clustering process to group data based on similarity. The elbow method is used to determine the optimal number of clusters, so that the clustering results obtained can be maximized to achieve better results. This study aims to analyze the use of Manhattan technique in K-Means clustering for data grouping. The research problem is how to analyze the Manhattan metric technique in K-Means clustering for effective data grouping. Applying the K-Means method shows that the existing data is successfully divided into four specified clusters. After determining the correct number of clusters, the K-Means method is used to sort the data in the dataset. From 3172 data, the final results obtained cluster 0 as many as 774 data, cluster 1 as many as 417 data, cluster 2 as many as 1244 data, and cluster 3 as many as 737 data. The results of the clustering process obtained a davies-bouldin index value of 1.4568.
Keywords