E3S Web of Conferences (Jan 2024)

Clustering of k-means based on Euclidean distance metric and Mahalanobis metric

  • Akhmatshin Farid,
  • Egarmin Pavel,
  • Gerasimova Marina,
  • Petrova Irina,
  • Mikitchak Sergey

DOI
https://doi.org/10.1051/e3sconf/202453103002
Journal volume & issue
Vol. 531
p. 03002

Abstract

Read online

Clustering of k-means uses different variants of the algorithm of the same name to identify clusters. This paper deals with the performance study of the clustering algorithm using Euclidean distance metric and Mahalanobis metric. The choice of k-values as the initial estimate of the mean is considered in the second and fifth iterations. The BIRCH-3 and Mopsi-Finland datasets [1] are chosen as input data to investigate the performance of the metrics. The study shows the high efficiency of the k-means clustering algorithm using the Euclid metric depending on the random selection of the initial k values in the initial iterations of the algorithm. The use of Mahalanobis metric is more effective with an increasing number of iterations.