Clustering of k-means based on Euclidean distance metric and Mahalanobis metric

Akhmatshin Farid; Egarmin Pavel; Gerasimova Marina; Petrova Irina; Mikitchak Sergey

doi:10.1051/e3sconf/202453103002

E3S Web of Conferences (Jan 2024)

Clustering of k-means based on Euclidean distance metric and Mahalanobis metric

Akhmatshin Farid,
Egarmin Pavel,
Gerasimova Marina,
Petrova Irina,
Mikitchak Sergey

Affiliations

Akhmatshin Farid: Siberian University of Science and Technology
Egarmin Pavel: Siberian University of Science and Technology
Gerasimova Marina: Siberian University of Science and Technology
Petrova Irina: Siberian University of Science and Technology
Mikitchak Sergey: Siberian University of Science and Technology

DOI: https://doi.org/10.1051/e3sconf/202453103002
Journal volume & issue: Vol. 531
p. 03002

Abstract

Read online

Clustering of k-means uses different variants of the algorithm of the same name to identify clusters. This paper deals with the performance study of the clustering algorithm using Euclidean distance metric and Mahalanobis metric. The choice of k-values as the initial estimate of the mean is considered in the second and fifth iterations. The BIRCH-3 and Mopsi-Finland datasets [1] are chosen as input data to investigate the performance of the metrics. The study shows the high efficiency of the k-means clustering algorithm using the Euclid metric depending on the random selection of the initial k values in the initial iterations of the algorithm. The use of Mahalanobis metric is more effective with an increasing number of iterations.

Published in E3S Web of Conferences

ISSN: 2267-1242 (Online)
Publisher: EDP Sciences
Country of publisher: France
LCC subjects: Geography. Anthropology. Recreation: Environmental sciences
Website: http://www.e3s-conferences.org/

About the journal