An Improved K-Means Algorithm Based on Contour Similarity

Jing Zhao; Yanke Bao; Dongsheng Li; Xinguo Guan

doi:10.3390/math12142211

Mathematics (Jul 2024)

An Improved K-Means Algorithm Based on Contour Similarity

Jing Zhao,
Yanke Bao,
Dongsheng Li,
Xinguo Guan

Affiliations

Jing Zhao: Key Laboratory of Industrial Automation and Machine Vision of Qiannan, School of Mathematics and Statistics, Qiannan Normal University for Nationalities, Duyun 558000, China
Yanke Bao: College of Science, Liaoning Technical University, Fuxin 123000, China
Dongsheng Li: Key Laboratory of Industrial Automation and Machine Vision of Qiannan, School of Mathematics and Statistics, Qiannan Normal University for Nationalities, Duyun 558000, China
Xinguo Guan: Key Laboratory of Industrial Automation and Machine Vision of Qiannan, School of Mathematics and Statistics, Qiannan Normal University for Nationalities, Duyun 558000, China

DOI: https://doi.org/10.3390/math12142211
Journal volume & issue: Vol. 12, no. 14
p. 2211

Abstract

Read online

The traditional k-means algorithm is widely used in large-scale data clustering because of its easy implementation and efficient process, but it also suffers from the disadvantages of local optimality and poor robustness. In this study, a Csk-means algorithm based on contour similarity is proposed to overcome the drawbacks of the traditional k-means algorithm. For the traditional k-means algorithm, which results in local optimality due to the influence of outliers or noisy data and random selection of the initial clustering centers, the Csk-means algorithm overcomes both drawbacks by combining data lattice transformation and dissimilar interpolation. In particular, the Csk-means algorithm employs Fisher optimal partitioning of the similarity vectors between samples for the process of determining the number of clusters. To improve the robustness of the k-means algorithm to the shape of the clusters, the Csk-means algorithm utilizes contour similarity to compute the similarity between samples during the clustering process. Experimental results show that the Csk-means algorithm provides better clustering results than the traditional k-means algorithm and other comparative algorithms.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords