Information (Dec 2019)

A Fast Method for Estimating the Number of Clusters Based on Score and the Minimum Distance of the Center Point

  • Zhenzhen He,
  • Zongpu Jia,
  • Xiaohong Zhang

DOI
https://doi.org/10.3390/info11010016
Journal volume & issue
Vol. 11, no. 1
p. 16

Abstract

Read online

Clustering is widely used as an unsupervised learning algorithm. However, it is often necessary to manually enter the number of clusters, and the number of clusters has a great impact on the clustering effect. At present, researchers propose some algorithms to determine the number of clusters, but the results are not very good for determining the number of clusters of data sets with complex and scattered shapes. To solve these problems, this paper proposes using the Gaussian Kernel density estimation function to determine the maximum number of clusters, use the change of center point score to get the candidate set of center points, and further use the change of the minimum distance between center points to get the number of clusters. The experiment shows the validity and practicability of the proposed algorithm.

Keywords