Dianzi Jishu Yingyong (Mar 2023)

An initial centers selection method serving K-means

  • Li Qiuyun,
  • Liu Yanwu

DOI
https://doi.org/10.16157/j.issn.0258-7998.223066
Journal volume & issue
Vol. 49, no. 3
pp. 134 – 138

Abstract

Read online

Clustering is one of the most important data mining technologies, and K-means is the most famous and commonly used clustering algorithm. However, the performance of K-means depends heavily on the initial centers. It is very important for K-means to select how many initial centers and which data points to choose as the initial centers. Therefore, an initial centers selection method called DPCC (density peak clustering centers) is proposed. DPCC generates a selection decision graph based on density and distance, so as to highlight all density peak points in dataset. These density peak points are the initial centers provided by DPCC for K-means. Experiments show that DPCC not only provides decision support for the number of initial centers, but also improves the accuracy of K-means and reduces the running time of K-means.

Keywords