ITM Web of Conferences (Jan 2017)

Discovering Movie Categories Based on SPHK-means Clustering Algorithm

  • Li Dong-Yuan,
  • Cao Cai-Feng

DOI
https://doi.org/10.1051/itmconf/20171104002
Journal volume & issue
Vol. 11
p. 04002

Abstract

Read online

Basing on SPHK-means, an improved K-means clustering algorithm, we have used dataset provided by MovieLens to design experiment. First, we have reduced dimensions of movie-user scoring matrix. Then, we have multiply sampled movies to conduct agglomerative hierarchical clustering in order to determine the appropriate value of k and initial centers. Finally, according to fixed k and initial centers, we have divided movies into groups through K-means clustering. With evaluation indicators as precision, recall and number of groups found, experiment in this paper has indicated that result of SPHK-means clustering algorithm is better than that of classical K-means clustering algorithm.