ITM Web of Conferences (Jan 2017)
Discovering Movie Categories Based on SPHK-means Clustering Algorithm
Abstract
Basing on SPHK-means, an improved K-means clustering algorithm, we have used dataset provided by MovieLens to design experiment. First, we have reduced dimensions of movie-user scoring matrix. Then, we have multiply sampled movies to conduct agglomerative hierarchical clustering in order to determine the appropriate value of k and initial centers. Finally, according to fixed k and initial centers, we have divided movies into groups through K-means clustering. With evaluation indicators as precision, recall and number of groups found, experiment in this paper has indicated that result of SPHK-means clustering algorithm is better than that of classical K-means clustering algorithm.