Research on User Similarity Calculation of Collaborative Filtering for Sparse Data

WU Sen, DONG Yaxian, WEI Guiying, GAO Xiaonan

doi:10.3778/j.issn.1673-9418.2011062

Jisuanji kexue yu tansuo (May 2022)

Research on User Similarity Calculation of Collaborative Filtering for Sparse Data

WU Sen, DONG Yaxian, WEI Guiying, GAO Xiaonan

Affiliations

WU Sen, DONG Yaxian, WEI Guiying, GAO Xiaonan: School of Economics and Management, University of Science and Technology Beijing, Beijing 100083, China

DOI: https://doi.org/10.3778/j.issn.1673-9418.2011062
Journal volume & issue: Vol. 16, no. 5
pp. 1043 – 1052

Abstract

Read online

User-based collaborative filtering achieves recommendation for target users based on the preferences of their nearest neighbors, in which how to calculate user similarity is critical. The traditional rating similarity calculation relies on the scores of common scoring items. With the intensification of the sparsity of user-item scoring matrix, traditional rating similarity calculation is difficult to accurately measure the similarity between users. Along this line, traditional rating similarity calculation is difficult in selecting reliable nearest neighbors for the target user, which affects the final recommendation performance. Besides, structural similarity is another commonly used similarity calculation method in recommendation task, which is mostly measured by the proportion of users’ common scoring items. This kind of method is easy to calculate and less affected by data sparseness. However, its outputs are usually close, leading to the result that different user-pairs cannot be distinguished obviously. To solve the similarity calculation difficulty for collaborative filtering caused by data sparseness, a sparse cosine similarity is proposed in this paper. Firstly, this paper formulates a new structural similarity, sparse set simil-arity to differentiate users into two groups, high-correlation users and low-correlation users. Then, this paper deve-lops different rating similarity calculation methods for different kinds of users, which can eliminate the misleading produced by traditional rating similarity when the data is sparse. Finally, the sparse cosine similarity is constructed by combining the raised rating similarity and structural similarity. Experimental results show that compared with seven similarity calculation methods, the presented sparse cosine similarity can yield more accurate user similarity and improve the performance of recommendation task, overcoming the limitations that traditional rating methods are affected by data sparseness severely and the results produced by structural methods are not distinct significantly.

|similarity measure|collaborative filtering|sparse data|recommendation system

Published in Jisuanji kexue yu tansuo

ISSN: 1673-9418 (Print)
Publisher: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://fcst.ceaj.org

About the journal

Abstract

Keywords