Mathematics (Jan 2023)

Improving Data Sparsity in Recommender Systems Using Matrix Regeneration with Item Features

  • Sang-Min Choi,
  • Dongwoo Lee,
  • Kiyoung Jang,
  • Chihyun Park,
  • Suwon Lee

DOI
https://doi.org/10.3390/math11020292
Journal volume & issue
Vol. 11, no. 2
p. 292

Abstract

Read online

With the development of the Web, users spend more time accessing information that they seek. As a result, recommendation systems have emerged to provide users with preferred contents by filtering abundant information, along with providing means of exposing search results to users more effectively. These recommendation systems operate based on the user reactions to items or on the various user or item features. It is known that recommendation results based on sparse datasets are less reliable because recommender systems operate according to user responses. Thus, we propose a method to improve the dataset sparsity and increase the accuracy of the prediction results by using item features with user responses. A method based on the content-based filtering concept is proposed to extract category rates from the user–item matrix according to the user preferences and to organize these into vectors. Thereafter, we present a method to filter the user–item matrix using the extracted vectors and to regenerate the input matrix for collaborative filtering (CF). We compare the prediction results of our approach and conventional CF using the mean absolute error and root mean square error. Moreover, we calculate the sparsity of the regenerated matrix and the existing input matrix, and demonstrate that the regenerated matrix is more dense than the existing one. By computing the Jaccard similarity between the item sets in the regenerated and existing matrices, we verify the matrix distinctions. The results of the proposed methods confirm that if the regenerated matrix is used as the CF input, a denser matrix with higher predictive accuracy can be constructed than when using conventional methods. The validity of the proposed method was verified by analyzing the effect of the input matrix composed of high average ratings on the CF prediction performance. The low sparsity and high prediction accuracy of the proposed method are verified by comparisons with the results by conventional methods. Improvements of approximately 16% based on K-nearest neighbor and 15% based on singular value decomposition, and a three times improvement in the sparsity based on regenerated and original matrices are obtained. We propose a matrix reconstruction method that can improve the performance of recommendations.

Keywords