Heliyon (Jul 2024)
Scientific paper recommender system using deep learning and link prediction in citation network
Abstract
Today, the number of published scientific articles is increasing day by day, and this has made the process of searching for articles more difficult. The need to provide specific recommender systems (RSs) for suggesting scientific articles is strongly felt in this situation. Because searching for articles based only on matching the titles or content of other articles is not an efficient process. In this research, the combination of two content analysis and citation network is used to design an RS for scientific articles (RECSA). In RECSA, natural language processing and deep learning techniques are used to process the titles and extract the content attributes of the articles. For this purpose, first, the titles of the articles are pre-processed, and by using the Term Frequency Inverse Document Frequency (TF-IDF) criterion, the importance of each word in the title is estimated. Then the dimensions of the obtained attributes are reduced by using a convolutional neural network (CNN). Then, by using the cosine similarity criterion, the content similarity matrix of the articles is calculated based on the attribute vectors. Also, the link prediction approach is used to analyze the connections of scientific articles' citation network. Finally, in the third step of RECSA, the two similarity matrices calculated in the previous steps are combined using an influence coefficient parameter to obtain the final similarity matrix, and the recommendation operation is based on the highest similarity value. The efficiency of RECSA has been evaluated from different aspects and the results have been compared with previous works. According to the results, utilizing the combination of TF-IDF and CNN for analyzing content-based features, leads to at least 0.32 % improvement in terms of precision compared to previous works. Also, by integrating citation and content-based data, the precision of first suggestion in RECSA would be 99.01 % which indicates the minimum improvement of 0.9 % compared to compared methods. The results show that by using RECSA, the recommendation can be done with higher accuracy and efficiency.