IEEE Access (Jan 2023)
Scientific Documents Retrieval Based on Graph Convolutional Network and Hesitant Fuzzy Set
Abstract
Previous scientific literature retrieval methods, which are based on mathematical expression, ignore the literature attributes and the association between the literature, and the retrieval accuracy was affected. In this study, literature retrieval model based on Graph Convolutional Network (GCN) is proposed. By extracting document attributes from a structured document dataset, an Attribute Relation Graph (ARG) is constructed. Using GCN to capture the dependencies among literature nodes and generate literature representations by information aggregation to realize graph-based literature modeling; Introducing the advantages of Hesitant Fuzzy Set (HFS) theory in multi-attribute decision-making to realize the similarity evaluation between mathematical query expressions and mathematical retrieval result expressions. Finally, the similarity between literature features and mathematical expressions is integrated to obtain the ordered output of scientific literature retrieval results. Experiments were conducted on the arXiv public dataset, and the average precision of the top 10 retrieval results was 0.892, and the average NDCG value of the top 10 rankings was 0.875.
Keywords