PLoS ONE (Jan 2021)
KG2Vec: A node2vec-based vectorization model for knowledge graph.
Abstract
Since the word2vec model was proposed, many researchers have vectorized the data in the research field based on it. In the field of social network, the Node2Vec model improved on the basis of word2vec can vectorize nodes and edges in social networks, so as to carry out relevant research on social networks, such as link prediction, and community division. However, social network is a network with homogeneous structure. When dealing with heterogeneous networks such as knowledge graph, Node2Vec will lead to inaccurate prediction and unreasonable vector quantization data. Specifically, in the Node2Vec model, the walk strategy for homogeneous networks is not suitable for heterogeneous networks, because the latter has distinguishing features for nodes and edges. In this paper, a Heterogeneous Network vector representation method is proposed based on random walks and Node2Vec, called KG2vec (Heterogeneous Network to Vector) that solves problems related to the inadequate consideration of the full-text semantics and the contextual relations that are encountered by the traditional vector representation of the knowledge graph. First, the knowledge graph is reconstructed and a new random walk strategy is applied. Then, two training models and optimizing strategies are proposed, so that the contextual environment between entities and relations is obtained, semantically providing a full vector representation of the Heterogeneous Network. The experimental results show that the KG2VEC model solves the problem of insufficient context consideration and unsatisfactory results of one-to-many relationship in the vectorization process of the traditional knowledge graph. Our experiments show that KG2vec achieves better performance with higher accuracy than traditional methods.