IEEE Access (Jan 2021)
A Schema-Driven Synthetic Knowledge Graph Generation Approach With Extended Graph Differential Dependencies (GDD<sup>x</sup>s)
Abstract
Knowledge Graphs (KGs), as one of the key trends which are driving the next wave of technologies, have now become a new form of knowledge representation, and a cornerstone for several applications from generic to specific industrial use cases. However, in some specific domains such as law enforcement, a real and large domain-oriented KG is often unavailable due to data privacy concerns. In such domains it is necessary to generate a synthetic KG which mimics the properties of a real KG in the domain. Although during the last two decades, a variety of graph data generators has been proposed to achieve the generation of different kinds of networks, the state-of-the-art synthetic graph data generators are not feasible to generate a realistic and synthetic KGs because KGs always contain data characteristics with specified semantics. In this work, we propose a schema-driven synthetic KG generation approach with extended graph differential dependencies (GDDx), which is an extension of the recently developed graph entity/differential dependencies that represent formal constraints for graph data to enable the generation of desired graph patterns in synthetic KG. Next, we develop an effective KG generation algorithm that employs the schema and the pre-defined GDDxs. Finally, we evaluate our synthetic KG generator and compare with several state-of-the-art synthetic graph generators. The results from the experiments show that our KG generation method can generate KGs that exhibit the desired graph patterns, node attributes and degree distributions associated with each entity type in the graph's schema.
Keywords