Information (Dec 2021)

CIDOC2VEC: Extracting Information from Atomized CIDOC-CRM Humanities Knowledge Graphs

  • Hassan El-Hajj,
  • Matteo Valleriani

DOI
https://doi.org/10.3390/info12120503
Journal volume & issue
Vol. 12, no. 12
p. 503

Abstract

Read online

The development of the field of digital humanities in recent years has led to the increased use of knowledge graphs within the community. Many digital humanities projects tend to model their data based on CIDOC-CRM ontology, which offers a wide array of classes appropriate for storing humanities and cultural heritage data. The CIDOC-CRM ontology model leads to a knowledge graph structure in which many entities are often linked to each other through chains of relations, which means that relevant information often lies many hops away from their entities. In this paper, we present a method based on graph walks and text processing to extract entity information and provide semantically relevant embeddings. In the process, we were able to generate similarity recommendations as well as explore their underlying data structure. This approach was then demonstrated on the Sphaera Dataset which was modeled according to the CIDOC-CRM data structure.

Keywords