International Journal of Digital Curation (Oct 2022)

Synchronic Curation for Assessing Reuse and Integration Fitness of Multiple Data Collections

  • Maria Esteva,
  • Weijia Xu,
  • Nevan Simone,
  • Kartik Nagpal,
  • Amit Gupta,
  • Moriba Jah

DOI
https://doi.org/10.2218/ijdc.v17i1.847
Journal volume & issue
Vol. 17, no. 1

Abstract

Read online

Data driven applications often require using data integrated from different, large, and continuously updated collections. Each of these collections may present gaps, overlapping data, have conflicting information, or complement each other. Thus, a curation need is to continuously assess if data from multiple collections are fit for integration and reuse. To assess different large data collections at the same time, we present the Synchronic Curation (SC) framework. SC involves processing steps to map the different collections to a unifying data model that represents research problems in a scientific area. The data model, which includes the collections' provenance and a data dictionary, is implemented in a graph database where collections are continuously ingested and can be queried. SC has a collection analysis and comparison module to track updates, and to identify gaps, changes, and irregularities within and across collections. Assessment results can be accessed interactively through a web-based interactive graph. In this paper we introduce SC as an interdisciplinary enterprise, and illustrate its capabilities through its implementation in ASTRIAGraph, a space sustainability knowledge system.