IEEE Access (Jan 2023)

Data Enrichment Toolchain: A Data Linking and Enrichment Platform for Heterogeneous Data

  • Luis Sanchez,
  • Jorge Lanza,
  • Juan Ramon Santana,
  • Pablo Sotres,
  • Victor Gonzalez,
  • Laura Martin,
  • Gurkan Solmaz,
  • Erno Kovacs,
  • Maren Dietzel,
  • Anja Summa,
  • Amir Reza Jafari,
  • Roberto Minerva,
  • Noel Crespi

DOI
https://doi.org/10.1109/ACCESS.2023.3317705
Journal volume & issue
Vol. 11
pp. 103079 – 103091

Abstract

Read online

Proliferation of data sources associated to Internet of Things (IoT) deployment as well as those bound to Open Data Portals (e.g. European Data Portal, Municipalities Open Data Portals, etc.) and Social Media platforms is creating an abundance of information that is called to bring benefits for both the private and public sectors, through the development of added-value services, increasing administrations’ transparency and availability or fostering efficiency of public services. However, pieces of information without a context are significantly less valuable. Raw data lacks semantics and it is highly heterogeneous from one data-source to another. This poses a challenge to make it useful. To turn all this data into valuable information it is necessary to enable its combination so that meaningful context can be created. Moreover, it is fundamental to define the mechanisms enabling the adoption and orchestration of advanced (typically AI-enabled) data processing techniques to be applied over the harmonized datasets and data-streams. This paper presents the Data Enrichment Toolchain (DET) that provides the necessary harmonization and enrichment to datasets and data-streams coming from heterogeneous sources. The value of the enriched data lies on the one hand in the transfer of the data into a semantically grounded knowledge graph and, on the other hand, in the creation of new data through linking, aggregating and reasoning on the data. In both cases, the benefit of employing linked-data modelling and semantics comes from the extension of the metadata that is associated to every piece of information. Furthermore, the experimental evaluation of the DET implementation that we have carried out is also presented in the paper.

Keywords