Applied Sciences (May 2024)

Graph Stream Compression Scheme Based on Pattern Dictionary Using Provenance

  • Hyeonbyeong Lee,
  • Bokyoung Shin,
  • Dojin Choi,
  • Jongtae Lim,
  • Kyoungsoo Bok,
  • Jaesoo Yoo

DOI
https://doi.org/10.3390/app14114553
Journal volume & issue
Vol. 14, no. 11
p. 4553

Abstract

Read online

With recent advancements in network technology and the increasing popularity of the internet, the use of social network services and Internet of Things devices has flourished, leading to a continuous generation of large volumes of graph stream data, where changes, such as additions or deletions of vertices and edges, occur over time. Additionally, owing to the need for the efficient use of storage space and security requirements, graph stream data compression has become essential in various applications. Even though various studies on graph compression methods have been conducted, most of them do not fully reflect the dynamic characteristics of graph streams and the complexity of large graphs. In this paper, we propose a compression scheme using provenance data to efficiently process and analyze large graph stream data. It obtains provenance data by analyzing graph stream data and builds a pattern dictionary based on this to perform dictionary-based compression. By improving the existing dictionary-based graph compression methods, it enables more efficient dictionary management through tracking pattern changes and evaluating their importance using provenance. Furthermore, it considers the relationships among sub-patterns using an FP-tree and performs pattern dictionary management that updates pattern scores based on time. Our experiments show that the proposed scheme outperforms existing graph compression methods in key performance metrics, such as compression rate and processing time.

Keywords