IEEE Access (Jan 2019)

Efficiently Approximating Top-<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula> Sequential Patterns in Transactional Graphs

  • Mingtao Lei,
  • Xi Zhang,
  • Jincui Yang,
  • Binxing Fang

DOI
https://doi.org/10.1109/ACCESS.2019.2916811
Journal volume & issue
Vol. 7
pp. 62817 – 62832

Abstract

Read online

In real-world networks, an edge (or vertex) may be associated with a transaction database, where each transaction consists of a set of items to describe the attributes or behaviors of the edge (or vertex). This type of graph is named as a transactional graph, and the problem of mining sequential patterns from transactional graphs is to discover the subsequences of transaction sequences that frequently appear in many paths of the graph. Solving this problem can facilitate broad applications ranging from social network analysis to urban computing. However, this task is not trivial due to a large number of sequences existed in the graph. In particular, each path of the graph may induce multiple sequences of transactions, and gathering the transaction sequences induced by all the paths may lead to an extremely large volume. To efficiently approximate the top-k patterns, we propose a Parallelized Sampling-based Approach For Mining Top-k Sequential Patterns, PSMSP, which involves two key techniques: (a) a parallelized unbiased sequence sampling approach based on a sequence balanced graph partition strategy, and (b) a novel PSP-Tree structure to efficiently mine the patterns based on the anti-monotonicity properties. The experimental results on synthetic and real-world datasets demonstrate that the PSMSP can successfully find the sequential patterns with superior efficiency.

Keywords