IEEE Access (Jan 2021)

Large Graph Sampling Algorithm for Frequent Subgraph Mining

  • Tianyu Zheng,
  • Li Wang

DOI
https://doi.org/10.1109/ACCESS.2021.3089699
Journal volume & issue
Vol. 9
pp. 88970 – 88980

Abstract

Read online

Large graph networks frequently appear in the latest applications. Their graph structures are very large, and the interaction among the vertices makes it difficult to split the structures into separate multiple structures, thus increasing the difficulty of frequent subgraph mining. The process of calculating subgraph isomorphism often requires many calculations. Reducing the unessential structure of the graph is an effective method to improve the efficiency. Therefore, we propose a large graph sampling algorithm (RASI) based on random areas selection sampling and incorporate graph induction techniques to reduce the structure of the original graph. In addition, we find that constraining the weight of the number of vertices in the entire graph is essential to reduce the calculation of subgraph isomorphisms. This parameter is constrained in the sampling process to improve the efficiency of frequent subgraph mining. Experimental results show that RASI has more stable performance and performs better than other sampling algorithms in non-connected graphs. Mining frequent subgraphs by graph sampling can significantly improve the efficiency of mining, and the number of subgraphs before and after sampling is very similar.

Keywords