IEEE Access (Jan 2020)
R<sup>3</sup>MAT: A Rapid and Robust Graph Generator
Abstract
One of the main problems when developing graph-based applications is the availability of large and representative datasets. The lack of real graphs has motivated the development of software tools for generating synthetic graphs. R-MAT is a data generation method that was designed to produce synthetic graphs whose characteristics resemble those occurring in real networks. Although the generation model defined by R-MAT is easy to understand, its implementation is not trivial and it has intrinsic memory restrictions that makes the generation of very large graphs difficult. This paper studies the practical implementation of R-MAT. We discuss the issues of the original implementation which works with the adjacency matrix of the graph and analyze the size of the resulting graph obtained with the R-MAT model. Then, we introduce and experimentally evaluate R3MAT, an alternative implementation for R-MAT based on an array of degrees. These experiments show that (i) our R3MAT is able to generate graphs of hundred million nodes and billion edges in a single machine, (ii) our method preserves the characteristic power-law distribution of the edge degrees present in real-world graphs, and (iii) R3MAT has the best performance in the current state of the art, when considering a single modest computer in a sequential fashion.
Keywords