IEEE Access (Jan 2020)

Optimizing Distance Computation in Distributed Graph Systems

  • Qing Wang,
  • Shengyi Ji,
  • Peng Peng,
  • Mingdao Li,
  • Ping Huang,
  • Zheng Qin

DOI
https://doi.org/10.1109/ACCESS.2020.3032727
Journal volume & issue
Vol. 8
pp. 191673 – 191682

Abstract

Read online

Given a large graph, such as a social network or a knowledge graph, a fundamental query is how to find the distance from a source vertex to another vertex in the graph. As real graphs become very large and many distributed graph systems, such as Pregel, Pregel+, Giraph, and GraphX, are proposed, how to employ distributed graph systems to process single-source distance queries should attract more attention. In this paper, we propose a landmark-based framework to optimize the distance computation over distributed graph systems. We also use a measure called set betweenness to select the optimal set of landmarks for distance computation. Although we can prove that selecting the optimal set of landmarks is NP-hard, we propose a heuristic distributed algorithm that can guarantee the approximation ratio. Experiments on large real graphs confirm the superiority of our methods.

Keywords