IEEE Access (Jan 2019)

Cluster-Scheduling Big Graph Traversal Task for Parallel Processing in Heterogeneous Cloud Based on DAG Transformation

  • Kekun Hu,
  • Guosun Zeng,
  • Shuang Ding,
  • Huowen Jiang

DOI
https://doi.org/10.1109/ACCESS.2019.2921477
Journal volume & issue
Vol. 7
pp. 77070 – 77082

Abstract

Read online

Task scheduling is the key to the full utilization of heterogeneous cloud capabilities for parallel processing of big graphs. Most graph processing systems adopt single-granularity scheduling mechanisms without considering the heterogeneity of the cloud, leading to poor performance. To alleviate it by learning from the excellent directed acyclic graph (DAG)-based scheduling techniques accumulated in traditional parallel computing, we first present a streaming DAG-construction heuristic. It transforms a big graph along with graph traversal algorithms to be carried out into a DAG. We then propose a three-phase heterogeneous-aware cluster-scheduling algorithm to schedule the DAG into a heterogeneous cloud for parallel processing. In the first phase, we design a parallel linear clustering algorithm to cluster the DAG into a series of linear clusters with different granularities. In the second phase, we design a heterogeneous-aware load balancing algorithm to map these clusters to different computational nodes of the cloud. In the last phase, we design a task ordering algorithm to assigns these clusters as-early-as-possible start times. The experimental results show that our scheme can generate high-quality schedules and improve the efficiency and performance of parallel processing of big graphs in the heterogeneous cloud.

Keywords