IEEE Access (Jan 2022)

JHTD: An Efficient Joint Scheduling Framework Based on Hypergraph for Task Placement and Data Transfer Across Geographically Distributed Data Centers

  • Chao Jing,
  • Penggao Dan

DOI
https://doi.org/10.1109/ACCESS.2022.3219873
Journal volume & issue
Vol. 10
pp. 116302 – 116316

Abstract

Read online

As the explosive growth of the data volume, data center is playing a critical role to store and process huge amount of data. Traditional single data center can no longer to adapt into incredibly fast-growing data. Recently, some researches have extended the tasks such data processing to geographically distributed data centers. However, since the joint consideration of task placement and data transfer, it is complex and difficult to design a proper scheduling approach with the goal of minimizing makespan under the constraint of task dependencies, processing capability and network, etc. Therefore, our work proposes $JHTD$ : an efficient joint scheduling framework based on hypergraph for task placement and data transfer across geographically distributed data centers. Generally, there are two crucial stages in $JHTD$ . Initially, due to the outstanding of hypergraphs in modeling complex problems, we have leveraged a hypergraph-based model to establish the relationship between tasks, data files, and data centers. Thereafter, a hypergraph-based partition method has been developed for task placement within the first stage. In the second stage, a task reallocation scheme has been devised in terms of each task-to-data dependency. Meanwhile, a data dependency aware transferring scheme has been designed to minimize the makespan. Last, the real-world model China-VO project has been used to conduct a variety of simulation experiments. The results have demonstrated that $JHTD$ effectively optimizes the problems of task placement and data transfer across geographically distributed data centers. $JHTD$ has been compared with three other state-of-the-art algorithms. The results have demonstrated that $JHTD$ can reduce the makespan by up to 20.6%. Also, various impacts (data transfer volume and load balancing) have been taken into account to show and discuss the effectiveness of $JHTD$ .

Keywords