Journal of Electrical and Computer Engineering (Jan 2016)
Heuristic Data Placement for Data-Intensive Applications in Heterogeneous Cloud
Abstract
Data placement is an important issue which aims at reducing the cost of internode data transfers in cloud especially for data-intensive applications, in order to improve the performance of the entire cloud system. This paper proposes an improved data placement algorithm for heterogeneous cloud environments. In the initialization phase, a data clustering algorithm based on data dependency clustering and recursive partitioning has been presented, and both the factor of data size and fixed position are incorporated. And then a heuristic tree-to-tree data placement strategy is advanced in order to make frequent data movements occur on high-bandwidth channels. Simulation results show that, compared with two classical strategies, this strategy can effectively reduce the amount of data transmission and its time consumption during execution.