Mathematical and Computational Applications (May 2018)

The Impact of the Implementation Cost of Replication in Data Grid Job Scheduling

  • Babar Nazir,
  • Faiza Ishaq,
  • Shahaboddin Shamshirband,
  • Anthony T. Chronopoulos

DOI
https://doi.org/10.3390/mca23020028
Journal volume & issue
Vol. 23, no. 2
p. 28

Abstract

Read online

Data Grids deal with geographically-distributed large-scale data-intensive applications. Schemes scheduled for data grids attempt to not only improve data access time, but also aim to improve the ratio of data availability to a node, where the data requests are generated. Data replication techniques manage large data by storing a number of data files efficiently. In this paper, we propose centralized dynamic scheduling strategy-replica placement strategies (CDSS-RPS). CDSS-RPS schedule the data and task so that it minimizes the implementation cost and data transfer time. CDSS-RPS consists of two algorithms, namely (a) centralized dynamic scheduling (CDS) and (b) replica placement strategy (RPS). CDS considers the computing capacity of a node and finds an appropriate location for the job. RPS attempts to improve file access time by using replication on the basis of number of accesses, storage capacity of a computing node, and response time of a requested file. Extensive simulations are carried out to demonstrate the effectiveness of the proposed strategy. Simulation results demonstrate that the replication and scheduling strategies improve the implementation cost and average access time significantly.

Keywords