Information (May 2023)
Online Task Scheduling of Big Data Applications in the Cloud Environment
Abstract
The development of big data has generated data-intensive tasks that are usually time-consuming, with a high demand on cloud data centers for hosting big data applications. It becomes necessary to consider both data and task management to find the optimal resource allocation scheme, which is a challenging research issue. In this paper, we address the problem of online task scheduling combined with data migration and replication in order to reduce the overall response time as well as ensure that the available resources are efficiently used. We introduce a new scheduling technique, named Online Task Scheduling algorithm based on Data Migration and Data Replication (OTS-DMDR). The main objective is to efficiently assign online incoming tasks to the available servers while considering the access time of the required datasets and their replicas, the execution time of the task in different machines, and the computational power of each machine. The core idea is to achieve better data locality by performing an effective data migration while handling replicas. As a result, the overall response time of the online tasks is reduced, and the throughput is improved with enhanced machine resource utilization. To validate the performance of the proposed scheduling method, we run in-depth simulations with various scenarios and the results show that our proposed strategy performs better than the other existing approaches. In fact, it reduces the response time by 78% when compared to the First Come First Served scheduler (FCFS), by 58% compared to the Delay Scheduling, and by 46% compared to the technique of Li et al. Consequently, the present OTS-DMDR method is very effective and convenient for the problem of online task scheduling.
Keywords