Mathematics (May 2022)

Improving the Performance of MapReduce for Small-Scale Cloud Processes Using a Dynamic Task Adjustment Mechanism

  • Tzu-Chi Huang,
  • Guo-Hao Huang,
  • Ming-Fong Tsai

DOI
https://doi.org/10.3390/math10101736
Journal volume & issue
Vol. 10, no. 10
p. 1736

Abstract

Read online

The MapReduce architecture can reliably distribute massive datasets to cloud worker nodes for processing. When each worker node processes the input data, the Map program generates intermediate data that are used by the Reduce program for integration. However, as the worker nodes process the MapReduce tasks, there are differences in the number of intermediate data created, due to variation in the operating-system environments and the input data, which results in the phenomenon of laggard nodes and affects the completion time for each small-scale cloud application task. In this paper, we propose a dynamic task adjustment mechanism for an intermediate-data processing cycle prediction algorithm, with the aim of improving the execution performance of small-scale cloud applications. Our mechanism dynamically adjusts the number of Map and Reduce program tasks based on the intermediate-data processing capabilities of each cloud worker node, in order to mitigate the problem of performance degradation caused by the limitations on the Google Cloud platform (Hadoop cluster) due to the phenomenon of laggards. The proposed dynamic task adjustment mechanism was compared with a simulated Hadoop system in a performance analysis, and an improvement of at least 5% in the processing efficiency was found for a small-scale cloud application.

Keywords