Applied Artificial Intelligence (Jan 2021)

Modeling and Analysis of Hadoop MapReduce Systems for Big Data Using Petri Nets

  • Dai-Lun Chiang,
  • Sheng-Kuan Wang,
  • Yu-Ying Wang,
  • Yi-Nan Lin,
  • Tsang-Yen Hsieh,
  • Cheng-Ying Yang,
  • Victor R. L. Shen,
  • Hung-Wei Ho

DOI
https://doi.org/10.1080/08839514.2020.1842111
Journal volume & issue
Vol. 35, no. 1
pp. 80 – 104

Abstract

Read online

Information technological advances have significantly increased large volumes of corporate datasets, which have also created a wide range of business opportunities related to big data and cloud computing. Hadoop is a popular programming framework used for the setup of a cloud computing system. The MapReduce framework forms a core of the Hadoop program for parallel computing and its parallel framework can greatly increase the efficiency of big data analysis. This paper aims to adopt a Petri net (PN) to create a visual model of the MapReduce framework and to analyze its reachability property. We present a real big data analysis system to demonstrate the feasibility of the PN model, to describe the internal procedure of the MapReduce framework in detail, to list common errors and to propose an error prevention mechanism using the PN models in order to increase its efficiency in the system development.