Open Computer Science (Jul 2022)

An alternative C++-based HPC system for Hadoop MapReduce

  • Srinivasakumar Vignesh,
  • Vanamoorthy Muthumanikandan,
  • Sairaj Siddarth,
  • Ganesh Sainath

DOI
https://doi.org/10.1515/comp-2022-0246
Journal volume & issue
Vol. 12, no. 1
pp. 238 – 247

Abstract

Read online

MapReduce (MR) is a technique used to improve distributed data processing vastly and can massively speed up computation. Hadoop and MR rely on memory-intensive JVM and Java. A MR framework based on High-Performance Computing (HPC) could be used, which is both memory-efficient and faster than standard MR. This article explores a C++-based approach to MR and its feasibility on multiple factors like developer friendliness, deployment interface, efficiency, and scalability. This article also introduces Eager Reduction and Delayed Reduction techniques to speed up MR.

Keywords