TASK Quarterly (Oct 2018)

INFLUENCE OF YARN SCHEDULERS ON POWER CONSUMPTION AND PROCESSING TIME FOR VARIOUS BIG DATA BENCHMARKS

  • KRZYSZTOF DRYPCZEWSKI,
  • JERZY PROFICZ,
  • ANDRZEJ STEPNOWSKI

DOI
https://doi.org/10.17466/tq2018/22.4/c
Journal volume & issue
Vol. 22, no. 4

Abstract

Read online

Climate change caused by human activities can influence the lives of everybody on the planet. The environmental concerns must be taken into consideration by all fields of study including ICT. Green Computing aims to reduce negative effects of IT on the environment while, at the same time, maintaining all of the possible benefits it provides. Several Big Data platforms like Apache Spark or YARN have become widely used in analytics and High-Performance Computing systems due to the reliability and usability of Map Reduce implementations. The authors research the power consumption and energy efficiency of Hadoop YARN schedulers using Apache Spark under three different workloads. The test cases include: sorting large binary files, counting unique words in large text files and processing satellite imagery from the Sentinel-2 mission. The presented results show small (2%–11%) but distinct differences in the power consumption of FIFO and FAIR schedulers.

Keywords