Evaluation of high-level query languages based on MapReduce in Big Data

Marouane Birjali; Abderrahim Beni-Hssane; Mohammed Erritali

doi:10.1186/s40537-018-0146-3

Journal of Big Data (Oct 2018)

Evaluation of high-level query languages based on MapReduce in Big Data

Marouane Birjali,
Abderrahim Beni-Hssane,
Mohammed Erritali

Affiliations

Marouane Birjali: LAROSERI Laboratory, Department of Computer Science, Faculty of Sciences, University of Chouaib Doukkali
Abderrahim Beni-Hssane: LAROSERI Laboratory, Department of Computer Science, Faculty of Sciences, University of Chouaib Doukkali
Mohammed Erritali: TIAD Laboratory, Department of Computer Science, Faculty of Sciences and Technologies, University of Sultan Moulay Slimane

DOI: https://doi.org/10.1186/s40537-018-0146-3
Journal volume & issue: Vol. 5, no. 1
pp. 1 – 21

Abstract

Read online

Abstract MapReduce (MR) is a criterion of Big Data processing model with parallel and distributed large datasets. This model knows difficult problems related to low-level and batch nature of MR that gives rise to an abstraction layer on the top of MR. Therefore; several High-Level MapReduce Query Languages built on the top of MR provide more abstract query languages and extend the MR programming model. These High-Level MapReduce Query Languages remove the burden of MR programming away from the developers and make a soft migration of existing competences with SQL skills to Big Data. This paper investigates the very used—common High-Level MapReduce Query Languages built directly on the top of MR that translate queries into executable native MR jobs. It evaluates the performance of the four presented High-Level MapReduce Query Languages: JAQL, Hive, Big SQL and Pig, with regards to their insightful perspectives and ease of programming. The baseline metrics reported are increasing input size, scale-out number of nodes and controlling number of reducers. The experimental results study the technical advantages and limitations of each High-Level MapReduce Query Languages. Finally, the paper provides a summary for developers to choose the High-Level MapReduce Query Languages which fulfill their needs and interests.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords