Employing Vertical Elasticity for Efficient Big Data Processing in Container-Based Cloud Environments

Jin-young Choi; Minkyoung Cho; Jik-Soo Kim

doi:10.3390/app11136200

Applied Sciences (Jul 2021)

Employing Vertical Elasticity for Efficient Big Data Processing in Container-Based Cloud Environments

Jin-young Choi,
Minkyoung Cho,
Jik-Soo Kim

Affiliations

Jin-young Choi: Gabia Inc., Seongnam 13494, Korea
Minkyoung Cho: Department of Computer Engineering, Myongji University, Yongin 17058, Korea
Jik-Soo Kim: Department of Computer Engineering, Myongji University, Yongin 17058, Korea

DOI: https://doi.org/10.3390/app11136200
Journal volume & issue: Vol. 11, no. 13
p. 6200

Abstract

Read online

Recently, “Big Data” platform technologies have become crucial for distributed processing of diverse unstructured or semi-structured data as the amount of data generated increases rapidly. In order to effectively manage these Big Data, Cloud Computing has been playing an important role by providing scalable data storage and computing resources for competitive and economical Big Data processing. Accordingly, server virtualization technologies that are the cornerstone of Cloud Computing have attracted a lot of research interests. However, conventional hypervisor-based virtualization can cause performance degradation problems due to its heavily loaded guest operating systems and rigid resource allocations. On the other hand, container-based virtualization technology can provide the same level of service faster with a lightweight capacity by effectively eliminating the guest OS layers. In addition, container-based virtualization enables efficient cloud resource management by dynamically adjusting the allocated computing resources (e.g., CPU and memory) during the runtime through “Vertical Elasticity”. In this paper, we present our practice and experience of employing an adaptive resource utilization scheme for Big Data workloads in container-based cloud environments by leveraging the vertical elasticity of Docker, a representative container-based virtualization technique. We perform extensive experiments running several Big Data workloads on representative Big Data platforms: Apache Hadoop and Spark. During the workload executions, our adaptive resource utilization scheme periodically monitors the resource usage patterns of running containers and dynamically adjusts allocated computing resources that could result in substantial improvements in the overall system throughput.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords