Towards autoscaling of Apache Flink jobs

Varga Balázs; Balassi Márton; Kiss Attila

doi:10.2478/ausi-2021-0003

Acta Universitatis Sapientiae: Informatica (Jun 2021)

Towards autoscaling of Apache Flink jobs

Varga Balázs,
Balassi Márton,
Kiss Attila

Affiliations

Varga Balázs: ELTE Eötvös Loránd University Budapest, Hungary
Balassi Márton: Cloudera, Budapest, Hungary
Kiss Attila: J. Selye University, Komárno, Slovakia

DOI: https://doi.org/10.2478/ausi-2021-0003
Journal volume & issue: Vol. 13, no. 1
pp. 39 – 59

Abstract

Read online

Data stream processing has been gaining attention in the past decade. Apache Flink is an open-source distributed stream processing engine that is able to process a large amount of data in real time with low latency. Computations are distributed among a cluster of nodes. Currently, provisioning the appropriate amount of cloud resources must be done manually ahead of time. A dynamically varying workload may exceed the capacity of the cluster, or leave resources underutilized. In our paper, we describe an architecture that enables the automatic scaling of Flink jobs on Kubernetes based on custom metrics, and describe a simple scaling policy. We also measure the e ects of state size and target parallelism on the duration of the scaling operation, which must be considered when designing an autoscaling policy, so that the Flink job respects a Service Level Agreement.

Published in Acta Universitatis Sapientiae: Informatica

ISSN: 2066-7760 (Online)
Publisher: Scientia Publishing House
Country of publisher: Romania
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://acta.sapientia.ro/en/series/informatica

About the journal

Abstract

Keywords