Big Data and Cognitive Computing (Jul 2018)
Adaptive Provisioning of Heterogeneous Cloud Resources for Big Data Processing
Abstract
Efficient utilization of resources plays an important role in the performance of large scale task processing. In cases where heterogeneous types of resources are used within the same application, it is hard to achieve good utilization of all of the different types of resources. By taking advantage of recent developments in cloud infrastructure that enable the use of dynamic clusters of resources, and by dynamically altering the size of the available resources for all the different resource types, the overall utilization of resources, however, can be improved. Starting from this premise, this paper discusses a solution that aims to provide a generic algorithm to estimate the desired ratios of instance processing tasks as well as ratios of the resources that are used by these instances, without the necessity for trial runs or a priori knowledge of the execution steps. These ratios are then used as part of an adaptive system that is able to reconfigure itself to maximize utilization. To verify the solution, a reference framework which adaptively manages clusters of functionally different VMs to host a calculation scenario is implemented. Experiments are conducted based on a compute-heavy use case in which the probability of underground pipeline failures is determined based on the settlement of soils. These experiments show that the solution is capable of eliminating large amounts of under-utilization, resulting in increased throughput and lower lead times.
Keywords