PLoS ONE (Jan 2017)
Speeding up profiling program's runtime characteristics for workload consolidation.
Abstract
Workload consolidation is a common method to increase resource utilization of the clusters or data centers while still trying to ensure the performance of the workloads. In order to get the maximum benefit from workload consolidation, the task scheduler has to understand the runtime characteristics of the individual program and schedule the programs with less resource conflict onto the same server. We propose a set of metrics to comprehensively depict the runtime characteristics of programs. The metrics set consists of two types of metrics: resource usage and resource sensitivity. The resource sensitivity refers to the performance degradation caused by insufficient resources. The resource usage of a program is easy to get by common performance analysis tools, but the resource sensitivity can not be obtained directly. The simplest and the most intuitive way to obtain the resource sensitivity of a program is to run the program in an environment with controllable resources and record the performance achieved under all possible resource conditions. However, such a process is very much time consuming when multiple resources are involved and each resource is controlled in fine granularity. In order to obtain the resource sensitivity of a program quickly, we propose a method to speed up the resource sensitivity profiling process. Our method is realized based on two level profiling acceleration strategies. First, taking advantage of the resource usage information, we set up the maximum resource usage of the program as the upper bound of the controlled resource. In this way, the range of controlling resource levels can be narrowed, and the number of experiments can be significantly reduced. Secondly, using a prediction model achieved by interpolation, we can reduce the time spent on profiling even further because the resource sensitivity in most of the resource conditions is obtained by interpolation instead of real program execution. These two profiling acceleration strategies have been implemented and applied in profiling program runtime characteristics. Our experiment results show that the proposed two-level profiling acceleration strategy not only shortens the process of profiling, but also guarantees the accuracy of the resource sensitivity. With the fast profiling method, the average absolute error of the resource sensitivity can be controlled within 0.05.