Applied Sciences (Jun 2021)
Concerto: Dynamic Processor Scaling for Distributed Data Systems with Replication
Abstract
A surge of interest in data-intensive computing has led to a drastic increase in the demand for data centers. Given this growing popularity, data centers are becoming a primary contributor to the increased consumption of energy worldwide. To mitigate this problem, this paper revisits DVFS (Dynamic Voltage Frequency Scaling), a well-known technique to reduce the energy usage of processors, from the viewpoint of distributed systems. Distributed data systems typically adopt a replication facility to provide high availability and short latency. In this type of architecture, the replicas are maintained in an asynchronous manner, while the master synchronously operates via user requests. Based on this relaxation constraint of replica, we present a novel DVFS technique called Concerto, which intentionally scales down the frequency of processors operating for the replicas. This mechanism can achieve considerable energy savings without an increase in the user-perceived latency. We implemented Concerto on Redis 6.0.1, a commercial-level distributed key-value store, demonstrating that all associated performance issues were resolved. To prevent a delay in read queries assigned to the replicas, we offload the independent part of the read operation to the fast-running thread. We also empirically demonstrate that the decreased performance of the replica does not cause an increase of the replication lag because the inherent load unbalance between the master and replica hides the increased latency of the replica. Performance evaluations with micro and real-world benchmarks show that Redis saves 32% on average and up to 51% of energy with Concerto under various workloads, with minor performance losses in the replicas. Despite numerous studies of the energy saving in data centers, to the best of our best knowledge, Concerto is the first approach that considers clock-speed scaling at the aggregate level, exploiting heterogeneous performance constraints across data nodes.
Keywords