EAI Endorsed Transactions on Energy Web (Apr 2018)

Improving ZooKeeper Atomic Broadcast Performance When a Server Quorum Never Crashes

  • Ibrahim EL-Sanosi,
  • Paul Ezhilchelvan

DOI
https://doi.org/10.4108/eai.10-4-2018.154455
Journal volume & issue
Vol. 5, no. 17
pp. 1 – 9

Abstract

Read online

Operating at the core of the highly-available ZooKeeper system is the ZooKeeper atomic broadcast (Zab) for imposing a total order on service requests that seek to modify the replicated system state. Zab is designed with the weakest assumptions possible under crash-recovery fault model; e.g., any number - even all - of servers can crash simultaneously and the system will continue or resume its service provisioning when a server quorum remains or resumes to be operative. Our aim is to explore ways of improving Zab performance without modifying its easy-to-implement structure. To this end, we first assume that server crashes are independent and a server quorum remains operative at all time. Under these restrictive, yet practical, assumptions, we propose three variations of Zab and do performance comparison. The first variation orders excellent performance but can be only used for 3-server systems; the other two do not have this limitation. One of them reduces the leader overhead further by conditioning the sending of acknowledgements on the outcomes of coin tosses. Owing to its superb performance, it is re-designed to operate under the least-restricted Zab fault assumptions. Further performance comparisons confirm the potential of coin-tossing in ordering performances better than Zab, particularly at high workloads.

Keywords