Brief Announcement Zab: A Practical Totally Ordered Broadcast Protocol

  • Flavio P. Junqueira
  • Benjamin C. Reed
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5805)


At Yahoo!, we have developed a fault-tolerant coordination service called ZooKeeper [4] that allows large scale applications to implement coordination tasks such as leader election, status propagation, and rendezvous. ZooKeeper forgoes locks [2] and instead implements simple wait-free data objects [3] along with a consistency model that guarantees linearizable updates and FIFO order for client operations. We have found the service to be flexible with performance that meets the production demands of the Web-scale applications of Yahoo!.

The ZooKeeper service comprises n ZooKeeper replicas (n ≥ 2f + 1, f is a threshold on the number of faulty replicas). Among these replicas, there is a distinguished, elected replica: the leader. The remaining replicas are followers. Clients of the ZooKeeper service can connect and submit requests through any ZooKeeper replica. If this request reads the state of ZooKeeper, the replica serves this request locally. Otherwise, it forwards the request to the leader. The leader receives ZooKeeper requests and transforms them into idempotent transactions. The transformation corresponds to generating the state modifications for the given request, as with primary-backup protocols [1]. The leader then sends transactions as messages using atomic broadcast. As a leader can crash, there must be an additional leadership election protocol. To elect a leader, ZooKeeper requires at least ⌈(n + 1)/2⌉ non-faulty replicas.


Large Scale Application Production Demand Leader Election Election Algorithm Epoch Number 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Budhiraja, N., et al.: The primary-backup approach. In: Mullender, S. (ed.) Distributed Systems, vol. 8, pp. 199–216. Addison-Wesley, Reading (1993)Google Scholar
  2. 2.
    Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: OSDI 2006, pp. 335–350 (2006)Google Scholar
  3. 3.
    Herlihy, M.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)CrossRefGoogle Scholar
  4. 4.
    Zookeeper project (2008),
  5. 5.
    Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Flavio P. Junqueira
    • 1
  • Benjamin C. Reed
    • 1
  1. 1.Yahoo! ResearchUSA

Personalised recommendations