ZooKeeper Atomic Broadcast (Zab) is an atomic broadcast protocol specially designed for ZooKeeper, which supports additional crash recovery. This protocol actually has been widely adopted by famous Internet companies, but there are few studies on the correctness and credibility of the Zab protocol, and thus we utilize formal methods to study the correctness. In this paper, Zab, Paxos and Raft are all analyzed and compared to help better understand the Zab protocol. Then we model the Zab protocol with TLA+ and verify three properties abstracted from the specification by the model checker TLC, including two liveness properties and one safety property. The final experimental results can prove that the design of the protocol conforms to the original requirements. This paper makes up for the analysis of formal methods in the Zab protocol.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Burrows M. The chubby lock service for loosely-coupled distributed systems. In Proc. the 7th Int. Symposium on Operating Systems Design and Implementation, November 2006, pp.335-350.
Junqueira F P, Reed B C. Brief announcement Zab: A practical totally ordered broadcast protocol. In Proc. the 23rd Int. Symposium on Distributed Computing, September 2009, pp.362-363.
Lamport L. Specifying Systems, The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley, 2002.
Junqueira F P, Reed B C, Serafini M. Zab: High-performance broadcast for primary-backup systems. In Proc. the 41st Int. Conference on Dependable Systems and Networks, June 2011, pp.245-256.
Hunt P, Konar M, Junqueira F P, Reed B. ZooKeeper: Wait-free coordination for Internet-scale systems. In Proc. the 2010 USENIX Annual Technical Conference, June 2010.
Ongaro D, Ousterhout J K. In search of an understandable consensus algorithm. In Proc. the 2014 USENIX Annual Technical Conference, June 2014, pp.305-319.
Lamport L, Malkhi D, Zhou L. Vertical paxos and primary-backup replication. In Proc. the 28th Annual ACM Symposium on Principles of Distributed Computing, August 2009, pp.312-313.
Kuppe M A, Lamport L, Ricketts D. The TLA+ toolbox. In Proc. the 5th Workshop on Formal Integrated Development Environment, October 2019, pp.50-62.
Lamport L, Matthews J, Tuttle M R, Yu Y. Specifying and verifying systems with TLA+. In Proc. the 10th ACM SIGOPS European Workshop, July 2002, pp.45-48.
Paiva P Y A, Saotome O, Brandauer C. Specification and verification of a multi-agent coordination protocol with TLA+. In Proc. the 8th Brazilian Symposium on Computing Systems Engineering, November 2018, pp.207-212.
Chaudhuri K, Doligez D, Lamport L, Merz S. Verifying safety properties with the TLA+ proof system. In Proc. the 5th Int. Joint Conference on Automated Reasoning, July 2010, pp.142-148.
Cousineau D, Doligez D, Lamport L, Merz S, Ricketts D, Vanzetto H. TLA + proofs. In Proc. the 18th Int. Symposium on Formal Methods, August 2012, pp.147-154.
EL-Sanosi I, Ezhilchelvan P D. Improving the latency and throughput of ZooKeeper atomic broadcast. In Proc. the 7th Imperial College Computing Student Workshop, September 2017, Article No. 3.
EL-Sanosi I, Ezhilchelvan P D. Improving ZooKeeper atomic broadcast performance by coin tossing. In Proc. the 14th European Performance Engineering Workshop, September 2017, pp.249-265.
Batson B, Lamport L. High-level specifications: Lessons from industry. In Proc. the 1st Int. Symposium on Formal Methods for Components and Objects, November 2002, pp.242-261.
Newcombe C. Why Amazon chose TLA+. In Proc. the 4th Int. Conference on Abstract State Machines, Alloy, B, TLA, VDM, and Z, June 2014, pp.25-39.
Newcombe C, Rath T, Zhang F, Munteanu B, Brooker M, Deardeuff M. How Amazon web services uses formal methods. Commun. ACM, 2015, 58(4): 66-73.
Joshi R, Lamport L, Matthews J, Tasiran S, Tuttle M R, Yu Y. Checking cache-coherence protocols with TLA+. Formal Methods Syst. Des., 2003, 22(2): 125-131.
Lu T, Merz S, Weidenbach C. Towards verification of the pastry protocol using TLA+. In Proc. the 13th IFIP WG 6.1 International Conference and the 31st IFIP WG 6.1 Int. Conference, June 2011, pp.244-258.
Mokkedem A, Ferguson M J, de Johnston R. A TLA solution to the specification and verification of the RLP1 retransmission protocol. In Proc. the 4th Int. Symposium of Formal Methods Europe, September 1997, pp.398-417.
Regnier P, Lima G, Andrade A M S. A TLA+ formal specification and verification of a new real-time communication protocol. Electron. Notes Theor. Comput. Sci., 2009, 240: 221-238.
Chand S, Liu Y A, Stoller S D. Formal verification of Multi-Paxos for distributed consensus. In Proc. the 21st Int. Symposium on Formal Methods, November 2016, pp.119-136.
Gao Y, Li H, Li Y, Liu B, Wang X, Ruan H. Using TLA+ to specify leader election of Raft algorithm with consideration of leadership transfer in multiple controllers. In Proc. the 19th IEEE Int. Conference on Software Quality, Reliability and Security Companion, July 2019, pp.219-226.
About this article
Cite this article
Yin, JQ., Zhu, HB. & Fei, Y. Specification and Verification of the Zab Protocol with TLA+. J. Comput. Sci. Technol. 35, 1312–1323 (2020). https://doi.org/10.1007/s11390-020-0538-7
- Zab protocol