Specification and Verification of the Zab Protocol with TLA+

Abstract

ZooKeeper Atomic Broadcast (Zab) is an atomic broadcast protocol specially designed for ZooKeeper, which supports additional crash recovery. This protocol actually has been widely adopted by famous Internet companies, but there are few studies on the correctness and credibility of the Zab protocol, and thus we utilize formal methods to study the correctness. In this paper, Zab, Paxos and Raft are all analyzed and compared to help better understand the Zab protocol. Then we model the Zab protocol with TLA+ and verify three properties abstracted from the specification by the model checker TLC, including two liveness properties and one safety property. The final experimental results can prove that the design of the protocol conforms to the original requirements. This paper makes up for the analysis of formal methods in the Zab protocol.

This is a preview of subscription content, access via your institution.

References

  1. [1]

    Burrows M. The chubby lock service for loosely-coupled distributed systems. In Proc. the 7th Int. Symposium on Operating Systems Design and Implementation, November 2006, pp.335-350.

  2. [2]

    Junqueira F P, Reed B C. Brief announcement Zab: A practical totally ordered broadcast protocol. In Proc. the 23rd Int. Symposium on Distributed Computing, September 2009, pp.362-363.

  3. [3]

    Lamport L. Specifying Systems, The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley, 2002.

  4. [4]

    Junqueira F P, Reed B C, Serafini M. Zab: High-performance broadcast for primary-backup systems. In Proc. the 41st Int. Conference on Dependable Systems and Networks, June 2011, pp.245-256.

  5. [5]

    Hunt P, Konar M, Junqueira F P, Reed B. ZooKeeper: Wait-free coordination for Internet-scale systems. In Proc. the 2010 USENIX Annual Technical Conference, June 2010.

  6. [6]

    Ongaro D, Ousterhout J K. In search of an understandable consensus algorithm. In Proc. the 2014 USENIX Annual Technical Conference, June 2014, pp.305-319.

  7. [7]

    Lamport L, Malkhi D, Zhou L. Vertical paxos and primary-backup replication. In Proc. the 28th Annual ACM Symposium on Principles of Distributed Computing, August 2009, pp.312-313.

  8. [8]

    Kuppe M A, Lamport L, Ricketts D. The TLA+ toolbox. In Proc. the 5th Workshop on Formal Integrated Development Environment, October 2019, pp.50-62.

  9. [9]

    Lamport L, Matthews J, Tuttle M R, Yu Y. Specifying and verifying systems with TLA+. In Proc. the 10th ACM SIGOPS European Workshop, July 2002, pp.45-48.

  10. [10]

    Paiva P Y A, Saotome O, Brandauer C. Specification and verification of a multi-agent coordination protocol with TLA+. In Proc. the 8th Brazilian Symposium on Computing Systems Engineering, November 2018, pp.207-212.

  11. [11]

    Chaudhuri K, Doligez D, Lamport L, Merz S. Verifying safety properties with the TLA+ proof system. In Proc. the 5th Int. Joint Conference on Automated Reasoning, July 2010, pp.142-148.

  12. [12]

    Cousineau D, Doligez D, Lamport L, Merz S, Ricketts D, Vanzetto H. TLA + proofs. In Proc. the 18th Int. Symposium on Formal Methods, August 2012, pp.147-154.

  13. [13]

    EL-Sanosi I, Ezhilchelvan P D. Improving the latency and throughput of ZooKeeper atomic broadcast. In Proc. the 7th Imperial College Computing Student Workshop, September 2017, Article No. 3.

  14. [14]

    EL-Sanosi I, Ezhilchelvan P D. Improving ZooKeeper atomic broadcast performance by coin tossing. In Proc. the 14th European Performance Engineering Workshop, September 2017, pp.249-265.

  15. [15]

    Batson B, Lamport L. High-level specifications: Lessons from industry. In Proc. the 1st Int. Symposium on Formal Methods for Components and Objects, November 2002, pp.242-261.

  16. [16]

    Newcombe C. Why Amazon chose TLA+. In Proc. the 4th Int. Conference on Abstract State Machines, Alloy, B, TLA, VDM, and Z, June 2014, pp.25-39.

  17. [17]

    Newcombe C, Rath T, Zhang F, Munteanu B, Brooker M, Deardeuff M. How Amazon web services uses formal methods. Commun. ACM, 2015, 58(4): 66-73.

    Article  Google Scholar 

  18. [18]

    Joshi R, Lamport L, Matthews J, Tasiran S, Tuttle M R, Yu Y. Checking cache-coherence protocols with TLA+. Formal Methods Syst. Des., 2003, 22(2): 125-131.

    Article  Google Scholar 

  19. [19]

    Lu T, Merz S, Weidenbach C. Towards verification of the pastry protocol using TLA+. In Proc. the 13th IFIP WG 6.1 International Conference and the 31st IFIP WG 6.1 Int. Conference, June 2011, pp.244-258.

  20. [20]

    Mokkedem A, Ferguson M J, de Johnston R. A TLA solution to the specification and verification of the RLP1 retransmission protocol. In Proc. the 4th Int. Symposium of Formal Methods Europe, September 1997, pp.398-417.

  21. [21]

    Regnier P, Lima G, Andrade A M S. A TLA+ formal specification and verification of a new real-time communication protocol. Electron. Notes Theor. Comput. Sci., 2009, 240: 221-238.

    Article  Google Scholar 

  22. [22]

    Chand S, Liu Y A, Stoller S D. Formal verification of Multi-Paxos for distributed consensus. In Proc. the 21st Int. Symposium on Formal Methods, November 2016, pp.119-136.

  23. [23]

    Gao Y, Li H, Li Y, Liu B, Wang X, Ruan H. Using TLA+ to specify leader election of Raft algorithm with consideration of leadership transfer in multiple controllers. In Proc. the 19th IEEE Int. Conference on Software Quality, Reliability and Security Companion, July 2019, pp.219-226.

Download references

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Hui-Biao Zhu or Yuan Fei.

Supplementary Information

ESM 1

(PDF 111 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yin, JQ., Zhu, HB. & Fei, Y. Specification and Verification of the Zab Protocol with TLA+. J. Comput. Sci. Technol. 35, 1312–1323 (2020). https://doi.org/10.1007/s11390-020-0538-7

Download citation

Keywords

  • Zab protocol
  • TLA+
  • specification
  • verification