Advertisement

Cutoff Bounds for Consensus Algorithms

  • Ognjen MarićEmail author
  • Christoph SprengerEmail author
  • David BasinEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10427)

Abstract

Consensus algorithms are fundamental building blocks for fault-tolerant distributed systems and their correctness is critical. However, there are currently no fully-automated methods for their verification. The main difficulty is that the algorithms are parameterized: they should work for any given number of processes. We provide an expressive language for consensus algorithms targeting the benign asynchronous setting. For this language, we give algorithm-dependent cutoff bounds. A cutoff bound B reduces the parameterized verification of consensus to a setting with B processes. For the algorithms in our case studies, we obtain bounds of 5 or 7, enabling us to model check them efficiently. This is the first cutoff result for fault-tolerant distributed systems.

Notes

We would like to thank the anonymous reviewers and Ralf Sasse for their useful feedback on the paper.

References

  1. 1.
    Abdulla, P., Cerans, K., Jonsson, B., Tsay, Y.-K.: General decidability theorems for infinite-state systems. In: LICS, pp. 313–321 (1996)Google Scholar
  2. 2.
    Abdulla, P., Haziza, F., Holík, L.: Parameterized verification through view abstraction. Int. J. Softw. Tools Technol. Transf., pp. 1–22 (2015)Google Scholar
  3. 3.
    Abdulla, P.A.: Regular model checking. Int. J. Softw. Tools Technol. Transf. 14(2), 109–118 (2012)CrossRefGoogle Scholar
  4. 4.
    Andoni, A., Daniliuc, D., Khurshid, S., Marinov, D.: Evaluating the “small scope hypothesis”. In: POPL, vol. 2 (2003)Google Scholar
  5. 5.
    Apt, K.R., Kozen, D.C.: Limits for automatic verification of finite-state concurrent systems. Inf. Process. Lett. 22(6), 307–309 (1986)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bailis, P., Kingsbury, K.: The network is reliable. ACM Queue 12(7), 20:20–20:32 (2014)Google Scholar
  7. 7.
    Ben-Or, M.: Another advantage of free choice: completely asynchronous agreement protocols. In: PODC, pp. 27–30 (1983)Google Scholar
  8. 8.
    Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. J. ACM (JACM) 43(2), 225–267 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Chaouch-Saad, M., Charron-Bost, B., Merz, S.: A reduction theorem for the verification of round-based distributed algorithms. In: Bournez, O., Potapov, I. (eds.) RP 2009. LNCS, vol. 5797, pp. 93–106. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-04420-5_10 CrossRefGoogle Scholar
  10. 10.
    Charron-Bost, B., Schiper, A.: The Heard-Of model: computing in distributed systems with benign faults. Distrib. Comput. 22(1), 49–71 (2009)CrossRefzbMATHGoogle Scholar
  11. 11.
    Debrat, H., Merz, S.: Verifying fault-tolerant distributed algorithms in the heard-of model. Archive of Formal Proofs (AFP) (2012). https://www.isa-afp.org/entries/Heard_Of.shtml
  12. 12.
    Delzanno, G., Tatarek, M., Traverso, R.: Model Checking Paxos in Spin. Electron. Proc. Theoret. Comput. Sci. 161, 131–146 (2014)CrossRefGoogle Scholar
  13. 13.
    Drăgoi, C., Henzinger, T.A., Veith, H., Widder, J., Zufferey, D.: A logic-based framework for verifying consensus algorithms. In: McMillan, K.L., Rival, X. (eds.) VMCAI 2014. LNCS, vol. 8318, pp. 161–181. Springer, Heidelberg (2014). doi: 10.1007/978-3-642-54013-4_10 CrossRefGoogle Scholar
  14. 14.
    Drăgoi, C., Henzinger, T. A., Zufferey, D.: PSync: a partially synchronous language for fault-tolerant distributed algorithms. In: POPL, pp. 400–415 (2016)Google Scholar
  15. 15.
    Emerson, E.A., Kahlon, V.: Reducing model checking of the many to the few. In: McAllester, D. (ed.) CADE 2000. LNCS (LNAI), vol. 1831, pp. 236–254. Springer, Heidelberg (2000). doi: 10.1007/10721959_19 CrossRefGoogle Scholar
  16. 16.
    Emerson, E.A., Kahlon, V.: Exact and efficient verification of parameterized cache coherence protocols. In: Geist, D., Tronci, E. (eds.) CHARME 2003. LNCS, vol. 2860, pp. 247–262. Springer, Heidelberg (2003). doi: 10.1007/978-3-540-39724-3_22 CrossRefGoogle Scholar
  17. 17.
    Emerson, E.A., Namjoshi, K.S.: On reasoning about rings. Int. J. Found. Comput. Sci. 14(04), 527–549 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Fisman, D., Kupferman, O., Lustig, Y.: On verifying fault tolerance of distributed protocols. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 315–331. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-78800-3_22 CrossRefGoogle Scholar
  20. 20.
    Guerraoui, R., Henzinger, T.A., Jobstmann, B., Singh, V.: Model checking transactional memories. In: PLDI, pp. 372–382 (2008)Google Scholar
  21. 21.
    Hawblitzel, C., Howell, J., Kapritsos, M., Lorch, J.R., Parno, B., Roberts, M.L., Setty, S., Zill, B.: IronFleet: proving practical distributed systems correct. In SOSP, pp. 1–17 (2015)Google Scholar
  22. 22.
    Herlihy, M.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)CrossRefGoogle Scholar
  23. 23.
    Holzmann, G.J.: The SPIN Model Checker - Primer and Reference Manual. Addison-Wesley, Boston (2004)Google Scholar
  24. 24.
    Hutle, M., Schiper, A.: Communication predicates: a high-level abstraction for coping with transient and dynamic faults. In: 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2007, DSN 2007, pp. 92–101. IEEE (2007)Google Scholar
  25. 25.
    Jackson, D.: Software Abstractions: Logic, Language, and Analysis. MIT Press, Cambridge (2012)Google Scholar
  26. 26.
    Jackson, D., Damon, C.A.: Elements of style: analyzing a software design feature with a counterexample detector. IEEE Trans. Softw. Eng. 22(7), 484–495 (1996)CrossRefGoogle Scholar
  27. 27.
    Jaskelioff, M., Merz, S.: Proving the correctness of Disk Paxos. Archive of Formal Proofs (AFP) (2005). https://www.isa-afp.org/entries/DiskPaxos.shtml
  28. 28.
    John, A., Konnov, I., Schmid, U., Veith, H., Widder, J.: Parameterized model checking of fault-tolerant distributed algorithms by abstraction. In: FMCAD, pp. 201–209 (2013)Google Scholar
  29. 29.
    Johnson, T.T., Mitra, S.: A small model theorem for rectangular hybrid automata networks. In: Giese, H., Rosu, G. (eds.) FMOODS/FORTE -2012. LNCS, vol. 7273, pp. 18–34. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-30793-5_2 CrossRefGoogle Scholar
  30. 30.
    Kaiser, A., Kroening, D., Wahl, T.: Dynamic cutoff detection in parameterized concurrent programs. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 645–659. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-14295-6_55 CrossRefGoogle Scholar
  31. 31.
    Kingsbury, K.: Jepsen: Testing the Partition Tolerance of PostgreSQL, Redis, MongoDB and Riak (2013). http://www.infoq.com/articles/jepsen
  32. 32.
    Knuth, D.E.: The Art of Computer Programming, Vol III: Sorting and Searching. Addison-Wesley, Boston (1973)zbMATHGoogle Scholar
  33. 33.
    Konnov, I., Veith, H., Widder, J.: On the completeness of bounded model checking for threshold-based distributed algorithms: reachability. In: Baldan, P., Gorla, D. (eds.) CONCUR 2014. LNCS, vol. 8704, pp. 125–140. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-44584-6_10 Google Scholar
  34. 34.
    Konnov, I., Veith, H., Widder, J.: SMT and POR beat counter abstraction: parameterized model checking of threshold-based distributed algorithms. In: Computer Aided Verification, pp. 85–102, July 2015Google Scholar
  35. 35.
    Küfner, P., Nestmann, U., Rickmann, C.: Formal verification of distributed algorithms. In: Baeten, J.C.M., Ball, T., Boer, F.S. (eds.) TCS 2012. LNCS, vol. 7604, pp. 209–224. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33475-7_15 CrossRefGoogle Scholar
  36. 36.
    Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)CrossRefGoogle Scholar
  37. 37.
    Lynch, N., Vaandrager, F.: Forward and backward simulations part I: untimed systems. Inf. Comput. 121, 214–233 (1995)CrossRefzbMATHGoogle Scholar
  38. 38.
    Marić, O., Sprenger, C., Basin, D.: Consensus Refined. In: DSN, pp. 391–402 (2015)Google Scholar
  39. 39.
    Marić, O.: Formal Verification of Fault-Tolerant Systems. Ph.D. thesis, Department of Computer Science, ETH Zurich (2017). http://dx.doi.org/10.3929/ethz-a-010892776
  40. 40.
    Marić, O.: The Consensus Verifier, May 2017. http://www.infsec.ethz.ch/research/software/consl-verifier
  41. 41.
    Newcombe, C.: Why Amazon Chose TLA\(^+\). In: Ait, A.Y., Schewe, K.D. (eds.) Abstract State Machines, Alloy, B, TLA, VDM, and Z. ABZ 2014. LNCS, vol. 8477, pp. 25–39. Springer, Berlin (2014)CrossRefGoogle Scholar
  42. 42.
    Oetsch, J., Prischink, M., Pührer, J., Schwengerer, M., Tompits, H.: On the small-scope hypothesis for testing answer-set programs. In: KR (2012)Google Scholar
  43. 43.
    Santoro, N., Widmayer, P.: Time is not a healer. In: Monien, B., Cori, R. (eds.) STACS 1989. LNCS, vol. 349, pp. 304–313. Springer, Heidelberg (1989). doi: 10.1007/BFb0028994 CrossRefGoogle Scholar
  44. 44.
    Schiper, N., Rahli, V., van Renesse, R., Bickford, M., Constable, R.: Developing correctly replicated databases using formal tools. In: DSN, pp. 395–406 (2014)Google Scholar
  45. 45.
    Suzuki, I.: Proving properties of a ring of finite-state machines. Inf. Process. Lett. 28(4), 213–214 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  46. 46.
    Tsuchiya, T., Schiper, A.: Model checking of consensus algorithms. In: SRDS, pp. 137–148, October 2007Google Scholar
  47. 47.
    Tsuchiya, T., Schiper, A.: Using bounded model checking to verify consensus algorithms. In: Taubenfeld, G. (ed.) DISC 2008. LNCS, vol. 5218, pp. 466–480. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-87779-0_32 CrossRefGoogle Scholar
  48. 48.
    Tsuchiya, T., Schiper, A.: Verification of consensus algorithms using satisfiability solving. Distrib. Comput. 23(5–6), 341–358 (2010)zbMATHGoogle Scholar
  49. 49.
    Yuan, D., Luo, Y., Zhuang, X., Rodrigues, G.R., Zhao, X., Zhang, Y., Jain, P.U., Stumm, M.: Simple testing can prevent most critical failures: an analysis of production failures in distributed dataintensive systems. In: OSDI (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer Science, Institute of Information SecurityETH ZurichZurichSwitzerland

Personalised recommendations