Revisiting the Paxos algorithm
This paper develops a new I/O automaton model called the Clock General Timed Automaton (Clock GTA) model. The Clock GTA is based on the General Timed Automaton (GTA) of Lynch and Vaandrager. The Clock GTA provides a systematic way of describing timing-based systems in which there is a notion of “normal” timing behavior, but that do not necessarily always exhibit this “normal” behavior. It can be used for practical time performance analysis based on the stabilization of the physical system.
We use the Clock GTA automaton to model, verify and analyze the paxos algorithm. The paxos algorithm is an efficient and highly fault-tolerant algorithm, devised by Lamport, for reaching consensus in a distributed system. Although it appears to be practical, it is not widely known or understood. This paper contains a new presentation of the paxos algorithm, based on a formal decomposition into several interacting components. It also contains a correctness proof and a time performance and fault, tolerance analysis.
KeywordsI/O automata models formal verification distributed consensus partially synchronous systems fault-tolerance
Unable to display preview. Download preview PDF.
- 1.T.D. Chandra, V. Hadzilacos, S. Toueg, The weakest failure detector for solving consensus, in Proceedings of the 11th Annual ACM Symposium on Principles of Distributed Computing, pages 147–158, Vancouver, British Columbia, Canada, August 1992.Google Scholar
- 2.T.D. Chandra, S. Toueg, Unreliable failure detector for asynchronous distributed systems, Journal of the ACM, Vol. 43 (2), pp. 225–267. A preliminary version appeared in the Proceedings of the loth Annual ACM Symposium on Principles of Distributed Computing, pages 325–340, August 1991.CrossRefGoogle Scholar
- 3.F. Cristian and C. Fetzer, The timed asynchronous system model, Dept. of Computer Science, UCSD, La Jolla, CA. Technical Report CSE97-519.Google Scholar
- 4.R. De Prisco, Revisiting the Paxos algorithm, M.S. Thesis, Massachusetts Institute of Technology, Laboratory for Computer Science, Cambridge, MA, June 1997. Technical Report MIT-LCS-TR-717, Lab. for Computer Science, MIT, Cambridge, MA, USA, June 1997.Google Scholar
- 6.A. Fekete, N. Lynch, A. Shvartsman, Specifying and using a partitionable group communication service, to appear in Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing, August 1997.Google Scholar
- 7.M.J. Fischer, The consensus problem in unreliable distributed systems (a brief survey). Rep. YALEU/DSC/RR-273. Dept. of Computer Science, Yale Univ., New Have, Conn., June 1983.Google Scholar
- 8.L. Lamport, The part-time parliament, Research Report 49, Digital Equipment Corporation Systems Research Center, Palo Alto, CA, September 1989.Google Scholar
- 9.B. Lampson, How to build a highly available system using consensus, in Proceedings of the 10th International Workshop on Distributed Algorithms WDAG 96, Bologna, Italy, pages 1–15, 1996.Google Scholar
- 10.B. Liskov, B. Oki, Viewstamped replication: A new primary copy method to support highly-available distributed systems, in Proceedings of the 7th Annual ACM Symposium on Principles of Distributed Computing, pages 8–17, August 1988.Google Scholar
- 11.N. Lynch, Distributed Algorithms, Morgan Kaufmann Publishers, San Francisco, 1996.Google Scholar
- 12.N. Lynch, F. Vaandrager, Forward and backward simulations for timing-based systems. in Real-Time: Theory in Practice, Vol. 600 of Lecture Notes in Computer Science, Springer-Verlag, pp. 397–446, 1992.Google Scholar
- 13.N. Lynch, F. Vaandrager, Forward and backward simulations—Part II: Timing-based systems. Technical Memo MIT-LCS-TM-487.b, Lab. for Computer Science, MIT Cambridge, MA, USA, April 1993.Google Scholar
- 14.N. Lynch, F. Vaandrager. Actions transducers and timed automata. Technical Memo MIT-LCS-TM-480.b, Lab. for Computer Science, MIT, Cambridge, MA, USA, October 1994.Google Scholar
- 15.B. Oki, Viewstamped replication for highly-available distributed systems, Ph.D. Thesis, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, 1988.Google Scholar
- 16.B. Patt-Shamir, A theory of clock synchronization, Ph.D. Thesis, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, October 1994.Google Scholar
- 17.D. Skeen, Nonblocking Commit Protocols, Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 133–142, May 1981.Google Scholar