Seamless Paxos coordinators

Abstract

The Paxos algorithm requires a single correct coordinator process to operate. After a failure, the replacement of the coordinator may lead to a temporary unavailability of the application implemented atop Paxos. So far, this unavailability has been addressed by reducing the coordinator replacement rate through the use of stable coordinator selection algorithms. We have observed that the cost of recovery of the newly elected coordinator’s state is at the core of this unavailability problem. In this paper we present a new technique to manage coordinator replacement that allows the recovery to occur concurrently with new consensus rounds. Experimental results show that our seamless approach effectively solves the temporary unavailability problem, its adoption entails uninterrupted execution of the application. Our solution removes the restriction that the occurrence of coordinator replacements is something to be avoided, allowing the decoupling of the application execution from the accuracy of the mechanism used to choose a coordinator. This result increases the performance of the application even in the presence of failures, it is of special importance to the autonomous operation of replicated applications that have to adapt to varying network conditions and partial failures.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. 1.

    Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’06) (2006)

    Google Scholar 

  2. 2.

    Buzato, L.E., Vieira, G.M.D., Zwaenepoel, W.: Dynamic content web applications: crash, failover, and recovery analysis. In: IEEE/IFIP International Conference on Dependable Systems and Networks (DSN ’09), pp. 229–238. IEEE Press, New York (2009). doi:10.1109/DSN.2009.5270331

    Chapter  Google Scholar 

  3. 3.

    Camargos, L.J., Schmidt, R.M., Pedone, F.: Multicoordinated agreement protocols for higher availability. In: Proceedings of the 2008 Seventh IEEE International Symposium on Network Computing and Applications (NCA ’08), pp. 76–84. IEEE Comp. Soc., Washington (2008). doi:10.1109/NCA.2008.28

    Chapter  Google Scholar 

  4. 4.

    Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: an engineering perspective. In: Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing (PODC ’07), pp. 398–407. ACM Press, New York (2007). doi:10.1145/1281100.1281103

    Chapter  Google Scholar 

  5. 5.

    Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. J. ACM 43(4), 685–722 (1996). doi:10.1145/234533.234549

    Article  MATH  MathSciNet  Google Scholar 

  6. 6.

    Chen, W., Toueg, S., Aguilera, M.K.: On the quality of service of failure detectors. IEEE Trans. Comput. 51(5), 561–580 (2002). doi:10.1109/TC.2002.1004595

    Article  MathSciNet  Google Scholar 

  7. 7.

    Jain, R.: The Art of Computer Systems Performance Analysis. Wiley, New York (1991)

    MATH  Google Scholar 

  8. 8.

    Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978). doi:10.1145/359545.359563

    Article  MATH  Google Scholar 

  9. 9.

    Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998). doi:10.1145/279227.279229

    Article  Google Scholar 

  10. 10.

    Lamport, L.: Fast Paxos. Distrib. Comput. 19(2), 79–103 (2006). doi:10.1007/s00446-006-0005-x

    Article  MATH  MathSciNet  Google Scholar 

  11. 11.

    Lampson, B.W., Sturgis, H.E.: Atomic transactions. In: Lampson, B.W., Paul, M., Siegert, H.J. (eds.) Distributed Systems: Architecture and Implementation, vol. 105, pp. 246–265 (1981)

    Chapter  Google Scholar 

  12. 12.

    MacCormick, J., Murphy, N., Najork, M., Thekkath, C.A., Zhou, L.: Boxwood: abstractions as the foundation for storage infrastructure. In: Proc. of 6th USENIX Symp. on Operating Systems Design and Implementation (2004)

    Google Scholar 

  13. 13.

    Malkhi, D., Oprea, F., Zhou, L.: Ω meets Paxos: leader election and stability without eventual timely links. In: Proceedings of the 19th International Conference on Distributed Computing (DISC ’05). Lecture Notes in Computer Science, vol. 3724, pp. 199–213. Springer, New York (2005). doi:10.1007/11561927_16

    Google Scholar 

  14. 14.

    Mao, Y., Junqueira, F.P., Marzullo, K.: Mencius: building efficient replicated state machines for WANs. In: Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’08) (2008)

    Google Scholar 

  15. 15.

    Marandi, P.J., Primi, M., Schiper, N., Pedone, F.: Ring Paxos: a high-throughput atomic broadcast protocol. In: 40th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2010), Chicago, USA, pp. 527–536 (2010). doi:10.1109/DSN.2010.5544272

    Chapter  Google Scholar 

  16. 16.

    Prisco, R.D., Lampson, B., Lynch, N.: Revisiting the Paxos algorithm. Theor. Comput. Sci. 243(1–2), 35–91 (2000). doi:10.1016/S0304-3975(00)00042-6

    Article  MATH  Google Scholar 

  17. 17.

    Rao, J., Shekita, E.J., Tata, S.: Using Paxos to build a scalable, consistent, and highly available datastore. Proc. VLDB Endow. 4, 243–254 (2011)

    Google Scholar 

  18. 18.

    Schneider, F.B.: Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Comput. Surv. 22(4), 299–319 (1990). doi:10.1145/98163.98167

    Article  Google Scholar 

  19. 19.

    Vieira, G.M.D., Buzato, L.E.: Treplica: ubiquitous replication. In: Proc. of the 26th Brazilian Symposium on Computer Networks and Distributed Systems (SBRC ’08), Rio de Janeiro, Brazil (2008)

    Google Scholar 

Download references

Acknowledgements

Gustavo M.D. Vieira was partially supported by CNPq grant 142638/2005-6. Luiz E. Buzato was partially supported by CNPq grant 473340/2009-7 and FAPESP grant 2009/06859-8.

The authors thank Prof. W. Zwaenepoel, and Olivier Cramieri, both from EPFL, Switzerland, for their support in the earlier stages of this research. We thank Daniel Cason for the support with the cluster management at IC-UNICAMP.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Gustavo M. D. Vieira.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Vieira, G.M.D., Garcia, I.C. & Buzato, L.E. Seamless Paxos coordinators. Cluster Comput 17, 463–473 (2014). https://doi.org/10.1007/s10586-013-0264-9

Download citation

Keywords

  • Consensus
  • Failure detector
  • Fault tolerance
  • Paxos
  • Replication