On Barriers and the Gap between Active and Passive Replication

  • Flavio P. Junqueira
  • Marco Serafini
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8205)

Abstract

Active replication is commonly built on top of the atomic broadcast primitive. Passive replication, which has been recently used in the popular ZooKeeper coordination system, can be naturally built on top of the primary-order atomic broadcast primitive. Passive replication differs from active replication in that it requires processes to cross a barrier before they become primaries and start broadcasting messages. In this paper, we propose a barrier function τ that explains and encapsulates the differences between existing primary-order atomic broadcast algorithms. We also show that implementing primary-order atomic broadcast on top of a generic consensus primitive and τ inherently results in higher time complexity than atomic broadcast, as witnessed by existing algorithms. We overcome this problem by presenting an alternative, primary-order atomic broadcast implementation that builds on top of a generic consensus primitive and uses consensus itself to form a barrier. This algorithm is modular and matches the time complexity of existing τ-based algorithms.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baker, J., Bond, C., Corbett, J., Furman, J.J., Khorlin, A., Larson, J., Léon, J.-M., Li, Y., Lloyd, A., Yushprakh, V.: Megastore: Providing scalable, highly available storage for interactive services. In: CIDR, vol. 11, pp. 223–234 (2011)Google Scholar
  2. 2.
    Birman, K., Malkhi, D., Van Renesse, R.: Virtually synchronous methodology for dynamic service replication. Technical Report MSR-TR-2010-151, Microsoft Research (2010)Google Scholar
  3. 3.
    Budhiraja, N., Marzullo, K., Schneider, F.B., Toueg, S.: The primary-backup approach, pp. 199–216. ACM Press/Addison-Wesley (1993)Google Scholar
  4. 4.
    Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43(2), 225–267 (1996)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. Journal of the ACM 43(4), 685–722 (1996)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Chockler, G.V., Keidar, I., Vitenberg, R.: Group communication specifications: a comprehensive study. ACM Compututing Surveys 33(4), 427–469 (2001)CrossRefGoogle Scholar
  7. 7.
    Défago, X., Schiper, A.: Semi-passive replication and lazy consensus. Journal of Parallel and Distributed Computing 64(12), 1380–1398 (2004)CrossRefMATHGoogle Scholar
  8. 8.
    Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: Zookeeper: Wait-free coordination for Internet-scale systems. In: USENIX Annual Technical Conference, pp. 145–158 (2010)Google Scholar
  9. 9.
    Junqueira, F.P., Reed, B., Serafini, M.: Zab: High-performance broadcast for primary-backup systems. In: IEEE Conference on Dependable Systems and Networks, pp. 245–256 (2011)Google Scholar
  10. 10.
    Junqueira, F.P., Serafini, M.: On barriers and the gap between active and passive replication (full version). arXiv:1308.2979 [cs.DC] (2013)Google Scholar
  11. 11.
    Lamport, L.: The part-time parliament. ACM Transactions on Computing Systems (TOCS) 16(2), 133–169 (1998)CrossRefGoogle Scholar
  12. 12.
    Lamport, L.: Lower bounds for asynchronous consensus. Distributed Computing 19(2), 79–103 (2006)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Lamport, L., Malkhi, D., Zhou, L.: Vertical paxos and primary-backup replication. In: ACM Symposium on Principles of Distributed Computing, pp. 312–313 (2009)Google Scholar
  14. 14.
    Pedone, F., Frolund, S.: Pronto: A fast failover protocol for off-the-shelf commercial databases. In: IEEE Symposium on Reliable Distributed Systems, pp. 176–185 (2000)Google Scholar
  15. 15.
    Shraer, A., Reed, B., Malkhi, D., Junqueira, F.: Dynamic reconfiguration of primary/backup clusters. In: USENIX Annual Technical Conference, pp. 425–438 (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Flavio P. Junqueira
    • 1
  • Marco Serafini
    • 2
  1. 1.Microsoft ResearchCambridgeUK
  2. 2.Yahoo! ResearchBarcelonaSpain

Personalised recommendations