On Barriers and the Gap between Active and Passive Replication
Active replication is commonly built on top of the atomic broadcast primitive. Passive replication, which has been recently used in the popular ZooKeeper coordination system, can be naturally built on top of the primary-order atomic broadcast primitive. Passive replication differs from active replication in that it requires processes to cross a barrier before they become primaries and start broadcasting messages. In this paper, we propose a barrier function τ that explains and encapsulates the differences between existing primary-order atomic broadcast algorithms. We also show that implementing primary-order atomic broadcast on top of a generic consensus primitive and τ inherently results in higher time complexity than atomic broadcast, as witnessed by existing algorithms. We overcome this problem by presenting an alternative, primary-order atomic broadcast implementation that builds on top of a generic consensus primitive and uses consensus itself to form a barrier. This algorithm is modular and matches the time complexity of existing τ-based algorithms.
KeywordsBarrier Function Correct Process Primary Order Failure Detector State Update
Unable to display preview. Download preview PDF.
- 1.Baker, J., Bond, C., Corbett, J., Furman, J.J., Khorlin, A., Larson, J., Léon, J.-M., Li, Y., Lloyd, A., Yushprakh, V.: Megastore: Providing scalable, highly available storage for interactive services. In: CIDR, vol. 11, pp. 223–234 (2011)Google Scholar
- 2.Birman, K., Malkhi, D., Van Renesse, R.: Virtually synchronous methodology for dynamic service replication. Technical Report MSR-TR-2010-151, Microsoft Research (2010)Google Scholar
- 3.Budhiraja, N., Marzullo, K., Schneider, F.B., Toueg, S.: The primary-backup approach, pp. 199–216. ACM Press/Addison-Wesley (1993)Google Scholar
- 8.Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: Zookeeper: Wait-free coordination for Internet-scale systems. In: USENIX Annual Technical Conference, pp. 145–158 (2010)Google Scholar
- 9.Junqueira, F.P., Reed, B., Serafini, M.: Zab: High-performance broadcast for primary-backup systems. In: IEEE Conference on Dependable Systems and Networks, pp. 245–256 (2011)Google Scholar
- 10.Junqueira, F.P., Serafini, M.: On barriers and the gap between active and passive replication (full version). arXiv:1308.2979 [cs.DC] (2013)Google Scholar
- 13.Lamport, L., Malkhi, D., Zhou, L.: Vertical paxos and primary-backup replication. In: ACM Symposium on Principles of Distributed Computing, pp. 312–313 (2009)Google Scholar
- 14.Pedone, F., Frolund, S.: Pronto: A fast failover protocol for off-the-shelf commercial databases. In: IEEE Symposium on Reliable Distributed Systems, pp. 176–185 (2000)Google Scholar
- 15.Shraer, A., Reed, B., Malkhi, D., Junqueira, F.: Dynamic reconfiguration of primary/backup clusters. In: USENIX Annual Technical Conference, pp. 425–438 (2012)Google Scholar