Modeling and Validating the Performance of Atomic Broadcast Algorithms in High Latency Networks

  • Richard Ekwall
  • André Schiper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4641)


The performance of consensus and atomic broadcast algorithms using failure detectors is often affected by a trade-off between the number of communication steps and the number of messages needed to reach a decision.

In this paper, we model the performance of three consensus and atomic broadcast algorithms using failure detectors in the oft-neglected setting of wide area networks and validate this model by experimentally evaluating the algorithms in several different setups.


Local Area Network Average Latency Failure Detector Network Latency Wide Area Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chandra, T., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of ACM 43(2), 225–267 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Ekwall, R., Schiper, A., Urbán, P.: Token-based atomic broadcast using unreliable failure detectors. In: Proc. of the 23rd Symposium on Reliable Distributed Systems (SRDS 2004), Florianópolis, Brazil (October 2004)Google Scholar
  3. 3.
    Mostefaoui, A., Raynal, M.: Solving Consensus using Chandra-Toueg’s Unreliable Failure Detectors: A Synthetic Approach. In: Jayanti, P. (ed.) DISC 1999. LNCS, vol. 1693, Springer, Heidelberg (1999)Google Scholar
  4. 4.
    Urbán, P., Shnayderman, I., Schiper, A.: Comparison of failure detectors and group membership: Performance study of two atomic broadcast algorithms. In: Proc. of the Int’l Conf. on Dependable Systems and Networks (DSN), pp. 645–654 (June 2003)Google Scholar
  5. 5.
    Fischer, M., Lynch, N., Paterson, M.: Impossibility of Distributed Consensus with One Faulty Process. Journal of ACM 32, 374–382 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Vicente, P., Rodrigues, L.: An Indulgent Total Order Algorithm with Optimistic Delivery. In: 21st IEEE Symp. on Reliable Distributed Systems (SRDS-21), Osaka, Japan, pp. 92–101 (October 2002)Google Scholar
  7. 7.
    Lin, Y., Kemme, B., Patiño-Martínez, M., Jiménez-Peris, R.: Consistent data replication: Is it feasible in WANs?. In: Proc. 11th International Euro-Par Conference, Lisbon, Portugal, pp. 633–643 (September 2005)Google Scholar
  8. 8.
    Anker, T., Dolev, D., Greenman, G., Shnayderman, I.: Evaluating total order algorithms in WAN. In: Proc. International Workshop on Large-Scale Group Communication, Florence, Italy (October 2003)Google Scholar
  9. 9.
    Bakr, O., Keidar, I.: Evaluating the running time of a communication round over the internet. In: Proc. of the 21st ACM Ann. Symp. on Principles of Distributed Computing, pp. 243–252 (2002)Google Scholar
  10. 10.
    Sousa, A., Pereira, J., Moura, F., Oliveira, R.: Optimistic Total Order in Wide Area Networks. In: 21st IEEE Symp. on Reliable Distributed Systems (SRDS-21), Osaka, Japan, pp. 190–199 (October 2002)Google Scholar
  11. 11.
    Guerraoui, R., Levy, R.R., Pochon, B., Quéma, V.: High Throughput Total Order Broadcast for Cluster Environments. In: IEEE International Conference on Dependable Systems and Networks (DSN 2006) (June 2006)Google Scholar
  12. 12.
    Ekwall, R., Schiper, A.: Comparing Atomic Broadcast Algorithms in High Latency Networks. Technical Report LSR-REPORT-2006-003, École Polytechnique Fédérale de Lausanne, Switzerland (July 2006)Google Scholar
  13. 13.
    Schneider, F.: Replication Management using the State-Machine Approach. In: Mullender, S. (ed.) Distributed Systems, 2nd edn. ACM Press Books, pp. 169–198. Addison-Wesley, London, UK (1993)Google Scholar
  14. 14.
    Alvisi, L., Marzullo, K.: Waft: Support for fault-tolerance in wide-area object oriented systems. In: ISW 1998. Proc. of the 2nd Information Survivability Workshop, October 1998, pp. 5–10. IEEE Computer Society Press, Los Alamitos (1998)Google Scholar
  15. 15.
    Urbán, P., Défago, X., Schiper, A.: Neko: A single environment to simulate and prototype distributed algorithms. Journal of Information Science and Engineering 18(6), 981–997 (2002)Google Scholar
  16. 16.
    Cappello, F., Caron, E., Dayde, M., Desprez, F., Jeannot, E., Jegou, Y., Lanteri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Primet, P., Richard, O.: Grid’5000: a large scale, reconfigurable, controlable and monitorable Grid platform. In: Grid’2005 Workshop, Seattle, USA, IEEE/ACM (November 13-14, 2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Richard Ekwall
    • 1
  • André Schiper
    • 1
  1. 1.École Polytechnique Fédérale de Lausanne (EPFL), 1015 LausanneSwitzerland

Personalised recommendations