Run-Time Switching Between Total Order Algorithms

  • José Mocito
  • Luís Rodrigues
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4128)


Total order broadcast protocols are a fundamental building block in the construction of many fault-tolerant distributed applications. Unfortunately, total order is an intrinsically expensive operation. Moreover, there are certain algorithms that perform better in specific scenarios and given network properties. This paper proposes and evaluates an adaptive protocol that is able to dynamically switch between different total order algorithms. The protocol allows to achieve the best possible performance, by selecting, in each moment, the algorithm that is most appropriate to the present network conditions. Experimental results show that, using our protocol, adaptation can be achieved with negligible interference with the data flow.


Total Order Correct Process Failure Detector Interarrival Time Fundamental Building Block 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Powell, D. (ed.): Special Issue on Group Communication. Communications of the ACM 39(4), 50–97 (1996)Google Scholar
  2. 2.
    Guerraoui, R., Schiper, A.: Software-based replication for fault tolerance. IEEE Computer 30(4), 68–74 (1997)Google Scholar
  3. 3.
    Défago, X., Schiper, A., Urbán, P.: Total order broadcast and multicast algorithms: Taxonomy and survey. ACM Computing Surveys 36(4), 372–421 (2004)CrossRefGoogle Scholar
  4. 4.
    Schneider, F.B.: Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Computing Surveys 22(4), 299–319 (1990)CrossRefGoogle Scholar
  5. 5.
    Kaashoek, M., Tanenbaum, A.: Group communication in the Amoeba distributed operating system. In: Proceedings of the 11th International Conference on Distributed Computing Systems, pp. 222–230. IEEE, Los Alamitos (1991)CrossRefGoogle Scholar
  6. 6.
    Peterson, L., Buchholz, N., Schlichting, R.: Preserving and using context information in interprocess communication. ACM Transactions on Computer Systems 7(3), 146–217 (1989)CrossRefGoogle Scholar
  7. 7.
    Dolev, D., Kramer, S., Malki, D.: Early delivery totally ordered multicast in asynchronous environments. In: Digest of Papers, The 23th International Symposium on Fault-Tolerant Computing, pp. 544–553. IEEE, Los Alamitos (1993)Google Scholar
  8. 8.
    Chang, J., Maxemchuck, N.: Reliable broadcast protocols. ACM, Transactions on Computer Systems 2(3) (1984)Google Scholar
  9. 9.
    Lamport, L.: Time, clocks and the ordering of events in a distributed system. Communications of the ACM 21(7), 558–565 (1978)MATHCrossRefGoogle Scholar
  10. 10.
    Birman, K., Joseph, T.: Reliable communication in the presence of failures. ACM, Transactions on Computer Systems 5(1) (1987)Google Scholar
  11. 11.
    Rodrigues, L., Fonseca, H., Veríssimo, P.: Totally ordered multicast in large-scale systems. In: Proceedings of the 16th International Conference on Distributed Computing Systems, Hong Kong, pp. 503–510. IEEE, Los Alamitos (1996)CrossRefGoogle Scholar
  12. 12.
    Rodrigues, L., Mocito, J., Carvalho, N.: From spontaneous total order to uniform total order: different degrees of optimistic delivery. In: Proceedings of the 21st ACM symposium on Applied computing (SAC 2006). ACM Press, New York (to appear, 2006)Google Scholar
  13. 13.
    Liu, X., van Renesse, R.: Fast protocol transition in a distributed environment. In: Proceedings of the 19th ACM Conference on Principles of Distributed Computing (PODC 2000), Portland, OR, p. 341 (2000)Google Scholar
  14. 14.
    Chen, W.K., Hiltunen, M., Schlichting, R.: Constructing adaptive software in distributed systems. In: ICDCS 2001: Proceedings of the The 21st International Conference on Distributed Computing Systems, Washington, p. 635. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  15. 15.
    van Renesse, R., Birman, K., Hayden, M., Vaysburd, A., Karr, D.: Building adaptive systems using Ensemble. Software: Practice and Experience 28(9), 963–979 (1998)CrossRefGoogle Scholar
  16. 16.
    Rutti, O., Wojciechowski, P., Schiper, A.: Structural and algorithmic issues of dynamic protocol update. In: Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2006), IEEE, Los Alamitos (2006)Google Scholar
  17. 17.
    Miranda, H., Pinto, A., Rodrigues, L.: Appia, a flexible protocol kernel supporting multiple coordinated channels. In: Proceedings of the 21st International Conference on Distributed Computing Systems, Phoenix, Arizona, pp. 707–710. IEEE, Los Alamitos (2001)CrossRefGoogle Scholar
  18. 18.
    Nicol, D., Liu, J., Liljenstam, M., Yan, G.: Simulation of large-scale networks using SSF. In: Proceedings of the 2003 Winter Simulation Conference (2003)Google Scholar
  19. 19.
    Chandra, T., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43(2), 225–267 (1996)MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Birman, K., Joseph, T.: Exploiting virtual synchrony in distributed systems. Technical Report 87-811, Department of Computer Science, Cornell University, Ithaca, New York (1987)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • José Mocito
    • 1
  • Luís Rodrigues
    • 1
  1. 1.University of Lisbon 

Personalised recommendations