Distributed Computing

, Volume 22, Issue 1, pp 29–47 | Cite as

The Theta-Model: achieving synchrony without clocks

Article

Abstract

We present a novel partially synchronous system model, which augments the asynchronous model by a (possibly unknown) bound Θ on the ratio of longest and shortest end-to-end delays of messages simultaneously in transit. An upper bound on those delays need not exist, however, and even Θ may hold only after some unknown global stabilization time. Θ-algorithms are fully message-driven and do not have access to bounded drift local clocks, which makes them particularly suitable for VLSI Systems-on-Chip, for example. In this model, we provide a simulation of (eventually achieved) lock-step rounds, which even works in the presence of Byzantine failures. It follows that most problems in distributed computing have a solution in our model: Using the basic consensus algorithm for partially synchronous systems by Dwork et al. (J ACM 35(2):288–323, 1988), for example, Byzantine consensus can be solved. We also introduce a timing transformation technique that facilitates simple correctness proofs and performance analyses of Θ-algorithms, and provide a detailed relation of the Θ-Model to other partially synchronous system models.

Keywords

Computing models Fault-tolerant distributed algorithms Partially synchronous systems Clocks and time 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Stable leader election. In: DISC ’01: Proceedings of the 15th International Conference on Distributed Computing, pp. 108–122. Springer, Berlin (2001)Google Scholar
  2. 2.
    Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: On implementing Omega with weak reliability and synchrony assumptions. In: Proceeding of the 22nd Annual ACM Symposium on Principles of Distributed Computing (PODC’03), pp. 306–314. ACM Press, New York (2003)Google Scholar
  3. 3.
    Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Communication-efficient leader election and consensus with limited link synchrony. In: Proceedings of the 23th ACM Symposium on Principles of Distributed Computing (PODC’04), pp. 328–337. ACM Press, St. John’s (2004)Google Scholar
  4. 4.
    Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Consensus with byzantine failures and little system synchrony. In: Proceedings of the International Conference on Dependable Systems and Networks (DSN’06), pp. 147–155 (2006)Google Scholar
  5. 5.
    Albeseder, D.: Evaluation of message delay correlation in distributed systems. In: Proceedings of the Third Workshop on Intelligent Solutions for Embedded Systems. Hamburg, Germany (2005)Google Scholar
  6. 6.
    Ammar, Y., Buhrig, A., Marzencki, M., Charlot, B., Basrour, S., Matou, K., Renaudin, M.: Wireless sensor network node with asynchronous architecture and vibration harvesting micro power generator. In: sOc-EUSAI ’05: Proceedings of the 2005 joint conference on Smart objects and ambient intelligence, pp. 287–292. ACM, New York (2005)Google Scholar
  7. 7.
    Anceaume E., Fernández A., Mostéfaoui A., Neiger G., Raynal M.: A necessary and sufficient condition for transforming limited accuracy failure detectors. J. Comput. Syst. Sci. 68(1), 123–133 (2004)MATHCrossRefGoogle Scholar
  8. 8.
    Attiya H., Dwork C., Lynch N., Stockmeyer L.: Bounds on the time to reach agreement in the presence of timing uncertainty. J. ACM (JACM) 41(1), 122–152 (1994)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Beauquier J., Kekkonen-Moneta S.: Fault-tolerance and self-stabilization: impossibility results and solutions using self-stabilizing failure detectors. Int. J. Syst. Sci. 28(11), 1177–1187 (1997)MATHCrossRefGoogle Scholar
  10. 10.
    Biely, M., Widder, J.: Optimal message-driven implementations of omega with mute processes. ACM Trans. Auton. Adaptive Syst. 4(1), Article 4, 22 pages (2009)Google Scholar
  11. 11.
    Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)MATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Cristian F., Fetzer C.: The timed asynchronous distributed system model. IEEE Trans. Parallel Distrib. Syst. 10(6), 642–657 (1999)CrossRefGoogle Scholar
  13. 13.
    Dolev D., Dwork C., Stockmeyer L.: On the minimal synchronism needed for distributed consensus. J. ACM 34(1), 77–97 (1987)MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Dolev S.: Self-stabilization. MIT Press, Cambridge (2000)MATHGoogle Scholar
  15. 15.
    Dwork C., Lynch N., Stockmeyer L.: Consensus in the presence of partial synchrony. J. ACM 35(2), 288–323 (1988)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Ebergen J.C.: A formal approach to designing delay-insensitive circuits. Distrib. Comput. 5, 107–119 (1991)MATHCrossRefGoogle Scholar
  17. 17.
    Ekanayake V., Clinton Kelly I., Manohar R.: An ultra low-power processor for sensor networks. SIGOPS Oper. Syst. Rev. 38(5), 27–36 (2004)CrossRefGoogle Scholar
  18. 18.
    Ferringer, M., Fuchs, G., Steininger, A., Kempf, G.: VLSI implementation of a fault-tolerant distributed clock generation. IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT2006), pp. 563–571 (2006)Google Scholar
  19. 19.
    Fetzer, C., Schmid, U.: Brief announcement: On the possibility of consensus in asynchronous systems with finite average response times. In: Proceedings of the 23th ACM Symposium on Principles of Distributed Computing (PODC’04), p. 402. Boston, Massachusetts (2004)Google Scholar
  20. 20.
    Fetzer, C., Schmid, U., Sü ßkraut, M.: On the possibility of consensus in asynchronous systems with finite average response times. In: Proceedings of the 25th International Conference on Distributed Computing Systems (ICDCS’05), pp. 271–280. IEEE Computer Society, Washington (2005)Google Scholar
  21. 21.
    Fischer M.J., Lynch N.A., Paterson M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Fuegger, M., Schmid, U., Fuchs, G., Kempf, G.: Fault-tolerant distributed clock generation in VLSI systems-on-chip. In: Proceedings of the Sixth European Dependable Computing Conference (EDCC-6), pp. 87–96. IEEE Computer Society Press (2006)Google Scholar
  23. 23.
    Gafni, E.: Round-by-round fault detectors (extended abstract): unifying synchrony and asynchrony. In: Proceedings of the Seventeenth Annual ACM Symposium on Principles of Distributed Computing, pp. 143–152. ACM Press, Puerto Vallarta (1998)Google Scholar
  24. 24.
    Hermant J.F., Le Lann G.: Fast asynchronous uniform consensus in real-time distributed systems. IEEE Trans. Comput. 51(8), 931–944 (2002)CrossRefGoogle Scholar
  25. 25.
    Hermant, J.F., Widder, J.: Implementing reliable distributed real-time systems with the Θ-model. In: Proceedings of the 9th International Conference on Principles of Distributed Systems (OPODIS 2005). LNCS, vol. 3974, pp. 334–350. Springer, Pisa (2005)Google Scholar
  26. 26.
    Hutle, M., Malkhi, D., Schmid, U., Zhou, L.: Brief announcement: Chasing the weakest system model for implementing omega and consensus. In: Proceedings Eighth International Symposium on Stabilization, Safety, and Security of Distributed Systems (formerly Symposium on Self-stabilizing Systems) (SSS 2006). LNCS, pp. 576–577. Springer, Dallas (2006)Google Scholar
  27. 27.
    Hutle, M., Malkhi, D., Schmid, U., Zhou, L.: Chasing the weakest system model for implementing omega and consensus. IEEE Trans. Dependable Secure Comput. (2009, to appear)Google Scholar
  28. 28.
    Hutle, M., Widder, J.: On the possibility and the impossibility of message-driven self-stabilizing failure detection. In: Proceedings of the Seventh International Symposium on Self Stabilizing Systems (SSS 2005), LNCS, vol. 3764, pp. 153–170. Springer, Barcelona (2005). Appeared also as brief announcement in Proceedings of the 24th ACM Symposium on Principles of Distributed Computing (PODC’05)Google Scholar
  29. 29.
    Hutle, M., Widder, J.: Self-stabilizing failure detector algorithms. In: Proc. IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN’05), pp. 485–490. IASTED/ACTA Press, Innsbruck (2005)Google Scholar
  30. 30.
    Lamport L., Shostak R., Pease M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982)MATHCrossRefGoogle Scholar
  31. 31.
    Le Lann, G., Schmid, U.: How to implement a timer-free perfect failure detector in partially synchronous systems. Tech. Rep. 183/1-127, Department of Automation, Technische Universität Wien (2003) (Replaced by Research Report 28/2005, Institut für Technische Informatik, TU Wien, 2005)Google Scholar
  32. 32.
    Lundelius-Welch J., Lynch N.A.: A new fault-tolerant algorithm for clock synchronization. Inform. Comput. 77(1), 1–36 (1988)CrossRefGoogle Scholar
  33. 33.
    Lynch N.: Distributed Algorithms. Morgan Kaufman Publishers, Inc., San Francisco (1996)MATHGoogle Scholar
  34. 34.
    Malkhi, D., Oprea, F., Zhou, L.: Ω meets paxos: Leader election and stability without eventual timely links. In: Proceedings of the 19th Symposium on Distributed Computing (DISC’05). LNCS, vol. 3724, pp. 199–213. Springer, Cracow (2005)Google Scholar
  35. 35.
    Mostefaoui, A., Mourgaya, E., Raynal, M.: Asynchronous implementation of failure detectors. In: Proceedings of the International Conference on Dependable Systems and Networks (DSN’03). San Francisco, CA (2003)Google Scholar
  36. 36.
    Mostefaoui, A., Powell, D., Raynal, M.: A hybrid approach for building eventually accurate failure detectors. In: Proceedings of the 10th International Pacific Rim Dependable Computing Symposium (PRDC’04), pp. 57–65. IEEE Computer Society (2004)Google Scholar
  37. 37.
    Mostéfaoui, A., Raynal, M.: Solving consensus using chandra-toueg’s unreliable failure detectors: A general quorum-based approach. In: Jayanti, P. (ed.) Distributed Computing: 13th International Symposium (DISC’99). Lecture Notes in Computer Science, vol. 1693, pp. 49–63. Springer-Verlag GmbH, Bratislava (1999)Google Scholar
  38. 38.
    Mostefaoui, A., Raynal, M., Travers, C.: Crash-resilient time-free eventual leadership. In: Proceedings of the 23rd IEEE Symposium on Reliable Distributed Systems (SRDS 2004), pp. 208–217. IEEE Computer Society (2004)Google Scholar
  39. 39.
    Parkes, S., Armbruster, P.: SpaceWire: a spacecraft onboard network for real-time communications. In: Proceedings 14th IEEE-NPSS Real Time Conference, pp. 6–10 (2005)Google Scholar
  40. 40.
    Ponzio, S.: The real-time cost of timing uncertainty: Consensus and failure detection. Master’s thesis, Massachusetts Institute of Technology (1991)Google Scholar
  41. 41.
    Ponzio, S., Strong, R.: Semisynchrony and real time. In: Proceedings of the 6th International Workshop on Distributed Algorithms (WDAG’92), pp. 120–135. Haifa, Israel (1992)Google Scholar
  42. 42.
    Schmid, U., Fetzer, C.: Randomized asynchronous consensus with imperfect communications. In: 22nd Symposium on Reliable Distributed Systems (SRDS’03), pp. 361–370. Florence, Italy (2003)Google Scholar
  43. 43.
    Schmid, U., Steininger, A.: Dezentrale Fehlertolerante Taktgenerierung in VLSI Chips. Research Report 69/2004, Technische Universität Wien, Institut für Technische Informatik (2004). (Österr. Patentanmeldung A 1223/2004)Google Scholar
  44. 44.
    Schmid, U., Weiss, B., Rushby, J.: Formally verified byzantine agreement in presence of link faults. In: 22nd International Conference on Distributed Computing Systems (ICDCS’02), pp. 608–616. Vienna, Austria (2002)Google Scholar
  45. 45.
    Srikanth T.K., Toueg S.: Optimal clock synchronization. J. ACM 34(3), 626–645 (1987)CrossRefMathSciNetGoogle Scholar
  46. 46.
    Sutherland I.E., Ebergen J.: Computers without clocks. Sci. Am. 287(2), 62–69 (2002)CrossRefGoogle Scholar
  47. 47.
    Veríssimo P., Casimiro A.: The timely computing base model and architecture. IEEE Trans. Comput. 51(8), 916–930 (2002)CrossRefGoogle Scholar
  48. 48.
    Vitányi, P.M.: Time-driven algorithms for distributed control. Report CS-R8510, C.W.I. (1985)Google Scholar
  49. 49.
    Widder, J.: Booting clock synchronization in partially synchronous systems. In: Proceedings of the 17th International Symposium on Distributed Computing (DISC’03), LNCS, vol. 2848, pp. 121–135. Springer, Sorrento (2003)Google Scholar
  50. 50.
    Widder, J.: Distributed computing in the presence of bounded asynchrony. Ph.D. thesis, Vienna University of Technology, Fakultät für Informatik (2004)Google Scholar
  51. 51.
    Widder, J., Le Lann, G., Schmid, U.: Failure detection with booting in partially synchronous systems. In: Proceedings of the 5th European Dependable Computing Conference (EDCC-5). LNCS, vol. 3463, pp. 20–37. Springer, Budapest (2005)Google Scholar
  52. 52.
    Widder J., Schmid U.: Booting clock synchronization in partially synchronous systems with hybrid process and link failures. Distrib. Comput. 20(2), 115–140 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  1. 1.Embedded Computing Systems Group (E182/2)Technische Universität WienViennaAustria
  2. 2.Laboratoire d’Informatique LIXÉcole polytechniquePalaiseau CedexFrance

Personalised recommendations