Implementing Reliable Distributed Real-Time Systems with the Θ-Model

  • Jean-François Hermant
  • Josef Widder
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3974)

Abstract

A widely accepted viewpoint is that designs for distributed real-time systems should be based on synchronous computational models. Safety in such designs, however, requires that the target system behaves as the synchronous model postulates. We believe that this approach is rather risky, as it rests on solving distributed scheduling problems which are known to be NP-hard. We therefore advocate the use of more relaxed system models, namely asynchronous models equipped with unreliable failure detectors.

To this end, we introduce a novel implementation of the perfect failure detector, resting on an abstract model without upper bounds on end-to-end message delays. Then, we demonstrate how this algorithm can be transferred from the abstract model into a real network/system architecture. Finally, we prove that this solution exhibits real-time behavior.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2), 374–382 (1985)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: On implementing Omega with weak reliability and synchrony assumptions. In: Proceeding of the 22nd Annual ACM Symposium on Principles of Distributed Computing (PODC 2003) (2003)Google Scholar
  3. 3.
    Fetzer, C., Schmid, U., Süßkraut, M.: On the possibility of consensus in asynchronous systems with finite average response times. In: Proceedings of the 25th International Conference on Distributed Computing Systems (ICDCS 2005), Columbus, Ohio, USA (2005)Google Scholar
  4. 4.
    Widder, J., Le Lann, G., Schmid, U.: Failure detection with booting in partially synchronous systems. In: Dal Cin, M., Kaâniche, M., Pataricza, A. (eds.) EDCC 2005. LNCS, vol. 3463, pp. 20–37. Springer, Heidelberg (2005)Google Scholar
  5. 5.
    Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43(2), 225–267 (1996)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Le Lann, G.: On real-time and non real-time distributed computing (invited paper). In: Helary, J.-M., Raynal, M. (eds.) WDAG 1995. LNCS, vol. 972, pp. 51–70. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  7. 7.
    Le Lann, G.: Proof-based system engineering and embedded systems (invited paper). In: Rozenberg, G. (ed.) EEF School 1996. LNCS, vol. 1494, pp. 208–248. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  8. 8.
    Le Lann, G.: Asynchrony and real-time dependable computing. In: 8th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems (WORDS 2003), Guadalajara, Mexico, pp. 18–25 (2003)Google Scholar
  9. 9.
    Le Lann, G., Schmid, U.: How to implement a timer-free perfect failure detector in partially synchronous systems. Technical Report 183/1-127, Department of Automation, Technische Universität Wien (2003)Google Scholar
  10. 10.
    Widder, J.: Distributed Computing in the Presence of Bounded Asynchrony. PhD thesis, Vienna University of Technology, Fakultät für Informatik (2004)Google Scholar
  11. 11.
    Hermant, J.-F., Le Lann, G.: Fast asynchronous uniform consensus in real-time distributed systems. IEEE Transactions on Computers 51(8), 931–944 (2002)CrossRefGoogle Scholar
  12. 12.
    Hermant, J.F., Widder, J.: Implementing time free designs for distributed real-time systems (a case study). Research Report 23/2004 Technische Universität Wien, Institut für Technische Informatik, Joint Research Report with INRIA Rocquencourt (2004)Google Scholar
  13. 13.
    Le Lann, G., Rolin, P.: Process and device for the transmission of messages between different stations through a local distribution network. US Patent Number 4,847,835, July 1989, French Patent Number 84-16957 (November 1984)Google Scholar
  14. 14.
    Lundelius-Welch, J., Lynch, N.A.: An upper and lower bound for clock synchronization. Information and Control 62, 190–204 (1984)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Widder, J.: Booting clock synchronization in partially synchronous systems. In: Fich, F.E. (ed.) DISC 2003. LNCS, vol. 2848, pp. 121–135. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  16. 16.
    Hutle, M., Widder, J.: On the possibility and the impossibility of message-driven self-stabilizing failure detection. In: Tixeuil, S., Herman, T. (eds.) SSS 2005. LNCS, vol. 3764, pp. 153–170. Springer, Heidelberg (2005); Appeared also as brief announcement in Proceedings of the 24th ACM Symposium on Principles of Distributed Computing (PODC 2005) CrossRefGoogle Scholar
  17. 17.
    Beauquier, J.: Fault-tolerance and self-stabilization: Impossibility results and solutions using self-stabilizing failure detectors. International Journal of Systems Science 28(11), 1177–1187 (1997)CrossRefMATHGoogle Scholar
  18. 18.
    Hermant, J.F., Le Lann, G.: A protocol and correctness proofs for real-time high-performance broadcast networks. In: Proc. IEEE Int’l. Conf. Distributed Computing Systems, pp. 360–369 (1998)Google Scholar
  19. 19.
    Hermant, J.F.: Quelques problèmes et solutions en ordonnancement temps réel pour systèmes répartis. PhD thesis, Paris-VI-Pierre-et-Marie-Curie Univ. (1999)Google Scholar
  20. 20.
    Albeseder, D.: Evaluation of message delay correlation in distributed systems. In: Proceedings of the Third Workshop on Intelligent Solutions for Embedded Systems, Hamburg, Germany (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jean-François Hermant
    • 1
  • Josef Widder
    • 2
  1. 1.INRIA Rocquencourt, Projet NovaltisLe ChesnayFrance
  2. 2.Embedded Computing Systems Group E182/2Technische Universität WienViennaAustria

Personalised recommendations