Skip to main content

The design of the Transis system

  • Group Communication
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 938))

Abstract

Transis is a high availability distributed system, being developed in the Hebrew University. It supports reliable group communication for high availability applications. The system provides enhanced services for information dissemination and replication in a dynamic environment where machines may crash, for arbitrarily long periods, and may recover; where the network may partition and re-merge. Transis contains novel protocols for reliable message delivery, it optimizes the performance for existing network hardware, and offers a variety of different handles to upper applications. The paper presents the experience gained in the design and the implementation of the Transis communication subsystem.

This work was supported in part by GIF I-207-199.6/91

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. O. Amir, Y. Amir, and D. Dolev. A Highly Available Application in the Transis Environment. In Proceedings of the Hardware and Software Architectures for Fault Tolerance Workshop, at Le Mont Saint-Michel, France (LNCS 774), June 1993.

    Google Scholar 

  2. Y. Amir, D. Dolev, S. Kramer, and D. Malki. Membership Algorithms for Multicast Communication Groups. In 6th Intl. Workshop on Distributed Algorithms proceedings (WDAG-6), (LCNS, 647), pages 292–312, November 1992.

    Google Scholar 

  3. Y. Amir, D. Dolev, S. Kramer, and D. Malki. Transis: A Communication Sub-System for High Availability. In 22nd Annual International Symposium on Fault-Tolerant Computing, pages 76–84, July 1992.

    Google Scholar 

  4. Y. Amir, D. Dolev, P. M. Melliar-Smith, and L. E. Moser. Robust and Efficient Replication using Group Communication. Technical Report CS94-20, Institute of Computer Science, The Hebrew University of Jerusalem, Jerusalem, Israel, 1994.

    Google Scholar 

  5. Y. Amir, L. E. Moser, P. M. Melliar-Smith, D. A. Agarwal, and P. Ciarfella. Fast Message Ordering and Membership Using a Logical Token-Passing Ring. In Intl. Conference on Distributed Computing Systems, pages 551–560, May 1993.

    Google Scholar 

  6. K. P. Birman. The Process Group Approach to Reliable Distributed Computing. Communications of the ACM, 36(12), December 1993.

    Google Scholar 

  7. K. P. Birman. Reliable Distributed Computing with the Isis Toolkit, chapter Virtual Synchrony Model. IEEE Press, 1994. to appear.

    Google Scholar 

  8. K. P. Birman, R. Cooper, and B. Gleeson. Programming with Process Groups: Group and Multicast Semantics. TR 91-1185, dept. of Computer Science, Cornell University, Jan 1991.

    Google Scholar 

  9. K. P. Birman, A. Schiper, and P. Stephenson. Lightweight Causal and Atomic Group Multicast. ACM Trans. Comp. Syst., 9(3):272–314, 1991.

    Google Scholar 

  10. K. P. Birman and R. van Renesse. Reliable Distributed Computing with the Isis Toolkit. IEEE Press, 1994.

    Google Scholar 

  11. F. Cristian. Reaching agreement on processor group membership in synchronous distributed systems. Distributed Computing, 4(4):175–187, April 1991.

    Google Scholar 

  12. D. Dolev, S. Kramer, and D. Malki. Early Delivery Totally Ordered Broadcast in Asynchronous Environments. In 23rd Annual International Symposium on Fault-Tolerant Computing, pages 544–553, June 1993.

    Google Scholar 

  13. D. Dolev, D. Malki, and H. R. Strong. An Asynchronous Membership Protocol that Tolerates Partitions. submitted for publication. Available as CS TR94-6, Institute of Computer Science, the Hebrew University of Jerusalem, 1994.

    Google Scholar 

  14. J. Y. Halpern and Y. Moses. Knowledge and Common Knowledge in a Distributed Environment. In 3rd Annual ACM Symp. on Principles of Distributed Computing, pages 50–61, 1984.

    Google Scholar 

  15. M. F. Kaashoek and A. S. Tanenbaum. Group Communication in the Amoeba Distributed Operating System. In 11th Intl. Conference on Distributed Computing Systems, pages 882–891, May 1991.

    Google Scholar 

  16. M. F. Kaashoek, A. S. Tanenbaum, S. F. Hummel, and H. E. Bal. An Efficient Reliable Broadcast Protocol. Operating Systems Review, 23(4):5–19, October 1989.

    Google Scholar 

  17. I. Keidar. A Highly Available Paradigm for Consistent Object Replication. Master's thesis, Inst. of Computer Science, The Hebrew University of Jerusalem, 1994. Also available as Technical Report CS95-5. submitted for publication.

    Google Scholar 

  18. R. Ladin, B. Liskov, L. Shrira, and S. Ghemawat. Lazy Replication: Exploiting the Semantics of Distributed Services. In 9th Ann. Symp. Principles of Distributed Computing, pages 43–58, August 90.

    Google Scholar 

  19. L. Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. Comm. ACM, 21(7):558–565, July 78.

    Google Scholar 

  20. D. Malki, Y. Amir, D. Dolev, and S. Kramer. The Transis Approach to High Availability Cluster Communication. TR 94-14, Inst. of Comp. Sci., The Hebrew University of Jerusalem, June 1994.

    Google Scholar 

  21. D. Malki and R. van Renesse. The Replication Service Layer. internal manuscript, 1994.

    Google Scholar 

  22. P. M. Melliar-Smith, L. E. Moser, and V. Agrawala. Broadcast Protocols for Distributed Systems. IEEE Trans. Parallel & Distributed Syst., 1(1):17–25, Jan 1990.

    Google Scholar 

  23. S. Mishra, L. L. Peterson, and R. L. Schlichting. Consul: A Communication Substrate for Fault-Tolerant Distributed Programs. TR 91-32, dept. of Computer Science, University of Arizona, 1991.

    Google Scholar 

  24. L. E. Moser, Y. Amir, P. M. Melliar-Smith, and D. A. Agarwal. Extended virtual synchrony. In Proceedings of the Fourteenth Intl. Conference on Distributed Computing Systems, pages 56–65, Poznan, Poland, June 1994. IEEE. Also available as technical report ECE93-22, Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA.

    Google Scholar 

  25. L. L. Peterson, N. C. Buchholz, and R. D. Schlichting. Preserving and Using Context Information in Interprocess Communication. ACM Trans. Comput. Syst., 7(3):217–246, August 89.

    Google Scholar 

  26. D. Powell. Delta-4: A Generic Architecture for Dependable Distributed Computing. Springer-Verlag, 1991.

    Google Scholar 

  27. A. M. Ricciardi and K. P. Birman. Using Process Groups to Implement Failure Detection in Asynchronous Environments. In proc. annual ACM Symposium on Principles of Distributed Computing, pages 341–352, August 1991.

    Google Scholar 

  28. R. van Renesse, R. Cooper, B. Glade, and P. Stephenson. A RISC Approach to Process Groups. In Proceedings of the 5th ACM SIGOPS Workshop, pages 21–23, September 1992.

    Google Scholar 

  29. R. van Renesse, T. M. Hickey, and K. P. Birman. Design and Performance of Horus: A Lightweight Group Communications System. Technical Report 94-1442, Cornell University, Dept. of Computer Science, Aug. 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Kenneth P. Birman Friedemann Mattern André Schiper

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dolev, D., Malki, D. (1995). The design of the Transis system. In: Birman, K.P., Mattern, F., Schiper, A. (eds) Theory and Practice in Distributed Systems. Lecture Notes in Computer Science, vol 938. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60042-6_6

Download citation

  • DOI: https://doi.org/10.1007/3-540-60042-6_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60042-8

  • Online ISBN: 978-3-540-49409-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics