Abstract
In this paper we introduce the novel election based fault tolerance mechanisms recently incorporated into the Collaborative Computing Transport Layer (CCTL). CCTL offers the atomic reliable multicast facilities used in the Collaborative Computing Framework (CCF). Our approach utilizes a reliable IP multicast primitive to implement two electorial algorithms that not only form consensus, but efficiently deliver a compact matrix based view of the network. This matrix can subsequently be analyzed to identify specific network failures (e.g. partitioning). The underlying premise of the approach being that by basing fault tolerance on a reliable multicast primitive, we eliminate the need for specific keep-alive packets such as heartbeats.
Chapter PDF
References
K. P. Birman. Building Secure and Reliable Network Applications. Prentice Hall, 1997.
S. Chodrow, S. Cheung, P. Hutto, A. Krantz, P. Gray, T. Goddard, I. Rhee, and V. Sunderam. CCF: A Collaborative Computing Frameworks. In IEEE Internet Computing, January / February 2000.
B. Glade, K. P. Birman, R. Cooper, and R. Renesse. Light weight process groups in the isis system. Distributed Systems Engineering, 1:29–36.
R. J. Loader, J. S. Pascoe, and V. S. Sunderam. An Electorial Approach to Fault-Tolerance in Multicast Networks. Technical Report RUCS/2000/TR/011/A, The University of Reading, Department of Computer Science, 2000.
I. Rhee, S. Cheung, P. Hutto, A. Krantz, and V. Sunderam. Group Communication Support for Distributed Collaboration Systems. In Proc. Cluster Computing: Networks, Software Tools and Applications, December 1998.
I. Rhee, S. Cheung, P. Hutto, and V. Sunderam. Group Communication Support for Distributed Multimedia And CSCW Systems. In Proc. 17th International Conference On Distributed Systems, May 1997.
R. van Renesse, K. P. Birman, and S. Maffeis. Horus, a flexible group communication system. In Communications of the ACM, April 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Loader, R.J., Pascoe, J.S., Sunderam, V.S. (2001). Introducing Fault-Tolerant Group Membership Into The Collaborative Computing Transport Layer. In: Alexandrov, V.N., Dongarra, J.J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds) Computational Science — ICCS 2001. ICCS 2001. Lecture Notes in Computer Science, vol 2073. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45545-0_41
Download citation
DOI: https://doi.org/10.1007/3-540-45545-0_41
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42232-7
Online ISBN: 978-3-540-45545-5
eBook Packages: Springer Book Archive