Collaborative Group Membership

Pascoe, James S.; Loader, Roger J.; Sunderam, Vaidy S.

doi:10.1023/A:1014306520634

Collaborative Group Membership

Published: May 2002

Volume 22, pages 55–68, (2002)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

James S. Pascoe¹,
Roger J. Loader¹ &
Vaidy S. Sunderam²

39 Accesses
2 Citations
Explore all metrics

Abstract

In this paper we present a novel approach to fault-tolerant group membership for use predominantly in collaborative computing environments. As an exemplar, we use the Collaborative Computing Transport Layer which offers reliable atomic multicast capabilities for use in collaborative environments such as the Collaborative Computing Frameworks (CCF). Specific design goals of the approach are the elimination of processing overhead due to heartbeats, support for partial failures and extensibility. These goals are satisfied in an approach, termed Collaborative Group Membership (CGM), which uses a quiescent weak failure detector and two election based algorithms to form consensus on the membership of a failing group. Failure detection operates through a reliable multicast primitive and as such eliminates the need for explicit keep-alive packets; thus in a failure free environment, CGM imposes no overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

H. Attiya and J. Welch. Distributed Computing: Fundamentals, Simulations and Advanced Topics. McGraw-Hill, New York, 1998.
Google Scholar
C. A. R. Hoare. Communicating Sequential Processes. Prentice Hall, Englewood Cliffs, NJ, 1986.
Google Scholar
S. Chodrow, S. Cheung, P. Hutto, A. Krantz, P. Gray, T. Goddard, I. Rhee, and V. Sunderam. CCF: a collaborative computing frameworks. IEEE Internet Computing, January/February 2000.
D. A. Agarwal. Totem: A reliable ordered delivery protocol for interconnected local-area networks. Ph.D. thesis, University of California, Santa Barbara, August 1994.
Google Scholar
D. Dolev and D. Malki. The transis approach to high availability cluster communication. Communications of the ACM, April 1996.
G. J. Holzmann. Basic spin manual. Available from: http://cm.bell-labs.com/cm/cs/what/spin/Man/ Manual.html.
G. J. Holzmann. Design and Validation of Computer Protocols. Prentice Hall, Englewood Cliffs, NJ, 1991. An online version is available at: http://cm.bell-labs.com/cm/cs/what/spin/Doc/ Book91.html.
Google Scholar
J. S. Pascoe, R. J. Loader, and V. S. Sunderam. Working towards the agreement problem protocol verification environment. In Proceedings of WoTUG 24: Communicating Process Architectures, pp. 213–230. IOS Press, 2001.
K. Berket. The InterGroup protocols: Scalable group communication for the internet. PhD thesis, University of California, Santa Barbara, 2000.
Google Scholar
K. P. Birman. Building Secure and Reliable Network Applications. Prentice Hall, Englewood Cliffs, N.J., 1997.
Google Scholar
L. E. Moser, P. M. Melliar-Smith, D. A. Agarwal, R. K. Budhia, and C. A. Lingley-Papadopoulos. Totem: a fault-tolerant multicast group communication system. Communications of the ACM, April 1996.
R. J. Loader, J. S. Pascoe, and V. S. Sunderam. An electorial approach to fault-tolerance in multicast networks. Technical report RUCS/2000/TR/011/A, The University of Reading, Department of Computer Science, 2000.
O. Rodeh, K. P. Birman, and D. Dolev. The architecture and performance of security protocols in the ensemble group communication system. Technical report TR2000–1791, Cornell University, March 2000.
R. van Renesse, K. P. Birman, and S. Maffeis. Horus, a flexible group communication system. Communications of the ACM, April 1996.
I. Rhee, S. Cheung, P. Hutto, A. Krantz, and V. Sunderam. Group communication support for distributed collaboration systems. In Proceedings of Cluster Computing: Networks, Software Tools and Applications, December 1998.
R. Gerth. Concise Promela reference. Available from: http://cm.bell-labs.com/cm/cs/what/spin/Man/ Quick.html.
T. D. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the Association for Computing Machinery, 43(2), 1996.
Z. Manna and A. Pnueli. The Temporal Logic of Reactive and Concurrent Systems: Specification. Springer-Verlag, Berlin, 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, The University of Reading, United Kingdom, RG6 6AY
James S. Pascoe & Roger J. Loader
Math & Computer Science, Emory University, Atlanta, GA, 30322
Vaidy S. Sunderam

Authors

James S. Pascoe
View author publications
You can also search for this author in PubMed Google Scholar
Roger J. Loader
View author publications
You can also search for this author in PubMed Google Scholar
Vaidy S. Sunderam
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pascoe, J.S., Loader, R.J. & Sunderam, V.S. Collaborative Group Membership. The Journal of Supercomputing 22, 55–68 (2002). https://doi.org/10.1023/A:1014306520634

Download citation

Issue Date: May 2002
DOI: https://doi.org/10.1023/A:1014306520634

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Collaborative Group Membership

Abstract

Access this article

Similar content being viewed by others

SHBF: a secure and scalable hybrid blockchain framework for resolving trilemma challenges

A brief introduction to distributed systems

A survey on the evolution of stream processing systems

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Collaborative Group Membership

Abstract

Access this article

Similar content being viewed by others

SHBF: a secure and scalable hybrid blockchain framework for resolving trilemma challenges

A brief introduction to distributed systems

A survey on the evolution of stream processing systems

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation