Abstract
Group communication is nowadays a well established topic in distributed computing. It emerged over the years as a topic with a strong synergy between theory and practice: group communication is highly relevant for building distributed systems, and is also of theoretical importance, because of the difficult problems it addresses. The paper presents an - inevitably subjective - retrospective of the main milestones that led to our current understanding of group communication, with the focus on theoretical contributions of practical relevance. Some open issues are discussed at the end of the paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. El Abbadi and S. Toueg. Availability in Partitioned Replicated Databases. In ACM SIGACT-SIGMOD symp. on prin. of Database Systems, pages 240–251, March 1986. See also ACM Trans. on Database Systems, 14(2):264–290, June 1989.
M. Aguilera, W. Chen, and S. Toueg. Failure detection and consensus in the crash-recovery model. In Proceedings of the 12th International Symposium on Distributed Computing, LNCS, pages 231–245. Springer, September 1998.
Y. Amir, D. Dolev, S. Kramer, and D. Malki. Membership Algorithms for Multicast Communication Groups. In 6th Intl. Workshop on Distributed Algorithms proceedings (WDAG-6), (itlcns, 647), pages 292–312, November 1992.
E. Anceaume, B. Charron-Bost, P. Minet, and S. Toueg. On the formal specification of group membership services. Technical Report 95-1534, Department of Computer Science, Cornell University, August 1995.
M. Barborak, M. Malek, and A. Dahbura. The Consensus Problem in Fault-Tolerant Computing. ACM Computing Surveys, 25(2):171–220, June 1993.
M. Ben-Or. Another Advantage of Free Choice: Completely Asynchronous Agreement Protocols. In proc. 2nd annual ACM Symposium on Principles of Distributed Computing, pages 27–30, 1983.
K. Birman and T. Joseph. Reliable Communication in the Presence of Failures. ACMTrans. on Computer Systems, 5(1):47–76, February 1987.
K. Birman and T. Joseph. Exploiting Virtual Synchrony in Distributed Systems. In 11th Ann. Symp. Operating Systems Principles, pages 123–138. ACM, Nov 1987.
K. Birman, T. Joseph, T. Räuchle, and A. ElAbbadi. Implementing Fault-Tolerant Distributed Objects. IEEE Trans. on Software Engineering, 11(6):502–508, June 1985.
T. D. Chandra, V. Hadzilacos, and S. Toueg. The Weakest Failure Detector for Solving Consensus. Journal of ACM, 43(4):685–722, 1996.
T. D. Chandra and S. Toueg. Unreliable Failure Detectors for Asynchronous Systems. In proc. 10th annual ACMSymposium on Principles of Distributed Computing, pages 325–340, 1991. See also Journal of the ACM, 43(2):225–267, 1996.
Tushar Deepak Chandra, Vassos Hadzilacos, Sam Toueg, and Bernadette Charron-Bost. On the impossibility of group membership. In Proc. of the 15th ACM Symposium on Principles of Distributed Computing, pages 322–330, Philadelphia, Pennsylvania, USA, May 1996.
J-M. Chang. Simplifying Distributed Database Systems Design by Using a Broadcast Network. In Proc. of the ACM SIGMOD Int. Conference on Management of Data, pages 223–233, June 1984.
J. M._Chang and N. Maxemchuck. Reliable Broadcast Protocols. ACM Trans. on Computer Systems, 2(3):251–273, August 1984.
B. Charron-Bost, X. Défago, and A. Schiper. Broadcasting Messages in Fault-Tolerant Distributed Systems: the benefit of handling input-triggered and output-triggered suspicions differently. In 21st IEEE Symp. on Reliable Distributed Systems (SRDS-21), pages 244–249, Osaka, Japan, October 2002.
D. R. Cheriton and D. Skeen. Understanding the Limitations of Causally and Totally Ordered Communications. In 14th ACM Symp. Operating Systems Principles, pages 44–57, Dec 1993.
D. R. Cheriton and W. Zwaenepoel. Distributed Process Groups in the V Kernel. ACM Trans. on Computer Systems, 2(3):77–107, May 1985.
G.V. Chockler, I. Keidar, and R. Vitenberg. Group Communication Specifications: A Comprehensive Study. ACM Computing Surveys, 4(33):1–43, December 2001.
F. Cristian, H. Aghili, R. Strong, and D. Dolev. Atomic broadcast: From simple message diffusion to byzantine agreement. In IEEE 15th Int Symp on Fault-Tolerant Computing (FTCS-15), pages 200–206, June 1985.
Flaviu Cristian. Reaching Agreement on Processor Group Membership in Synchronous Distributed Systems. Distributed Computing, 4(4):175–187, April 1991.
D. Davcec and A. Burkhard. Consistency and Recovery Control for Replicated Files. In Proceedings of the 10th Symposium on Operating Systems Principles, pages 87–96, 1985.
X. Défago, A. Schiper, and P. Urban. Totally Ordered Broadcast and Multicast Algorithms:A Comprehensive Survey. TR DSC/2000/036, EPFL, Communication Systems Departement, October 2000.
C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the presence of partial synchrony. Journal of ACM, 35(2):288–323, April 1988.
A. Fekete, N. Lynch, and A.A. Shvartsman. Specifying and Using a Group Communication Service. ACM Trans. on Computer Systems, 19(2):171–216, May 2001.
M. Fischer. The Consensus Problem in Unreliable Distributed Systems (A Brief Survey). In Int. Conf. on Foundations of Computation Theory (FCT), pages 127–140, 1983.
M. Fischer, N. Lynch, and M. Paterson. Impossibility of Distributed Consensus with One Faulty Process. Journal of ACM, 32:374–382, April 1985.
V. Hadzilacos and S. Toueg. Fault-Tolerant Broadcasts and Related Problems. Technical Report 94-1425, Department of Computer Science, Cornell University, May 1994.
B. Kemme. Database Replication for Clusters of Workstations. PhD thesis, ETH Zurich, Switzerland, 2000. Number 13864.
L. Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. Comm. ACM, 21(7):558–565, July 1978.
L. Lamport. Using Time Instead of Timeout for Fault-Tolerant Distributed Systems. ACM Trans. on Progr. Languages and Syst., 6(2):254–280, April 1984.
L. Lamport. The Part-Time Parliament. Technical Report 49, Digital SRC, September 1989. See also ACM Trans. on Computer Systems, 16(2):133–169, May 1998.
L. Lamport, R. Shostak, and M. Pease. The Byzantine Generals Problem. ACM Trans. on Progr. Languages and Syst., 4(3):382–401, July 1982.
D. Malkhi and M. Reiter. Byzantine Quorum Systems. Distributed Computing, 11(4):203–213, 1998.
P. M. Melliar-Smith, L. E. Moser, and V. Agrawala. Membership Algorithms for Asynchronous Distributed Systems. In IEEE 11th Intl. Conf. Distributed Computing Systems, pages 480–488, May 1991.
M. Pease, R. Shostak, and L. Lamport. Reaching Agreement in the Presence of Faults. Journal of ACM, 27(2):228–234, 1980.
F. Pedone. The Database State Machine and Group Communication Issues. PhD thesis, École Polytechnique Fédérale de Lausanne, Switzerland, December 1999. Number 2090.
F. Pedone and A. Schiper. Generic Broadcast. In 13th. Intl. Symposium on Distributed Computing (DISC’99), pages 94–108. Springer Verlag, LNCS 1693, September 1999.
M. Rabin. Randomized Byzantine Generals. In Proc. 24th Annual ACM Symposium on Foundations of Computer Science, pages 403–409, 1983.
R. Reischuk. A New Solution for the Byzantine general’s problem. Technical Report RJ 3673, IBM Research Laboratory, November 1982.
A. M. Ricciardi and K. P. Birman. Using Process Groups to Implement Failure Detection in Asynchronous Environments. In Proc. of the 10th ACM Symposium on Principles of Distributed Computing, pages 341–352, August 1991.
F.B. Schneider. Synchronization in Distributed Programs. ACM Trans. on Progr. Languages and Syst., 4(2):125–148, April 1982.
R.D. Schlichting and F.B. Schneider. Fail-stop processors: an approach to designing faulttolerant computing systems. ACM Trans. on Computer Systems, 1(3):222–238, August 1983.
J.H. Wensley. SIFT-Software Implementated Fault Tolerance. FJCC, pages 243–253, 1972.
J.H. Wensley, L. Lamport, J. Goldberg, M.W. Green, K.N. Levitt, P.M. Melliar-Smith, R.E. Shotak, and C.B. Weinstock. SIFT: The Design and Analysis of a Fault-Tolerant Computer for Aircraft Control. Proc. IEEE, 66(10):1240–1255, 1978.
M. Wiesmann. Group Communications and Database Replication: Techniques, Issues and Performance. PhD thesis, École Polytechnique Fédérale de Lausanne, Switzerland, 2002. Number 2577.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schiper, A. (2003). Practical Impact of Group Communication Theory. In: Schiper, A., Shvartsman, A.A., Weatherspoon, H., Zhao, B.Y. (eds) Future Directions in Distributed Computing. Lecture Notes in Computer Science, vol 2584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-37795-6_1
Download citation
DOI: https://doi.org/10.1007/3-540-37795-6_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00912-2
Online ISBN: 978-3-540-37795-5
eBook Packages: Springer Book Archive