Abstract
Active replication is an effective means to enhance fault tolerant capability in distributed systems. A fault-tolerant group is composed of replicas of key components in a system. This paper analyzes three types of leave semantics of group members, and manifests activities a group member involves. Then it educes requirements for a group member to safely leave. As to quick-leave semantics, this paper proposes a solution and discusses the non-empty protocol and relay protocol in detail. Further, it gives proofs of correctness and termination property of the protocols. The solution is a building block for a practical and operational group membership module.
Similar content being viewed by others
References
Polledna, S., Fault Tolerant Real-Time Systems: The Problem of Replica Determinism, Boston: Kluwer Academic Publishers, 1995.
Guerraoui, R., Schiper, A., The generic consensus service, IEEE Transactions on Software Engineering, 2001, 27(1): 29–41.
Chandra, T., Toueg, S., Unreliable failure detectors for reliable distributed systems, Journal of ACM, 1996, 34(1):225–267.
Schiper, A., Early consensus in an asynchronous system with a weak failure detector, Distributed Computing, 1997, 10(3): 149–157.
Hurfin, M., Mostefaoui, A., Raynal, M., Consensus in asynchronous systems where processes can crash and recover, in Proceedings of 17th IEEE Symposium on Reliable Distributed Systems, 1998, 280–286.
Chandra, T., Hadzilacos, V., Toueg, S., The weakest failure detector for solving consensus, Journal of ACM, 1996, 43(4): 685–722.
Chockler, G., Keidar, I., Vitenberg, R., Group communication specification: A comprehensive survey, ACM Computing Surveys, 2001, 33(4): 427–469.
Renesse, R., Birman, K., Maffeis, S., Horus: A flexible group communication system, Comm. ACM, 1996, 39(4): 76–83.
Amir, Y., Moser, L., Melliar-Smith, M. et al., The Totem single-ring ordering and membershuo protocol, ACM Transactions on Computer Systems, 1995, 13(4): 311–342.
Moser, L., Melliar-Smith, P., Agarwal, D. et al., Totem: A fault-Tolerant multicast group communication system, Comm. ACM, 1996, 39(4): 54–63.
Dolev, D., Malki, D., The transis approach to high availability cluster communication, Comm ACM, 1996, 39(4): 64–70.
Dolev, D., Malki, D., The design of the transis system, Theory and Practice in Distributed Systems, 1995, 83–98.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Wang, Y. Active leave behavior of members in a fault-tolerant group. Sci China Ser F 47, 260–272 (2004). https://doi.org/10.1360/03yf0280
Received:
Issue Date:
DOI: https://doi.org/10.1360/03yf0280