Abstract
The power of an object type T can be measured as the maximum number n of processes that can solve consensus using only objects of T and registers. This number, denoted cons(T), is called the consensus power of T. This paper addresses the question of the weakest failure detector to solve consensus among a number k > n of processes that communicate using shared objects of a type T with consensus power n. In other words, we seek for a failure detector that is sufficient and necessary to “boost” the consensus power of a type T from n to k. It was shown in Neiger (Proceedings of the 14th annual ACM symposium on principles of distributed computing (PODC), pp. 100–109, 1995) that a certain failure detector, denoted Ω n , is sufficient to boost the power of a type T from n to k, and it was conjectured that Ω n was also necessary. In this paper, we prove this conjecture for one-shot deterministic types. We first show that, for any one-shot deterministic type T with cons(T) ≤ n, Ω n is necessary to boost the power of T from n to n + 1. Then we go a step further and show that Ω n is also the weakest to boost the power of (n + 1)-ported one-shot deterministic types from n to any k > n. Our result generalizes, in a precise sense, the result of the weakest failure detector to solve consensus in asynchronous message-passing systems (Chandra et al. in J ACM 43(4):685–722, 1996). As a corollary, we show that Ω t is the weakest failure detector to boost the resilience level of a distributed shared memory system, i.e., to solve consensus among n > t processes using (t − 1)-resilient objects of consensus power t.
References
Afek Y., Attiya H., Dolev D., Gafni E., Merrit M. and Shavit N. (1993). Atomic snapshots of shared memory. J. ACM 40(4): 873–890
Attie, P., Lynch, N.A., Rajsbaum, S.: Boosting fault-tolerance in asynchronous message passing systems is impossible. Technical report. MIT Laboratory for Computer Science, MIT-LCS-TR-877, (2002)
Attie, P.C., Guerraoui, R., Kouznetsov, P., Lynch, N.A., Rajsbaum, S.: The impossibility of boosting distributed service resilience. In: Proceedings of the 25th IEEE International Conference on Distributed Computing Systems (ICDCS’05), June (2005)
Attiya H. and Welch J.L. (2004). Distributed Computing: Fundamentals, Simulations and Advanced Topics, 2nd edn. Wiley, New York
Borowsky, E., Gafni, E., Afek, Y.: Consensus power makes (some) sense! In: Proceedings of the 13th Annual ACM Symposium on Principles of Distributed Computing (PODC), pp. 363–372, August (1994)
Chandra T.D., Hadzilacos V., Jayanti P. and Toueg S. (2004). Generalized irreducibility of consensus and the equivalence of t-resilient and wait-free implementations of consensus. SIAM J. Comput. 34(2): 333–357
Chandra T.D., Hadzilacos V. and Toueg S. (1996). The weakest failure detector for solving consensus. J. ACM 43(4): 685–722
Chandra T.D. and Toueg S. (1996). Unreliable failure detectors for reliable distributed systems. J. ACM 43(2): 225–267
Dolev D., Dwork C. and Stockmeyer L.J. (1987). On the minimal synchronism needed for distributed consensus. J. ACM 34(1): 77–97
Dwork C., Lynch N.A. and Stockmeyer L.J. (1988). Consensus in the presence of partial synchrony. J. ACM 35(2): 288–323
Fischer M.J., Lynch N.A. and Paterson M.S. (1985). Impossibility of distributed consensus with one faulty process. J. ACM 32(3): 374–382
Guerraoui, R., Herlihy, M., Kouznetsov, P., Lynch, N., Newport, C.: On the weakest failure detector ever. Technical report, Max Planck Institute for Software Systems
Guerraoui, R., Kouznetsov, P.: On failure detectors and type boosters. In: Proceedings of the 17th International Symposium on Distributed Computing (DISC’03), October (2003)
Herlihy M. (1991). Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1): 124–149
Herlihy, M., Ruppert, E.: On the existence of booster types. In: Proceedings of the 41st IEEE Symposium on Foundations of Computer Science (FOCS), pp 653–663 (2000)
Herlihy M. and Wing J.M. (1990). Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3): 463–492
Jayanti P. (1997). Robust wait-free hierarchies. J. ACM 44(4): 592–614
Jayanti, P., Toueg, S.: Some results on the impossibility, universality and decidability of consensus. In: Proceedings of the 6th International Workshop on Distributed Algorithms (WDAG’92). LNCS, vol 647. Springer, Heidelberg (1992)
Lo, W.-K., Hadzilacos, V.: Using failure detectors to solve consensus in asynchronous shared-memory systems. In: Proceedings of the 8th International Workshop on Distributed Algorithms (WDAG’94). LNCS, vol. 857, pp. 280–295. Springer, Heidelberg (1994)
Lo W.-K. and Hadzilacos V. (2000). All of us are smarter than any of us: Nondeterministic wait-free hierarchies are not robust. SIAM J. Comput. 30(3): 689–728
Loui, M.C., Abu-Amara, H.H.: Memory requirements for agreement among unreliable asynchronous processes. Adv. Comput. Res., pp. 163–183 (1987)
Lynch N.A. (1996). Distributed Algorithms. Morgan Kaufmann Publishers, San Francisco
Neiger, G.: Failure detectors and the wait-free hierarchy. In: Proceedings of the 14th Annual ACM Symposium on Principles of Distributed Computing (PODC), pp. 100–109, August (1995)
Ruppert E. (2000). Determining consensus numbers. SIAM J. Comput. 30(4): 1156–1168
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is a revised and extended version of a paper that appeared in the Proceedings of the 17th International Symposium on Distributed Computing (DISC 2003), entitled “On failure detectors and type boosters.”
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Guerraoui, R., Kouznetsov, P. Failure detectors as type boosters. Distrib. Comput. 20, 343–358 (2008). https://doi.org/10.1007/s00446-007-0043-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00446-007-0043-z
Keywords
- Correct Process
- Reduction Algorithm
- Failure Detector
- Failure Pattern
- Shared Object