Advertisement

Distributed Computing

, Volume 20, Issue 5, pp 343–358 | Cite as

Failure detectors as type boosters

  • Rachid Guerraoui
  • Petr Kouznetsov
Open Access
Article

Abstract

The power of an object type T can be measured as the maximum number n of processes that can solve consensus using only objects of T and registers. This number, denoted cons(T), is called the consensus power of T. This paper addresses the question of the weakest failure detector to solve consensus among a number k > n of processes that communicate using shared objects of a type T with consensus power n. In other words, we seek for a failure detector that is sufficient and necessary to “boost” the consensus power of a type T from n to k. It was shown in Neiger (Proceedings of the 14th annual ACM symposium on principles of distributed computing (PODC), pp. 100–109, 1995) that a certain failure detector, denoted Ω n , is sufficient to boost the power of a type T from n to k, and it was conjectured that Ω n was also necessary. In this paper, we prove this conjecture for one-shot deterministic types. We first show that, for any one-shot deterministic type T with cons(T) ≤ n, Ω n is necessary to boost the power of T from n to n + 1. Then we go a step further and show that Ω n is also the weakest to boost the power of (n + 1)-ported one-shot deterministic types from n to any k > n. Our result generalizes, in a precise sense, the result of the weakest failure detector to solve consensus in asynchronous message-passing systems (Chandra et al. in J ACM 43(4):685–722, 1996). As a corollary, we show that Ω t is the weakest failure detector to boost the resilience level of a distributed shared memory system, i.e., to solve consensus among n > t processes using (t − 1)-resilient objects of consensus power t.

Keywords

Correct Process Reduction Algorithm Failure Detector Failure Pattern Shared Object 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Afek Y., Attiya H., Dolev D., Gafni E., Merrit M. and Shavit N. (1993). Atomic snapshots of shared memory. J. ACM 40(4): 873–890 zbMATHCrossRefGoogle Scholar
  2. 2.
    Attie, P., Lynch, N.A., Rajsbaum, S.: Boosting fault-tolerance in asynchronous message passing systems is impossible. Technical report. MIT Laboratory for Computer Science, MIT-LCS-TR-877, (2002)Google Scholar
  3. 3.
    Attie, P.C., Guerraoui, R., Kouznetsov, P., Lynch, N.A., Rajsbaum, S.: The impossibility of boosting distributed service resilience. In: Proceedings of the 25th IEEE International Conference on Distributed Computing Systems (ICDCS’05), June (2005)Google Scholar
  4. 4.
    Attiya H. and Welch J.L. (2004). Distributed Computing: Fundamentals, Simulations and Advanced Topics, 2nd edn. Wiley, New York Google Scholar
  5. 5.
    Borowsky, E., Gafni, E., Afek, Y.: Consensus power makes (some) sense! In: Proceedings of the 13th Annual ACM Symposium on Principles of Distributed Computing (PODC), pp. 363–372, August (1994)Google Scholar
  6. 6.
    Chandra T.D., Hadzilacos V., Jayanti P. and Toueg S. (2004). Generalized irreducibility of consensus and the equivalence of t-resilient and wait-free implementations of consensus. SIAM J. Comput. 34(2): 333–357 zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Chandra T.D., Hadzilacos V. and Toueg S. (1996). The weakest failure detector for solving consensus. J. ACM 43(4): 685–722 zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Chandra T.D. and Toueg S. (1996). Unreliable failure detectors for reliable distributed systems. J. ACM 43(2): 225–267 zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Dolev D., Dwork C. and Stockmeyer L.J. (1987). On the minimal synchronism needed for distributed consensus. J. ACM 34(1): 77–97 zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Dwork C., Lynch N.A. and Stockmeyer L.J. (1988). Consensus in the presence of partial synchrony. J. ACM 35(2): 288–323 CrossRefMathSciNetGoogle Scholar
  11. 11.
    Fischer M.J., Lynch N.A. and Paterson M.S. (1985). Impossibility of distributed consensus with one faulty process. J. ACM 32(3): 374–382 zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Guerraoui, R., Herlihy, M., Kouznetsov, P., Lynch, N., Newport, C.: On the weakest failure detector ever. Technical report, Max Planck Institute for Software SystemsGoogle Scholar
  13. 13.
    Guerraoui, R., Kouznetsov, P.: On failure detectors and type boosters. In: Proceedings of the 17th International Symposium on Distributed Computing (DISC’03), October (2003)Google Scholar
  14. 14.
    Herlihy M. (1991). Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1): 124–149 CrossRefGoogle Scholar
  15. 15.
    Herlihy, M., Ruppert, E.: On the existence of booster types. In: Proceedings of the 41st IEEE Symposium on Foundations of Computer Science (FOCS), pp 653–663 (2000)Google Scholar
  16. 16.
    Herlihy M. and Wing J.M. (1990). Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3): 463–492 CrossRefGoogle Scholar
  17. 17.
    Jayanti P. (1997). Robust wait-free hierarchies. J. ACM 44(4): 592–614 zbMATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Jayanti, P., Toueg, S.: Some results on the impossibility, universality and decidability of consensus. In: Proceedings of the 6th International Workshop on Distributed Algorithms (WDAG’92). LNCS, vol 647. Springer, Heidelberg (1992)Google Scholar
  19. 19.
    Lo, W.-K., Hadzilacos, V.: Using failure detectors to solve consensus in asynchronous shared-memory systems. In: Proceedings of the 8th International Workshop on Distributed Algorithms (WDAG’94). LNCS, vol. 857, pp. 280–295. Springer, Heidelberg (1994)Google Scholar
  20. 20.
    Lo W.-K. and Hadzilacos V. (2000). All of us are smarter than any of us: Nondeterministic wait-free hierarchies are not robust. SIAM J. Comput. 30(3): 689–728 zbMATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    Loui, M.C., Abu-Amara, H.H.: Memory requirements for agreement among unreliable asynchronous processes. Adv. Comput. Res., pp. 163–183 (1987)Google Scholar
  22. 22.
    Lynch N.A. (1996). Distributed Algorithms. Morgan Kaufmann Publishers, San Francisco zbMATHGoogle Scholar
  23. 23.
    Neiger, G.: Failure detectors and the wait-free hierarchy. In: Proceedings of the 14th Annual ACM Symposium on Principles of Distributed Computing (PODC), pp. 100–109, August (1995)Google Scholar
  24. 24.
    Ruppert E. (2000). Determining consensus numbers. SIAM J. Comput. 30(4): 1156–1168 zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Distributed Programming Laboratory, EPFLLausanneSwitzerland
  2. 2.Max Planck Institute for Software SystemsSarbrückenGermany

Personalised recommendations