Skip to main content

Randomization Can Be a Healer: Consensus with Dynamic Omission Failures

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 5805)

Abstract

Wireless ad-hoc networks are being increasingly used in diverse contexts, ranging from casual meetings to disaster recovery operations. A promising approach is to model these networks as distributed systems prone to dynamic communication failures. This captures transitory disconnections in communication due to phenomena like interference and collisions, and permits an efficient use of the wireless broadcasting medium. This model, however, is bound by the impossibility result of Santoro and Widmayer, which states that, even with strong synchrony assumptions, there is no deterministic solution to any non-trivial form of agreement if n − 1 or more messages can be lost per communication round in a system with n processes. In this paper we propose a novel way to circumvent this impossibility result by employing randomization. We present a consensus protocol that ensures safety in the presence of an unrestricted number of omission faults, and guarantees progress in rounds where such faults are bounded by \(f \leq \lceil \frac{n}{2} \rceil (n-k)+k-2\), where k is the number of processes required to decide, eventually assuring termination with probability 1.

Keywords

  • Impossibility Result
  • Transmission Failure
  • Consensus Protocol
  • Communication Round
  • Message Loss

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This work was partially supported by the FCT through the Multiannual and the CMU-Portugal Programs.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aguilera, M., Chen, W., Toueg, S.: Failure detection and consensus in the crash-recovery model. Distributed Computing 13(2), 99–125 (2000)

    CrossRef  Google Scholar 

  2. Akkoyunlu, E.A., Ekanadham, K., Huber, R.V.: Some constraints and tradeoffs in the design of network communications. In: Proceedings of the 5th ACM Symposium on Operating Systems Principles, pp. 67–74 (1975)

    Google Scholar 

  3. Ben-Or, M.: Another advantage of free choice: Completely asynchronous agreement protocols. In: Proceedings of the 2nd ACM Symposium on Principles of Distributed Computing, pp. 27–30 (1983)

    Google Scholar 

  4. Biely, M., Widder, J., Charron-Bost, B., Gaillard, A., Hutle, M., Schiper, A.: Tolerating corrupted communication. In: Proceedings of the 26th ACM Symposium on Principles of Distributed Computing, pp. 244–253 (2007)

    Google Scholar 

  5. Bracha, G.: An asynchronous \(\lfloor(n-1)/3\rfloor\)-resilient consensus protocol. In: Proceedings of the 3rd ACM Symposium on Principles of Distributed Computing, pp. 154–162 (1984)

    Google Scholar 

  6. Cachin, C., Kursawe, K., Shoup, V.: Random oracles in Constantinople: Practical asynchronous Byzantine agreement using cryptography. Journal of Cryptology 18(3), 219–246 (2005)

    CrossRef  MathSciNet  MATH  Google Scholar 

  7. Chandra, T., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43(2), 225–267 (1996)

    CrossRef  MathSciNet  MATH  Google Scholar 

  8. Charron-Bost, B., Schiper, A.: The heard-of model: Computing in distributed systems with benign failures. Technical Report LSR-REPORT-2007-001, EPFL (2007)

    Google Scholar 

  9. Chockler, G., Demirbas, M., Gilbert, S., Lynch, N., Newport, C., Nolte, T.: Consensus and collision detectors in radio networks. Distributed Computing 21(1), 55–84 (2008)

    CrossRef  MATH  Google Scholar 

  10. Dolev, D., Dwork, C., Stockmeyer, L.: On the minimal synchronism needed for distributed consensus. Journal of the ACM 34(1), 77–97 (1987)

    CrossRef  MathSciNet  MATH  Google Scholar 

  11. Dolev, D., Friedman, R., Keidar, I., Malkhi, D.: Failure detectors in omission failure environments. In: Proceedings of the 16th ACM Symposium on Principles of Distributed Computing, pp. 286–295 (1997)

    Google Scholar 

  12. Fischer, M.J.: The consensus problem in unreliable distributed systems (A brief survey). In: Karpinski, M. (ed.) FCT 1983. LNCS, vol. 158, pp. 127–140. Springer, Heidelberg (1983)

    CrossRef  Google Scholar 

  13. Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2), 374–382 (1985)

    CrossRef  MathSciNet  MATH  Google Scholar 

  14. Gray, J.: Notes on data base operating systems. In: Bayer, R., Graham, R.M., Seegmüller, G. (eds.) Operating Systems. LNCS, vol. 60. Springer, Heidelberg (1978)

    CrossRef  Google Scholar 

  15. Hurfin, M., Mostefaoui, A., Raynal, M.: Consensus in asynchronous systems where processes can crash and recover. In: Proceedings of the the 17th IEEE Symposium on Reliable Distributed Systems, pp. 280–286 (1998)

    Google Scholar 

  16. Lamport, L.: Lower bounds for asynchronous consensus. Distributed Computing 19(2), 104–125 (2006)

    CrossRef  MATH  Google Scholar 

  17. Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. ACM Transactions on Programming Languages and Systems 4(3), 382–401 (1982)

    CrossRef  MATH  Google Scholar 

  18. Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  19. Moniz, H., Neves, N.F., Correia, M., Veríssimo, P.: Experimental comparison of local and shared coin randomized consensus protocols. In: Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems, pp. 235–244 (2006)

    Google Scholar 

  20. Moniz, H., Neves, N.F., Correia, M., Veríssimo, P.: RITAS: Services for randomized intrusion tolerance. In: IEEE Transactions on Dependable and Secure Computing (to appear, 2009)

    Google Scholar 

  21. Neves, N.F., Correia, M., Veríssimo, P.: Solving vector consensus with a wormhole. IEEE Transactions on Parallel and Distributed Systems 16(12), 1120–1131 (2005)

    CrossRef  Google Scholar 

  22. Oliveira, R., Guerraoui, R., Schiper, A.: Consensus in the crash-recover model. Technical Report 97-239, EPFL (1997)

    Google Scholar 

  23. Pease, M., Shostak, R., Lamport, L.: Reaching agreement in the presence of faults. Journal of the ACM 27(2), 228–234 (1980)

    CrossRef  MathSciNet  MATH  Google Scholar 

  24. Perry, K.J., Toueg, S.: Distributed agreement in the presence of processor and communication faults. IEEE Transactions on Software Engineering 12(3), 477–482 (1986)

    CrossRef  MATH  Google Scholar 

  25. Rabin, M.O.: Randomized Byzantine generals. In: Proceedings of the 24th Annual IEEE Symposium on Foundations of Computer Science, pp. 403–409 (1983)

    Google Scholar 

  26. Raynal, M., Roy, M.: A note on a simple equivalence between round-based synchronous and asynchronous models. In: Proceedings of the 11th IEEE Pacific Rim International Symposium on Dependable Computing, pp. 387–392 (2005)

    Google Scholar 

  27. Santoro, N., Widmayer, P.: Agreement in synchronous networks with ubiquitous faults. Theoretical Computer Science 384(2-3), 232–249 (2007)

    CrossRef  MathSciNet  MATH  Google Scholar 

  28. Santoro, N., Widmayer, P.: Time is not a healer. In: Proceedings of the 6th Symposium on Theoretical Aspects of Computer Science, pp. 304–313 (1989)

    Google Scholar 

  29. Schmid, U., Weiss, B., Keidar, I.: Impossibility results and lower bounds for consensus under link failures. SIAM Journal on Computing 38(5), 1912–1951 (2009)

    CrossRef  MathSciNet  MATH  Google Scholar 

  30. Varghese, G., Lynch, N.A.: A tradeoff between safety and liveness for randomized coordinated attack. Information and Computation 128(1), 57–71 (1996)

    CrossRef  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Moniz, H., Neves, N.F., Correia, M., Veríssimo, P. (2009). Randomization Can Be a Healer: Consensus with Dynamic Omission Failures. In: Keidar, I. (eds) Distributed Computing. DISC 2009. Lecture Notes in Computer Science, vol 5805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04355-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04355-0_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04354-3

  • Online ISBN: 978-3-642-04355-0

  • eBook Packages: Computer ScienceComputer Science (R0)