Abstract
Wireless ad-hoc networks are being increasingly used in diverse contexts, ranging from casual meetings to disaster recovery operations. A promising approach is to model these networks as distributed systems prone to dynamic communication failures. This captures transitory disconnections in communication due to phenomena like interference and collisions, and permits an efficient use of the wireless broadcasting medium. This model, however, is bound by the impossibility result of Santoro and Widmayer, which states that, even with strong synchrony assumptions, there is no deterministic solution to any non-trivial form of agreement if n − 1 or more messages can be lost per communication round in a system with n processes. In this paper we propose a novel way to circumvent this impossibility result by employing randomization. We present a consensus protocol that ensures safety in the presence of an unrestricted number of omission faults, and guarantees progress in rounds where such faults are bounded by \(f \leq \lceil \frac{n}{2} \rceil (n-k)+k-2\), where k is the number of processes required to decide, eventually assuring termination with probability 1.
Keywords
- Impossibility Result
- Transmission Failure
- Consensus Protocol
- Communication Round
- Message Loss
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This work was partially supported by the FCT through the Multiannual and the CMU-Portugal Programs.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aguilera, M., Chen, W., Toueg, S.: Failure detection and consensus in the crash-recovery model. Distributed Computing 13(2), 99–125 (2000)
Akkoyunlu, E.A., Ekanadham, K., Huber, R.V.: Some constraints and tradeoffs in the design of network communications. In: Proceedings of the 5th ACM Symposium on Operating Systems Principles, pp. 67–74 (1975)
Ben-Or, M.: Another advantage of free choice: Completely asynchronous agreement protocols. In: Proceedings of the 2nd ACM Symposium on Principles of Distributed Computing, pp. 27–30 (1983)
Biely, M., Widder, J., Charron-Bost, B., Gaillard, A., Hutle, M., Schiper, A.: Tolerating corrupted communication. In: Proceedings of the 26th ACM Symposium on Principles of Distributed Computing, pp. 244–253 (2007)
Bracha, G.: An asynchronous \(\lfloor(n-1)/3\rfloor\)-resilient consensus protocol. In: Proceedings of the 3rd ACM Symposium on Principles of Distributed Computing, pp. 154–162 (1984)
Cachin, C., Kursawe, K., Shoup, V.: Random oracles in Constantinople: Practical asynchronous Byzantine agreement using cryptography. Journal of Cryptology 18(3), 219–246 (2005)
Chandra, T., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43(2), 225–267 (1996)
Charron-Bost, B., Schiper, A.: The heard-of model: Computing in distributed systems with benign failures. Technical Report LSR-REPORT-2007-001, EPFL (2007)
Chockler, G., Demirbas, M., Gilbert, S., Lynch, N., Newport, C., Nolte, T.: Consensus and collision detectors in radio networks. Distributed Computing 21(1), 55–84 (2008)
Dolev, D., Dwork, C., Stockmeyer, L.: On the minimal synchronism needed for distributed consensus. Journal of the ACM 34(1), 77–97 (1987)
Dolev, D., Friedman, R., Keidar, I., Malkhi, D.: Failure detectors in omission failure environments. In: Proceedings of the 16th ACM Symposium on Principles of Distributed Computing, pp. 286–295 (1997)
Fischer, M.J.: The consensus problem in unreliable distributed systems (A brief survey). In: Karpinski, M. (ed.) FCT 1983. LNCS, vol. 158, pp. 127–140. Springer, Heidelberg (1983)
Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2), 374–382 (1985)
Gray, J.: Notes on data base operating systems. In: Bayer, R., Graham, R.M., Seegmüller, G. (eds.) Operating Systems. LNCS, vol. 60. Springer, Heidelberg (1978)
Hurfin, M., Mostefaoui, A., Raynal, M.: Consensus in asynchronous systems where processes can crash and recover. In: Proceedings of the the 17th IEEE Symposium on Reliable Distributed Systems, pp. 280–286 (1998)
Lamport, L.: Lower bounds for asynchronous consensus. Distributed Computing 19(2), 104–125 (2006)
Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. ACM Transactions on Programming Languages and Systems 4(3), 382–401 (1982)
Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann, San Francisco (1997)
Moniz, H., Neves, N.F., Correia, M., Veríssimo, P.: Experimental comparison of local and shared coin randomized consensus protocols. In: Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems, pp. 235–244 (2006)
Moniz, H., Neves, N.F., Correia, M., Veríssimo, P.: RITAS: Services for randomized intrusion tolerance. In: IEEE Transactions on Dependable and Secure Computing (to appear, 2009)
Neves, N.F., Correia, M., Veríssimo, P.: Solving vector consensus with a wormhole. IEEE Transactions on Parallel and Distributed Systems 16(12), 1120–1131 (2005)
Oliveira, R., Guerraoui, R., Schiper, A.: Consensus in the crash-recover model. Technical Report 97-239, EPFL (1997)
Pease, M., Shostak, R., Lamport, L.: Reaching agreement in the presence of faults. Journal of the ACM 27(2), 228–234 (1980)
Perry, K.J., Toueg, S.: Distributed agreement in the presence of processor and communication faults. IEEE Transactions on Software Engineering 12(3), 477–482 (1986)
Rabin, M.O.: Randomized Byzantine generals. In: Proceedings of the 24th Annual IEEE Symposium on Foundations of Computer Science, pp. 403–409 (1983)
Raynal, M., Roy, M.: A note on a simple equivalence between round-based synchronous and asynchronous models. In: Proceedings of the 11th IEEE Pacific Rim International Symposium on Dependable Computing, pp. 387–392 (2005)
Santoro, N., Widmayer, P.: Agreement in synchronous networks with ubiquitous faults. Theoretical Computer Science 384(2-3), 232–249 (2007)
Santoro, N., Widmayer, P.: Time is not a healer. In: Proceedings of the 6th Symposium on Theoretical Aspects of Computer Science, pp. 304–313 (1989)
Schmid, U., Weiss, B., Keidar, I.: Impossibility results and lower bounds for consensus under link failures. SIAM Journal on Computing 38(5), 1912–1951 (2009)
Varghese, G., Lynch, N.A.: A tradeoff between safety and liveness for randomized coordinated attack. Information and Computation 128(1), 57–71 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Moniz, H., Neves, N.F., Correia, M., Veríssimo, P. (2009). Randomization Can Be a Healer: Consensus with Dynamic Omission Failures. In: Keidar, I. (eds) Distributed Computing. DISC 2009. Lecture Notes in Computer Science, vol 5805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04355-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-04355-0_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04354-3
Online ISBN: 978-3-642-04355-0
eBook Packages: Computer ScienceComputer Science (R0)