Abstract
We consider the problem of reaching agreement in distributed systems in which some processes may deviate from their prescribed behavior before they eventually crash. We call this failure model “mortal Byzantine”. After discussing some application examples where this model is justified, we provide matching upper and lower bounds on the number of faulty processes, and on the required number of rounds in synchronous systems. We then continue our study by varying different system parameters. On the one hand, we consider the failure model under weaker timing assumptions, namely for partially synchronous systems and asynchronous systems with unreliable failure detectors. On the other hand, we vary the failure model in that we limit the occurrences of faulty steps that actually lead to a crash in synchronous systems.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Charron-Bost B., Schiper A.: Uniform consensus is harder than consensus. J. Algorithms 51(1), 15–37 (2004)
Delporte-Gallet, C., Fauconnier, H., Horn, S.L., Toueg, S.: Fast fault-tolerant agreement algorithms. In: Proceedings of the 24th ACM Symposium on Principles of Distributed Computing (PODC’05), pp. 169–178. ACM Press, New York, USA (2005)
Lynch N.: Distributed Algorithms. Morgan Kaufman Publishers, San Francisco (1996)
Lamport L., Shostak R., Pease M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982)
Nesterenko, M., Arora, A.: Dining philosophers that tolerate malicious crashes. In: Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS’02), pp. 191–198. Vienna, Austria (2002)
Fischer, M.J., Lynch, N.A., Merritt, M.: Easy impossibility proofs for distributed consensus problems. In: Proceedings of the Fourth Annual ACM Symposium on Principles of Distributed Computing, PODC ’85, pp. 59–70. ACM, New York, USA (1985)
Fischer M.J., Lynch N.: A lower bound for the time to assure interactive consistancy. Inf. Process. Lett. 14(4), 198–202 (1982)
Dwork C., Lynch N., Stockmeyer L.: Consensus in the presence of partial synchrony. J. ACM 35(2), 288–323 (1988)
Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)
Doudou, A., Garbinato, B., Guerraoui, R., Schiper, A.: Muteness failure detectors: specification and implementation. In: Proceedings 3rd European Dependable Computing Conference (EDCC-3). Lecture Notes in Computer Science 1667, vol. 1667, pp. 71–87. Springer, Prague, Czech Republic (1999)
Doudou, A., Schiper, A.: Muteness detectors for consensus with Byzantine processes. In: Proceedings of the 17th ACM Symposium on Principles of Distributed Computing (PODC-17). Puerto Vallarta, Mexico (1998)
Bazzi, R.A., Herlihy, M.: Enhanced fault-tolerance through Byzantine failure detection. In: 13th International Conference on Principles of Distributed Systems (OPODIS), Lecture Notes in Computer Sciences, vol. 5923, pp. 129–143. Springer (2009)
Dijkstra, E.W.: On the role of scientific thought. In: Selected Writings on Computing: A Personal Perspective, pp. 60–66. Springer, New York (1982). (EWD 447)
Dolev D., Reischuk R., Strong H.R.: Early stopping in Byzantine agreement. J. ACM 37(4), 720–741 (1990)
Elrad T., Francez N.: Decomposition of distributed programs into communication-closed layers. Sci. Comput. Programm. 2(3), 155–173 (1982)
Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Consensus with Byzantine failures and little system synchrony. In: DSN ’06: Proceedings of the International Conference on Dependable Systems and Networks, pp. 147–155. IEEE Computer Society, Washington, DC, USA (2006). doi:10.1109/DSN.2006.22
Bracha G., Toueg S.: Asynchronous consensus and broadcast protocols. J. ACM 32(4), 824–840 (1985)
Srikanth T., Toueg S.: Simulating authenticated broadcasts to derive simple fault-tolerant algorithms. Distrib. Comput. 2, 80–94 (1987)
Fischer M.J., Lynch N.A., Paterson M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)
Perry K.J., Toueg S.: Distributed agreement in the presence of processor and communication faults. IEEE Trans. Softw. Eng. SE-12(3), 477–482 (1986)
Dolev D.: The Byzantine generals strike again. J. Algorithms 3(1), 14–30 (1982)
Fitzi, M., Maurer, U.M.: From partial consistency to global broadcast. In: Proceedings of the 32nd Annual ACM Symposium on Theory of Computing (STOC), pp. 494–503 (2000)
Pease M., Shostak R., Lamport L.: Reaching agreement in the presence of faults. J. ACM 27(2), 228–234 (1980)
Castro, M., Liskov, B.: Practical Byzantine fault tolerance. In: 3rd Symposium on Operating Systems Design and Implementation (1999)
Correia M., Neves N.F., Lung L.C., Veríssimo P.: Low complexity Byzantine-resilient consensus. Distrib. Comput. 17, 237–249 (2005)
Doudou, A., Garbinato, B., Guerraoui, R.: Encapsulating failure detection: From crash to Byzantine failures. In: Reliable Software Technologies—Ada-Europe 2002. Lecture Notes in Computer Science 2361, pp. 24–50. Springer, Vienna, Austria (2002)
Malkhi, D., Reiter, M.: Unreliable intrusion detection in distributed computations. In: Proceedings of the 10th Computer Security Foundations Workshop (CSFW97), pp. 116–124. Rockport, MA, USA (1997)
Abd-El-Malek, M., Granger, G.R., Goodson, G.R., Reiter, M.K., Wylie, J.J.: Fault-scalable Byzantine fault-tolerant services. In: 20th ACM Symposium on Operating Systems Principles (SOSP’05), pp. 59–74 (2005)
Correia M., Neves N.F., Veríssimo P.: From consensus to atomic broadcast: Time-free Byzantine-resistant protocols without signatures. Comput. J. 49(1), 82–96 (2006)
Martin J.P., Alvisi L.: Fast Byzantine consensus. IEEE Trans. Dependable Secur. Comput. 3(3), 202–215 (2006)
Anceaume, E., Delporte-Gallet, C., Fauconnier, H., Hurfin, M., Le Lann, G.: Designing modular services in the scattered Byzantine failure model. In: 3rd International Symposium on Parallel and Distributed Computing (ISPDC 2004), pp. 262–269. IEEE Computer Society (2004)
Anceaume, E., Delporte-Gallet, C., Fauconnier, H., Hurfin, M., Widder, J.: Clock synchronization in the Byzantine-recovery failure model. In: International Conference On Principles Of Distributed Systems OPODIS 2007. Lecture Notes in Computer Science, pp. 90–104. Springer, Guadeloupe, French West Indies (2007)
Azadmanesh M.H., Kieckhafer R.M.: New hybrid fault models for asynchronous approximate agreement. IEEE Trans. Comput. 45(4), 439–449 (1996)
Biely, M.: An optimal Byzantine agreement algorithm with arbitrary node and link failures. In: Proceedings of 15th Annual IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS’03), pp. 146–151. Marina Del Rey, USA (2003)
Thambidurai, P.M., Park, Y.K.: Interactive consistency with multiple failure modes. In: Proceedings of 7th Symposium on Reliable Distributed Systems, pp. 93–100 (1988)
Fischer, M., Lamport, L.: Byzantine generals and transaction commit protocols. Technical Report 62, SRI International (1982)
Hermant J.F., Lann G.: Fast asynchronous uniform consensus in real-time distributed systems. IEEE Trans. Comput. 51(8), 931–944 (2002)
Chandra T.D., Hadzilacos V., Toueg S.: The weakest failure detector for solving consensus. J. ACM 43(4), 685–722 (1996)
Charron-Bost B., Hutle M., Widder J.: In search of lost time. Inf. Process. Lett. 110(21), 928–933 (2010)
Baldoni R., Hélary J.M., Raynal M., Tangui L.: Consensus in Byzantine asynchronous systems. J. Discret. Algorithms 1(2), 185–210 (2003)
Friedman R., Mostéfaoui A., Raynal M.: Simple and efficient oracle-based consensus protocols for asynchronous Byzantine systems. IEEE Trans. Dependable Secur. Comput. 2(1), 46–56 (2005)
Kihlstrom, K.P., Moser, L.E., Melliar-Smith, P.M.: Solving consensus in a Byzantine environment using an unreliable fault detector. In: Proceedings of the International Conference on Principles of Distributed Systems (OPODIS), pp. 61–75. Chantilly, France (1997)
Kihlstrom K.P., Moser L.E., Melliar-Smith P.M.: Byzantine fault detectors for solving consensus. Comput. J. 46(1), 16–35 (2003)
Aguilera M.K., Chen W., Toueg S.: Failure detection and consensus in the crash-recovery model. Distrib. Comput. 13(2), 99–125 (2000)
Delporte-Gallet, C., Fauconnier, H., Freiling, F.C., Penso, L.D., Tielmann, A.: From crash-stop to permanent omission: automatic transformation and weakest failure detectors. In: 21st International Symposium on Distributed Computing (DISC). Lecture Notes in Computer Science, vol. 4731, pp. 165–178. Springer (2007)
Widder, J., Gridling, G., Weiss, B., Blanquart, J.P.: Synchronous consensus with mortal Byzantines. In: Proceedings of the International Conference on Dependable Systems and Networks (DSN’07), pp. 102–111. Edinburgh, UK (2007)
Choy, M., Singh, A.K.: Efficient fault tolerant algorithms for resource allocation in distributed systems. In: Proceedings of the Twenty-fourth Annual ACM Symposium on Theory of Computing, STOC ’92, pp. 593–602. ACM, New York, USA (1992)
Yamauchi, Y., Masuzawa, T., Bein, D.: Adaptive containment of time-bounded Byzantine faults. In: 12th International Symposium Stabilization, Safety, and Security of Distributed Systems (SSS 2010). Lecture Notes in Computer Science, vol. 6366, pp. 126–140. Springer (2010)
Turpin R., Coan A.B.: Extending binary Byzantine agreement to multivalued Byzantine agreement. Inf. Process. Lett. 18(2), 73–76 (1984)
Mostefaoui A., Raynal M., Tronel F.: From binary consensus to multivalued consensus in asynchronous message-passing systems. Inf. Process. Lett. 73(5–6), 207–212 (2000)
Zhang J., Chen W.: Bounded cost algorithms for multivalued consensus using binary consensus instances. Inf. Process. Lett. 109(17), 1005–1009 (2009)
Acknowledgments
We are grateful to Danny Dolev for pointing out [22] to us, and for enlightening us about the relation between Crusader agreement and consensus. We thank Rida Bazzi and Maurice Herlihy for valuable discussions on their results [12]. We also thank the anonymous reviewers whose constructive comments helped to improve the organization of the paper significantly.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Additional information
J. Widder was supported by the Austrian FWF National Research Network RiSE (S11403-N23), by the PROSEED project (proj.no ICT10-050) of the Vienna Science and Technology Fund, by NSF grant 0964696, and by the FWF project THETA (proj.no. P17757). M. Biely was partially supported by the Austrian BM:vit FIT-IT project TRAFT (proj.no. 812205). G. Gridling was supported by the Austrian FWF project SPAWN (proj.no. P18264) and the Austrian BM:vit FIT-IT project FAME (proj.no. 816454). The work of Sect. 4 and parts of Sect. 6 was originally presented at the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2007.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Widder, J., Biely, M., Gridling, G. et al. Consensus in the presence of mortal Byzantine faulty processes. Distrib. Comput. 24, 299–321 (2012). https://doi.org/10.1007/s00446-011-0147-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00446-011-0147-3