Distributed Computing

, Volume 17, Issue 3, pp 237–249 | Cite as

Low complexity Byzantine-resilient consensus

  • Miguel Correia
  • Nuno Ferreira  Neves
  • Lau Cheuk Lung
  • Paulo Veríssimo
Article

Abstract.

The application of the tolerance paradigm to security - intrusion tolerance - has been raising a reasonable amount of attention in the dependability and security communities. In this paper we present a novel approach to intrusion tolerance. The idea is to use privileged components - generically designated by wormholes - to support the execution of intrusion-tolerant protocols, often called Byzantine-resilient in the literature.

The paper introduces the design of wormhole-aware intrusion-tolerant protocols using a classical distributed systems problem: consensus. The system where the consensus protocol runs is mostly asynchronous and can fail in an arbitrary way, except for the wormhole, which is secure and synchronous. Using the wormhole to execute a few critical steps, the protocol manages to have a low time complexity: in the best case, it runs in two rounds, even if some processes are malicious. The protocol also shows how often theoretical partial synchrony assumptions can be substantiated in practical distributed systems. The paper shows the significance of the TTCB as an engineering paradigm, since the protocol manages to be simple when compared with other protocols in the literature.

Keywords:

Byzantine fault tolerance intrusion tolerance distributed systems models distributed algorithms consensus 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adelsbach A, Alessandri D, Cachin C, Creese S, Deswarte Y, Kursawe K, Laprie JC, Powell D, Randell B, Riordan J, Ryan P, Simmonds W, Stroud R, Veríssimo P, Waidner M, Wespi A: Conceptual Model and Architecture of MAFTIA. Project MAFTIA deliverable D21, January 2002, http://www.research.ec.org/maftia/deliverables/D21.pdf.Google Scholar
  2. 2.
    Babao\~glu Ö, Drummond R, Stephenson P: The impact of communication network properties on reliable broadcast protocols. In Proceedings of the 16th IEEE International Symposium on Fault-Tolerant Computing, July 1986, pp 212-217Google Scholar
  3. 3.
    Baldoni R, Helary J, Raynal M, Tanguy L: Consensus in Byzantine asynchronous systems. In: Proceedings of the International Colloquium on Structural Information and Communication Complexity, June 2000, pp 1-16Google Scholar
  4. 4.
    Ben-Or M: Another advantage of free choice: Completely asynchronous agreement protocols. In: Proceedings of the 2nd ACM Symposium on Principles of Distributed Computing, August 1983, pp 27-30Google Scholar
  5. 5.
    Bracha G, Toueg S: Asynchronous consensus and broadcast protocols. Journal of the ACM 32(4), 824-840 (October 1985)Google Scholar
  6. 6.
    Cachin C, Kursawe K, and Shoup V: Random oracles in Contanstinople: Practical asynchronous Byzantine agreement using cryptography. In: Proceedings of the 19th ACM Symposium on Principles of Distributed Computing, July 2000, pp 123-132Google Scholar
  7. 7.
    Casimiro A, Martins P, Veríssimo P: How to build a Timely Computing Base using Real-Time Linux. In: Proceedings of the IEEE International Workshop on Factory Communication Systems, September 2000, pp 127-134Google Scholar
  8. 8.
    Chandra T, Toueg S: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43(2), 225-267 (March 1996)Google Scholar
  9. 9.
    Cloutier P, Mantegazza P, Papacharalambous S, Soanes I, Hughes S, Yaghmour K: DIAPM-RTAI position paper. In: Real-Time Linux Workshop, November 2000Google Scholar
  10. 10.
    Correia M, Lung LC, Neves NF, Veríssimo P: Efficient Byzantine-resilient reliable multicast on a hybrid failure model. In: Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems, October 2002, pp 2-11Google Scholar
  11. 11.
    Correia M, Veríssimo P, Neves NF: The design of a COTS real-time distributed security kernel (extended version). DI/FCUL TR 01-12, Department of Computer Science, University of Lisbon, 2001Google Scholar
  12. 12.
    Correia M, Veríssimo P, Neves NF: The design of a COTS real-time distributed security kernel. In: Proceedings of the Fourth European Dependable Computing Conference, October 2002, pp 234-252Google Scholar
  13. 13.
    Dolev D, Dwork C, Stockmeyer L: On the minimal synchronism needed for distributed consensus. Journal of the ACM 34(1), 77-97 (January 1987)Google Scholar
  14. 14.
    Doudou A, Garbinato B, Guerraoui R: Encapsulating failure detection: From crash-stop to Byzantine failures. In: International Conference on Reliable Software Technologies, pp 24-50, May 2002Google Scholar
  15. 15.
    Doudou A, Schiper A: Muteness failure detectors for consensus with Byzantine processes. Technical Report 97/30, EPFL, 1997Google Scholar
  16. 16.
    Dwork C, Lynch N, Stockmeyer L: Consensus in the presence of partial synchrony. Journal of the ACM 35(2), 288-323, (April 1988)Google Scholar
  17. 17.
    Fischer MJ: The consensus problem in unreliable distributed systems (A brief survey). In: Karpinsky M (editor) Foundations of Computing Theory, volume 158 of Lecture Notes in Computer Science, pp 127-140, Springer-Verlag, 1983Google Scholar
  18. 18.
    Fischer MJ, Lynch NA, Paterson MS: Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2), 374-382 (April 1985)Google Scholar
  19. 19.
    Friedman R, Mostefaoui A, Rajsbaum S, Raynal M: Distributed agreement and its relation with error-correcting codes. In: Proceedings of the 16th International Conference on Distributed Computing, pp 63-87, October 2002Google Scholar
  20. 20.
    Hadzilacos V, Toueg S: A modular approach to fault-tolerant broadcasts and related problems. Technical Report TR94-1425, Cornell University, Department of Computer Science, May 1994Google Scholar
  21. 21.
    Kihlstrom KP, Moser LE, Melliar-Smith PM: Byzantine fault detectors for solving consensus. The Computer Journal 46(1), 16-35 (January 2003)Google Scholar
  22. 22.
    Lamport L: Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21(7), 558-565 (July 1978)Google Scholar
  23. 23.
    Lamport L, Shostak R, Pease M: The Byzantine generals problem. ACM Transactions on Programming Languages and Systems 4(3), 382-401 (July 1982)Google Scholar
  24. 24.
    Lynch N: Distributed Algorithms. Morgan Kaufmann, 1996Google Scholar
  25. 25.
    Malkhi D, Reiter M: Unreliable intrusion detection in distributed computations. In: Proceedings of the 10th Computer Security Foundations Workshop, pp 116-124, June 1997Google Scholar
  26. 26.
    Menezes AJ, Van Oorschot PC, Vanstone SA: Handbook of Applied Cryptography. CRC Press, 1997Google Scholar
  27. 27.
    Meyer F, Pradhan D: Consensus with dual failure modes. In: Proceedings of the 17th IEEE International Symposium on Fault-Tolerant Computing, pp 214-222, July 1987Google Scholar
  28. 28.
    Mostefaoui A, Rajsbaum S, Raynal M, Conditions on input vectors for consensus solvability in asynchronous distributed systems. In: Proceedings of the 33rd ACM Symposium on Theory of Computing, pp 152-162, July 2001Google Scholar
  29. 29.
    Powell D (editor): Delta-4 - A Generic Architecture for Dependable Distributed Computing. ESPRIT Research Reports. Springer-Verlag, November 1991Google Scholar
  30. 30.
    Rabin MO: Randomized Byzantine Generals. In: Proceedings of the 24th Annual IEEE Symposium on Foundations of Computer Science, pp 403-409, November 1983Google Scholar
  31. 31.
    Reischuck R: A new solution for the Byzantine general’s problem. Technical Report RJ 3673, IBM Research Lab., November 1982Google Scholar
  32. 32.
    Schiper A: Early consensus in an asynchronous system with a weak failure detector. Distributed Computing 10, 149-157 (October 1997)Google Scholar
  33. 33.
    Tobotras B: Linux Capabilities FAQ 0.2. ftp://ftp.guardian.no/pub/free/linux/capabilities/capfaq.txt, 1999Google Scholar
  34. 34.
    Veríssimo P: Uncertainty and predictability: Can they be reconciled? In: Future Directions in Distributed Computing, volume 2584 of Lecture Notes in Computer Science, pp 108-113. Springer-Verlag, 2003Google Scholar
  35. 35.
    Veríssimo P, Casimiro A: The Timely Computing Base model and architecture. IEEE Transactions on Computers 51(8), 916-930 (August 2002)Google Scholar
  36. 36.
    Veríssimo P, Neves NF, Correia M: Intrusion-tolerant architectures: Concepts and design. In: Lemos R, Gacek C, Romanovsky A (editors): Architecting Dependable Systems, volume 2677 of Lecture Notes in Computer Science, pp 3-36, Springer-Verlag, 2003Google Scholar
  37. 37.
    Veríssimo P, Rodrigues L, Casimiro A: Cesiumspray: a precise and accurate global clock service for large-scale systems. Journal of Real-Time Systems 12(3), 243-294 (May 1997)Google Scholar

Copyright information

© Springer-Verlag Berlin/Heidelberg 2005

Authors and Affiliations

  • Miguel Correia
    • 1
  • Nuno Ferreira  Neves
    • 1
  • Lau Cheuk Lung
    • 2
  • Paulo Veríssimo
    • 1
  1. 1.Faculdade de Ciências da Universidade de LisboaLisboaPortugal
  2. 2.Pontifícia Universidade Católica do ParanáPrado, VelhoBrasil

Personalised recommendations