On the Communication Surplus Incurred by Faulty Processors

  • Dariusz R. Kowalski
  • Michał Strojnowski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4731)

Abstract

We study the impact of faulty processors on the communication cost of distributed algorithms in a message-passing model. The system is synchronous but prone to various kinds of processor failures: crashes, message omissions, (authenticated) Byzantine faults. One of the basic communication tasks, called fault-tolerant gossip, or gossip for short, is to exchange the initial values among all non-faulty processors. In this paper we address the question if there is a gossip algorithm which is both fault-tolerant, fast and communication-efficient? We answer this question in affirmative in the model allowing only crash failures, and in some sense negatively when the other kinds of failures may occur. More precisely, in an execution by n processors when f of them are faulty, each non-faulty processor contributes a constant to the message complexity, each crashed processor contributes Θ(fε) (ε> 0 could be an arbitrarily small constant independent from n,f but dependent on the algorithm), each omission (or authenticated Byzantine) processor contributes Θ(t), and each—even potential—Byzantine failure results in additional Θ(n) messages sent.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Attiya, H., Welch, J.: Distributed Computing. John Willey & Sons, West Sussex, England (2004)CrossRefGoogle Scholar
  2. 2.
    Capalbo, M.R., Reingold, O., Vadhan, S.P., Wigderson, A.: Randomness conductors and constant-degree lossless expanders. In: Proc. of 34th ACM Symposium on Theory of Computing (STOC), pp. 659–668 (2002)Google Scholar
  3. 3.
    Chlebus, B.S., Kowalski, D.R.: Robust gossiping with an application to consensus. Journal of Computer and System Sciences 72, 1262–1281 (2006)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Chlebus, B.S., Kowalski, D.R.: Time and communication efficient consensus for crash failures. In: Dolev, S. (ed.) DISC 2006. LNCS, vol. 4167, pp. 314–328. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Chlebus, B.S., Kowalski, D.R., Rokicki, M.A.: Adversarial queuing on the multiple-access channel. In: Proc. of 25th ACM Symposium on Principles of Distributed Computing (PODC), pp. 92–101 (2006)Google Scholar
  6. 6.
    Chlebus, B.S., Kowalski, D.R., Shvartsman, A.A.: Collective asynchronous reading with polylogarithmic worst-case overhead. In: Proc. of 36th ACM Symposium on Theory of Computing (STOC), pp. 321–330 (2004)Google Scholar
  7. 7.
    Diks, K., Pelc, A.: Optimal adaptive broadcasting with a bounded fraction of faulty nodes. Algorithmica 28(1), 37–50 (2000)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Dolev, D., Reischuk, R.: Bounds on information exchange for Byzantine Agreement. Journal of ACM 32(1), 191–204 (1985)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Fischer, M., Lynch, N.: A lower bound for the time to assure interactive consistency. Information Processing Letters 14(4), 183–186 (1982)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Fujita, S., Yamashita, M.: Optimal group gossiping in hypercubes under circuit switching model. SIAM J. on Computing 25(5), 1045–1060 (1996)MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Galil, Z., Mayer, A., Yung, M.: Resolving message complexity of Byzantine agreement and beyond. In: Proc. of 36th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 724–733 (1995)Google Scholar
  12. 12.
    Georgiou, C., Kowalski, D.R., Shvartsman, A.A.: Efficient gossip and robust distributed computation. In: Fich, F.E. (ed.) DISC 2003. LNCS, vol. 2848, pp. 224–238. Springer, Heidelberg (2003)Google Scholar
  13. 13.
    Hromkovic, J., Klasing, R., Pelc, A., Ruzicka, P., Unger, W.: Dissemination of information in communication networks: broadcasting, gossiping, leader election, and fault-tolerance. In: Theoretical Computer Science. EATCS Series, Springer, Heidelberg (2005)Google Scholar
  14. 14.
    Lynch, N.: Distributed Algorithms. Morgan Kaufmann, San Francisco (1996)MATHGoogle Scholar
  15. 15.
    Neiger, G., Toueg, S.: Automatically increasing the fault-tolerance of distributed systems. Journal of Algorithms 11, 374–419 (1990)MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Dariusz R. Kowalski
    • 1
  • Michał Strojnowski
    • 2
  1. 1.Department of Computer Science, The University of LiverpoolUK
  2. 2.Instytut Informatyki, Uniwersytet WarszawskiPoland

Personalised recommendations