Advertisement

Cheaper matrix clocks

  • Frédéric Ruget
Miscellaneous
Part of the Lecture Notes in Computer Science book series (LNCS, volume 857)

Abstract

Matrix clocks have nice properties that can be used in the context of distributed database protocols and fault tolerant protocols. Unfortunately, they are costly to implement, requiring storage and communication overhead of size \(\mathcal{O}(n^2 )\) for a system of n sites. They are often considered a non feasible approach when the number of sites is large.

In this paper, we firstly describe an efficient incremental algorithm to compute the matrix clock, which achieves storage and communication overhead of size \(\mathcal{O}(n)\) when the sites of the computation are “well synchronized”. Secondly, we introduce the k-matrix clock: an approximation to the genuine matrix clock that can be computed with a storage and communication overhead of size \(\mathcal{O}(kn)\). k-matrix clocks can be useful to implement faulttolerant protocols for systems with crash failure semantics such that the maximum number of simultaneous faults is bounded by k−1.

Key words

distributed systems causality logical time matrix time fault tolerance 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [CASD85]
    F. Cristian, H. Aghili, R. Strong, and D. Dolev. Atomic broadcast: From simple message diffusion to byzantine agreement. In Proc. 15th Int. Symp. on Fault-tolerant Computing, June 1985.Google Scholar
  2. [EZ92]
    E. N. Elnozahy and W. Zwaenepoel. Manetho: Transparent rollbackrecovery with low overhead, limited rollback and fast ouput commit. IEEE Transactions on Computers, Special Issue on Fault-Tolerant Computing, 41(5):526–531, May 1992.Google Scholar
  3. [EZ93]
    E. N. Elnozahy and W. Zwaenepoel. Fault-tolerance for a workstation cluster. In Proc. of the Workshop on Hardware and Software Architectures for Fault Tolerance, May 1993.Google Scholar
  4. [Fid91]
    C. J. Fidge. Logical time in distributed computing systems. IEEE Computer, 24(8):28–33, Aug. 1991.Google Scholar
  5. [HHW89]
    A. Heddaya, M. Hsu, and W. E. Weihl. Two phase gossip: Managing distributed event histories. Information Sciences, 49(1):35–57, 1989.CrossRefMathSciNetGoogle Scholar
  6. [JZ87]
    D. B. Johnson and W. Zwaenepoel. Sender-based message logging. In The 17th Int. Symp. on Fault-Tolerant Computing, pages 14–19. IEEE Computer Society, June 1987.Google Scholar
  7. [KB91]
    Krishnakumar and Bernstein. Bounded ignorance in replicated systems. In Proc. ACM symp. on Principles of Database Systems, 1991.Google Scholar
  8. [Lam78]
    L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21(7):558–565, 1978.CrossRefGoogle Scholar
  9. [LL86]
    B. Liskov and R. Ladin. Highly available distributed services and fault-tolerant distributed garbage collection. In Proc. 5th ACM Symp. on PODC, pages 29–39, 1986.Google Scholar
  10. [Mat89]
    F. Mattern. Virtual time and global states of distributed systems. In Proc. of Int. Workshop on Parallel and Distributed Algorithms, Bonas (France), pages 215–226. Cosnard, Quinton, Raynal and Robert editors, 1989.Google Scholar
  11. [PT92]
    P. Panangaden and K. Taylor. Concurrent common knowledge: Defining agreement for asynchronous systems. Distributed Computing, 6(2):73–94, September 1992.Google Scholar
  12. [Rug94a]
    F. Ruget. Cheaper matrix clocks. Technical Report CS/TR-94-63, Chorus Systems, 1994.Google Scholar
  13. [Rug94b]
    F. Ruget. A distributed execution replay facility for CHORUS. In Proc. of the 7th Int. Conf. on Parallel and Distributed Systems (PDCS'94), Las Vegas, Nevada, October 1994.Google Scholar
  14. [SBY88]
    R. E. Strom, D. F. Bacon, and S. A. Yemini. Volatile logging in n-fault-tolerant distributed systems. In The Eighteenth Annual International Symposium on Fault-Tolerant Computing: Digest of Papers, pages 44–49, June 1988.Google Scholar
  15. [SL87]
    S. K. Sarin and L. Lynch. Discarding obsolete information in a replicated database system. IEEE Trans. on Soft. Eng., SE 13(1):39–46, Jan. 1987.Google Scholar
  16. [SY85]
    R. E. Strom and S. A. Yemini. Optimistic recovery in distributed systems. ACM Transactions on computer Systems, 3(3):204–226, August 1985.CrossRefGoogle Scholar
  17. [WB84]
    G. T. J. Wuu and A. J. Bernstein. Efficient solutions to the replicated log and dictionary problems. In Proc. 3rd ACM Symp. on PODC, pages 232–242, 1984.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1994

Authors and Affiliations

  • Frédéric Ruget
    • 1
  1. 1.Chorus SystèmesMontigny le BxFrance

Personalised recommendations