Advertisement

Guaranteed Mutually Consistent Checkpointing in Distributed Computations

  • Zhonghua Yang
  • Chengzheng Sun
  • Abdul Sattar
  • Yanyan Yang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1538)

Abstract

In this paper, we emplore the isomorphism between vector time and causality to characterize consistency of a set of checkpoints in a distributed computing. A necessary and sufficient condition, to determine if a set of checkpoints can form a consistent global checkpoint, is presented and proved using the isomorphic power of vector time and causality. To the best of our knowledge, this is the first attempt to use the isomorphism for this purpose. This condition leads to a simple and straightforward algorithm for a guaranteed mutually consistent global checkpointing. In our approach, a process can take a checkpoint whenever and wherever it wants while other related process may be asked to take an additional checkpoint for ensuring the mutual consistency. We also show how this condition and the resulting algorithm can be used to obtain a maximum and minimum global checkpoints, another important paradigm for distributed applications

Keywords

Mobile Host Vector Time Application Message Vector Clock Execution History 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    R. Baldoni, J-M. Helary, and M. Raynal. Mutually Consistent Recording in Asynchronous Computation. Technical Report 981, IRISA, Campus de Beaulieu, 35042 Rennes Cedex, France, January 1996.Google Scholar
  2. [2]
    R. Baldoni, Jean-Michel Helary, and Michel Raynal. About State Recording in Asynchronous Computations (Abstract). In PODC’96, page 55, 1996.Google Scholar
  3. [3]
    Twan Basten, Thomas Kunz, James P. Black, Michael H. Coffin, and David J. Taylor. Vector Time and Causality Among Abstract Events in Distributed Computations. Distributed Computing, 11(1):21–39, 1997.CrossRefGoogle Scholar
  4. [4]
    K. Mani Chandy and Leslie Lamport. Distributed Snapshots: Determining Global States of Distributed Systems. ACM Transactions on Computer Systems, 3(1):63–75, February 1985.Google Scholar
  5. [5]
    Carol Critchlow and Kim Taylor. The Inhibition Spectrum and the Achievement of Causal Consistency. Distributed Computing, 10(1):11–27, 1996.CrossRefMathSciNetGoogle Scholar
  6. [6]
    E. N. Elnozahy, D. B. Johnson, and Y-M. Wang. A Survey of Rollback-Recovery Protocols in Message-Passing Systems. Technical Report CMU-CS-96-181, CS, CMU, 3 October 1996. Submitted for Publication in ACM Computing Survey.Google Scholar
  7. [7]
    C. J. Fidge. Timestamps in Message-Passing Systems that Preserve Partial Ordering. In Proceedings of 11th Australian Computer Science Conference, pages 56–66, February 1988.Google Scholar
  8. [8]
    Richard Koo and Sam Toueg. Checkpointing and Rollback-Recovery for Distributed System. IEEE Transactions on Software Engineering, SE-13(1):23–31, January 1987.Google Scholar
  9. [9]
    Leslie Lamport. Time, Clocks, and the Ordering of Events in a Distributed System. Communications of the ACM, 21(7):558–565, July 1978.Google Scholar
  10. [10]
    Friedemann Mattern. Virtual Time and Global States of Distributed Systems. In M. Cosnard and P. Quinton, editors, Proceedings of International Workshop on Parallel and Distributed Algorithms (Chateau de Bonas, France, October 1988), pages 215–226, Amsterdam, 1989. Elsevier Science Publishers B. V.Google Scholar
  11. [11]
    Robert H. B. Netzer and Jian Xu. Necessary and Sufficient Conditions for Consistent Global Snapshots. IEEE Transactions on Parallel and Distributed Systems, 6(2): 165–169, February 1995.Google Scholar
  12. [12]
    B. Randell. System Structure for Software Fault Tolerance. IEEE Transactions on Software Engineering, SE-1(2):220–232, June 1975.Google Scholar
  13. [13]
    Michel Raynal, André Schiper, and Sam Toueg. The Causal Ordering Abstraction and a Simple Way to Implement it. Information Processing Letters, 39:343–350, 27 September 1991.Google Scholar
  14. [14]
    Andre Schiper, Jorge Eggli, and Alain Sandoz. A new algorithm to implement causal ordering. In Distributed Algorithms, 3rd International Workshop Proceedings, pages 219–232, 1989.Google Scholar
  15. [15]
    R. Schwarz and F. Mattern. Detecting Causal Relationships in Distributed Computations: In Search of the Holy Grail. Distributed Computing, 7(3): 149–174, 1994.zbMATHCrossRefGoogle Scholar
  16. [16]
    R. E. Strom and S. Yemini. Optimistic Recovery in Distributed Systems. ACM TOCS, 3(3):204–226, August 1985.Google Scholar
  17. [17]
    K. Taylor. The Role of Inhibition in Asynchronous Consistent-Cut Protocols. In J.-C. Bermond and M. Raynal, editors, Distributed Algorithms: 3rd Int’l Workshop, pages 280–291, Nice, France, September 1989. LNCS 392. Springer-Verlag. Also Tech Report TR 89-995, April 1989, Cornell University.Google Scholar
  18. [18]
    Yi-Min Wang. Maximum and Minimum Consistenct Global Checkpoints and their Applications. In Proc. of IEEE 14th Symposium on Reliable Distributed Systems, pages 86–95, Bad Newenahr, Germany, September 1995. IEEE.Google Scholar
  19. [19]
    Zhonghua Yang and T. Anthony Marsland. Global states and time in distributed systems. IEEE Computer Society Press, 1994. ISBN: 0-8186-5300-0.Google Scholar
  20. [20]
    Zhonghua Yang, Chengzheng Sun, and Abdul Sattar. Consistent Global States of Mobile Distributed Computations. In Proceedings of The 1998 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’98), Las Vegas, Nevada, USA, July 13–16 1998. To appear.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Zhonghua Yang
    • 1
  • Chengzheng Sun
    • 1
  • Abdul Sattar
    • 1
  • Yanyan Yang
    • 2
  1. 1.School of Computing and Information TechnologyGriffith UniversityNathanAustralia
  2. 2.Gintic InstituteNanyang Technological UniversitySingapore

Personalised recommendations