Skip to main content

Implementing Snapshot Objects on Top of Crash-Prone Asynchronous Message-Passing Systems

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10048))

Abstract

Distributed snapshots, as introduced by Chandy and Lamport in the context of asynchronous failure-free message-passing distributed systems, are consistent global states in which the observed distributed application might have passed through. It appears that two such distributed snapshots cannot necessarily be compared (in the sense of determining which one of them is the “first”). Differently, snapshots introduced in asynchronous crash-prone read/write distributed systems are totally ordered, which greatly simplify their use by upper layer applications.

In order to benefit from shared memory snapshot objects, it is possible to simulate a read/write shared memory on top of an asynchronous crash-prone message-passing system, and build then snapshot objects on top of it. This algorithm stacking is costly in both time and messages. To circumvent this drawback, this paper presents algorithms building snapshot objects directly on top of asynchronous crash-prone message-passing system. “Directly” means here “without building an intermediate layer such as a read/write shared memory”. To the authors knowledge, the proposed algorithms are the first providing such constructions. Interestingly enough, these algorithms are efficient and relatively simple.

M. Raynal—The French authors were partially supported by the French ANR project DESCARTES devoted to abstraction layers in distributed computing. The third author was supported in part by UNAM PAPIIT-DGAPA project IN107714.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The main property of such a broadcast operation is that any message delivered by a (correct or faulty) process is delivered by all correct processes, and at least the messages broadcast by the correct processes are delivered. Hence all correct processes deliver the same set of messages S, and any faulty process delivers a subset of S. Algorithms implementing reliable broadcast in the presence of process crashes are described in many textbooks (e.g. [4, 17]).

  2. 2.

    Let us notice that it is possible that several processes wrote snapshot values in repSnap[jm] to help \(p_j\) terminate its snapshot invocation. Any of these values is a correct snapshot value.

References

  1. Afek, Y., Attiya, H., Dolev, D., Gafni, E., Merritt, M., Shavit, N.: Atomic snapshots of shared memory. J. ACM 40(4), 873–890 (1993)

    Article  MATH  Google Scholar 

  2. Attiya, H.: Efficient and robust sharing of memory in message-passing systems. J. Algorithms 34, 109–127 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  3. Attiya, H., Bar-Noy, A., Dolev, D.: Sharing memory robustly in message passing systems. J. ACM 42(1), 121–132 (1995)

    Article  MATH  Google Scholar 

  4. Attiya, H., Welch, J.: Distributed Computing: Fundamentals, Simulations and Advanced Topics, 2nd edn, 414 p. Wiley-Interscience (2004)

    Google Scholar 

  5. Chandy, K.M., Lamport, L.: Distributed snapshots: determining global states of distributed systems. ACM Trans. Comput. Syst. 3(1), 63–75 (1985)

    Article  Google Scholar 

  6. Cooper, R., Marzullo, K.: Consistent detection of global predicates. In: Proceedings of Workshop on Parallel and Distributed Debugging. ACM press (1991)

    Google Scholar 

  7. Delporte, C., Fauconnier, H., Rajsbaum, S., Raynal, M.: Implementing snapshot objects on top of crash-prone asynchronous message-passing systems, 15 p. Technical report 2037, IRISA, Université de Rennes (F) (2016)

    Google Scholar 

  8. Dutta, P., Guerraoui, R., Levy, R., Vukolic, M.: Fast access to distributed atomic memory. SIAM J. Comput. 39(8), 3752–3783 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  9. Hélary, J.-M., Mostéfaoui, A., Raynal, M.: Communication-induced determination of consistent snapshots. IEEE TPDS 10(9), 865–877 (1999)

    Google Scholar 

  10. Herlihy, M.P.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. (TOPLAS) 13(1), 124–149 (1991)

    Article  Google Scholar 

  11. Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM TOPLAS 12(3), 463–492 (1990)

    Article  Google Scholar 

  12. Imbs, D., Raynal, M.: Help when needed, but no more: efficient read/write partial snapshot. J. Parallel Distrib. Comput. 72(1), 1–12 (2012)

    Article  MATH  Google Scholar 

  13. Inoue, M., Masuzawa, T., Chen, W., Tokura, N.: Linear-time snapshot using multi-writer multi-reader registers. In: Tel, G., Vitányi, P. (eds.) WDAG 1994. LNCS, vol. 857, pp. 130–140. Springer, Heidelberg (1994). doi:10.1007/BFb0020429

    Chapter  Google Scholar 

  14. Lai, T.H., Yang, T.H.: On distributed snapshots. Inf. Process. Lett. 25, 153–158 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  15. Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)

    Article  MATH  Google Scholar 

  16. Mostéfaoui, A., Raynal, M.: Two-bit messages are sufficient to implement atomic read/write registers in crash-prone systems. In: Proceedings of 35th International ACM Symposium on Principles of Distributed Computing (PODC 2016), pp. 381–390. ACM Press (2016)

    Google Scholar 

  17. Raynal, M.: Communication and Agreement Abstractions for Fault-Tolerant Asynchronous Distributed Systems, 251 p. Morgan & Claypool Publishers (2010). ISBN 978-1-60845-293-4

    Google Scholar 

  18. Raynal, M.: Distributed Algorithms for Message-Passing Systems, 510 p. Springer (2013). ISBN 978-3-642-38122-5

    Google Scholar 

  19. Raynal, M.: Concurrent Programming: Algorithms, Principles and Foundations, 515 p. Springer (2013). ISBN 978-3-642-32026-2

    Google Scholar 

  20. Taubenfeld, G.: Synchronization Algorithms and Concurrent Programming, 423 p. Pearson Prentice-Hall (2006). ISBN 0-131-97259-6

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michel Raynal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Delporte-Gallet, C., Fauconnier, H., Rajsbaum, S., Raynal, M. (2016). Implementing Snapshot Objects on Top of Crash-Prone Asynchronous Message-Passing Systems. In: Carretero, J., Garcia-Blas, J., Ko, R., Mueller, P., Nakano, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2016. Lecture Notes in Computer Science(), vol 10048. Springer, Cham. https://doi.org/10.1007/978-3-319-49583-5_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49583-5_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49582-8

  • Online ISBN: 978-3-319-49583-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics