Skip to main content

Fault-tolerant shared memory simulations

Extended abstract

  • Conference paper
  • First Online:
STACS 96 (STACS 1996)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1046))

Included in the following conference series:

Abstract

We consider the problem of simulating a PRAM on a faulty distributed memory machine (DMM). We focus on dynamic faults, i.e. each processor or memory module independently fails during the simulation of a PRAM step with fixed probability and remains faulty for the rest of the simulation. We build upon randomized hashing-based simulations on non-faulty DMMs from [14], which achieve delay O (log log n), with high probability. We design and analyze routines for handling faults occurring during the simulation. Based on these routines we present simulations on faulty DMMs with the same delay O(log log n) as in the non-faulty case, provided that the failure probability of processors and modules is small enough to guarantee an expected linear number of processors and modules to survive the simulation. Thus the facility of being resilient to memory or processor faults increases the delay of the simulation at most by a constant factor.

Supported in part by DFG-Graduiertenkolleg “Parallele Rechnernetzwerke in der Produktionstechnik”, ME 872/4-1, by DFG-SFB 376 “Massive Parallelität, by the Esprit Basic Research Action Nr. 7141 (ALCOM II), and by the SICMA Project founded by the European Community within the program on “Advanced Communication Technologies and Services”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J.R. Anderson and G.L. Miller: Optical communication for pointer based algorithms. Technical Report CRI 88-14, Computer Science Department, University of Southern Carolina, Los Angeles, CA 90089-0782 USA, 1988.

    Google Scholar 

  2. Ö. Babaoglu, R. Drummond and P. Stephenson: The impact of communication network properties on reliable broadcast protocols. Technical Report, Department of Computer Science, Cornell University, Ithaca, New York 1988.

    Google Scholar 

  3. P. Berenbrink, F. Meyer auf der Heide and V. Stemann: Fault-tolerant shared memory simulations. Technical Report, to appear.

    Google Scholar 

  4. B.S. Chlebus, A. Gambin and P. Indyk: PRAM computations resilient to memory faults. In Proc. of the 2nd Annual European Symposium on Algorithms, pp 401–412, 1994.

    Google Scholar 

  5. F. Christian, H. Aghili, D. Dolev and Ray Strong: Atomic broadcast: from simple message diffusion to byzantine agreement. Computer Science, 1984.

    Google Scholar 

  6. A. Czumaj, F. Meyer auf der Heide and V. Stemann: Shared memory simulations with triple logarithmic delay. In Proc. of the 3rd Annual European Symposium on Algorithms, pp 46–59, 1995.

    Google Scholar 

  7. M. Dietzfelbinger and F. Meyer auf der Heide: Simple, efficient shared memory simulations. In Proc. of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, pp 110–119, 1993.

    Google Scholar 

  8. M. Dietzfelbinger and F. Meyer auf der Heide: How to distribute a hash table in a complete network. In Proc. of the 22nd ACM Symposium on Theory of Computing, pp 117–127, 1990.

    Google Scholar 

  9. L.A. Goldberg, M. Jerrum and T. Leighton: A doubly logarithmic communication algorithm for the completely connected optical communication parallel computer. In Proc. of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures, pp 300–309, 1993.

    Google Scholar 

  10. L.A. Goldberg, Y. Matias and S. Rao: An optical simulation of shared memory. In Proc. of the 6th Annual ACM Symposium on Parallel Algorithms and Architectures, pp 257–267, 1994.

    Google Scholar 

  11. R. Karp, M. Luby, and F. Meyer auf der Heide: Efficient PRAM simulations on distributed memory machine. In Proc. of the 24th Annual ACM Symposium on Theory of Computing, pp 318–326, 1992.

    Google Scholar 

  12. P.D. MacKenzie, C.G. Plaxton, R. Rajamaran: On contention resolution protocols and associated phenomena. University of Texas at Austin, Technical Report 94-06, 1994.

    Google Scholar 

  13. F. Meyer auf der Heide: Hashing strategies for simulating shared memory on distributed memory machines. In Proc. of the 1st Heinz Nixdorf Symposium “Parallel Architectures and their Efficient Use”, F. Meyer auf der Heide, B. Monien, A.L. Rosenberg, eds., pp 20–29, 1992.

    Google Scholar 

  14. F. Meyer auf der Heide, C. Scheideler and V. Stemann: Exploiting storage redundancy to speed up randomized shared memory simulations. In Proc. of the 12th Annual Symposium on Theoretical Aspects of Computer Science, pp 267–278, 1995.

    Google Scholar 

  15. J.P. Schmitt, A. Siegel and A. Srinivasan: Chernoff-Hoeffding bounds for applications with limited independence. In the Proc. of the 4th ACM-Siam Symposium on Discrete Algorithms, pp 331–340, 1993.

    Google Scholar 

  16. A. Siegel: On universal classes of fast high performance hash functions, their time-space tradeoff and their applications. In Proc. of the 30th IEEE Annual Symposium on Foundations of Computer Science, pp 20–25, 1989.

    Google Scholar 

  17. E. Upfal and A. Wigderson: How to share memory in a distributed system. J. Assoc. Comput. Mach. 34, pp 116–127, 1987.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Volker Stemann .

Editor information

Claude Puech Rüdiger Reischuk

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Berenbrink, P., Meyer auf der Heide, F., Stemann, V. (1996). Fault-tolerant shared memory simulations. In: Puech, C., Reischuk, R. (eds) STACS 96. STACS 1996. Lecture Notes in Computer Science, vol 1046. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60922-9_16

Download citation

  • DOI: https://doi.org/10.1007/3-540-60922-9_16

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60922-3

  • Online ISBN: 978-3-540-49723-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics