Introduction

Kanellakis, Paris Christos; Shvartsman, Alex Allister

doi:10.1007/978-1-4757-5210-6_1

Paris Christos Kanellakis³ &
Alex Allister Shvartsman⁴

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 401))

82 Accesses

Abstract

THIS study of fault-tolerant parallel computation uses models of computation based on the parallel random access machine, or pram. The pram model is generally accepted as a convenient abstraction useful for defining and analyzing parallel algorithms. However it makes some assumptions that call into question its practicality. The main such assumptions are global synchronization of processors, high-bandwidth concurrent access to shared memory, and infallibility of processors, interconnections and memory. In this monograph we pursue the goal of preserving the high-level pram abstraction that makes it attractive for programming parallel algorithms, while narrowing the gap between prams and realizable parallel machines. Our primary focus is the removal of the assumption that the processors are failure-free. In some settings we also show how to relax the assumption of global synchrony and how to limit shared memory access concurrency in fault-tolerant algorithms while preserving their efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bibliographic Notes

R.M. Karp and V. Ramachandran, “A Survey of Parallel Algorithms for Shared-Memory Machines”, in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), vol. 1, North-Holland, 1990.
Google Scholar
N. Pippenger, “Communications Networks,” in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), vol. 1, North-Holland, 1990.
Google Scholar
L. Valiant, “General Purpose Parallel Architectures,” in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), vol. 1, North-Holland, 1990.
Google Scholar
F. Thomson Leighton, Introduction to Parallel Algorithms and Architectures: Array, Trees, Hypercubes, Morgan Kaufman Publishers, San Mateo, CA, 1992.
Google Scholar
S. Fortune and J. Wyllie, “Parallelism in Random Access Machines”, Proc. the 10th ACM Symposium on Theory of Computing, pp. 114–118, 1978.
Google Scholar
T.H. Cormen, C.E. Leiserson and R.L. Rivest, Introduction to Algorithms, MIT Press, 1990.
Google Scholar
H.T. Kung and C.E. Leiserson, “Algorithms for VLSI Processor Arrays”, presented at the Symp. on Sparse Matrix Computations and Their Applications, Knoxville, TN, 1978.
Google Scholar
C. Mead and L. Conway, Introduction to VLSI Systems, Addison-Wesley, Reading, MA, 1980.
Google Scholar
S. Owicki and D. Gries, “An Axiomatic Proof Technique for Parallel Programs I”, Acta Informatica, vol. 6, pp. 319–340, 1976.
Article MathSciNet MATH Google Scholar
M.J. Flynn, “Very High Speed Computing Systems”, in Proc. of IEEE, vol. 54, no. 12, pp. 1901–1909, 1966.
Google Scholar
D.E. Knuth, The Art of Computer Programming, vol. 3, Sorting and Searching, Addison-Wesley Publ. Co., 1973.
Google Scholar
L. Rudolph, “A Robust Sorting Network”, IEEE Trans. on Computers, vol. 34, no. 4, pp. 326–335, 1985.
Article MathSciNet MATH Google Scholar
F. Cristian, “Understanding Fault-Tolerant Distributed Systems”, in Communications of the ACM, vol. 3, no. 2, pp. 56–78, 1991.
Google Scholar
N.A. Lynch, Distributed Algorithms, Morgan Kaufman Publishers, San Mateo, CA, 1995.
Google Scholar
M. Pease, R. Shostak, L. Lamport, “Reaching agreement in the presence of faults”, JACM, vol. 27, no. 2, pp. 228–234, 1980.
Article MathSciNet MATH Google Scholar
L. Lamport, R. Shostak and M. Pease, “The Byzantine Generals Problem”, ACM TOPLAS, vol. 4, no. 3, pp. 382–401, 1982.
Article MATH Google Scholar
M.J. Fischer, N. A. Lynch, M. S. Paterson, “Impossibility of distributed consensus with one faulty process”, JACM, vol. 32, no. 2, pp. 374–382, 1985.
Article MathSciNet MATH Google Scholar
N.A. Lynch, “One Hundred Impossibility Proofs for Distributed Comuting”, Proc. of the 8th ACM Symposium on Principles of Distributed Computing, pp. 1–27, 1989.
Chapter Google Scholar
L. Lamport and N.A. Lynch, “Distributed Computing: Models and Methods,” in Handbook of Theoretical Computer Science (ed. J. van Leeuwen), vol. 1, North-Holland, 1990.
Google Scholar
R.D. Schlichting and F.B. Schneider, “Fail-Stop Processors: an Approach to Designing Fault-tolerant Computing Systems”, ACM Transactions on Computer Systems, vol. 1, no. 3, pp. 222–238, 1983.
Article Google Scholar
C. Martel, R. Subramonian, and A. Park, “Asynchronous PRAMS are (Almost) as Good as Synchronous PRAMS,” in Proc. 32d IEEE Symposium on Foundations of Computer Science, pp. 590–599, 1990.
Google Scholar
Z. M. Kedem, K. V. Palem, and P. Spirakis, “Efficient Robust Parallel Computations,” Proc. 22nd ACM Symp. on Theory of Computing, pp. 138–148, 1990.
Google Scholar
P.C. Kanellakis and A.A. Shvartsman, “Efficient Parallel Algorithms Can Be Made Robust”, Distributed Computing, vol. 5, no. 4, pp. 201–217, 1992; prelim. vers. in Proc. of the 8th ACM PODC, pp. 211–222, 1989.
Google Scholar
Z.M. Kedem, K.V. Palem, A. Raghunathan, and P. Spirakis, “Combining Tentative and Definite Executions for Dependable Parallel Computing,” in Proc 23d ACM. Symposium on Theory of Computing, pp. 381–390, 1991.
Google Scholar

Download references

Author information

Authors and Affiliations

Brown University, Providence, Rhode Island, USA
Paris Christos Kanellakis
Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
Alex Allister Shvartsman

Authors

Paris Christos Kanellakis
View author publications
You can also search for this author in PubMed Google Scholar
Alex Allister Shvartsman
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kanellakis, P.C., Shvartsman, A.A. (1997). Introduction. In: Fault-Tolerant Parallel Computation. The Springer International Series in Engineering and Computer Science, vol 401. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-5210-6_1

Download citation

DOI: https://doi.org/10.1007/978-1-4757-5210-6_1
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5177-9
Online ISBN: 978-1-4757-5210-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics