On synchronisation in fault-tolerant data and compute intensive programs over a network of workstations

  • J. Smith
Workshop 16: Performance Evaluation and Prediction
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1300)


An application structured as a fault-tolerant bag of tasks adapts easily to changing resources. To be represented by a single bag of tasks, a computation must decompose into purely independent tasks. The work summarised here investigates performance of structuring approaches applicable where this ideal is not possible, partly through analysis and partly through measurements of a realistic fault-tolerant computation.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    G. S. Almasi and A. Gottlieb. Highly Parallel Computing. Benjamin/Cummings, 2nd edition, 1994. ISBN 0-8053-0443-6.Google Scholar
  2. 2.
    D. E. Bakken. Supporting Fault-Tolerant Parallel Programming in Linda. PhD thesis, The University of Arizona, Aug. 1994.Google Scholar
  3. 3.
    A. Baratloo, P Dasgupta, and Z. M. Kedem. CALYPSO: A novel software system for fault-tolerant parallel processing on distributed platforms. In 4th International Symposium on High Performance Distributed Computing. IEEE, Aug. 1995.Google Scholar
  4. 4.
    P A. Bernstein, M. Hsu, and B. Mann. Implementing recoverable requests using queues. ACM SIGMOD, pages 112–122, 1990.Google Scholar
  5. 5.
    P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson. RAID: highperformance, reliable secondary storage. ACM Computing Surveys, 26(2):145–185, June 1994.Google Scholar
  6. 6.
    T. Clark and K. P. Birman. Using the ISIS resource manager for distributed, fault-tolerant computing. Technical Report 92-1289, Cornell University Computer Science Department, June 1992.Google Scholar
  7. 7.
    J. M. del Rosario and A. Choudhary. High performance I/O for parallel computers: Problems and prospects. IEEE Computer, pages 59–68, Mar. 1994.Google Scholar
  8. 8.
    G. H. Golub and C. F. V Loan. Matrix Computations. John Hopkins University Press, second edition, 1989. ISBN 0-8018-3772-3.Google Scholar
  9. 9.
    J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kauffman, 1993.Google Scholar
  10. 10.
    K. Jeong. Fault-Tolerant Parallel Processing Combining Linda, Checkpointing, and Transactions. PhD thesis, New York University, Jan. 1996.Google Scholar
  11. 11.
    J. Smith. Fault Tolerant Parallel Applications Using a Network Of Workstations. PhD thesis, University of Newcastle upon Tyne, 1996. Forthcoming.Google Scholar
  12. 12.
    J. A. Smith and S. Shrivastava. Performance of data and compute intensive programs over a network of workstations. Theoretical Computer Science, 1997. To appear in special issue for Euro-Par'96 papers.Google Scholar
  13. 13.
    V. S. Sunderam, G. A. Geist, J. J. Dongarra, and R. J. Manchek. The PVM concurrent computing system: Evolution, experiences, and trends. Parallel Computing Vol. 20(4), pages 531–546, 1993.Google Scholar
  14. 14.
    M. Zyngier. md., Apr. 1996. version 0.35.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • J. Smith
    • 1
  1. 1.Department of Computing ScienceThe University of Newcastle upon TyneNewcastle upon TyneUK

Personalised recommendations