Skip to main content

Performance Evaluation of Parallel Systems Employing Roll-Forward Checkpoint Schemes

  • Conference paper
Computational Science and Its Applications - ICCSA 2006 (ICCSA 2006)

Abstract

High performance and reliability are the main goals of parallel and distributed computing systems. To increase the performance and reliability of the systems, various checkpoint schemes have been proposed in the literature for decades. However, the lack of general analytical models has been an obstacle to compare the performance of systems employing different checkpoint schemes. This paper develops an analytical model to evaluate the relative response time of systems employing checkpoint schemes. The model has been applied to evaluate the relative response time of systems employing RFC (Roll-Forward Checkpoint), DMR-F (Double Modular Redundancy for Forward recovery), and DST (Duplex with Self-Test) schemes. The result shows the feasibility of the model developed in the paper.

This research was supported by the MIC, Korea, under the ITRC support program supervised by the IITA(IITA-2005-C1090-0502-0009).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Park, G.-L., Youn, H.Y., Choo, H.-S.: Optimal Checkpoint Analysis Using Stochastic Petri Net. In: IEEE Pacific Rim Int. Symp. Dependable Computing, pp. 57–60 (2001)

    Google Scholar 

  2. Baldoni, R., Helary, J.M., Raynal, M.: Rollback-dependency trackability: A Minimal Characterization and Its Protocol. In: Inform, and Comput. (2001)

    Google Scholar 

  3. Gao, G., Singhal, M.: Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems. IEEE Trans. Parallel Dist. Syst. 12(2), 157–172 (2001)

    Article  Google Scholar 

  4. Rao, S., Alvisi, L., Vin, H.M.: The Cost of Recovery in Message logging Protocols. IEEE Trans. Knowledge Data Eng. 12(2), 160–173 (2000)

    Article  Google Scholar 

  5. Long, J., Fuchs, W.K., Abraham, J.A.: Compiler-Assisted Static Checkpoint Insertion. In: 22nd Int. Symp. Fault-Tolerant Computing, pp. 58–65 (1992)

    Google Scholar 

  6. Gray, J.: Why do computers stop and what can be done about it. In: 5th Symp. Reliability in Dist. Software and Database Syst., pp. 3–12 (1986)

    Google Scholar 

  7. Park, G.-L., Youn, H.Y.: A New Approach for High Performance Computing Systems with Various Checkpointing Schemes. Journal of Supercomputing 33, 65–78 (2005)

    Google Scholar 

  8. Long, J., Fuchs, W.K., Abraham, J.A.: Forward Recovery Using Checkpointing in Parallel Systems. In: Proc. Int. Conf. Parallel Proc., pp. 272–275 (1990)

    Google Scholar 

  9. Pradhan, D.K., Vaidya, N.H.: Roll-forward Checkpoint Scheme: Concurrent Retry with Nondedicated Spares. In: Proc. of 1992 IEEE Workshop on Fault-Tolerant Parallel and Dist. Syst., pp. 166–174 (1992)

    Google Scholar 

  10. Park, G.-L., Youn, H.Y., Shirazi, B.: Duplex with Self-Test: A Roll Forward Checkpoint Scheme for High Performance Computing. In: High Performance Comp. Symp., pp. 314–319 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Park, GL. et al. (2006). Performance Evaluation of Parallel Systems Employing Roll-Forward Checkpoint Schemes. In: Gavrilova, M.L., et al. Computational Science and Its Applications - ICCSA 2006. ICCSA 2006. Lecture Notes in Computer Science, vol 3984. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751649_20

Download citation

  • DOI: https://doi.org/10.1007/11751649_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34079-9

  • Online ISBN: 978-3-540-34080-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics