Performance Evaluation of a Two Level Error Recovery Scheme for Distributed Systems

  • B. S. Panda
  • Sajal K. Das
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2571)


Rollback recovery schemes are used in fault-tolerant distributed systems to minimize the computation loss incurred in the presence of failures. One-level recovery schemes do not consider the different types of failures and their relative frequency of occurrence, thereby tolerating all failures with the same overhead. Two-level recovery schemes aim to provide low overhead protection against more probable failures, providing protection against other failures with possibly higher overhead. In this paper, we have analyzed a two-level recovery scheme due to Vaidya taking probability of task completion on a system with limited repairs as the performance metric.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    K.M. Chandy, J.C. Browne, C.W. Dissly, and W.R. Uhrig, Analytic Models for Rollback and Recovery Strategies in Data Base Systems, IEEE Trans. Software Eng, 1 (1975)100–110.Google Scholar
  2. 2.
    S. Garg and K.F. Wong, Analysis of an improved Distributed Checkpointing Algorithm, Technical Report WUCS-93-37, Dept. of Computer Science, Washington Univ., June 1993.Google Scholar
  3. 3.
    E. Gelenbe, A Model for Roll-Back Recovery with Multiple Checkpoints, Proc. Second Int’l Conf. Software Eng., (1976)251–255.Google Scholar
  4. 4.
    E. Gelenbe, Model of Information Recovery Using the Method of Multiple Checkpointing, Automation and Control, 4 (1976)251–255.Google Scholar
  5. 5.
    V.F. Nicola, Checkpointing and the Modeling of Program Execution time, Software fault Tolerance, in: M.R. Lyu Ed. John Wiley & Sons, (1995)167–188.Google Scholar
  6. 6.
    V.F. Nicola and J.M. van Spanje,Comparative Analysis of Different Models of Checkpointing and Recovery, IEEE Trans. Software Eng. 16(1990)807–821.CrossRefGoogle Scholar
  7. 7.
    A.N. Tantawi and m. Ruschitzka, Performance Analysis of Checkpointing Strategies, ACM Trans. Computer Systems, 2 (1984) 123–144.CrossRefGoogle Scholar
  8. 8.
    N.H. Vaidya, A case for Two-level Recovery Schemes, IEEE Trans. Computers, 47 (6) (1998) 656–666.CrossRefGoogle Scholar
  9. 9.
    J.W. Young, A first Order Approximation to the Optimum Checkpoint Interval, Comm. ACM 17(1974) 530–531.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • B. S. Panda
    • 1
  • Sajal K. Das
    • 2
  1. 1.Department of MathematicsIndian Institute of TechnologyNew DelhiIndia
  2. 2.Department of Computer Science and EngineeringThe University of Texas at ArlingtonArlingtonUSA

Personalised recommendations