A Validated Model of a Fault-Tolerant System

  • Veena B. Mendiratta
  • Kishor S. Trivedi
Part of the Asset Analytics book series (ASAN)


The recovery and repair durations of large fault-tolerant systems generally span several orders of magnitude. The distributions also violate the common modeling assumption of an exponential distribution for the recovery and repair time. A reward-based semi-Markov model is presented that can be used to predict the steady-state availability of such systems as well as evaluate design trade-offs with respect to their impact on system availability. The model has been validated against field outage data from a large system.


Availability modeling Markov models Fault-tolerant system 


  1. 1.
    Ciardo G, Marie RA, Sericola B, Trivedi KS (1990) Performability analysis using semi-Markov reward processes. IEEE Trans Comput 39(10):1251–1264. Scholar
  2. 2.
    Cinlar E (2013) Introduction to stochastic processes. Courier CorporationGoogle Scholar
  3. 3.
    Hseuh MC, Iyer RK, Trivedi KS (1988) Performance modeling based on real data: a case study. IEEE Trans Comput 37(4):478–484. Scholar
  4. 4.
    Ibe OC, Howe RC, Trivedi KS (1989) Approximate availability analysis of VAXcluster systems. IEEE Trans Reliab 38(1):146–152. Scholar
  5. 5.
    Malhotra M, Reibman A (1993) Selecting and implementing phase approximations for semi-Markov models. Stoch Models 9(4):473–506. Scholar
  6. 6.
    Neuts MF, Meier KS (1981) On the use of phase type distributions in reliability modelling of systems with two components. OR Spectr 2(4):227–234. Scholar
  7. 7.
    Sahner RA, Trivedi KS (1993) A software tool for learning about stochastic models. IEEE Trans Educ 36(1):56–61. Scholar
  8. 8.
    Trivedi KS (2001) Probability and statistics with reliability, queueing and computer science applications. Wiley, NY.
  9. 9.
    Trivedi KS, Bobbio A (2017) Reliability and availability engineering: modeling, analysis, and applications. Cambridge University PressGoogle Scholar
  10. 10.
    Wein AS, Sathaye ARCHANA (1990) Validating complex computer system availability models. IEEE Trans Reliab 39(4):468–479. Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Nokia Bell LabsNapervilleUSA
  2. 2.Duke UniversityDurhamUSA

Personalised recommendations