Advertisement

A Validated Model of a Fault-Tolerant System

  • Veena B. Mendiratta
  • Kishor S. Trivedi
Chapter
Part of the Asset Analytics book series (ASAN)

Abstract

The recovery and repair durations of large fault-tolerant systems generally span several orders of magnitude. The distributions also violate the common modeling assumption of an exponential distribution for the recovery and repair time. A reward-based semi-Markov model is presented that can be used to predict the steady-state availability of such systems as well as evaluate design trade-offs with respect to their impact on system availability. The model has been validated against field outage data from a large system.

Keywords

Availability modeling Markov models Fault-tolerant system 

References

  1. 1.
    Ciardo G, Marie RA, Sericola B, Trivedi KS (1990) Performability analysis using semi-Markov reward processes. IEEE Trans Comput 39(10):1251–1264.  https://doi.org/10.1109/12.59855CrossRefGoogle Scholar
  2. 2.
    Cinlar E (2013) Introduction to stochastic processes. Courier CorporationGoogle Scholar
  3. 3.
    Hseuh MC, Iyer RK, Trivedi KS (1988) Performance modeling based on real data: a case study. IEEE Trans Comput 37(4):478–484.  https://doi.org/10.1109/12.2195CrossRefGoogle Scholar
  4. 4.
    Ibe OC, Howe RC, Trivedi KS (1989) Approximate availability analysis of VAXcluster systems. IEEE Trans Reliab 38(1):146–152.  https://doi.org/10.1109/24.24588CrossRefGoogle Scholar
  5. 5.
    Malhotra M, Reibman A (1993) Selecting and implementing phase approximations for semi-Markov models. Stoch Models 9(4):473–506.  https://doi.org/10.1080/15326349308807278CrossRefGoogle Scholar
  6. 6.
    Neuts MF, Meier KS (1981) On the use of phase type distributions in reliability modelling of systems with two components. OR Spectr 2(4):227–234.  https://doi.org/10.1007/BF01721011CrossRefGoogle Scholar
  7. 7.
    Sahner RA, Trivedi KS (1993) A software tool for learning about stochastic models. IEEE Trans Educ 36(1):56–61.  https://doi.org/10.1109/13.204817CrossRefGoogle Scholar
  8. 8.
    Trivedi KS (2001) Probability and statistics with reliability, queueing and computer science applications. Wiley, NY.  https://doi.org/10.1002/9781119285441
  9. 9.
    Trivedi KS, Bobbio A (2017) Reliability and availability engineering: modeling, analysis, and applications. Cambridge University PressGoogle Scholar
  10. 10.
    Wein AS, Sathaye ARCHANA (1990) Validating complex computer system availability models. IEEE Trans Reliab 39(4):468–479.  https://doi.org/10.1109/24.58724CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Nokia Bell LabsNapervilleUSA
  2. 2.Duke UniversityDurhamUSA

Personalised recommendations