On stratified sampling for high coverage estimations

  • David Powell
  • Michel Cukier
  • Jean Arlat
Session 2 Fault Injection
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1150)


This paper addresses the problem of estimating the coverage of a fault tolerance mechanism through statistical processing of observations collected in faultinjection experiments. In an earlier paper, several techniques for sampling the fault/activity input space of a fault tolerance mechanism were presented. Various estimators based on simple sampling in the whole space and stratified sampling in a partitioned space were studied; confidence limits were derived based on a normal approximation. In this paper, the validity of this approximation is analyzed, especially for high coverage systems. The theory of confidence regions is then introduced to estimate the coverage without approximation when, for practical reasons, stratification is used. Three statistics are considered for defining confidence regions. It is shown that one of these statistics — a vectorial statistic — is often more conservative than the other two. However, only the vectorial statistic is computationally tractable. The results obtained are compared with those based on approximation by means of three hypothetical example systems.


Confidence Limit Stratify Sampling Confidence Region Point Estimator Coverage Factor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    T. F. Arnold, “The Concept of Coverage and its Effect on the Reliability Model of Repairable Systems,” IEEE Trans. on Computers, vol. C-22, pp. 251–254, 1973.Google Scholar
  2. [2]
    W. G. Bouricius, W. C. Carter, and P. R. Schneider, “Reliability Modeling Techniques for Self-Repairing Computer Systems”, Proc. 24th National Conference, pp. 295–309, ACM, 1969.Google Scholar
  3. [3]
    Z. Segall, D. Vrsalovic, D. Siewiorek, D. Yaskin, J. Kownacki, J. Barton, D. Rancey, A. Robinson, and T. Lin, “FIAT — Fault Injection based Automated Testing Environment”, Proc. 18th Int. Symp. on Fault-Tolerant Computing (FTCS-18), pp. 102–107, Tokyo, Japan, IEEE Computer Society Press, 1988.Google Scholar
  4. [4]
    G. S. Choi, R. K. Iyer, R. Saleh, and V. Carreno, “A Fault Behavior Model for an Avionic Microprocessor: a Case Study,” in Dependable Computing for Critical Applications, A. Avizienis and J.-C. Laprie, Eds. Vienna, Austria: Springer-Verlag, 1991, pp. 171–195.Google Scholar
  5. [5]
    R. Chillarege and N. S. Bowen, “Understanding Large System Failures — A Fault Injection Experiment”, Proc. 19th Int. Symp. on Fault-Tolerant Computing (FTCS-19), pp. 356–363, Chicago, MI, USA, IEEE Computer Society Press, 1989.Google Scholar
  6. [6]
    J. Arlat, A. Costes, Y. Crouzet, J.-C. Laprie, and D. Powell, “Fault Injection and Dependability Evaluation of Fault-Tolerant Systems,” IEEE Trans. on Computers, vol. 42, pp. 913–923, 1993.Google Scholar
  7. [7]
    U. Gunneflo, J. Karlsson, and J. Torin, “Evaluation of Error Detection Schemes using Fault Injection by Heavy-ion Radiation”, Proc. 19th Int. Symp. Fault-Tolerant Computing (FTCS-19), pp. 340–347, Chicago, MI, USA, IEEE Computer Society Press, 1989.Google Scholar
  8. [8]
    C. J. Walter, “Evaluation and Design of an Ultra-Reliable Distributed Architecture for Fault Tolerance,” IEEE Trans. on Reliability, vol. 39, pp. 492–499, 1990.Google Scholar
  9. [9]
    G. A. Kanawati, N. A. Kanawati, and J. A. Abraham, “FERRARI: A Flexible Software-Based Fault and Error Injection System,” IEEE Trans. on Computers, vol. 44, pp. 248–260, 1995.Google Scholar
  10. [10]
    V. D. Agrawal, “Sampling Techniques for Determining Fault Coverage in LSI Circuits,” Journal of Digital Systems, vol. V, pp. 189–201, 1981.Google Scholar
  11. [11]
    W. Daehn, “Fault Simulation using Small Fault Samples,” Journal of Electronic Testing: Theory and Applications, vol. 2, pp. 191–203, 1991.Google Scholar
  12. [12]
    D. Powell, E. Martins, J. Arlat, and Y. Crouzet, “Estimators for Fault Tolerance Coverage Evaluation”, Proc. 23rd Int. Conf. on Fault-Tolerant Computing (FTCS-23), pp. 228–237, Toulouse, France, IEEE Computer Society Press, 1993. (An extended version of this paper appears in IEEE Trans. Computers, 44 (2), pp.261–274, 1995).Google Scholar
  13. [13]
    W. Wang, K. S. Trivedi, B. V. Shah, and J. A. Profeta III, “The Impact of Fault Expansion on the Interval Estimate for Fault Detection Coverage”, Proc. 24th Int. Conf. on Fault-Tolerant Computing (FTCS-24), pp. 330–337, Austin, TX, USA, IEEE Computer Society Press, 1994.Google Scholar
  14. [14]
    C. Constantinescu, “Using Multi-Stage & Stratified Sampling for Inferring Fault-Coverage Probabilities,” IEEE Trans. Reliability, vol. 44, pp. 632–639, 1995.Google Scholar
  15. [15]
    D. P. Siewiorek and R. S. Swarz, The Theory and Practice of Reliable System Design: Digital Press, 1982.Google Scholar
  16. [16]
    D. A. Rennels and A. Avizienis, “RMS: A Reliability Modeling System for Self-Repairing Computers”, Proc. 3rd Int. Symp. on Fault-Tolerant Computing (FTCS-3), pp. 131–135, Palo Alto, CA, USA, IEEE Computer Society Press, 1973.Google Scholar
  17. [17]
    N. L. Johnson and S. Kotz, Distributions in Statistics — Discrete Distributions. New York: John Wiley & Sons, 1969.Google Scholar
  18. [18]
    M. Cukier, “Estimation of the Coverage of Fault-Tolerant Systems,” National Polytechnic Intitute, Toulouse, France, 1996. (in French)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • David Powell
    • 1
  • Michel Cukier
    • 1
  • Jean Arlat
    • 1
  1. 1.LAAS-CNRSToulouseFrance

Personalised recommendations