Skip to main content
Log in

Finding fault with fault injection: an empirical exploration of distortion in fault injection experiments

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

It has become well established that software will never become bug free, which has spurred research in mechanisms to contain faults and recover from them. Since such mechanisms deal with faults, fault injection is necessary to evaluate their effectiveness. However, little thought has been put into the question whether fault injection experiments faithfully represent the fault model designed by the user. Correspondence with the fault model is crucial to be able to draw strong and general conclusions from experimental results. The aim of this paper is twofold: to make a case for carefully evaluating whether activated faults match the fault model and to gain a better understanding of which parameters affect the deviation of the activated faults from the fault model. To achieve the latter, we instrumented a number of programs with our LLVM-based fault injection framework. We investigated the biases introduced by limited coverage, parts of the program executed more often than others and the nature of the workload. We evaluated the key factors that cause activated faults to deviate from the model and from these results provide recommendations on how to reduce such deviations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Arlat, J., Crouzet, Y., & Laprie, J. C. (1989). Fault injection for dependability validation of fault-tolerant computing systems. In Proceedings of the 19th international symposium on fault-tolerant computing (pp. 348–355).

  • Banabic, R., & Candea, G. (2012). Fast black-box testing of system recovery code. In Proceedings of the 7th ACM European conference on computer systems (pp. 281–294).

  • Barton, J. H., Czeck, E. W., Segall, Z. Z., & Siewiorek, D. P. (1990). Fault injection experiments using FIAT. IEEE Transactions on Computers, 39(4), 575–582.

    Article  Google Scholar 

  • Cadar, C., Dunbar, D., & Engler, D. (2008). KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX symposium on operating systems design and implementation (pp. 209–224).

  • Carreira, J., Madeira, H., & Silva, J. G. (1998). Xception: A technique for the experimental evaluation of dependability in modern computers. IEEE Transactions on Software Engineering, 24(2), 125–136.

    Article  Google Scholar 

  • Choi, G., & Iyer, R. (1992). FOCUS: An experimental environment for fault sensitivity analysis. IEEE Transactions on Computers, 41(12), 1515–1526.

    Article  Google Scholar 

  • Christmansson, J., & Chillarege, R. (1996). Generation of an error set that emulates software faults based on field data. In Proceedings of the 26th international symposium on fault-tolerant computing (p. 304).

  • Cotroneo, D., Lanzaro, A., Natella, R., & Barbosa, R. (2012). Experimental analysis of binary-level software fault injection in complex software. In Proceedings of the 9th European dependable computing conference (pp. 162–172).

  • Cukier, M., Powell, D., & Ariat, J. (1999). Coverage estimation methods for stratified fault-injection. IEEE Transactions on Computers, 48(7), 707–723.

    Article  Google Scholar 

  • Clauset, A., Shalizi, C. R., & Newman, M. (2009). Power-law distributions in empirical data. http://arxiv.org/abs/0706.1062.

  • Duraes, J. A., & Madeira, H. S. (2006). Emulation of software faults: A field data study and a practical approach. IEEE Transactions on Software Engineering, 32(11), 849–867.

    Article  Google Scholar 

  • Dures, J., & Madeira, H. (2002). Emulation of software faults by educated mutations at machine-code level. In Proceedings of the 13th international symposium on software reliability engineering (p. 329).

  • Giuffrida, C., Kuijsten, A., & Tanenbaum, A. S. (2013) EDFI: A dependable fault injection tool for dependability benchmarking experiments. In Proceedings of the pacific Rim international symposium on dependable computing.

  • Gu, W., Kalbarczyk, Z., Ravishankar, Iyer, K., & Yang, Z. (2003). Characterization of Linux kernel behavior under errors. In Proceedings of the international conference on dependable systems and networks (pp. 459–468).

  • Gunawi, H. S., Do, T., Joshi, P., Alvaro, P., Hellerstein, J. M., Arpaci-Dusseau, A. C., et al. (2011). FATE and DESTINI: A framework for cloud recovery testing. In Proceedings of the 8th USENIX conference on networked systems design and implementation (pp. 18–18).

  • Gunneflo, U., Karlsson, J., & Torin, J. (1989). Evaluation of error detection schemes using fault injection by heavy-ion radiation. In Proceedings of the 19th international symposium on fault-tolerant computing (pp. 340–347).

  • Herder, J. N., Bos, H., Gras, B., Homburg, P., & Tanenbaum, A. S. (2007). Failure resilience for device drivers. In Proceedings of the international conference on dependable systems and networks (pp. 41–50).

  • Hudak, J., Suh, B. H., Siewiorek, D., & Segall, Z. (1993). Evaluation and comparison of fault-tolerant software techniques. IEEE Transactions on Reliability, 42(2), 190–204.

    Article  Google Scholar 

  • Christmansson, J., Hiller, M., & Rimn, M. (1998). An experimental comparison of fault and error injection. In Proceedings of the 9th international symposium on software reliability engineering (p. 369).

  • Jenn, E., Arlat, J., Rimen, M., Ohlsson, J., & Karlsson, J. (1994). Fault injection into VHDL models: The MEFISTO tool. In Proceedings of the 24th international symposium on fault-tolerant computing (pp. 66–75).

  • Johansson, A., Suri, N., Murphy, B. (2007). On the impact of injection triggers for OS robustness evaluation. In Proceedings of the 18th international symposium on software reliability (pp. 127–126).

  • Johansson, E., Suri, N., & Murphy, B. (2007). On the selection of error model(s) for OS robustness evaluation. In Proceedings of the 37th international conference on dependable systems and networks (pp. 502–511).

  • Joshi, P., Gunawi, H. S., & Sen, K. (2011). PREFAIL: A programmable tool for multiple-failure injection. In Proceedings of the ACM international conference on object oriented programming systems languages and applications, (Vol. 46, pp. 171–188).

  • Kanawati, G. A., Kanawati, N. A., & Abraham, J. A. (1995). FERRARI: A flexible software-based fault and error injection system. IEEE Transactions on Computers, 44(2), 248–260.

    Article  MATH  Google Scholar 

  • Kao, W. L., & Iyer, R. (1994). DEFINE: A distributed fault injection and monitoring environment. In Proceedings of the IEEE workshop on fault-tolerant parallel and distributed systems (pp. 252–259).

  • Karlsson, J., & Folkesson, P. (1995). Application of three physical fault injection techniques to the experimental assessment of the MARS architecture. In Proceedings of the 5th IFIP working conference on dependable computing for critical applications (pp. 267–287).

  • Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., et al. (2009). seL4: Formal verification of an OS kernel. In Proceedings of the 22nd ACM symposium on operating systems principles (pp. 207–220). ACM. doi:10.1145/1629575.1629596

  • Koopman, P., Sung, J., Dingman, C., Siewiorek, D., & Marz, T. (1997). Comparing operating systems using robustness benchmarks. In Proceedings of the 16th symposium on reliable distributed systems (p. 72).

  • van der Kouwe, E., Giuffrida, C., & Tanenbaum, A. S. (2014). Evaluating distortion in fault injection experiments. In Fifteenth IEEE international symposium on high-assurance systems engineering (HASE’14) (pp. 25–32). doi:10.1109/HASE.2014.13.

  • Lattner, C., & Adve, V. (2004). LLVM: A compilation framework for lifelong program analysis and transformation. In Proceedings of the international symposium on code generation and optimization (p. 75).

  • Madeira, H., Costa, D., & Vieira, M. (2000). On the emulation of software faults by software fault injection. In Proceedings of the international conference on dependable systems and networks (pp. 417–426).

  • Madeira, H., Rela, M. Z., Moreira, F., & Silva, J. G. (1994). RIFLE: A general purpose pin-level fault injector. In Proceedings of the first European dependable computing conference (pp. 199–216).

  • Marinescu, P., & Candea, G. (2009). LFI: A practical and general library-level fault injector. In Proceedings of the international confernece on dependable systems and networks (pp. 379–388).

  • Marinescu, P. D., Banabic, R., & Candea, G. (2010). An extensible technique for high-precision testing of recovery code. In Proceedings of the USENIX annual technical conference (pp. 23–23).

  • Natella, N., Cotroneo, D., Duraes, J., & Madeira, H. (2012). On fault representativeness of software fault injection. IEEE Transactions on Software Engineering 99(1).

  • Natella, R., Cotroneo, D., Duraes, J., & Madeira, H. (2010). Representativeness analysis of injected software faults in complex software. In Proceedings of the 40th international conference on dependable systems and networks (pp. 437–446).

  • Ng, W. T., & Chen, P. M. (2001). The design and verification of the Rio file cache. IEEE Transactions on Computers, 50(4), 322–337.

    Article  Google Scholar 

  • Ostrand, T. J., & Weyuker, E. J. (2002). The distribution of faults in a large industrial software system. ACM SIGSOFT Software Engineering Notes, 27(4), 55–64. doi:10.1145/566171.566181.

    Article  Google Scholar 

  • Sullivan, M., & Chillarege, R. (1992). A comparison of software defects in database management systems and operating systems. In Proceedings of the 22nd international symposium on fault-tolerant computing (pp. 475–484).

  • Svenningsson, R., Vinter, J., Eriksson, H., & Trngren, M. (2010). MODIFI: a MODel-implemented fault injection tool. In Proceedings of the 29th international conference on computer safety, reliability, and security (pp. 210–222).

  • Swift, M. M., Annamalai, M., Bershad, B. N., & Levy, H. M. (2006). Recovering device drivers. ACM Transactions on Computer Systems, 24(4), 333–360.

    Article  Google Scholar 

  • Tsai, T. K., Hsueh, M. C., Zhao, H., Kalbarczyk, Z., & Iyer, R. K. (1999). Stress-based and path-based fault injection. IEEE Transactions on Computers, 48(11), 1183–1201.

    Article  Google Scholar 

  • Tsai, T. K., & Iyer, R. K. (1995). Measuring fault tolerance with the FTAPE fault injection tool. In Proceedings of the 8th international conference on modelling techniques and tools for computer performance evaluation (pp. 26–40).

  • Zhou, F., Condit, J., Anderson, Z., Bagrak, I., Ennals, R., Harren, M., et al. (2006). SafeDrive: Safe and recoverable extensions using language-based techniques. In Proceedings of the 7th symposium on operating systems design and implementation (pp. 45–60).

Download references

Acknowledgments

This research was supported in part by European Research Council Grant 227874.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erik van der Kouwe.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

van der Kouwe, E., Giuffrida, C. & Tanenbaum, A.S. Finding fault with fault injection: an empirical exploration of distortion in fault injection experiments. Software Qual J 24, 7–36 (2016). https://doi.org/10.1007/s11219-014-9261-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-014-9261-3

Keywords

Navigation