Advertisement

Software Quality Journal

, Volume 24, Issue 1, pp 7–36 | Cite as

Finding fault with fault injection: an empirical exploration of distortion in fault injection experiments

  • Erik van der Kouwe
  • Cristiano Giuffrida
  • Andrew S. Tanenbaum
Article
  • 290 Downloads

Abstract

It has become well established that software will never become bug free, which has spurred research in mechanisms to contain faults and recover from them. Since such mechanisms deal with faults, fault injection is necessary to evaluate their effectiveness. However, little thought has been put into the question whether fault injection experiments faithfully represent the fault model designed by the user. Correspondence with the fault model is crucial to be able to draw strong and general conclusions from experimental results. The aim of this paper is twofold: to make a case for carefully evaluating whether activated faults match the fault model and to gain a better understanding of which parameters affect the deviation of the activated faults from the fault model. To achieve the latter, we instrumented a number of programs with our LLVM-based fault injection framework. We investigated the biases introduced by limited coverage, parts of the program executed more often than others and the nature of the workload. We evaluated the key factors that cause activated faults to deviate from the model and from these results provide recommendations on how to reduce such deviations.

Keywords

Fault injection LLVM Reliability 

Notes

Acknowledgments

This research was supported in part by European Research Council Grant 227874.

References

  1. Arlat, J., Crouzet, Y., & Laprie, J. C. (1989). Fault injection for dependability validation of fault-tolerant computing systems. In Proceedings of the 19th international symposium on fault-tolerant computing (pp. 348–355).Google Scholar
  2. Banabic, R., & Candea, G. (2012). Fast black-box testing of system recovery code. In Proceedings of the 7th ACM European conference on computer systems (pp. 281–294).Google Scholar
  3. Barton, J. H., Czeck, E. W., Segall, Z. Z., & Siewiorek, D. P. (1990). Fault injection experiments using FIAT. IEEE Transactions on Computers, 39(4), 575–582.CrossRefGoogle Scholar
  4. Cadar, C., Dunbar, D., & Engler, D. (2008). KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX symposium on operating systems design and implementation (pp. 209–224).Google Scholar
  5. Carreira, J., Madeira, H., & Silva, J. G. (1998). Xception: A technique for the experimental evaluation of dependability in modern computers. IEEE Transactions on Software Engineering, 24(2), 125–136.CrossRefGoogle Scholar
  6. Choi, G., & Iyer, R. (1992). FOCUS: An experimental environment for fault sensitivity analysis. IEEE Transactions on Computers, 41(12), 1515–1526.CrossRefGoogle Scholar
  7. Christmansson, J., & Chillarege, R. (1996). Generation of an error set that emulates software faults based on field data. In Proceedings of the 26th international symposium on fault-tolerant computing (p. 304).Google Scholar
  8. Cotroneo, D., Lanzaro, A., Natella, R., & Barbosa, R. (2012). Experimental analysis of binary-level software fault injection in complex software. In Proceedings of the 9th European dependable computing conference (pp. 162–172).Google Scholar
  9. Cukier, M., Powell, D., & Ariat, J. (1999). Coverage estimation methods for stratified fault-injection. IEEE Transactions on Computers, 48(7), 707–723.CrossRefGoogle Scholar
  10. Clauset, A., Shalizi, C. R., & Newman, M. (2009). Power-law distributions in empirical data. http://arxiv.org/abs/0706.1062.
  11. Duraes, J. A., & Madeira, H. S. (2006). Emulation of software faults: A field data study and a practical approach. IEEE Transactions on Software Engineering, 32(11), 849–867.CrossRefGoogle Scholar
  12. Dures, J., & Madeira, H. (2002). Emulation of software faults by educated mutations at machine-code level. In Proceedings of the 13th international symposium on software reliability engineering (p. 329).Google Scholar
  13. Giuffrida, C., Kuijsten, A., & Tanenbaum, A. S. (2013) EDFI: A dependable fault injection tool for dependability benchmarking experiments. In Proceedings of the pacific Rim international symposium on dependable computing.Google Scholar
  14. Gu, W., Kalbarczyk, Z., Ravishankar, Iyer, K., & Yang, Z. (2003). Characterization of Linux kernel behavior under errors. In Proceedings of the international conference on dependable systems and networks (pp. 459–468).Google Scholar
  15. Gunawi, H. S., Do, T., Joshi, P., Alvaro, P., Hellerstein, J. M., Arpaci-Dusseau, A. C., et al. (2011). FATE and DESTINI: A framework for cloud recovery testing. In Proceedings of the 8th USENIX conference on networked systems design and implementation (pp. 18–18).Google Scholar
  16. Gunneflo, U., Karlsson, J., & Torin, J. (1989). Evaluation of error detection schemes using fault injection by heavy-ion radiation. In Proceedings of the 19th international symposium on fault-tolerant computing (pp. 340–347).Google Scholar
  17. Herder, J. N., Bos, H., Gras, B., Homburg, P., & Tanenbaum, A. S. (2007). Failure resilience for device drivers. In Proceedings of the international conference on dependable systems and networks (pp. 41–50).Google Scholar
  18. Hudak, J., Suh, B. H., Siewiorek, D., & Segall, Z. (1993). Evaluation and comparison of fault-tolerant software techniques. IEEE Transactions on Reliability, 42(2), 190–204.CrossRefGoogle Scholar
  19. Christmansson, J., Hiller, M., & Rimn, M. (1998). An experimental comparison of fault and error injection. In Proceedings of the 9th international symposium on software reliability engineering (p. 369).Google Scholar
  20. Jenn, E., Arlat, J., Rimen, M., Ohlsson, J., & Karlsson, J. (1994). Fault injection into VHDL models: The MEFISTO tool. In Proceedings of the 24th international symposium on fault-tolerant computing (pp. 66–75).Google Scholar
  21. Johansson, A., Suri, N., Murphy, B. (2007). On the impact of injection triggers for OS robustness evaluation. In Proceedings of the 18th international symposium on software reliability (pp. 127–126).Google Scholar
  22. Johansson, E., Suri, N., & Murphy, B. (2007). On the selection of error model(s) for OS robustness evaluation. In Proceedings of the 37th international conference on dependable systems and networks (pp. 502–511).Google Scholar
  23. Joshi, P., Gunawi, H. S., & Sen, K. (2011). PREFAIL: A programmable tool for multiple-failure injection. In Proceedings of the ACM international conference on object oriented programming systems languages and applications, (Vol. 46, pp. 171–188).Google Scholar
  24. Kanawati, G. A., Kanawati, N. A., & Abraham, J. A. (1995). FERRARI: A flexible software-based fault and error injection system. IEEE Transactions on Computers, 44(2), 248–260.zbMATHCrossRefGoogle Scholar
  25. Kao, W. L., & Iyer, R. (1994). DEFINE: A distributed fault injection and monitoring environment. In Proceedings of the IEEE workshop on fault-tolerant parallel and distributed systems (pp. 252–259).Google Scholar
  26. Karlsson, J., & Folkesson, P. (1995). Application of three physical fault injection techniques to the experimental assessment of the MARS architecture. In Proceedings of the 5th IFIP working conference on dependable computing for critical applications (pp. 267–287).Google Scholar
  27. Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., et al. (2009). seL4: Formal verification of an OS kernel. In Proceedings of the 22nd ACM symposium on operating systems principles (pp. 207–220). ACM. doi: 10.1145/1629575.1629596
  28. Koopman, P., Sung, J., Dingman, C., Siewiorek, D., & Marz, T. (1997). Comparing operating systems using robustness benchmarks. In Proceedings of the 16th symposium on reliable distributed systems (p. 72).Google Scholar
  29. van der Kouwe, E., Giuffrida, C., & Tanenbaum, A. S. (2014). Evaluating distortion in fault injection experiments. In Fifteenth IEEE international symposium on high-assurance systems engineering (HASE’14) (pp. 25–32). doi: 10.1109/HASE.2014.13.
  30. Lattner, C., & Adve, V. (2004). LLVM: A compilation framework for lifelong program analysis and transformation. In Proceedings of the international symposium on code generation and optimization (p. 75).Google Scholar
  31. Madeira, H., Costa, D., & Vieira, M. (2000). On the emulation of software faults by software fault injection. In Proceedings of the international conference on dependable systems and networks (pp. 417–426).Google Scholar
  32. Madeira, H., Rela, M. Z., Moreira, F., & Silva, J. G. (1994). RIFLE: A general purpose pin-level fault injector. In Proceedings of the first European dependable computing conference (pp. 199–216).Google Scholar
  33. Marinescu, P., & Candea, G. (2009). LFI: A practical and general library-level fault injector. In Proceedings of the international confernece on dependable systems and networks (pp. 379–388).Google Scholar
  34. Marinescu, P. D., Banabic, R., & Candea, G. (2010). An extensible technique for high-precision testing of recovery code. In Proceedings of the USENIX annual technical conference (pp. 23–23).Google Scholar
  35. Natella, N., Cotroneo, D., Duraes, J., & Madeira, H. (2012). On fault representativeness of software fault injection. IEEE Transactions on Software Engineering 99(1).Google Scholar
  36. Natella, R., Cotroneo, D., Duraes, J., & Madeira, H. (2010). Representativeness analysis of injected software faults in complex software. In Proceedings of the 40th international conference on dependable systems and networks (pp. 437–446).Google Scholar
  37. Ng, W. T., & Chen, P. M. (2001). The design and verification of the Rio file cache. IEEE Transactions on Computers, 50(4), 322–337.CrossRefGoogle Scholar
  38. Ostrand, T. J., & Weyuker, E. J. (2002). The distribution of faults in a large industrial software system. ACM SIGSOFT Software Engineering Notes, 27(4), 55–64. doi: 10.1145/566171.566181.CrossRefGoogle Scholar
  39. Sullivan, M., & Chillarege, R. (1992). A comparison of software defects in database management systems and operating systems. In Proceedings of the 22nd international symposium on fault-tolerant computing (pp. 475–484).Google Scholar
  40. Svenningsson, R., Vinter, J., Eriksson, H., & Trngren, M. (2010). MODIFI: a MODel-implemented fault injection tool. In Proceedings of the 29th international conference on computer safety, reliability, and security (pp. 210–222).Google Scholar
  41. Swift, M. M., Annamalai, M., Bershad, B. N., & Levy, H. M. (2006). Recovering device drivers. ACM Transactions on Computer Systems, 24(4), 333–360.CrossRefGoogle Scholar
  42. Tsai, T. K., Hsueh, M. C., Zhao, H., Kalbarczyk, Z., & Iyer, R. K. (1999). Stress-based and path-based fault injection. IEEE Transactions on Computers, 48(11), 1183–1201.CrossRefGoogle Scholar
  43. Tsai, T. K., & Iyer, R. K. (1995). Measuring fault tolerance with the FTAPE fault injection tool. In Proceedings of the 8th international conference on modelling techniques and tools for computer performance evaluation (pp. 26–40).Google Scholar
  44. Zhou, F., Condit, J., Anderson, Z., Bagrak, I., Ennals, R., Harren, M., et al. (2006). SafeDrive: Safe and recoverable extensions using language-based techniques. In Proceedings of the 7th symposium on operating systems design and implementation (pp. 45–60).Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Erik van der Kouwe
    • 1
  • Cristiano Giuffrida
    • 1
  • Andrew S. Tanenbaum
    • 1
  1. 1.Computer Science Department, Faculty of SciencesVU University AmsterdamAmsterdamThe Netherlands

Personalised recommendations