A Cooperative Approach to Virtual Machine Based Fault Injection

  • Thomas NaughtonEmail author
  • Christian Engelmann
  • Geoffroy Vallée
  • Ferrol Aderholdt
  • Stephen L. Scott
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10104)


Resilience investigations often employ fault injection (FI) tools to study the effects of simulated errors on a target system. It is important to keep the target system under test (SUT) isolated from the controlling environment in order to maintain control of the experiement. Virtual machines (VMs) have been used to aid these investigations due to the strong isolation properties of system-level virtualization. A key challenge in fault injection tools is to gain proper insight and context about the SUT. In VM-based FI tools, this challenge of target context is increased due to the separation between host and guest (VM). We discuss an approach to VM-based FI that leverages virtual machine introspection (VMI) methods to gain insight into the target’s context running within the VM. The key to this environment is the ability to provide basic information to the FI system that can be used to create a map of the target environment. We describe a proof-of-concept implementation and a demonstration of its use to introduce simulated soft errors into an iterative solver benchmark running in user-space of a guest VM.


Fault injection Virtualization Virtual machine introspection Resilience tools 



This material is based upon work supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research program.


  1. 1.
    Aderholdt, F., Ghafoor, S., Siraj, A., Scott, S.L.: Integrity based intrusion detection system for enterprise and cloud environments. In: Proceedings of the 4th IEEE/ACM International Conference on Utility and Cloud Computing (2011)Google Scholar
  2. 2.
    Brightwell, R., Oldfield, R., Maccabe, A.B., Bernholdt, D.E.: Hobbes: Composition and virtualization as the foundations of an extreme-scale OS/R. In: Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers (ROSS), ROSS 2013, NY, USA, pp. 2:1–2:8 (2013).
  3. 3.
    Bronevetsky, G., de Supinski, B.: Soft error vulnerability of iterative linear algebra methods. In: Proceedings of the 22nd Annual International Conference on Supercomputing, ICS 2008, NY, USA, pp. 155–164 (2008).
  4. 4.
    Chen, P.M., Noble, B.D.: When virtual is better than real. In: Proceedings of the Eighth Workshop on Hot Topics in Operating Systems, HOTOS 2001, pp. 133–138 (2001)
  5. 5.
    Cho, H., Mirkhani, S., Cher, C.Y., Abraham, J.A., Mitra, S.: Quantitative evaluation of soft error injection techniques for robust system design. In: 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1–10, May 2013Google Scholar
  6. 6.
    Clark, B., Deshane, T., Dow, E., Evanchik, S., Finlayson, M., Herne, J., Matthews, J.N.: Xen and the art of repeated research. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference, ATEC 2004, pp. 47–47. USENIX Association, Berkeley (2004).
  7. 7.
    DeBardeleben, N., Blanchard, S., Guan, Q., Zhang, Z., Fu, S.: Experimental framework for injecting logic errors in a virtual machine to profile applications for soft error resilience. In: Alexander, M., et al. (eds.) Euro-Par 2011. LNCS, vol. 7156, pp. 282–291. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-29740-3_32 CrossRefGoogle Scholar
  8. 8.
    Garfinkel, T., Rosenblum, M.: A virtual machine introspection based architecture for intrusion detection. In: Proceedings of Network and Distributed Systems Security Symposium, February 2003Google Scholar
  9. 9.
    Guan, Q., Debardeleben, N., Blanchard, S., Fu, S.: F-SEFI: a fine-grained soft error fault injection tool for profiling application vulnerability. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp. 1245–1254, May 2014Google Scholar
  10. 10.
    Heroux, M.A., Dongarra, J.: Toward a new metric for ranking high performance computing systems. Technical Report SAND2013-4744, Sandia National Laboratories. Accessed 26 April 2014
  11. 11.
    Hsueh, M.C., Tsai, T.K., Iyer, R.K.: Fault injection techniques and tools. Computer 30(4), 75–82 (1997)CrossRefGoogle Scholar
  12. 12.
    Koopman, P.: What’s wrong with fault injection as a benchmarking tool? In: Proceedings of the Workshop on Dependability Benchmarking (WDB 2002), 25 June 2002. (In Conjunction with IEEE Conference on Dependable Systems and Networks (DSN-2002))
  13. 13.
    Lange, J., Pedretti, K., Hudson, T., Dinda, P., Cui, Z., Xia, L., Bridges, P., Gocke, A., Jaconette, S., Levenhagen, M., Brightwell, R.: Palacios and Kitten: new high performance operating systems for scalable virtualized and native supercomputing. In: IEEE International Symposium on Parallel Distributed Processing (IPDPS), pp. 1–12, April 2010Google Scholar
  14. 14.
    Le, M., Tamir, Y.: Fault injection in virtualized systems - challenges and applications. Trans. Dependable Secure Comput. 12(3), 284–297 (2015).
  15. 15.
    Li, D., Vetter, J.S., Yu, W.: Classifying soft error vulnerabilities in extreme-scale scientific applications using a binary instrumentation tool. In: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). ACM, November 2012Google Scholar
  16. 16.
    Mantevo mini-application downloads,, project URL: Accessed 6 April 2014
  17. 17.
    Nance, K., Bishop, M., Hay, B.: Virtual machine introspection: observation or interference? IEEE Secur. Priv. 6(5), 32–37 (2008)CrossRefGoogle Scholar
  18. 18.
    Naughton, T., Vallée, G., Engelmann, C., Scott, S.L.: A case for virtual machine based fault injection in a high-performance computing environment. Euro-Par 2011. LNCS, vol. 7155, pp. 234–243. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-29737-3_27 CrossRefGoogle Scholar
  19. 19.
    Payne, B.D., Carbone, M., Sharif, M., Lee, W.: Lares: an architecture for secure active monitoring using virtualization. In: Proceedings of the IEEE Symposium on Security and Privacy, May 2008Google Scholar
  20. 20.
    Popek, G.J., Goldberg, R.P.: Formal requirements for virtualizable third generation architectures. Commun. ACM 17(7), 412–421 (1974)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Potyra, S., Sieh, V., Cin, M.D.: Evaluating fault-tolerant system designs using FAUmachine. In: Proceedings of the 2007 Workshop on Engineering Fault Tolerant Systems (EFTS 2007), NY, USA, p. 9. ACM, New York (2007)Google Scholar
  22. 22.
    Schirmeier, H., Borchert, C., Spinczyk, O.: Avoiding pitfalls in fault-injection based comparison of program susceptibility to soft errors. In: 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 319–330, June 2015Google Scholar
  23. 23.
    Smith, J.E., Nair, R.: Virtual Machines: Versatile Platforms for Systems and Processes. Morgan Kaufmann, Burlington (2005)zbMATHGoogle Scholar
  24. 24.
    Süßkraut, M., Creutz, S., Fetzer, C.: Fast fault injection with virtual machines (Fast Abstract). In: Supplement of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN2007).

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Thomas Naughton
    • 1
    Email author
  • Christian Engelmann
    • 1
  • Geoffroy Vallée
    • 1
  • Ferrol Aderholdt
    • 1
  • Stephen L. Scott
    • 1
    • 2
  1. 1.Oak Ridge National Laboratory Computer Science and Mathematics DivisionOak RidgeUSA
  2. 2.Computer ScienceTennessee Tech UniversityCookvilleUSA

Personalised recommendations