Skip to main content

StealthWorks: Emulating Memory Errors

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6418))

Abstract

A study of Google’s data center revealed that the incidence of main memory errors is surprisingly high. These errors can lead to application and system corruption, impacting reliability. The high error rate is an indication that new resiliency techniques will be vital in future memories. To develop such approaches, a framework is needed to conduct flexible and repeatable experiments. This paper describes such a framework, StealthWorks, to facilitate research on software resilience by behaviorally emulating memory errors in a live system. We illustrate it to study program tolerance to random errors and in the development of a new software technique to continuously test memory for errors.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dell, T.J.: A white paper on the benefits of chipkill - correct ECC for PC server main memory. In: IBM Microelectronics Division (1997)

    Google Scholar 

  2. Kumar, N., Childers, B.R., Soffa, M.L.: Low overhead program monitoring and profiling. In: ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 2005), pp. 28–34 (2005)

    Google Scholar 

  3. Li, M.-L., Ramachandran, P., Sahoo, S.K., Adve, S.V., Adve, V.S., Zhou, Y.: SWAT: An error resilient system. In: 4th Workshop on Silicon Errors in Logic - System Effects (2008)

    Google Scholar 

  4. Li, M.-L., Ramachandran, P., Sahoo, S.K., Adve, S.V., Adve, V.S., Zhou, Y.: Understanding the propagation of hard errors to software and its implications on resilient system design. In: Architecture Support for Programming Languages and Operating Systems (ASPLOS 2008), pp. 265–276 (2008)

    Google Scholar 

  5. Li, X., Huang, M.C., Shen, K.: A realistic evaluation of memory hardware errors and software system susceptibility. In: USENIX Conference (2010)

    Google Scholar 

  6. Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: Building customized program analysis tools with dynamic instrumentation. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PDLI 2005), pp. 190–200 (2005)

    Google Scholar 

  7. Schroeder, B., Pinheiro, E., Weber, W.-D.: DRAM errors in the wild: a large-scale field study. In: Internaetional Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2009), pp. 193–204 (2009)

    Google Scholar 

  8. Scott, K., Kumar, N., Velusamy, S., Childers, B.R., Davidson, J.W., Soffa, M.L.: Retargetable and reconfigurable software dynamic translation. In: International Conference on Code Generation and Optimization (CGO 2003), pp. 36–47 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rahman, M., Childers, B.R., Cho, S. (2010). StealthWorks: Emulating Memory Errors. In: Barringer, H., et al. Runtime Verification. RV 2010. Lecture Notes in Computer Science, vol 6418. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16612-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16612-9_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16611-2

  • Online ISBN: 978-3-642-16612-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics