Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

European Conference on Parallel Processing

Euro-Par 2011: Euro-Par 2011: Parallel Processing Workshops pp 272–281Cite as

  1. Home
  2. Euro-Par 2011: Parallel Processing Workshops
  3. Conference paper
Evaluating Application Vulnerability to Soft Errors in Multi-level Cache Hierarchy

Evaluating Application Vulnerability to Soft Errors in Multi-level Cache Hierarchy

  • Zhe Ma30,32,
  • Trevor Carlson31,32,
  • Wim Heirman31,32 &
  • …
  • Lieven Eeckhout31,32 
  • Conference paper
  • 1091 Accesses

  • 1 Citations

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7156)

Abstract

As the capacity of caches increases dramatically with new processors, soft errors originating in cache memories has become a major reliability concern for high performance processors. This paper presents application specific soft error vulnerability analysis in order to understand an application’s responses to soft errors from different levels of caches. Based on a high-performance processor simulator called Graphite, we have implemented a fault injection framework that can selectively inject bit flips to different levels of caches. We simulated a wide range of relevant bit error patterns and measured the applications’ vulnerabilities to bit errors. Our experimental results have shown the differing vulnerabilities of applications to bit errors in different levels of caches (e.g. the application failure rate for one program is more than the doulbe of that for another program for a given cache); the results have also indicated the probabilities of different failure behaviors for the given applications.

Keywords

  • Soft error
  • processor simulator
  • fault injection

This work is funded by Intel and by the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT).

Download conference paper PDF

References

  1. Baumann, R.: Soft errors in advanced computer systems. IEEE Design & Test of Computers 22(3), 258–266 (2005)

    CrossRef  Google Scholar 

  2. Bronevetsky, G., de Supinski, B.R.: Soft error vulnerability of iterative linear algebra methods. In: SELSE (2007)

    Google Scholar 

  3. Carlson, T.E., Heirman, W., Eeckhout, L.: Exploring the level of abstraction for scalable and accurate parallel multicore simulation. In: SC (2011)

    Google Scholar 

  4. Cochran, W.G.: Sampling Techniques, 3rd edn. John Wiley (1977)

    Google Scholar 

  5. da Lu, C., Reed, D.A.: Assessing fault sensitivity in MPI applications. In: SC, p. 37. IEEE Computer Society (2004)

    Google Scholar 

  6. Daveau, J.-M., Blampey, A., Gasiot, G., Bulone, J., Roche, P.: An industrial fault injection platform for soft-error dependability analysis and hardening of complex system-on-a-chip. In: IRPS, pp. 212–220 (2009)

    Google Scholar 

  7. Heidel, D., Marchal, P., et al.: Single-event upsets and multiple-bit upsets on a 45nm SOI SRAM. IEEE Transactions on Nuclear Science 56(6), 3499–3504 (2009)

    CrossRef  Google Scholar 

  8. Kim, J., Hardavellas, N., Mai, K., Falsafi, B., Hoe, J.C.: Multi-bit error tolerant caches using two-dimensional error coding. In: MICRO, pp. 197–209 (2007)

    Google Scholar 

  9. Luk, C.-K., Cohn, R.S., Muth, R., Patil, H., Klauser, A., Geoffrey Lowney, P., Wallace, S., Reddi, V.J., Hazelwood, K.M.: Pin: building customized program analysis tools with dynamic instrumentation. In: PLDI, pp. 190–200 (2005)

    Google Scholar 

  10. Mak, T.M., Mitra, S., Zhang, M.: DFT assisted built-in soft error resilience. In: IOLTS, p. 69 (2005)

    Google Scholar 

  11. Miller, J.E., Kasture, H., Kurian, G., Gruenwald III, C., Beckmann, N., Celio, C., Eastep, J., Agarwal, A.: Graphite: A distributed parallel simulator for multicores. In: HPCA, pp. 1–12 (2010)

    Google Scholar 

  12. Mukherjee, S.S., Weaver, C.T., Emer, J.S., Reinhardt, S.K., Austin, T.M.: A systematic methodology to compute the archi- tectural vulnerability factors for a high-performance microprocessor. In: MICRO, pp. 29–42. ACM/IEEE (2003)

    Google Scholar 

  13. Ramachandran, P., Kudva, P., Kellington, J.W., Schumann, J., Sanda, P.: Statistical fault injection. In: DSN, pp. 122–127. IEEE Computer Society (2008)

    Google Scholar 

  14. Rao, S., Sanda, P., Ackaret, J., Barrera, A., Yanez, J., Mitra, S.: Examing workload dependence of soft error rates. In: SELSE (2008)

    Google Scholar 

  15. Ruckerbauer, F.X., Georgakos, G.: Soft error rates in 65nm SRAMs analysis of new phenomena. In: IOLTS, pp. 203–204 (2007)

    Google Scholar 

  16. Schroeder, B., Gibson, G.A.: A large-scale study of failures in high performance computing systems. In: DSN, pp. 249–258 (2006)

    Google Scholar 

  17. Wang, N.J., Fertig, M., Patel, S.J.: Y-branches: When you come to a fork in the road, take it. In: IEEE PACT, pp. 56–66 (2003)

    Google Scholar 

  18. Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: Characterization and methodological considerations. In: ISCA, pp. 24–36 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Imec, Kapeldreef 75, 3000, Leuven, Belgium

    Zhe Ma

  2. Ghent University, Sint-Pietersnieuwstraat 41, 9000, Gent, Belgium

    Trevor Carlson, Wim Heirman & Lieven Eeckhout

  3. Intel ExaScience lab, Kapeldreef 75, 3000, Leuven, Belgium

    Zhe Ma, Trevor Carlson, Wim Heirman & Lieven Eeckhout

Authors
  1. Zhe Ma
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Trevor Carlson
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Wim Heirman
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Lieven Eeckhout
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Scilytics, Koellnerhofgasse 3/15A, 1010, Vienna, Austria

    Michael Alexander

  2. ICAR-CNR, Via P. Castellino, 111, 80131, Napoli, Italy

    Pasqua D’Ambra

  3. University of Amsterdam, 1090, Amsterdam, Netherlands

    Adam Belloum

  4. Innovative Computing Laboratory, The University of Tennessee, US

    George Bosilca

  5. Department of Experimental Medicine and Clinic, University Magna Græcia, 88100, Catanzaro, Italy

    Mario Cannataro

  6. Computer Science Department, University of Pisa, Italy

    Marco Danelutto

  7. Second University of Naples, Italy

    Beniamino Di Martino

  8. TUMünchen,, Boltzmannstr. 3, ,, 85748, Garching, Germany

    Michael Gerndt

  9. Equipe Runtime, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France

    Emmanuel Jeannot & Raymond Namyst & 

  10. Equipe HIEPACS, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France

    Jean Roman

  11. Computer Science and Mathematics Division, Oak Ridge National Laboratory, 37831-6164, Oak Ridge, TN, USA

    Stephen L. Scott

  12. Department of Scientific Computing, University of Vienna, Nordbergstr. 15/3C, 1090, Vienna, Austria

    Jesper Larsson Traff

  13. Computer Science and Mathematics Division, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA

    Geoffroy Vallée

  14. Technische Universität München, Germany

    Josef Weidendorfer

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ma, Z., Carlson, T., Heirman, W., Eeckhout, L. (2012). Evaluating Application Vulnerability to Soft Errors in Multi-level Cache Hierarchy. In: Alexander, M., et al. Euro-Par 2011: Parallel Processing Workshops. Euro-Par 2011. Lecture Notes in Computer Science, vol 7156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29740-3_31

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-29740-3_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29739-7

  • Online ISBN: 978-3-642-29740-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature