Advertisement

Memory-Aware Design Space Exploration for Reliability Evaluation in Computing Systems

  • Maha KooliEmail author
  • Giorgio Di Natale
  • Alberto Bosio
Article
  • 2 Downloads

Abstract

In this paper, we present an analytical methodology to measure the vulnerability of the memory components of a microprocessor-based computing system. It is based on the data and the instruction lifetime and residence. The proposed approach considers only the software-layer of the system, which makes it usable at early design stage when the hardware architecture is not fully defined. Then, to consider the hardware memory hierarchy (i.e., RAM, Caches, Register Files) at software level, we have developed a memory subsystem emulator that can be easily configured to support different features. The methodology can be used to perform a fast, easy and not costly cache-aware Design Space Exploration (DSE) to accurately evaluate the vulnerability of the RAM and the caches. The first set of experiments run on Mibench benchmarks shows that we can perform a fast, easy and not costly DSE to accurately evaluate the effects of the faults in both the RAM and the caches. In addition, we validate the proposed approach on a real industrial test case, which is a Flight Management System for avionic application. The results show that the proposed methodology give precise results compared to a classical fault injection tool, and it scales well with the complexity of the application.

Keywords

Reliability Design space exploration Lifetime analysis 

Notes

References

  1. 1.
    Alipour M, Salehi ME, Baghini HS (2012) Design space exploration to find the optimum cache and register file size for embedded applications. arXiv:1205.1871
  2. 2.
    Avižienis A, Laprie J-C, Randell B, Landwehr C (2004) Basic concepts and taxonomy of dependable and secure computing. IEEE Trans Dependable Secure Comput 1(1):11–33CrossRefGoogle Scholar
  3. 3.
    Baumann R (2005) Soft errors in advanced computer systems. IEEE Des Test 22(3):258–266CrossRefGoogle Scholar
  4. 4.
    Benso A, Di Carlo S, Di Natale G, Prinetto P, Taghaferri L (2003) Data criticality estimation in software applications. In: Proceedings international test conference (ITC), 2003, vol 1, pp 802–810Google Scholar
  5. 5.
    Biswas A, Racunas P, Cheveresan R, Emer J, Mukherjee SS, Rangan R (2005) Computing architectural vulnerability factors for address-based structures. SIGARCH Comput Archit News 33(2):532–543CrossRefGoogle Scholar
  6. 6.
    Borkar S, Karnik T, De V (2004) Design and reliability challenges in nanometer technologies. In: Proceedings of the 41st annual design automation conference, DAC ’04, pp 75–75Google Scholar
  7. 7.
    Cai Y, Schmitz MT, Ejlali A, Al-Hashimi BM, Reddy SM (2006) Cache size selection for performance, energy and reliability of time-constrained systems. In: Proceedings of the conference on Asia South Pacific design automation: ASP-DAC, Yokohama, Japan, January 24–27, pp 923–928Google Scholar
  8. 8.
    Ebrahimi M, Chen L, Asadi H, Tahoori MB (2013) CLASS: combined logic and architectural soft error sensitivity analysis. In: 18th Asia and South Pacific design automation conference, ASP-DAC 2013, Yokohama, Japan, January 22–25, 2013, pp 601–607Google Scholar
  9. 9.
    George NJ, Elks CR, Johnson BW, Lach J (2010) Transient fault models and avf estimation revisited. In: 2010 IEEE/IFIP international conference on dependable systems & networks (DSN). IEEE, pp 477–486Google Scholar
  10. 10.
    George NJ, Elks CR, Johnson BW, Lach J (2010) Transient fault models and AVF estimation revisited. In: Proceedings of the 2010 IEEE/IFIP international conference on dependable systems and networks, DSN 2010, Chicago, IL, USA, June 28 – July 1 2010, pp 477–486Google Scholar
  11. 11.
    Ghosh A, Givargis T (2003) Analytical design space exploration of caches for embedded systems. In: Design, automation and test in Europe conference and exposition DATE, Munich, Germany, March 3–7, pp 10650–10655Google Scholar
  12. 12.
    Hiser J, Davidson JW, Whalley DB (2007) Fast, accurate design space exploration of embedded systems memory configurations. In: Proceedings of the 2007 ACM symposium on applied computing SAC, Seoul, Korea, March 11–15, pp 699–706Google Scholar
  13. 13.
    Kooli M (2016) Analysing and supporting the reliability decision-making process in computing systems with a reliability evaluation framework. Theses, Université MontpellierGoogle Scholar
  14. 14.
    Kooli M, Di Natale G (2014) A survey on simulation-based fault injection tools for complex systems. In: Proceedings of the 9th international conference on design & technology of integrated systems in Nanoscale Era, DTIS, Santorini, Greece, May 6–8, pp 1–6Google Scholar
  15. 15.
    Kooli M, Di Natale G, Bosio A (2016) Cache-aware reliability evaluation through llvm-based analysis and fault injection. In: 22nd IEEE international symposium on on-line testing and robust system design, IOLTS, Catalunya, Spain, July 4–6Google Scholar
  16. 16.
    Kooli M, Kaddachi F, Di Natale G, Bosio A (2016) Cache- and register-aware system reliability evaluation based on data lifetime analysis. In: 34th IEEE VLSI test symposium, VTS 2016, Las Vegas, NV, USA, April 25–27, pp 1–6Google Scholar
  17. 17.
    Lattner C, Vikram A (2004) LLVM A compilation framework for lifelong program analysis & transformation. Proceedings of the international symposium on code generation and optimization: feedback-directed and runtime optimization, CGO ’04 p 75Google Scholar
  18. 18.
    Leveugle R, Calvez A, Maistri P, Vanhauwaert P (2009) Statistical fault injection: quantified error and confidence. In: Proceedings of the conference on design, automation and test in Europe, DATE Nice, France, pp 502–506Google Scholar
  19. 19.
    Li X, Negi HS, Mitra T, Roychoudhury A (2004) Design space exploration of caches using compressed traces. In: Proceedigns of the 18th annual international conference on supercomputing, ICS, Saint Malo, France, June 26 - July 01, pp 116–125Google Scholar
  20. 20.
    Liang Y, Mitra T (2008) Static analysis for fast and accurate design space exploration of caches. In: Proceedings of the 6th international conference on hardware/software codesign and system synthesis, CODES+ISSS 2008, Atlanta, GA, USA, October 19–24, pp 103–108Google Scholar
  21. 21.
    Liang Y, Mitra T (2013) An analytical approach for fast and accurate design space exploration of instruction caches. ACM Trans Embedded Comput Syst 13(3):43:1–43:29CrossRefGoogle Scholar
  22. 22.
    SimpleScalar LLC (2004) Simplescalar LLC to serve and projectGoogle Scholar
  23. 23.
    The gem5 simulatorGoogle Scholar
  24. 24.
    Ma A, Cheng Y, Xing Z (2011) Accurate and simplified prediction of AVF for delay and energy efficient cache design. J Comput Sci Technol 26(3):504–519CrossRefGoogle Scholar
  25. 25.
    Maghsoudloo M, Zarandi HR (2015) Design space exploration of non-uniform cache access for soft-error vulnerability mitigation. Microelectron Reliab 55(11):2439–2452CrossRefGoogle Scholar
  26. 26.
  27. 27.
    Montesinos P, Liu W, Torrellas J (2007) Using register lifetime predictions to protect register files against soft errors. In: Proceedings of the 37th annual IEEE/IFIP international conference on dependable systems and networks, DSN ’07, Washington, DC, USA. IEEE Computer Society, pp 286–296Google Scholar
  28. 28.
    Mukherjee SS, Weaver C, Emer J, Reinhardt SK, Austin T (2003) A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In: Proceedings of the 36th annual IEEE/ACM international symposium on microarchitecture, MICRO 36, San Diego, CA, USA, December 3–5, pp 29–42Google Scholar
  29. 29.
    Nicolaidis M (2010) Soft errors in modern electronic systems, vol 41. Springer Science & Business Media, BerlinGoogle Scholar
  30. 30.
    Patel R, Rajawat A (2015) Instruction cache design space exploration for embedded software applications. In: 19th international symposium on VLSI design and test, VDAT, Ahmedabad, India, June 26–29, pp 1–5Google Scholar
  31. 31.
    Savino A, Vallero A, Di Carlo S (2018) Redo: cross-layer multi-objective design-exploration framework for efficient soft error resilient systems. IEEE Trans Comput 67(10):1462–1477MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Shafique M, Rehman S, Aceituno PV, Henkel J (2013) Exploiting program-level masking and error propagation for constrained reliability optimization. In: Proceedings ACM/EDAC/IEEE design automation conference (DAC), pp 1–9Google Scholar
  33. 33.
    Vadlamani R, Zhao J, Burleson W, Tessier R (2010) Multicore soft error rate stabilization using adaptive dual modular redundancy. In: Proceedings of the conference on design, automation and test in Europe, DATE, Dresden, Germany, pp 27–32Google Scholar
  34. 34.
    Vallero A, Savino A, Chatzidimitriou A, Kaliorakis M, Kooli M, Riera Villanueva M, Di Natale G, Bosio A, Canal R, Gizopoulos D, Di Carlo S (2018) Syra: early system reliability analysis for cross-layer soft errors resilience in memory arrays of microprocessor systems. IEEE Trans Comput pp 1–1Google Scholar
  35. 35.
    Vallero A, Savino A, Politano G, Di Carlo S, Chatzidimitriou A, Tselonis S, Kaliorakis M, Gizopoulos D, Riera M, Canal R, Gonzalez A, Kooli M, Bosio A, Di Natale G (2016) Cross-layer system reliability assessment framework for hardware faults. In: Proceedings IEEE international test conference (ITC) , pp 1–10Google Scholar
  36. 36.
    Vallero A, Tselonis S, Foutris N, Kaliorakis M, Kooli M, Savino A, Politano G, Bosio A, Di Natale G, Gizopoulos D, Di Carlo S (2015) Cross-layer reliability evaluation, moving from the hardware architecture to the system level: a clereco eu project overview. Microprocess Microsyst 39(8):1204–1214CrossRefGoogle Scholar
  37. 37.
    Wang S, Jie S, Ziavras SG (2009) On the characterization and optimization of on-chip cache reliability against soft errors. IEEE Trans Computers 58(9):1171–1184MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Wattanapongsakorn N, Levitan SP (2004) Reliability optimization models for embedded systems with multiple applications. IEEE Trans Reliab 53(3):406–416CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.CEA, LETIUniversity Grenoble AlpesGrenobleFrance
  2. 2.CNRS, Grenoble INP, TIMAUniversity Grenoble AlpesGrenobleFrance
  3. 3.Ecole Centrale de LyonLyon Institute of NanotechnologyLyonFrance

Personalised recommendations