Skip to main content
Log in

Static probabilistic worst case execution time estimation for architectures with faulty instruction caches

  • Published:
Real-Time Systems Aims and scope Submit manuscript

Abstract

Semiconductor technology evolution suggests that permanent failure rates will increase dramatically with scaling, in particular for SRAM cells. While well known approaches such as error correcting codes exist to recover from failures and provide fault-free chips, they will not be affordable anymore in the future due to their growing cost. Consequently, other approaches like fine-grained disabling and reconfiguration of hardware elements (e.g. individual functional units or cache blocks) will become economically necessary. This fine-grained disabling will degrade performance compared to a fault-free execution. To the best of our knowledge, all static worst-case execution time (WCET) estimation methods assume fault-free processors. Their result is not safe anymore when fine-grained disabling of hardware components is used. In this paper we provide the first method that statically calculates a probabilistic WCET bound in the presence of permanent faults in instruction caches. The proposed method derives a probabilistic WCET bound for a program, cache configuration, and probability of cell failure. As our method relies on static analysis to bound the longest path, its probabilistic nature only stems from the probability that faults actually occur. Our method is computationally tractable because it does not require an exhaustive enumeration of all the possible combinations of faulty cache blocks. Experimental results show that it provides WCET estimates very close to, but never below, the method that derives probabilistic WCETs by enumerating all possible locations of faulty cache blocks. The proposed method not only allows to quantify the impact of permanent faults on WCET estimates, but, most importantly, can be used in architectural exploration frameworks to select the most appropriate fault management mechanisms and design parameters for current and future chip designs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. Especially, an in-order pipeline with a constant latency per instruction is assumed like in Slijepcevic et al. (2013).

  2. Due to the definition of CHMC FM, variable \(x_b\) is split into two distinct variables, one for the first execution of the block (\(x^{first}_b\)) and another for the next executions (\(x^{next}_b\) ).

  3. http://www.mrtc.mdh.se/projects/wcet/benchmarks.html.

  4. Nachos web site, http://www.cs.washington.edu/homes/tom/nachos/.

  5. In the conference version of the paper (Hardy and Puaut 2013) it was 270 s. The significant improvement comes from our complexity enhancement method consisting of injecting fault-insensitive frequency values into the different ILP systems (Sect. 4.2.1).

References

  • Borkar S, Karnik T, Narendra S, Tschanz J, Keshavarzi A, De V (2003) Parameter variations and impact on circuits and microarchitecture. In DAC40, pp 338–342

  • Bowman K, Tschanz J, Wilkerson C, Lu SL, Karnik T, De V, Borkar S. (2009) Circuit techniques for dynamic variation tolerance. In: DAC46. ACM, New York, pp 4–7. doi:10.1145/1629911.1629915

  • Cazorla FJ, Quiñones E, Vardanega T, Cucu L, Triquet B, Bernat G, Berger E, Abella J, Wartel F, Houston M, Santinellei L, Kosmidis L, Lo C, Maxim D (2013) Proartis: probabilistically analysable real-time systems. ACM Trans Embed Comput Syst 12:79

    Article  Google Scholar 

  • Cheng L, Gupta P, Spanos CJ, Qian K, He L (2011) Physically justifiable die-level modeling of spatial variation in view of systematic across wafer variability. IEEE Trans CAD Integr Circuits Syst 30(3):388–401

    Article  Google Scholar 

  • Chevochot P, Puaut I (1999) Scheduling fault-tolerant distributed hard real-time tasks independently of the replication strategies. In: 6th international conference on real-time computing and applications symposium, pp 356–363

  • Colin A, Puaut I (2001) A modular and retargetable framework for tree-based WCET analysis. In: Euromicro conference on real-time systems (ECRTS), Delft, pp. 37–44

  • Engblom J (2002) Processor pipelines and static worst-case execution time analysis. PhD thesis, Uppsala University.

  • Ferdinand C, Wilhelm R (1998) On predicting data cache behavior for real-time systems. In LCTES ’98: proceedings of the ACM SIGPLAN workshop on languages, compilers, and tools for embedded systems, pp 16–30

  • Ghosh S, Melhem R, Mossé D (1997) Fault-tolerance through scheduling of aperiodic tasks in hard real-time multiprocessor systems. IEEE Trans Parallel Distrib Syst 8(3):272–284

    Article  Google Scholar 

  • Hamming R (1950) Error detecting and error correcting codes. Bell Syst Tech J 26(2):147–160

    Article  MathSciNet  Google Scholar 

  • Hardy D, Puaut I (2008) WCET analysis of multi-level non-inclusive set-associative instruction caches. In: Proceedings of the 29th real-time systems symposium, pp 456–466

  • Hardy D, Puaut I (2013) Static probabilistic worst case execution time estimation for architectures with faulty instruction caches. In: 21st international conference on real-time networks and systems, RTNS 2013, Sophia Antipolis, October 17–18, pp 35–44

  • Hardy D, Sideris I, Ladas N, Sazeides Y (2012) The performance vulnerability of architectural and non-architectural arrays to permanent faults. In: Proceedings of the 45th annual IEEE/ACM international symposium on microarchitecture, MICRO’12, pp 48–59

  • Höfig K (2012) Failure-dependent timing analysis: a new methodology for probabilistic worst-case execution time analysis. In: Schmitt J (ed) Measurement, modelling, and evaluation of computing systems and dependability and fault tolerance. Lecture notes in computer science, vol 7201. Springer, Berlin, pp 61–75

    Google Scholar 

  • Kosmidis L, Abella J, Quiñones E, Cazorla FJ (2013) A cache design for probabilistically analysable real-time systems. In: Proceedings of the conference on design, automation and test in Europe, DATE ’13, pp 513–518

  • Li X, Roychoudhury A, Mitra T (2006) Modeling out-of-order processors for WCET estimation. Real Time Syst J 34(3):195–227

    Article  MATH  Google Scholar 

  • Li YTS, Malik S (1995) Performance analysis of embedded software using implicit path enumeration. In: Gerber R, Marlowe T (eds) LCTES ’95: proceedings of the ACM SIGPLAN 1995 workshop on languages, compilers, & tools for real-time systems, vol 30, New York, pp 88–98

  • Maxim D, Houston M, Santinelli L, Bernat G, Davis RI, Cucu-Grosjean L (2012) Re-sampling for statistical timing analysis of real-time systems. In: Proceedings of the 20th international conference on real-time and network systems, RTNS ’12. ACM, New York, pp. 111–120. doi:10.1145/2392987.2393001

  • McNairy C, Mayfield, J (2005) Montecito error protection and mitigation. In: HPCRI ’05: 1st workshop on high performance computing reliability issues

  • Mueller F (2000) Timing analysis for instruction caches. Real Time Syst 18(2–3):217–247

    Article  Google Scholar 

  • Nassif SR, Mehta N, Cao Y (2010) A resilience roadmap. In: DATE, pp 1011–1016

  • Punnekkat S, Burns A, Davis R (2001) Analysis of checkpointing for real-time systems. Real Time Syst 20(1):83–102

    Article  MATH  Google Scholar 

  • Puschner P, Schedl AV (1997) Computing maximum task execution times: a graph based approach. Proc IEEE Real Time Syst Symp 13:67–91

    Article  Google Scholar 

  • Reineke J, Grund D, Berg C, Wilhelm R (2007) Timing predictability of cache replacement policies. Real Time Syst 37(2):99–122

    Article  MATH  Google Scholar 

  • Slijepcevic M, Kosmidis L, Abella J, Quinones E, Cazorla F (2013) DTM: degraded test mode for fault-aware probabilistic timing analysis. In: 2013 25th Euromicro conference on real-time systems (ECRTS), pp 237–248

  • Theiling H, Ferdinand C, Wilhelm R (2000) Fast and precise WCET prediction by separated cache and path analyses. Real Time Syst 18(2–3):157–179

    Article  Google Scholar 

  • Wilhelm R, Engblom J, Ermedahl A, Holsti N, Thesing S, Whalley D, Bernat G, Ferdinand C, Heckmann R, Mitra T, Mueller F, Puaut I, Puschner P, Staschulat J, Stenström P (2008) The worst-case execution-time problem-overview of methods and survey of tools. ACM Trans Embed Comput Syst 7(3):36:1–36:53. doi:10.1145/1347375.1347389

    Article  Google Scholar 

  • Zhou ST, Katariya S, Ghasemi H, Draper S, Kim NS (2010) Minimizing total area of low-voltage SRAM arrays through joint optimization of cell size, redundancy, and ECC. In: 2010 IEEE international conference on computer design (ICCD), pp 112–117. doi:10.1109/ICCD.2010.5647605

Download references

Acknowledgments

The authors would like to thank Jaume Abella, Benjamin Lesage, Bastien Pasdeloup, Erven Rohou and André Seznec for their fruitful comments on earlier versions of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Damien Hardy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hardy, D., Puaut, I. Static probabilistic worst case execution time estimation for architectures with faulty instruction caches. Real-Time Syst 51, 128–152 (2015). https://doi.org/10.1007/s11241-014-9212-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11241-014-9212-x

Keywords

Navigation