Soft Error Mitigation in Soft-Core Processors
Chapter
Abstract
This chapter aims to present different approaches and techniques available in literature regarding the fault mitigation on soft-core processors, with an especial emphasis on those ones involving hardware/software hybrid-based solutions.
Keywords
Fault Injection Transient Fault Control Flow Graph Register Transfer Level Very Long Instruction Word
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
- 1.Baumann RC (2005) Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans Device Mater Reliab 5:305–316. doi: 10.1109/TDMR.2005.853449 CrossRefGoogle Scholar
- 2.Shivakumar P, Kistler M, Keckler SW, Burger D, Alvisi L (2002) Modeling the effect of technology trends on the soft error rate of combinational logic. In: Proceedings of the international conference on dependable systems and networks, IEEE Computer Society, pp 389–398. doi: 10.1109/DSN.2002.1028924
- 3.Benedetto JM, Eaton PH, Mavis DG, Gadlage M, Turflinger T (2006) Digital single event transient trends with technology node scaling. IEEE Trans Nucl Sci 53:3462–3465. doi: 10.1109/TNS.2006.886044 CrossRefGoogle Scholar
- 4.Perry F, Mackey L, Reis GA, Ligatti J, August DI, Walker D. (2007) Fault-tolerant typed assembly language. In: Proceedings of the 2007 ACM SIGPLAN conference on programming language design and implementation—PLDI’07. ACM Press, New York, p 42. doi: 10.1145/1250734.1250741
- 5.Karnik T, Hazucha P (2004) Characterization of soft errors caused by single event upsets in CMOS processes. IEEE Trans Dependable Secure Comput 1:128–143. doi: 10.1109/TDSC.2004.14 CrossRefGoogle Scholar
- 6.Edwards R, Dyer C, Normand E (2004) Technical standard for atmospheric radiation single event effects, (SEE) on avionics electronics. In: Proceedings of the 2004 IEEE radiation effects data workshop (IEEE Cat. No. 04TH8774), IEEE, pp 1–5. doi: 10.1109/REDW.2004.1352895
- 7.Barth JL, Dyer CS, Stassinopoulos EG (2003) Space, atmospheric, and terrestrial radiation environments. IEEE Trans Nucl Sci 50:466–482. doi: 10.1109/TNS.2003.813131 CrossRefGoogle Scholar
- 8.Michalak SE, Harris KW, Hengartner NW, Takala BE, Wender SA (2005) Predicting the number of fatal soft errors in Los Alamos national laboratory’s ASC Q supercomputer. IEEE Trans Device Mater Reliab 5:329–335. doi: 10.1109/TDMR.2005.855685 CrossRefGoogle Scholar
- 9.Agency ES (1993) The radiation design handbook ESA PSS-01-609. European Space Agency technical reportGoogle Scholar
- 10.Fulton R (2014) Airborne electronic hardware design assurance: a practitioner’s guide to RTCA/DO-254. CRC Press, Boca RatonGoogle Scholar
- 11.Commission IE (2006) IEC/TS 62396-1. Technical report, International Electrotechnical CommissionGoogle Scholar
- 12.Council AE (2003) Stress test qualification for integrated circuits, AEC-Q100-Rev-F.2. Technical reportGoogle Scholar
- 13.AEC-Q100 (1994) Stress test qualification for integrated circuits for automotive industryGoogle Scholar
- 14.Corporation A (2010) RTAX-S/SL and RTAX-DSP radiation-tolerant FPGAs. Data Sheet Rev 13Google Scholar
- 15.Xilinx Inc. (2010) Radiation-hardened, space-grade Virtex-5QV FPGA data sheet: DC and switching characteristics. Data sheet DS692 (v1.0.1)Google Scholar
- 16.Kubalík P, Kubátová H (2008) Dependable design technique for system-on-chip. J Syst Archit 54:452–464. doi: 10.1016/j.sysarc.2007.09.003 CrossRefGoogle Scholar
- 17.Kastensmidt FL, Carro L, Reis R (2006) Fault-tolerance techniques for SRAM-based FPGAs (frontiers in electronic testing). Springer, SecaucusGoogle Scholar
- 18.Mentor Graphics Corporation (2010) Advanced FPGA synthesis: precision rad-tolerant. Data sheet 1028010Google Scholar
- 19.Inc. S (2010) Synopsys FPGA synthesis synplify pro reference manual. Technical report, Actel editionGoogle Scholar
- 20.Xilinx Inc. (2009) Aerospace and defense: Xilinx TMRtool. Technical reportGoogle Scholar
- 21.Huang K, Yu H, Li X (2011) Cross-layer optimized placement and routing for FPGA soft error mitigation. In: Proceedings of the 2011 design, automation test in Europe conference exhibition, IEEE, pp 1–6. doi: 10.1109/DATE.2011.5763018
- 22.Sterpone L, Violante M (2006) A new reliability-oriented place and route algorithm for SRAM-based FPGAs. IEEE Trans Comput 55:732–744. doi: 10.1109/TC.2006.82 CrossRefGoogle Scholar
- 23.De Lima Kastensmidt FG, Neuberger G, Hentschke RF, Carro L, Reis R (2004) Designing fault-tolerant techniques for SRAM-based FPGAs. IEEE Des Test Comput 21:552–562. doi: 10.1109/MDT.2004.85 CrossRefGoogle Scholar
- 24.Nicolaidis M, Achouri N, Boutobza S (2003) Dynamic data-bit memory built-in self-repair. In: Proceedings of the international conference on computer aided design ICCAD-2003, pp 588–594. doi: 10.1109/ICCAD.2003.1257870
- 25.Lima F, Carro L, Reis R (2003) Designing fault tolerant systems into SRAM-based FPGAs. In: Proceedings of the 2003 design automation conference (IEEE Cat. No. 03CH37451), IEEE, pp 650–655. doi: 10.1109/DAC.2003.1219099
- 26.De Lima FG, Cota E, Carro L, Lubaszewski M, Reis R, Velazco R, et al (2000) Designing a radiation hardened 8051-like micro-controller. In: Proceedings of the 13th symposium on integrated circuits and systems design (Cat. No. PR00843), IEEE Computer Society, pp 255–260. doi: 10.1109/SBCCI.2000.876039
- 27.Nicolaidis M (2001) Soft errors in modern electronic systems, vol 41. Chapter 8. Front electron testing, 1st edn. Springer, New YorkGoogle Scholar
- 28.Neuberger G, de Lima Kastensmidt FG, Reis R (2005) An automatic technique for optimizing Reed-Solomon codes to improve fault tolerance in memories. In :Proceedings of the IEEE Des Test Comput 22:50–8. doi: 10.1109/MDT.2005.2
- 29.Hentschke R, Marques F, Lima F, Carro L, Susin A, Reis R (2002) Analyzing area and performance penalty of protecting different digital modules with Hamming code and triple modular redundancy. In: Proceedings of the 15th symposium on integrated circuits and systems design, IEEE Computer Society, pp 95–100. doi: 10.1109/SBCCI.2002.1137643
- 30.Calin T, Nicolaidis M, Velazco R (1996) Upset hardened memory design for submicron CMOS technology. IEEE Trans Nucl Sci 43:2874–2878. doi: 10.1109/23.556880 CrossRefGoogle Scholar
- 31.Von Neumann J (1956) Probabilistic logics and synthesis of reliable organisms from unreliable components. In: Shannon C, McCarthy J (eds) Automata studies. Princeton University Press, Princeton, pp 43–98Google Scholar
- 32.Mahmood A, McCluskey EJ (1988) Concurrent error detection using watchdog processors—a survey. IEEE Trans Comput 37:160–174. doi: 10.1109/12.2145 CrossRefGoogle Scholar
- 33.Austin TM (1999) DIVA: a reliable substrate for deep submicron microarchitecture design. In: Proceedings of the 32nd annual ACM/IEEE international symposium on microarchitecture, MICRO-32, IEEE Computer Society, pp 196–207. doi: 10.1109/MICRO.1999.809458
- 34.Reed IS, Solomon G (1960) Polynomial codes over certain finite fields. J Soc Ind Appl Math 8:300–304. doi: 10.1137/0108018 CrossRefMathSciNetMATHGoogle Scholar
- 35.Johnson BW (1989) Design and analysis of fault-tolerant systems for industrial applications. In: Görke W, Sörensen H (eds) Fault-tolerant computer systems, vol 214, pp 57–73. doi: 10.1007/978-3-642-75002-1_5
- 36.Martínez-Álvarez A, Restrepo-Calle F, Vivas Tejuelo LA, Cuenca-Asensi S (2013) Fault tolerant embedded systems design by multi-objective optimization. Expert Syst Appl 40:6813–6822CrossRefGoogle Scholar
- 37.Nicolaidis M (1999) Time redundancy based soft-error tolerance to rescue nanometer technologies. In: Proceedings of the 17th IEEE VLSI test symposium (Cat. No. PR00146), IEEE Computer Society, pp 86–94. doi: 10.1109/VTEST.1999.766651
- 38.Goloubeva O, Rebaudengo M, Reorda MS, Violante M (2006) Hardening the control flow. In: Software-implemented hardware fault tolerance. Springer, New York, pp 63–116. doi: 10.1007/0-387-32937-4
- 39.Benso A, Carlo SD, Natale GD, Prinetto P, Tagliaferri L (2001) Control-flow checking via regular expressions. In: Proceedings of the 10th Asian test symposium, IEEE, pp 299–303. doi: 10.1109/ATS.2001.990300
- 40.Hayes JP, Murray BT. (n.d.) Low-cost on-line fault detection using control flow assertions. In: Proceedings of the 9th IEEE on-line test symposium 2003. IOLTS 2003, IEEE Computer Society, pp 137–143. doi: 10.1109/OLT.2003.1214380
- 41.Goloubeva O, Rebaudengo M, Reorda MS, Violante M (2003) Soft-error detection using control flow assertions. In: Proceedings of the 16th IEEE symposium Comput. Arith., IEEE Computer Society, pp 581–588. doi: 10.1109/DFTVS.2003.1250158
- 42.Oh N, Shirvani PP, McCluskey EJ (2002) Control-flow checking by software signatures. IEEE Trans Reliab 51:111–122. doi: 10.1109/24.994926 CrossRefGoogle Scholar
- 43.Avizienis A (1985) The N-version approach to fault-tolerant software. IEEE Trans Softw Eng SE-11:1491–1501. doi: 10.1109/TSE.1985.231893 CrossRefGoogle Scholar
- 44.Jochim M (2002) Detecting processor hardware faults by means of automatically generated virtual duplex systems. In: Proceedings of the international conference on dependable systems and networks, IEEE Computer Society, pp 399–408. doi: 10.1109/DSN.2002.1028925
- 45.Oh N, Mitra S, McCluskey EJ (2002) ED/sup 4/I: error detection by diverse data and duplicated instructions. IEEE Trans Comput 51:180–199. doi: 10.1109/12.980007 CrossRefGoogle Scholar
- 46.Oh N, McCluskey EJ (2002) Error detection by selective procedure call duplication for low energy consumption. IEEE Trans Reliab 51:392–402. doi: 10.1109/TR.2002.804735 CrossRefGoogle Scholar
- 47.Rebaudengo M, Sonza Reorda M, Torchiano M, Violante M (1999) Soft-error detection through software fault-tolerance techniques. In: Proceedings of the 1999 IEEE international symposium on defect fault tolerance VLSI Systems, IEEE Computer Society, pp 210–218. doi: 10.1109/DFTVS.1999.802887
- 48.Rebaudengo M, Reorda MS, Violante M, Torchiano M (2001) A source-to-source compiler for generating dependable software. In: Proceedings of the 1st IEEE international workshop on source code analysis and manipulation, IEEE Computer Society, pp 33–42. doi: 10.1109/SCAM.2001.972664
- 49.Reis GA, Chang J, Vachharajani N, Rangan R, August DI (2005) SWIFT: software implemented fault tolerance. In: Proceedings of the international symposium on code generation and optimization, IEEE, pp 243–254. doi: 10.1109/CGO.2005.34
- 50.Chang J, Reis GA, August DI (2006) Automatic instruction-level software-only recovery. In: Proceedings of the international conference on dependable systems and networks, IEEE, pp 83–92. doi: 10.1109/DSN.2006.15
- 51.Reis GA, Chang J, Vachharajani N, Rangan R, August DI, Mukherjee SS (2005) Software-controlled fault tolerance. ACM Trans Archit Code Optim 2:366–396. doi: 10.1145/1113841.1113843 CrossRefGoogle Scholar
- 52.Reis GA, Chang J, Vachharajani N, Rangan R, August DI, Mukherjee SS (2005) Design and evaluation of hybrid fault-detection systems. In: Proceedings of the 32nd international symposium on computer architecture, IEEE, pp 148–159. doi: 10.1109/ISCA.2005.21
- 53.Mukherjee SS, Kontz M, Reinhardt SK (2002) Detailed design and evaluation of redundant multi-threading alternatives. In: Proceedings of the 29th annual International symposium on computer architecture, IEEE Computer Society, pp 99–110. doi: 10.1109/ISCA.2002.1003566
- 54.Bernardi P, Bolzani LMV, Rebaudengo M, Reorda MS, Vargas FL, Violante M (2006) A new hybrid fault detection technique for systems-on-a-chip. IEEE Trans Comput 55:185–198. doi: 10.1109/TC.2006.15 CrossRefGoogle Scholar
- 55.Bernardi P, Sterpone L, Violante M, Portela-Garcia M (2006) Hybrid fault detection technique: a case study on Virtex-II Pro’s PowerPC 405. IEEE Trans Nucl Sci 53:3550–3557. doi: 10.1109/TNS.2006.886221 CrossRefGoogle Scholar
- 56.Bernardi P, Bolzani Poehls L, Grosso M, Sonza RM (2010) A hybrid approach for detection and correction of transient faults in SoCs. IEEE Trans Dependable Secure Comput 7:439–445. doi: 10.1109/TDSC.2010.33 CrossRefGoogle Scholar
- 57.Bernardi P, Bolzani L, Reorda MS (2007) A hybrid approach to fault detection and correction in SoCs. In: Proceedings of the 13th IEEE international on-line test symposium (IOLTS 2007), IEEE, pp 107–112. doi: 10.1109/IOLTS.2007.8
- 58.Rebaudengo M, Reorda MS, Violante M, Nicolescu B, Velazco R (2002) Coping with SEUs/SETs in microprocessors by means of low-cost solutions: a comparison study. IEEE Trans Nucl Sci 49:1491–1495. doi: 10.1109/TNS.2002.1039689 CrossRefGoogle Scholar
- 59.Azambuja JR, Lapolli Â, Rosa L, Kastensmidt FL (2011) Detecting SEEs in microprocessors through a non-intrusive hybrid technique. IEEE Trans Nucl Sci 58:993–1000. doi: 10.1109/TNS.2011.2109398 CrossRefGoogle Scholar
- 60.Azambuja JR, Souza F, Rosa L, Kastensmidt F (2010) Non-intrusive hybrid signature-based technique to detect SEU and set faults in microprocessors. In: Proceedings of the 11th European conference on radiation and its effects on components and systems RADECS 2010, LängenfeldGoogle Scholar
- 61.Li X, Gaudiot J-L (2009) Tolerating radiation-induced transient faults in modern processors. Int J Parallel Prog 38:85–116. doi: 10.1007/s10766-009-0114-9 CrossRefGoogle Scholar
- 62.Scholzel M (2010) HW/SW co-detection of transient and permanent faults with fast recovery in statically scheduled data paths. In: Proceedings of the 2010 design automation and test in Europe conference exhibition (DATE 2010), IEEE, pp 723–728. doi: 10.1109/DATE.2010.5456957
- 63.Lee J, Shrivastava A (2010) A compiler-microarchitecture hybrid approach to soft error reduction for register files. IEEE Trans Comput Des Integr Circuits Syst 29:1018–1027. doi: 10.1109/TCAD.2010.2049050 CrossRefGoogle Scholar
- 64.Parra L, Lindoso A, Portela M, Entrena L, Restrepo-Calle F, Cuenca-Asensi S et al (2014) Efficient mitigation of data and control flow errors in microprocessors. IEEE Trans Nucl Sci 61:1590–1596. doi: 10.1109/TNS.2014.2310492 CrossRefGoogle Scholar
- 65.Martínez-Álvarez A, Restrepo-Calle F, Cuenca-Asensi S, Reyneri LM, Lindoso A, Entrena L (2012) A hybrid technique for soft error mitigation in interrupt-driven applications. In: Proceedings of the 13th European conference on radiation and its effects components and systems RADECS 2012, BiarritzGoogle Scholar
- 66.Altieri M, Becker J, Kastensmidt FL (2013) HETA: hybrid error-detection technique using assertions. IEEE Trans Nucl Sci 60:2805–2812. doi: 10.1109/TNS.2013.2246798 CrossRefGoogle Scholar
- 67.Portela-Garcia M, Grosso M, Gallardo-Campos M, Sonza Reorda M, Entrena L, Garcia-Valderas M et al (2012) On the use of embedded debug features for permanent and transient fault resilience in microprocessors. Microprocess Microsyst 36:334–343. doi: 10.1016/j.micpro.2012.02.013 CrossRefGoogle Scholar
Copyright information
© Springer International Publishing Switzerland 2016