Circuit-Level Soft-Error Mitigation

Part of the Frontiers in Electronic Testing book series (FRET, volume 41)


In nanometric technologies, circuits are increasingly sensitive to various kinds of perturbations. Soft errors, a concern in the past for space applications, became a reliability issue at ground level. Alpha particles and atmospheric neutrons induce single-event upsets (SEUs) affecting memory cells, latches, and flip-flops, and single-event transients (SETs) initiated in the combinational logic and captured by the associated latches and flip-flops. To face this challenge, a designer must dispose a variety of soft-error mitigation schemes adapted to various circuit structures, design architectures, and design constraints. In this chapter, we describe several SEU and SET mitigation schemes that could help designers to meet their reliability constraints.


Clock Cycle Error Correct Code Soft Error Linear Feedback Shift Register Transient Fault 


  1. 1.
    Baumann R.C., Soft errors in advanced computer systems. IEEE Des. Test Comput. 22(3), 2005, 258–266Google Scholar
  2. 2.
    Baumann R.C., Hossain T.Z., Murata S., Kitagwava H., Boron compounds as a dominant source of alpha particles in semiconductor devices. Proceedings of the 33rd International Reliability Physics Symposium, 1995Google Scholar
  3. 3.
    Baze M., Buchner S., Attenuation of single event induced pulses in CMOS combinational logic. IEEE Trans. Nucl. Sci. 44(6), 1997, 2217–2223CrossRefGoogle Scholar
  4. 4.
    Buchner S., Baze M., Brown D., McMorrow D., Melinger J., Comparison of error rates in combinational and sequential logic. IEEE Trans. Nucl. Sci. 44(6), 1997, 2209–2216CrossRefGoogle Scholar
  5. 5.
    Benedetto J., Eaton P., Avery K., Mavis D., Turflinger T., Dodd P., Vizkelethyd G., Heavy ion-induced digital single-event transients in deep submicron processes. IEEE Trans. Nucl. Sci. 51, 2004, 3480–3485CrossRefGoogle Scholar
  6. 6.
    Nicolaidis M., Scaling deeper to submicron: on-line testing to the rescue. Proceedings of the 28th Symposium on Fault-Tolerant Computing (FTCS-28), Munich, June 1998, pp. 299–301Google Scholar
  7. 7.
    Nicolaidis M., Design for soft-error robustness to rescue deep submicron scaling. Proceedings of the IEEE International Test Conference, Washington, DC, October 18–23, 1998Google Scholar
  8. 8.
    Nicolaidis M., Scaling deeper to submicron: on-line testing to the rescue. Proceedings of the 1998 IEEE International Test Conference (ITC 98), Washington, DC, October 1998Google Scholar
  9. 9.
    Levendel I., Nicolaidis M., Abraham J.A., Abramovici M., Motto S., A D&T roundtable: on-line test. IEEE Des. Test Comput. 16(1), 1999, 80–86CrossRefGoogle Scholar
  10. 10.
    Hawkins C.F., Baker K., Butler K., Figueras J., Nicolaidis M., Rao V., Roy R., Welsher T., IC reliability and test: what will deep submicron bring? IEEE Des. Test Comput. 16(2), 1999, 84–91CrossRefGoogle Scholar
  11. 11.
    Roche P., Jacquet F., Caillat C., Schoellkopf J.P., An alpha immune and ultra low neutron SER high density SRAM. Proceedings of IRPS 2004, April 2004, pp. 671–672Google Scholar
  12. 12.
    Nicolaidis M., Achouri N., Boutobza S., Dynamic data-bit memory built-in self-repair. Proceedings of the International Conference on Computer-Aided Design (ICCAD’03), San Jose, CA, November 9–13, 2003Google Scholar
  13. 13.
    Hsiao M.Y., A class of optimal minimum odd-weight-column SEC-DED codes. IBM J. Res. Dev. 14(4), 1970, 395–401CrossRefGoogle Scholar
  14. 14.
    Richter M., Oberlaender K., Goessel M., New linear SEC-DED codes with reduced triple bit error miscorrection probability. 14th IEEE International On-Line Testing Symposium, IOLTS’08, July 7–9, 2008, pp. 37–42Google Scholar
  15. 15.
    Nicolaidis M., Electronic circuit assembly comprising at least one memory with error correcting means. French and US Patent pending, filed July 2001Google Scholar
  16. 16.
    Nicolaidis M., Data storage method with error correction. French and US Patent pending, filed July 2002Google Scholar
  17. 17.
    Nicolaidis M., Design for Soft-Error Mitigation. IEEE Trans. Device Mater. Reliab. 5(3), 2005, 405–418CrossRefGoogle Scholar
  18. 18.
    Vargas F., Nicolaidis M., SEU tolerant SRAM design based on current monitoring. Proceedings of the 24th IEEE International Symposium on Fault-Tolerant Computing, Austin, TX, June 1994, pp. 106–115Google Scholar
  19. 19.
    Gill B., Nicolaidis M., Wolff F., Papachristou C., Garverick S., An efficient BICS design for SEUs detection and correction in semiconductor memories. Proceedings of the Conference on Design, Automation and Test in Europe (DATE), 2005, pp. 592–597Google Scholar
  20. 20.
    Reed I.S., Solomon G., Polynomial codes over certain finite fields. SIAM J. Appl. Math. 8, 1960, 300–304MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Blahut R.E., Theory and practice of error control codes. Addison-Wesley, Reading, MA, 1983MATHGoogle Scholar
  22. 22.
    Pradhan D.K., Fault-tolerant computing system design. Prentice-Hall, Englewood Cliffs, NJ, 1996Google Scholar
  23. 23.
    Peterson W.W., Weldon E.J., Error-correcting codes, 2nd edn. MIT, Cambridge, MA, 1972MATHGoogle Scholar
  24. 24.
    Koopman P., Chakravarty T., Cyclic redundancy code (CRC) polynomial selection for embedded networks. International Conference on Dependable Systems and Networks (DSN-2004), Florence, Italy, June 28–July 1, 2004, pp. 145–154Google Scholar
  25. 25.
    Sklar B., Digital communications: fundamentals and applications. Prentice-Hall, Englewood Cliffs, NJ, 2001Google Scholar
  26. 26.
    Nicolaidis M., A low-cost single-event latchup mitigation scheme. Proceedings of the 12th IEEE International Symposium on On-Line Testing, Como, Italy, July 10–12, 2006Google Scholar
  27. 27.
    Nicolaidis M., Torki K., Natali F., Belhaddad F., Alexandrescu D., Implementation and validation of a low-cost single-event latchup mitigation scheme. IEEE Workshop on Silicon Errors in Logic – System Effects (SELSE), Stanford, CA, March 24–25, 2009Google Scholar
  28. 28.
    Nicolaidis M., Circuit intégré protégé contre les courts-circuits et les erreurs de fonctionnement suite au passage d’une radiation ionisante. French Patent 2,882,601, delivered September 21, 2007Google Scholar
  29. 29.
    Liu M.S., Shaw G.A., Yue J., Fabrication of stabilized polysilicon resistors for SEU control. US Patent, issued May 18, 1993Google Scholar
  30. 30.
    Rockett L., An SEU hardened CMOS data latch design. IEEE Trans. Nucl. Sci. NS-35(6), 1988, 1682–1687CrossRefGoogle Scholar
  31. 31.
    Whitaker S., Canaris J., Liu K., SEU hardened memory cells for a CCSDS Reed Solomon encoder. IEEE Trans. Nucl. Sci. NS-38(6), 1991, 1471–1477CrossRefGoogle Scholar
  32. 32.
    Calin T., Nicolaidis M., Velazco R., Upset hardened memory design for submicron CMOS technology. 33rd International Nuclear and Space Radiation Effects Conference, Indian Wells, CA, July 1996Google Scholar
  33. 33.
    Bessot D., Velazco R., Design of SEU-hardened CMOS memory cells: the HIT cell. Proceedings of the 1994 RADECS Conference, 1994, pp. 563–570Google Scholar
  34. 34.
    Omana M., Rossi D., Metra C., Novel transient fault hardened static latch. Proceedings of the 18th International Test Conference, 2003, pp. 886–892Google Scholar
  35. 35.
    Nicolaidis M., Perez R., Alexandrescu D., Low-cost highly-robust hardened storage cells using blocking feedback transistors. Proceedings of the IEEE VLSI Test Symposium, 2008, pp. 371–376Google Scholar
  36. 36.
    Lin S., Kim Y.B., Lombardi F., A 11-transistor nanoscale CMOS memory cell for hardening to soft errors. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2010Google Scholar
  37. 37.
    Moharam K., Touba N., Cost-effective approach for reducing the soft error failure rate in logic circuits. Proceedings of the International Test Conference (ITC), September 2003, pp. 893–901Google Scholar
  38. 38.
    Dhillon Y.S., Diril A.U., Chatterjee A., Soft-error tolerance analysis and optimization of nanometer circuits. Proceedings of the Conference on Design, Automation and Test in Europe (DATE), March 7–11, 2005, pp. 288–293Google Scholar
  39. 39.
    Zhou Q., Mohanram K., Cost-effective radiation hardening technique for combinational logic. Proceedings of the 2004 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November 7–11, 2004, pp. 100–106Google Scholar
  40. 40.
    Dhillon Y.S., Diril A.U., Chatterjee A., Metra C., Load and logic co-optimization for design of soft-error resistant nanometer CMOS circuits. Proceedings of the 11th IEEE International On-Line Testing Symposium (IOLTS), July 6–8, 2005, pp. 35–40Google Scholar
  41. 41.
    Siewiorek D.P., Swarz R.S., Reliable computer systems: design and evaluation. Digital Press, Bedford, MA, 1992Google Scholar
  42. 42.
    Cheynet P., Nicolescu B., Velazco R., Rebaudengo M., Sonza Reorda M., Violante M., Experimentally evaluating an automatic approach for generating safety-critical software with respect to transient errors. IEEE Trans. Nucl. Sci. 47(6), 2000, 2231–2236CrossRefGoogle Scholar
  43. 43.
    Oh N., Mitra S., McCluskey E.J., ED4I: error detection by diverse data and duplicated instructions. IEEE Trans. Comput. 51(2), 2002, 180–199CrossRefGoogle Scholar
  44. 44.
    Goloubeva O., Rebaudengo M., Sonza Reorda M., Violante M., Soft-error detection using control flow assertions. Proceedings of the 18th International Symposium on Defect and Fault Tolerance in VLSI Systems, Cambridge, MA, November 3–5, 2003, pp. 581–588Google Scholar
  45. 45.
    Rebaudengo M., Sonza Reorda M., Violante M., A new approach to software-implemented fault tolerance. J. Electron. Test.: Theory Appl. 20, 2004, 433–437CrossRefGoogle Scholar
  46. 46.
    Rebaudengo M., Sonza Reorda M., Violante M., Nicolescu B., Velazco R., Coping with SEUs/SETs in microprocessors by means of low-cost solutions: a comparative study. IEEE Trans. Nucl. Sci. 49(3), 2002, 1491–1495CrossRefGoogle Scholar
  47. 47.
    Alkhalifa Z., Nair V.S.S., Krishnamurthy N., Abraham J.A., Design and evaluation of system-level checks for on-line control flow error detection. IEEE Trans. Parallel Distrib. Syst. 10(6), 1999, 627–641CrossRefGoogle Scholar
  48. 48.
    Oh N., Shirvani P.P., McCluskey E.J., Control-flow checking by software signatures. IEEE Trans. Reliab. 51(2), 2002, 111–122CrossRefGoogle Scholar
  49. 49.
    Mukherjee S., Reinhardt S., Transient fault detection via simultaneous multithreading. International Symposium on Computer Architecture, June 2000Google Scholar
  50. 50.
    Vijaykumar P.N., Pomerantz I., et al., Transient fault recovery using simultaneous multithreading. International Symposium on Computer Architecture, May 2002Google Scholar
  51. 51.
    Mukherjee S., Kontz S., Reinhardt S., Detailed design and evaluation of redundant multithreading alternatives. International Symposium on Computer Architecture, May 2002Google Scholar
  52. 52.
    Carter W.C., Schneider P.R., Design of dynamically checked computers. Proceedings of the 4th Congress IFIP, Vol. 2, Edinburgh, Scotland, August 5–10, 1968, pp. 878–883Google Scholar
  53. 53.
    Anderson D.A., Design of self-checking digital networks using coding techniques. Coordinated Sciences Laboratory, Report R/527. University of Illinois, Urbana, IL, September 1971Google Scholar
  54. 54.
    Freiman C.V., Optimal error detection codes for completely asymmetric binary channels. Inform. Contr. 5, 1962, 64–71MathSciNetCrossRefMATHGoogle Scholar
  55. 55.
    Berger J.M., A note on error detection codes for asymmetric binary channels. Inform. Contr. 4, 1961, 68–73CrossRefMATHGoogle Scholar
  56. 56.
    Peterson W.W., On checking an adder. IBM J. Res. Dev. 2, 1958, 166–168CrossRefGoogle Scholar
  57. 57.
    Avizienis A., Arithmetic algorithms for error-coded operands. IEEE Trans. Comput. C-22(6), 1973, 567–572CrossRefGoogle Scholar
  58. 58.
    Rao T.R.N., Error coding for arithmetic processors. Academic, New York, 1974MATHGoogle Scholar
  59. 59.
    Nicolaidis M., Courtois B., Self-checking logic arrays. In: Microprocessors and microsystems. Butterworth Scientific Ltd, Guildford, UK, 1989Google Scholar
  60. 60.
    De K., Natarajan C., Nair D., Banerjee P., RSYN: a system for automated synthesis of reliable multilevel circuits. IEEE Trans. VLSI Syst. 2(2), 1994, 186–195CrossRefGoogle Scholar
  61. 61.
    Touba N.A., McCluskey E.J., Logic synthesis techniques for reduced area implementation of multilevel circuits with concurrent error detection. Proceedings of the International Conference on Computer-Aided Design, 1994Google Scholar
  62. 62.
    Nicolaidis M., Duarte R.O., Manich S., Figueras J., Achieving fault secureness in parity prediction arithmetic operators. IEEE Des. Test Comput. 14(2), 1997, 60–71CrossRefGoogle Scholar
  63. 63.
    Nicolaidis M., Duarte R.O., Design of fault-secure parity-prediction booth multipliers. IEEE Des. Test Comput. 16(3), 1999, 90–101Google Scholar
  64. 64.
    Duarte R.O., Nicolaidis M., Bederr H., Zorian Y., Efficient fault-secure shifter design. J. Electron. Test.: Theory Appl. 12, 1998, 29–39CrossRefGoogle Scholar
  65. 65.
    Nicolaidis M., Carry checking/parity prediction adders and ALUs. IEEE Trans. VLSI Syst. 11(1), 2003, 121–128CrossRefGoogle Scholar
  66. 66.
    Ocheretnij V., Goessel M., Sogomonyan E.S., Marienfeld D., A modulo p checked self-checking carry-select adder. 9th International On-Line Testing Symposium, July 2003Google Scholar
  67. 67.
    Ocheretnij V., Marienfeld D., Sogomonyan E.S., Gossel M., Self-checking code-disjoint carry-select adder with low area overhead by use of add1-circuits. 10th IEEE International On-Line Testing Symposium, Madeira, Portugal, July 2004Google Scholar
  68. 68.
    Smith J.E, Metze G., The design of totally self-checking combinatorials circuits. Proceedings of the 7th Fault-Tolerant Computing Symposium, Los Angeles, CA, June 1997Google Scholar
  69. 69.
    Mago G., Monotone functions in sequential circuits. IEEE Trans. Comput. C-22(10), 1973, 928–933MathSciNetCrossRefGoogle Scholar
  70. 70.
    Diaz M., Design of totally self-checking and fail-safe sequential machines. Proceedings of the 4th International Fault-Tolerant Computing Symposium, Urbana, IL, 1974Google Scholar
  71. 71.
    Diaz M., Geffroy J.C., Courvoisier M., On-set realization of fail-safe sequential machines. IEEE Trans. Comput. C-23, 1974, 133–138MathSciNetCrossRefGoogle Scholar
  72. 72.
    Nanya T., Kawamura T., A note on strongly fault secure sequential circuits. IEEE Trans. Comput. C-36, 1987, 1121–1123CrossRefGoogle Scholar
  73. 73.
    Mak G.P., Abraham J.A., Davindson E.S., The design of PLAs with concurrent error detection. Proceedings of the FTCS-12, Santa Monica, CA, June 1982Google Scholar
  74. 74.
    Fuchs W.K., Chien Ch-H., Abraham J., Concurrent error detection in highly structured logic arrays. IEEE J. Solid-State Circuits SC-22, 1987, 583–594CrossRefGoogle Scholar
  75. 75.
    Lo J.C., Thanawastien S., Rao T.R.N., Nicolaidis M., An SFS Berger check prediction ALU and its application to self-checking processors designs. IEEE Trans. Comput. Aided Des. 2(4), 1992, 525–540CrossRefGoogle Scholar
  76. 76.
    Jha N.K., Wang S.-J., Design and synthesis of self-checking VLSI circuits. IEEE Trans. Comput. Aided Des. 12, 1993, 878–887CrossRefGoogle Scholar
  77. 77.
    Sparmann U., On the check base selection problem for fast adders. Proceedings of the 11th VLSI Test Symposium, Atlantic City, NJ, April 1993Google Scholar
  78. 78.
    Sparmann U., Reddy S.M., On the effectiveness of residue code checking for parallel two’s complement multipliers. Proceedings of the 24th Fault-Tolerant Computing Symposium, Austin, TX, June 1994Google Scholar
  79. 79.
    Alzaher Noufal I., A tool for automatic generation of self-checking multipliers based on residue arithmetic codes. Proceedings of the 1999 Design, Automation and Test in Europe Conference, Munich, March 1999Google Scholar
  80. 80.
    Nicolaidis M., Shorts in self-checking circuits. J. Electron. Test.: Theory Appl. 1(4), 1991, 257–273CrossRefGoogle Scholar
  81. 81.
    Anghel L., Nicolaidis M., Alzaher Noufal I., Self-checking circuits versus realistic faults in very deep submicron. Proceedings of the 18th IEEE VLSI Test Symposium, Montreal, Canada, April 2000Google Scholar
  82. 82.
    Alexandrescu D., Anghel L., Nicolaidis M., New methods for evaluating the impact of single event transients in VDSM ICs. IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, November 2002Google Scholar
  83. 83.
    Nicolaidis M., Fail-safe interfaces for VLSI: theoretical foundations and implementation. IEEE Trans. Comput. 47(1), 1998, 62–77CrossRefGoogle Scholar
  84. 84.
    Nicolaidis M., Time redundancy based soft-error tolerant circuits to rescue very deep submicron. Proceedings of the 17th IEEE VLSI Test Symposium, Dana Point, CA, April 1999Google Scholar
  85. 85.
    Anghel L., Nicolaidis M., Cost reduction and evaluation of a temporary faults detecting technique. Proceedings of the Design, Automation and Test in Europe Conference (DATE), Paris, March 2000Google Scholar
  86. 86.
    Anghel L., Nicolaidis M., Cost reduction and evaluation of a temporary faults detecting technique. Chapter in The Most Influential Papers of 10 Years DATE, Springer Editions, ISBN: 978-1-4020-6487-6, 2008Google Scholar
  87. 87.
    Nicolaidis M., Circuit Logique protégé contre des perturbations transitoires. French Patent application, filed March 9, 1999Google Scholar
  88. 88.
    Ernst D., et al., Razor: a low-power pipeline based on circuit-level timing speculation. Proceedings of the 36th International Symposium on Microarchitecture, December 2003Google Scholar
  89. 89.
    Ernst D., et al., Razor: circuit-level correction of timing errors for low-power operation. IEEE Micro 24(6), 2003, 10–20MathSciNetCrossRefGoogle Scholar
  90. 90.
    Das S., et al., A self-tuning DVS processor using delay-error detection and correction. IEEE Symposium on VLSI Circuits, June 2005Google Scholar
  91. 91.
    Agarwal M., Paul B.C., Ming Zhang Mitra S., Circuit failure prediction and its application to transistor aging. Proceedings of the 5th IEEE VLSI Test Symposium, Berkeley, CA, May 6–10, 2007Google Scholar
  92. 92.
    Mitra S., Agarwal M., Circuit failure prediction to overcome scaled CMOS reliability challenges. IEEE International Test Conference, Santa Clara, CA, October 23–25, 2007Google Scholar
  93. 93.
    Mitra S., Zhang M., Mak T.M., Kim K.S., System and shadow circuits with output joining circuit. US Patent 7,278,074, October 2, 2007, Assignee IntelGoogle Scholar
  94. 94.
    Ming Zhang Mak T., Tschanz J., Kee Sup Kim Seifert N., Lu D., Design for resilience to soft errors and variations. 13th IEEE International On-Line Testing Symposium (IOLTS 07), July 2007, pp. 23–28Google Scholar
  95. 95.
    Nicolaidis M., GRAAL: a new fault-tolerant design paradigm for mitigating the flaws of deep-nanometric technologies. Proceedings of the IEEE International Test Conference (ITC), Santa Clara, CA, October 23–25, 2007Google Scholar
  96. 96.
    Piguet C., et al., Low-power design of 8-b embedded CoolRisc microcontroller cores. IEEE J. Solid-State Circuits 32(7), 1997, 1067–1078CrossRefGoogle Scholar
  97. 97.
    Mack M.J., Sauer W.M., Mealey B.G., IBM POWER6 reliability. IBM J. Res. Dev. 51(6), 2007, 763–764CrossRefGoogle Scholar
  98. 98.
    Bowman K.A., et al., Energy-efficient and metastability-immune resilient circuits for dynamic variation tolerance. IEEE J. Solid-State Circuits 44(1), 2009, 49–63CrossRefGoogle Scholar
  99. 99.
    Franco P., McCluskey E.J., On-line testing of digital circuits. Proceedings of the IEEE VLSI Test Symposium, April 1994, pp. 167–173Google Scholar
  100. 100.
    Metra C., Favalli M., Ricco B., On-line detection of logic errors due to crosstalk, delay, and transient faults. Proceedings of the International Test Conference, Washington, DC, October 18–23, 1998, pp. 524–533Google Scholar
  101. 101.
    Hamming R.W., Error detecting and error correcting codes. Bell Syst. Tech. J. 29, 1953, 147–160MathSciNetGoogle Scholar
  102. 102.
    Bose R.C., Ray-Chaundhuri D.K., On a class of error correcting binary group codes. Inform. Contr. 3(1), 1960, 68–79CrossRefMATHGoogle Scholar
  103. 103.
    Hocquenghem A., Codes Corecteurs d'Erreurs. Chiffres 2, 1959, 147–156MathSciNetMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.TIMA Laboratory (CNRS, Grenoble INP, UJF)GrenobleFrance

Personalised recommendations