Mitigating Soft Error Risks through Protecting Critical Variables and Blocks

  • Muhammad Shaikh Sadi
  • Md. Nazim Uddin
  • Md. Mizanur Rahman Khan
  • Jan Jürjens
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 154)


Down scaling of CMOS technologies has resulted in high clock frequencies, smaller features sizes and low power consumption. But it reduces the soft error tolerance of the VLSI circuits. Safety critical systems are very sensitive to soft errors. A bit flip due to soft error can change the value of critical variable and consequently the system control flow can completely be changed which may lead to system failure. To minimize the risks of soft error, this paper proposes a novel methodology to detect and recover from soft error considering only ‘critical code block’ and ‘critical variable’ rather than considering all variables and/or blocks in the whole program. The proposed method reduces space and time overhead in comparison to existing dominant approach.


Soft Errors Safety Critical System Critical Variable Critical Block Criticality analysis Risk mitigation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Timor, A., Mendelson, A., Birk, Y., Suri, N.: Using under utilized CPU resources to enhance its reliability. IEEE Transactions on Dependable and Secure Computing 7(1), 94–109 (2010)CrossRefGoogle Scholar
  2. 2.
    Rhod, E.L., Lisboa, C.A.L., Carro, L., Reorda, M.S., Violante, M.: Hardware and Software Transparency in the Protection of Programs Against SEUs and SETs. Journal of Electronic Testing 24, 45–56 (2008)CrossRefGoogle Scholar
  3. 3.
    Mukherjee, S.S., Emer, J., Reinhardt, S.K.: The soft error problem: an architectural perspective. In: 11th International Symposium on High-Performance Computer Architecture, San Francisco, CA, USA, pp. 243–247 (2005)Google Scholar
  4. 4.
    Iyer, R.K., Nakka, N.M., Kalbarczyk, Z.T., Mitra, S.: Recent advances and new avenues in hardware-level reliability support. IEEE Micro 25, 18–29 (2005)CrossRefGoogle Scholar
  5. 5.
    Narayanan, V., Xie, Y.: Reliability concerns in embedded system designs. Computer 39, 118–120 (2006)CrossRefGoogle Scholar
  6. 6.
    Tosun, S.: Reliability-centric system design for embedded systems, Ph.D. Thesis, Syracuse University, United States –New York (2005)Google Scholar
  7. 7.
    Sadi, M.S., Myers, D.G., Sanchez, C.O., Jurjens, J.: Component Criticality Analysis to Minimizing Soft Errors Risk. Comput. Syst. Sci. & Eng. 26(1) (September 2010)Google Scholar
  8. 8.
    Oh, N., Mitra, S., McClusky, E.j.: ED4I: Error Detection by Diverse Data and Duplicated Instructions. IEEE Transactions on Computers 51(2) (February 2002)Google Scholar
  9. 9.
    Pattabiraman, K., Kalbarczyk, Z., Iyer, R.K.: Critical Variable Recomputation for Transient Error Detection (2008)Google Scholar
  10. 10.
    Piotrowski, A., Makowski, D., Jabłoński, G., Andrzej, N.: The Automatic Implementation of Software Implemented Hardware Fault Tolerance Algorithms as a Radiation-Induced Soft Errors Mitigation Technique. In: Nuclear Science Symposium Conference Record. IEEE, Los Alamitos (2008)Google Scholar
  11. 11.
    Mukherjee, S.S., Kontz, M., Reinhardt, S.K.: Detailed design and evaluation of redundant multi-threading alternatives. In: 29th Annual International Symposium on Computer Architecture, pp. 99–110 (2002)Google Scholar
  12. 12.
    Oh, N., Shirvani, P.P., McCluskey, E.J.: Error detection by duplicated instructions in super-scalar processors. IEEE Transactions on Reliability 51, 63–75 (2002)CrossRefGoogle Scholar
  13. 13.
    Reis, G.A., Chang, J., Vachharajani, N., Rangan, R., August, D.I.: SWIFT: software implemented fault tolerance, Los Alamitos, CA, USA, pp. 243–254 (2005)Google Scholar
  14. 14.
    Xie, Y., Li, L., Kandemir, M., Vijaykrishnan, N., Irwin, M.J.: Reliability-aware co-synthesis for embedded systems. In: 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors, pp. 41–50 (2004)Google Scholar
  15. 15.
    Chen, C.L., Hsiao, M.Y.: Error-Correcting Codes for Semiconductor Memory Applications: A State-Of-The-Art Review. IBM Journal of Research and Development 28, 124–134 (1984)CrossRefGoogle Scholar
  16. 16.
    Park, J.K., Kim, J.T.: A soft error mitigation technique for constrained gate-level designs. IEICE Electronics Express 5, 698–704 (2008)CrossRefGoogle Scholar
  17. 17.
    Miskov-Zivanov, N., Marculescu, D.: MARS-C: modeling and reduction of soft errors in combinational circuits, Piscataway, NJ, USA, pp. 767–772 (2006)Google Scholar
  18. 18.
    Quming, Z., Mohanram, K.: Cost-effective radiation hardening technique for combinational logic, Piscataway, NJ, USA, pp. 100–106 (2004)Google Scholar
  19. 19.
    Oma, M., Rossi, D., Metra, C.: Novel Transient Fault Hardened Static Latch, Charlotte, NC, United states, pp. 886–892 (2003)Google Scholar
  20. 20.
    P. R. STMicroelectronics Release, New chip technology from STmicroelectronics eliminates soft error threat to electronic systems,
  21. 21.
    Rockett Jr., L.R.: Simulated SEU hardened scaled CMOS SRAM cell design using gated resistors. IEEE Transactions on Nuclear Science 39, 1532–1541 (1992)CrossRefGoogle Scholar
  22. 22.
    Austin, T.M.: DIVA: a reliable substrate for deep submicron microarchitecture design. In: 32nd Annual International Symposium on Microarchitecture, pp. 196–207 (1999)Google Scholar
  23. 23.
    Gold, B.T., Kim, J., Smolens, J.C., Chung, E.S., Liaskovitis, V., Nurvitadhi, E., Falsafi, B., Hoe, J.C., Nowatzyk, A.G.: TRUSS: a reliable, scalable server architecture. IEEE Micro 25, 51–59 (2005)CrossRefGoogle Scholar
  24. 24.
    Krishnamohan, S.: Efficient techniques for modeling and mitigation of soft errors in nanometer-scale static CMOS logic circuits, Ph.D. Thesis, Michigan State University, United States – Michigan (2005)Google Scholar
  25. 25.
    Mohamed, A.G., Chad, S., Vijaykumar, T.N., Irith, P.: Transient-fault recovery for chip multiprocessors. IEEE Micro 23, 76 (2003)Google Scholar
  26. 26.
    Vijaykumar, T.N., Pomeranz, I., Cheng, K.: Transient-fault recovery using simultaneous multithreading. In: 29th Annual International Symposium on Computer Architecture, pp. 87–98 (2002)Google Scholar
  27. 27.
    Piotrowski, A., Tarnowski, S.: Compiler-level Implementation of Single Event Upset Errors Mitigation AlgorithmsGoogle Scholar
  28. 28.
    Oh, N., Shirvani, P.P., McCluskey, E.J.: Error detection by duplicated instructions in super-scalar processors. IEEE Transactions on Reliability 51, 63–75 (2002)CrossRefGoogle Scholar
  29. 29.
    Walcott, K.R., Humphreys, G., Gurumurthi, S.: Dynamic prediction of architectural vulnerability from microarchitectural state, New York, NY 10016-5997, United States, pp. 516–527 (2007)Google Scholar
  30. 30.
    Mitra, M.Z.S., Seifert, N., Mak, T.M., Kim, K.: Soft and IFIP, Soft Error Resilient System Design through Error Correction. VLSI-SoC (January 2006)Google Scholar
  31. 31.
    Zhang, M.: Analysis and design of soft-error tolerant circuits, Ph.D. Thesis, University of Illinois at Urbana-Champaign, United States – Illinois (2006)Google Scholar
  32. 32.
    Mohamed, A.G., Chad, S., Vijaykumar, T.N., Irith, P.: Transient-fault recovery for chip multiprocessors. IEEE Micro 23, 76 (2003)Google Scholar
  33. 33.
    Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Transactions on Software Engineering 20, 476–493 (1994)CrossRefGoogle Scholar
  34. 34.
    Bergaoui, S., Vanhauwaert, P., Leveugle, R.: A New Critical Variable Analysis in Processor-Based Systems. IEEE Transactions (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Muhammad Shaikh Sadi
    • 1
  • Md. Nazim Uddin
    • 1
  • Md. Mizanur Rahman Khan
    • 1
  • Jan Jürjens
    • 2
  1. 1.Khulna University of Engineering and Technology (KUET)KhulnaBangladesh
  2. 2.TU DortmundGermany

Personalised recommendations