Skip to main content

Background and Related Work

  • Chapter
  • First Online:
Reliable Software for Unreliable Hardware

Abstract

This chapter presents the background knowledge regarding different sources of the emerging reliability threats (i.e., soft errors, process variation, and aging-induced effects), the related work on soft error modeling, and their mitigation techniques. In particular, Sect. 2.1 provides the background regarding soft errors, starting with the basic transistor structure and its functionality, followed by various soft error sources and the soft error mechanism. Section 2.2 presents the basics of the NBTI-induced aging phenomena. Section 2.3 presents different variability sources and manufacturing induced process variation effects along with the process variation model explained in Sect. 2.3.1. Section 2.4 discusses the related work on soft error modeling and estimation at both the hardware and software layers. Starting from the traditional to more advanced approaches, Sect. 2.5 presents state-of-the-art soft error mitigation techniques at both hardware and software levels. As the focus of this work is on soft errors, most of the background discussed is related to soft errors. Towards the end, Sect. 2.6 summarizes the related work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Register Vulnerability Factor considers the register live period as a measure for the reliability.

  2. 2.

    Program Vulnerability Factor relates the software reliability to the bits for Architecturally Correct Execution in different programmer-visible architectural components (Register File, ALU, etc.), but hides the physical components (e.g., there are 256 physical registers, but 32 are visible to the programmer).

References

  1. S. Borkar, “Designing Reliable Systems from Unreliable Components: The Challenges of Transistor Variability and Degradation”, IEEE Micro, vol. 25, no. 6, pp. 10–16, 2005.

    Article  Google Scholar 

  2. M. A. Alam, S. Mahapatra, “A comprehensive model for PMOS NBTI degradation”, Microelectronics Reliability, pp. 71–81, 2005.

    Google Scholar 

  3. R. Baumann, “Radiation-induced soft errors in advanced semiconductor technologies”, IEEE Transactions on Device and Materials Reliability, vol. 5, no. 3, pp. 305–316, 2005.

    Article  MathSciNet  Google Scholar 

  4. J. Henkel, L. Bauer, N. Dutt, P. Gupta, S. Nassif, M. Shafique, M.Tahoori, and N.Wehn, “Reliable on-chip systems in the nano-era: Lessons learnt and future trends”, in Proceedings of the 50th Annual Design Automation Conference (DAC), pp. 99, ACM, 2013.

    Google Scholar 

  5. P. Shivakumar, M. Kistler, S. Keckler, D. Burger, and L. Alvisi, “Modeling the effect of technology trends on the soft error rate of combinational logic”, in Proceedings of the IEEE International Conference on Dependable Systems and Networks (DSN), pp. 389–398, 2002.

    Google Scholar 

  6. S. Mukherjee., J. Emer, and S. Reinhardt, “The soft error problem: An architectural perspective”, in The 11th International Symposium on High-Performance Computer Architecture, 2005. HPCA-11, pp. 243–247, 2005.

    Google Scholar 

  7. B. Raghunathan, Y. Turakhia, S. Garg, and D. Marculescu, “Cherry-picking: exploiting process variations in dark-silicon homogeneous chip multi-processors”, in Proceedings of the Conference on Design, Automation and Test in Europe (DATE), pp. 39–44. EDA Consortium, 2013.

    Google Scholar 

  8. P. Gupta, Y. Agarwal, L. Dolecek, N. Dutt, R. Gupta, R. Kumar, S. Mitra, A. Nicolau, T.Rosing, M. Srivastava, S. Swanson, and D. Sylvester, “Underdesigned and opportunistic computing in presence of hardware variability”, in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 32, no. 1, pp. 8–23, 2013.

    Google Scholar 

  9. N. Oh, P. Shirvani, and E. McCluskey, “Error detection by duplicated instructions in super-scalar processors”, in IEEE Transactions on Reliability, vol. 51, no. 1, pp. 63–75, 2002.

    Google Scholar 

  10. G. Reis, J. Chang, N. Vachharajani, R. Rangan, D. August, and S. Mukherjee, “Software-controlled fault tolerance”, in ACM Transactions on Architecture and Code Optimization (TACO), vol. 2, no. 4, pp. 366–396, 2005.

    Google Scholar 

  11. S. Mukherjee, C. Weaver, J. Emer, S. Reinhardt, and T. Austin, “A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor”, in Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). pp. 29, 2003.

    Google Scholar 

  12. R. Vadlamani, J. Zhao, W. Burleson, and R. Tessier, “Multicore soft error rate stabilization using adaptive dual modular redundancy”, in IEEE Design, Automation and Test in Europe Conference & Exhibition (DATE),pp. 27–32, 2010.

    Google Scholar 

  13. N. Oh, P. Shirvani, and E. McCluskey, “Control-flow checking by software signatures”, in IEEE Transactions on Reliability, vol. 51, no. 1, pp. 111–122, 2002.

    Google Scholar 

  14. J. Gaisler, “A portable and fault-tolerant microprocessor based on the SPARC v8 architecture”, in Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks(DSN), pp. 409–415, 2002.

    Google Scholar 

  15. S. Mukherjee, M. Kontz, and S. Reinhardt, “Detailed design and evaluation of redundant multi-threading alternatives”, in Proceedings of the 29th Annual IEEE International Symposium on Computer Architecture (ISCA), pp. 99–110, 2002.

    Google Scholar 

  16. A. Shye, J. Blomstedt, T. Moseley, V. Reddi, and D. Connors, “PLR: A software approach to transient fault tolerance for multicore architectures”, in IEEE Transactions on Dependable and Secure Computing, vol. 6, no. 2, pp. 135–148, 2009.

    Google Scholar 

  17. J. Smolens, B. Gold, B. Falsafi, and J. Hoe, “Reunion: Complexity-effective multicore redundancy”, in Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), IEEE Computer Society, pp. 223–234, 2006.

    Google Scholar 

  18. C. Constantinescu, “Trends and challenges in VLSI circuit reliability”, in IEEE Micro, vol. 23, no. 4, pp. 14–19, 2003.

    Google Scholar 

  19. H. Kufluoglu and M. Alam, “A Generalized Reaction–Diffusion Model With Explicit H–Dynamics for Negative-Bias Temperature-Instability (NBTI) Degradation”, in IEEE Transactions on Electron Devices, vol. 54, no. 5, pp. 1101–1107, 2007.

    Google Scholar 

  20. S. Dighe, S. Vangal, P. Aseron, S. Kumar, T. Jacob, K. Bowman, J. Howard, J. Tschanz, V. Erraguntla, N. Borkar, V. De, and S. Borkar, “Within-die variation-aware dynamic-voltage-frequency scaling core mapping and thread hopping for an 80-core processor”, in IEEE International Solid-State Circuits Conference, 2010.

    Google Scholar 

  21. L. Wanner, C. Apte, R. Balani, P. Gupta, and M. Srivastava, “Hardware variability-aware duty cycling for embedded sensors”, in IEEE Transactions on VLSI, 2012.

    Google Scholar 

  22. J. Xiong, V. Zolotov, and L. He, “Robust extraction of spatial correlation”, in IEEE Transactions on Computer Aided Design (TCAD), vol. 26, no. 4, pp. 619–631, 2007.

    Google Scholar 

  23. S. Herbert and D. Marculescu, “Characterizing chip-multiprocessor variability-tolerance”, in IEEE Design and Automation Conference, pp. 313–318, 2008.

    Google Scholar 

  24. P. Murley and G. Srinivasan, “Soft-error Monte Carlo modeling program, SEMM”, in IBM Journal of Research and Development, vol. 40, no. 1, 1996.

    Google Scholar 

  25. M. Omana, G. Papasso, D. Rossi, and C. Metra, “A Model for Transient Fault Propagation in Combinatorial Logic”, in Proceedings of the 9th IEEE International On-Line Testing Symposium (IOLTS), pp. 11–115, 2003.

    Google Scholar 

  26. S. Krishnaswamy, G. F. Viamonte, I. L. Markov, and J. P. Hayes, “Accurate Reliability Evaluation and Enhancement via Probabilistic Transfer Matrices”, in Proceedings of Design, Automation and Test in Europe (DATE), pp. 282–287, 2005.

    Google Scholar 

  27. Y. Dhillon, A. Diril, and A. Chatterjee, “Soft-Error Tolerance Analysis and Optimization of Nanometer Circuits”, in Proceedings of Design, Automation and Test in Europe (DATE), pp. 288–293, 2005.

    Google Scholar 

  28. S. Kiamehr, M. Ebrahimi, F. Firouzi, and M. Tahoori, “Chip-level modeling and analysis of electrical masking of soft errors”, in The 31st IEEE VLSI Test Symposium (VTS), pp. 1–6, 2013.

    Google Scholar 

  29. H. Asadi, and M. Tahoori, “An Accurate SER Estimation Method Based on Propagation Probability”, in Proceedings of Design, Automation and Test Conference in Europe (DATE), 2005.

    Google Scholar 

  30. M. Ebrahimi., L. Chen, H. Asadi, and M. Tahoori, “CLASS: Combined logic and architectural soft error sensitivity analysis”, in 18th Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 601–607, 2013.

    Google Scholar 

  31. K. Itoh, R. Hori, H. Masuda, Y. Kamigaki, H. Kawamoto, and H. Katto, “A single 5v 64k dynamic ram”, in IEEE International Solid-State Circuits Conference (ISSCC), Digest of Technical Papers, vol. 23, pp 228–229, 1980.

    Google Scholar 

  32. M. Kohara, Y. Mashiko, K. Nakasaki, and M. Nunoshita, “Mechanism of electromigration in ceramic packages induced by chip-coating polyimide”, in IEEE Transactions on Components, Hybrids, and Manufacturing Technology, vol. 13, no. 4, pp. 873–878, 1990.

    Google Scholar 

  33. M. Bruel, “Silicon on insulator material technology”, in Electronics Letters, vol. 31, no. 14, pp. 1201–1202, 1995.

    Google Scholar 

  34. E. Cannon, D. Reinhardt, M. Gordon, and P. Makowenskyj, “Sram ser in 90, 130 and 180 nm bulk and soi technologies”, in Proceedings of 42nd Annual IEEE International Reliability Physics Symposium, pp. 300–304, 2004.

    Google Scholar 

  35. D. Burnett, C. Lage, and A. Bormann, “Soft-error-rate improvement in advanced bicmos srams”, in Proceedings of 31st Annual Reliability Physics Symposium, pp. 156–160, 1993.

    Google Scholar 

  36. S Mitra, T. Karnik, N. Seifert, and M. Zhang, “Logic soft errors in sub-65 nm technologies design and cad challenges”, in Proceedings of 42nd Design Automation Conference (DAC), pp. 2–4, 2005.

    Google Scholar 

  37. D. Ernst, S. Das, S. Lee, D. Blaauw, T. Austin, T. Mudge, N. Kim, and K. Flautner, K, “Razor: circuit-level correction of timing errors for low-power operation”, in IEEE Micro, vol. 24, no. 6, pp. 10–20, 2004.

    Google Scholar 

  38. S. Das, C. Tokunaga, S. Pant, M. Wei-Hsiang, S. Kalaiselvan, K. Lai, D. Bull, and D. Blaauw, “RazorII: In situ error detection and correction for PVT and SER tolerance”, in IEEE Journal of Solid-State Circuits, vol. 44, no. 1, pp. 32–48, 2009.

    Google Scholar 

  39. IBM® XIV® Storage System cache: http://publib.boulder.ibm.com/infocenter/ibmxiv/r2/index.jsp [Online; accessed Apr. 2015].

  40. AMD Phenom™ II Processor Product Data Sheet 2010.

    Google Scholar 

  41. R. Hamming, “Error detecting and error correcting codes”, in Bell System Technical Journal, vol. 26, no. 2, pp. 147–160, 1950.

    Google Scholar 

  42. K. Kang, S. Gangwal, S. Park, and A. Roy, “NBTI Induced Performance Degradation in Logic and 66. Memory Circuits”, in Proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC), 2008.

    Google Scholar 

  43. Aeroflex, http://aeroflex.com/ams/ [Online; accessed Apr 2015].

  44. S. Reinhardt and S. Mukherjee, “Transient Fault Detection via Simultaneous Multithreading”, in Proceedings of the International Symposium on Computer Architecture (ISCA), pp. 25–34, 2000.

    Google Scholar 

  45. D. Tullsen, S. Eggers, and H. Levy, “Simultaneous multithreading: Maximizing on-chip parallelism”, in ACM SIGARCH Computer Architecture News, vol. 23, no. 2, pp. 392–403, ACM, 1995.

    Google Scholar 

  46. A. Avizienis, “The N-version approach to fault-tolerant software”, in IEEE Transactions on. Software Engineering, vol. 11, no. 12, pp. 1491–1501, 1985.

    Google Scholar 

  47. R. Koo and S. Toueg, “Checkpointing and rollback-recovery for distributed systems”, in IEEE Transactions on Software Engineering, vol. 1, pp. 23–31, 1987.

    Google Scholar 

  48. G. Reis, “Software modulated fault tolerance”, Ph.D. Thesis, Princeton University, 2008.

    Google Scholar 

  49. J. Lee and A.Shrivastava, “A compiler optimization to reduce soft errors in register files”, in ACM Sigplan Notices, vol. 44, no. 7, pp. 41–49, ACM, 2009.

    Google Scholar 

  50. J. Yan and W. Zhang, “Compiler-guided register reliability improvement against soft errors”, in Proceedings of the 5th ACM International Conference on Embedded Software, pp. 203–209, 2005.

    Google Scholar 

  51. V. Sridharan, “Introducing Abstraction to Vulnerability Analysis”, Ph.D. Thesis, March 2010.

    Google Scholar 

  52. V. Sridharan and D. Kaeli, “Eliminating Micro-architectural Dependency from Architectural Vulnerability”, in IEEE International Symposium on High Performance Computer Architecture, pp. 117–128, 2009.

    Google Scholar 

  53. D. Borodin and B. Juurlink, “Protective redundancy overhead reduction using instruction vulnerability factor”, in Proceedings of the 7th ACM International Conference on Computing Frontiers, pp. 319–326, 2010.

    Google Scholar 

  54. J. Hu, S. Wang, and G. Ziavras, “In-register duplication: Exploiting narrow-width value for improving register file reliability”, in IEEE International Conference on Dependable Systems and Networks (DSN 2006), pp. 281–290, 2006.

    Google Scholar 

  55. P. Lokuciejewski and P. Marwedel, “Combining worst-case timing models, loop unrolling, and static loop analysis for WCET minimization”, in 21st IEEE Euromicro Conference on Real-Time Systems (ECRTS), pp. 35–44, 2009.

    Google Scholar 

  56. V. Sarkar, “Optimized Unrolling of Nested Loops”, in International Journal on Parallel Programing, vol. 29, no. 5, pp. 545–581, 2001.

    Google Scholar 

  57. J. Hu, F. Li, V. Degalahal, M. Kandemir, N. Vijaykrishnan, and M. Irwin, “Compiler-directed instruction duplication for soft error detection”, in Proceedings of the Conference on Design, Automation and Test in Europe (DATE), pp. 1056–1057, 2005.

    Google Scholar 

  58. J. Xu, Q. Tan, and R. Shen, “The Instruction Scheduling for Soft Errors based on Data Flow Analysis”, in IEEE Pacific Rim International Symposium on Dependable Computing, pp. 372–378, 2009.

    Google Scholar 

  59. L. Spainhower and T. Gregg, “IBM S/390 parallel enterprise server G5 fault tolerance: A historical perspective”, in IBM journal of Research and Development, vol. 43, no. 5/6, 1999.

    Google Scholar 

  60. T. Li, M. Shafique, S. Rehman, J. A. Ambrose, J. Henkel, and S. Parameswaran, “DHASER: Dynamic Heterogeneous Adaptation for Soft-Error Resiliency in ASIP-based Multi-core Systems”, in IEEE International Conference on Computer Aided Design (ICCAD), pp. 646–653, 2013.

    Google Scholar 

  61. J. Maiz, S. Hareland, K. Zhang, and P. Armstrong, “Characterization of multi-bit soft error events in advanced SRAMs”, in Electron Devices Meeting (IEDM), pp. 21.4.1–21.4.4, 2003.

    Google Scholar 

  62. K. Osada, K. Yamaguchi, Y. Saitoh, and T. Kawahara, “SRAM immunity to cosmic-ray-induced multierrors based on analysis of an induced parasitic bipolar effect”, in IEEE Journal of Solid-State Circuits, vol. 39, no. 5, pp. 827–833,2004.

    Google Scholar 

  63. J.-M. Palau, G. Hubert, K. Coulie, B. Sagnes, M.-C. Calvet, and S. Fourtine, “Device simulation study of the seu sensitivity of srams to internal ion tracks generated by nuclear reactions”, in IEEE Transactions on Nuclear Science, vol. 48, no. 2, pp. 225–231, 2001.

    Google Scholar 

  64. N. Miskov-Zivanov and D. Marculescu, “Circuit reliability analysis using symbolic techniques”, in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 12, pp. 2638–2649, 2006.

    Google Scholar 

  65. M. Zhang and N. Shanbhag, “A Soft Error rate Analysis (SERA) Methodology”, in Proceedings of ACM/IEEE International Conference on Computer Aided Design (ICCAD), pp. 111–118, 2004.

    Google Scholar 

  66. N. George, C. Elks, B. Johnson, and J. Lach, “Transient fault models and AVF estimation revisited”, in IEEE/IFIP International Conference on Dependable Systems and Networks (DSN),pp. 477–486, 2010.

    Google Scholar 

  67. A. Biswas, P. Racunas, R. Cheveresan, J. Emer, S. Mukherjee, and R. Rangan, “Computing architectural vulnerability factors for address-based structures”, in Proceedings of the 32nd Annual International Symposium on Computer Architecture (ISCA), pp. 532–543, 2005.

    Google Scholar 

  68. N. Wang, J. Quek, T. Rafacz, and S. Patel, “Characterizing the effects of transient faults on a high-performance processor pipeline”, in IEEE International Conference on Dependable Systems and Networks (DSN), pp. 61–70, 2004.

    Google Scholar 

  69. R. Venkatasubramanian, J. Hayes, and B. Murray, “Low cost online fault detection using control flow assertions”, in Proceedings of 9th IEEE On-Line Test. Symposium (IOLTS), pp. 137–143, 2003.

    Google Scholar 

  70. P. Liden, P. Dahlgren, R. Johansson, and J. Karlsson, “On latching probability of particle induced transients in combinational networks”, in Proceedings of Fault-Tolerant Computing Symposium, pp. 340–349, 1994.

    Google Scholar 

  71. J. Ziegler, H. Curtis, H. Muhlfeld, J. Montrose, and B. Chin, “IBM experiments in soft fails in computer electronics (1978–1994)”, in IBM journal of research and development, vol. 40, no. 1, pp. 3–18, 1996.

    Google Scholar 

  72. L. Chen, M. Ebrahimi, and M. Tahoori, “CEP: Correlated Error Propagation for Hierarchical Soft Error Analysis”, in Journal of Electronic Testing: Theory and Applications (JETTA), Springer, 2013.

    Google Scholar 

  73. H. Ziade, R. Ayoubi, and R. Velazco, “A survey on fault injection techniques”, in International Arab Journal of Information Technology, vol. 1, no. 2, pp. 171–186, 2004.

    Google Scholar 

  74. V. Chippa, D. Mohapatra, A. Raghunathan, K.Roy, and S. Chakradhar, “Scalable effort hardware design: exploiting algorithmic resilience for energy efficiency”, in Proceedings of the ACM 47th Design Automation Conference (DAC), pp. 555–560, 2010.

    Google Scholar 

  75. K. Pattabiraman, N. Nakka, Z. Kalbarczyk, and R. Iyer, “SymPLFIED: Symbolic program-level fault injection and error detection framework”, in IEEE International Conference on Dependable Systems and Networks (DSN), pp. 472–481, 2008.

    Google Scholar 

  76. R. Velazco, A. Corominas, and P. Ferreyra, “Injecting bit flip faults by means of a purely software approach: a case studied”, in IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), pp. 108–116, 2002.

    Google Scholar 

  77. J. Coppens, D. Al-Khalili, and C. Rozon, “VHDL Modelling and Analysis of Fault Secure Systems”, in Proceedings of the IEEE Conference on Design Automation and Test in Europe (DATE), pp. 148–152, 1998.

    Google Scholar 

  78. R. Shafik, P. Rosinger, and B. Al-Hashimi, “SystemC-Based Minimum Intrusive Fault Injection Technique with Improved Fault Representation”, in IEEE International On-Line Testing Symposium (IOLTS), pp. 99–104, 2008.

    Google Scholar 

  79. P. Simonen, A. Heinonen, M. Kuulusa, and J. Nurmi, “Comparison of bulk and SOI CMOS Technologies in a DSP Processor Circuit Implementation”, in Proceedings of the 13th International Conference on Microelectronics (ICM), pp. 107–110, 2001.

    Google Scholar 

  80. J. Yao, S. Okada, M. Masuda, K. Kobayashi, and Y. Nakashima, “DARA: A low-cost reliable architecture based on unhardened devices and its case study of radiation stress test”, in IEEE Transactions on Nuclear Science, vol. 59, no. 6, pp. 2852–2858, 2012.

    Google Scholar 

  81. C. Weaver and T. Austin, “A fault tolerant approach to microprocessor design”, in IEEE International Conference on Dependable Systems and Networks (DSN), pp. 411–420, 2001.

    Google Scholar 

  82. G. Messenger, “Collection of Charge on Junction Nodes from Ion Tracks”, in IEEE Transactions on Nuclear Science, vol. 29, no. 6, pp. 2024–2031, 1982.

    Google Scholar 

  83. P. Dodd and F. Sexton, “Critical charge concepts for CMOS SRAMs”, in IEEE Transactions on Nuclear Science, vol. 42, no. 6, pp. 1764–1771, 1995.

    Google Scholar 

  84. J. Henkel, L. Bauer, H. Zhang, S. Rehman, and M. Shafique, “Multi-Layer Dependability: From Microarchitecture to Application Level”, in ACM/IEEE/EDA 51st Design Automation Conference (DAC), 2014.

    Google Scholar 

  85. F. Oboril, “Cross-Layer Approaches for an Aging-Aware Design of Nanoscale Microprocessors”, Ph.D. Thesis, 2015.

    Google Scholar 

  86. H. Amrouch, V. M. van Santen, T. Ebi, V. Wenzel, and J. Henkel, “Towards interdependencies of aging mechanisms”, in IEEE International Conference on Computer Aided Design (ICCAD), pp. 478–485, 2014.

    Google Scholar 

  87. DFG SPP1500 Program on Dependable Embedded Systems: http://spp1500.itec.kit.edu/.

  88. R. Baumann, “Soft errors in advanced computer systems”, in IEEE Design & Test of Computers, vol. 22, no. 3, pp. 258–266, 2005.

    Google Scholar 

  89. K. Kang, S. Gangwal, S. Park, and K. Roy, “NBTI induced performance degradation in logic and memory circuits: how effectively can we approach a reliability solution?”, in Proceedings of Asia and South Pacific Design Automation Conference, pp. 726–731, 2008.

    Google Scholar 

  90. M. Shafique, M. U. K. Khan, O. Tuefek, and J. Henkel, “EnAAM: Energy-Efficient Anti-Aging for On-Chip Video Memories”, in ACM/EDAC/IEEE 52nd Design Automation Conference, San Francisco, CA/USA, June 8–12, 2015.

    Google Scholar 

  91. S. Herbert, S. Garg, and D. Marculescu, “Exploiting process variability in voltage/frequency Control”, IEEE Transactions Very Large Scale Integration (VLSI) Systems, on 20, no. 8, pp. 1392–1404, 2012.

    Google Scholar 

  92. T. Li, R. Ragel, and S. Parameswaran, “Reli: Hardware/software Checkpoint and Recovery scheme for embedded processors”, in IEEE Design, Automation & Test in Europe Conference & Exhibition, pp. 875–880, 2012.

    Google Scholar 

  93. S. Rehman, A. Toma, F. Kriebel, M. Shafique, J.-J. Chen, and J. Henkel, “Reliable Code Generation and Execution on Unreliable Hardware under Joint Functional and Timing Reliability Considerations”, in: 19th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pp. 273–282, 2013.

    Google Scholar 

  94. J. B. Velamala, K. Sutaria, T. Sato, and Y. Cao, “Physics matters: statistical aging prediction under trapping/detrapping”, in 49th IEEE/ACM Annual Design Automation Conference (DAC), pp. 139–144, 2012.

    Google Scholar 

  95. K. Kuhn, C. Kenyon, A. Kornfeld, M. Liu, A. Maheshwari, W. Shih, S. Sivakumar, G. Taylor, P. VanDerVoorn, and K. Zawadzki, “Managing Process Variation in Intel’s 45 nm CMOS Technology”, in Intel Technology Journal, vol. 12, no. 2, 2008.

    Google Scholar 

  96. C. Li and W. Fuchs, “Catch-compiler-assisted techniques for checkpointing”, in 20th International Symposium of Fault-Tolerant Computing (FTCS-20), Digest of Papers, pp. 74–81, 1990.

    Google Scholar 

  97. J. Plank, M. Beck, G. Kingsley, and K. Li, “Libckpt: Transparent Checkpointing under Unix”, in Proceedings of Usenix Technical Conference, pp. 213–223, 1995.

    Google Scholar 

  98. Y. Huang and C. Kintala, “Software implemented fault tolerance: Technologies and experience”, in Proceedings of the IEEE Fault-Tolerant Computing Symposium (FTCS), vol. 23, pp. 2–9, 1993.

    Google Scholar 

  99. L. Wang, Z. Kalbarczyk, W. Gu, and R. Iyer, “An OS-level framework for providing application-aware reliability”, in Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing (PRDC), pp. 55–62, 2006.

    Google Scholar 

  100. J. Henkel, T. Ebi, H. Amrouch, and H. Khdr, “Thermal management for dependable on-chip systems”, in Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 113–118, 2013.

    Google Scholar 

  101. H. Amrouch, T. Ebi, and J. Henkel, “RESI: Register-Embedded Self-Immunity for Reliability Enhancement”, IEEE Transactions on CAD of Integrated Circuits and Systems (TCAD), vol. 33, no. 5, pp. 677–690, 2014.

    Article  Google Scholar 

  102. L. Bauer, C. Braun, M. E. Imhof, M. A. Kochte, E. Schneider, H. Zhang, J. Henkel, and H.-J. Wunderlich, “Test Strategies for Reliable Runtime Reconfigurable Architectures”, in IEEE Transactions on Computers (TC), vol. 62, no. 8, pp. 1494–1507, 2013.

    Google Scholar 

  103. H. Zhang, M. A. Kochte, M. E. Imhof, L. Bauer, H.-J. Wunderlich, and J. Henkel, “GUARD: GUAranteed Reliability in Dynamically Reconfigurable Systems”, in IEEE/ACM Design Automation Conference (DAC), pp. 32:1–32:6, 2014.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Rehman, S., Shafique, M., Henkel, J. (2016). Background and Related Work. In: Reliable Software for Unreliable Hardware. Springer, Cham. https://doi.org/10.1007/978-3-319-25772-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25772-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25770-9

  • Online ISBN: 978-3-319-25772-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics