Kernel-Level Tolerance

Chapter

Abstract

Negative bias temperature instability (NBTI) adversely affects the reliability of a processor by introducing new delay-induced faults. However, the effect of these delay variations is not uniformly spread across functional units and instructions: some are affected more (hence less reliable) than others. For massive number of kernels executing on functional units in GPUs, we propose a preventive method to ensure the absence of NBTI-induced timing errors during GPU lifetime . This chapter presents an NBTI-aware compiler-directed very long instruction word (VLIW) assignment scheme that uniformly distributes the stress of instructions with the aim of minimizing aging of GP-GPU architecture without any performance penalty. The proposed solution is an entirely software technique based on static workload characterization and online execution with NBTI monitoring that equalizes the expected lifetime of each processing element by regenerating aging-aware healthy kernels that respond to the specific health state of GP-GPU. We demonstrate our approach on AMD Evergreen architecture where iso-throughput executions of the healthy kernels reduce NBTI-induced voltage threshold shift up to 49% (11%) compared to naive kernel executions, with (without) architectural support for power-gating. The kernel adaption flow takes average of 13 ms on a typical host machine thus making it suitable for practical implementation.

Keywords

Delay variations Negative bias temperature instability (NBTI) Voltage threshold shift Device aging Kernel-level tolerance Aging-aware kernels Adaptive VLIW Dynamic binary optimizer GPU lifetime 

References

  1. 1.
    G. Chen, M.-F. Li, C.H. Ang, J.Z. Zheng, D.-L. Kwong, Dynamic NBTI of p-MOS transistors and its impact on mosfet scaling. Electron Device Lett. IEEE 23(12), 734–736 (2002)CrossRefGoogle Scholar
  2. 2.
    K. Bernstein, D.J. Frank, A.E. Gattiker, W. Haensch, B.L. Ji, S.R. Nassif, E.J. Nowak, D.J. Pearson, N.J. Rohrer, High-performance cmos variability in the 65-nm regime and beyond. IBM J. Res. Develop. 50(4.5), 433–449 (2006)Google Scholar
  3. 3.
    S. Ogawa, N. Shiono, Generalized diffusion-reaction model for the low-field charge-buildup instability at the Si-Sio2 interface. Phys. Rev. 51(7), 4218–4230 (1995)CrossRefGoogle Scholar
  4. 4.
    W. Wang, S. Yang, S. Bhardwaj, S. Vrudhula, F. Liu, Y. Cao, The impact of NBTI effect on combinational circuit: modeling, simulation, and analysis. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 18(2), 173–183 (2010)Google Scholar
  5. 5.
    S. Bhardwaj, W. Wang, R. Vattikonda, Y. Cao, S. Vrudhula, Predictive modeling of the NBTI effect for reliable design, in Custom Integrated Circuits Conference, 2006. CICC ’06 (IEEE, New Jersey, 2006), pp. 189–192Google Scholar
  6. 6.
    A. Tiwari, J. Torrellas, Facelift: hiding and slowing down aging in multicores, in 2008 41st IEEE/ACM International Symposium on Microarchitecture, 2008. MICRO-41 (2008), pp. 129–140Google Scholar
  7. 7.
    U.R. Karpuzcu, B. Greskamp, J. Torrellas, The bubblewrap many-core: popping cores for sequential acceleration, in 42nd Annual IEEE/ACM International Symposium on Microarchitecture, 2009. MICRO-42 (2009), pp. 447–458Google Scholar
  8. 8.
    T.-B. Chan, J. Sartori, P. Gupta, R. Kumar, On the efficacy of nbti mitigation techniques, in Design, Automation Test in Europe Conference Exhibition (DATE) (2011), pp. 1–6Google Scholar
  9. 9.
    F. Oboril, M.B. Tahoori, Extratime: modeling and analysis of wearout due to transistor aging at microarchitecture-level, in 2012 42nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) (2012), pp. 1–12Google Scholar
  10. 10.
    AMD Evergreen Family Instruction Set Architecture (2011)Google Scholar
  11. 11.
    P. Singh, E. Karl, D. Sylvester, D. Blaauw, Dynamic nbti management using a 45 nm multi-degradation sensor. IEEE Trans. Circuits Syst. I Regul. Pap. 58(9), 2026–2037 (2011)MathSciNetCrossRefGoogle Scholar
  12. 12.
    P Singh, E. Karl, D. Blaauw, D Sylvester, Compact degradation sensors for monitoring NBTI and oxide degradation. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 20(9), 1645–1655 (2012)Google Scholar
  13. 13.
    H. Kaul, M.A. Anders, S.K. Mathew, S.K. Hsu, A. Agarwal, R.K. Krishnamurthy, S. Borkar, A 300mV 494GOPS/W reconfigurable dual-supply 4-way SIMD vector processing accelerator in 45nm CMOS, in 2009 IEEE International Solid-State Circuits Conference, 2009. Digest of Technical Papers. ISSCC (2009), pp. 260–261Google Scholar
  14. 14.
    F. Firouzi, S. Kiamehr, M.B. Tahoori, NBTI mitigation by optimized NOP assignment and insertion, in Design, Automation Test in Europe Conference Exhibition (DATE) (2012), pp. 218–223Google Scholar
  15. 15.
    Multi2sim: A Heterogeneous System Simulator. https://www.multi2sim.org/
  16. 16.
    AMD app SDK v2.5. http://www.amd.com/stream
  17. 17.
    A. Calimera, E. Macii, M. Poncino, NBTI-aware power gating for concurrent leakage and aging optimization, in Proceedings of the 2009 ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED ’09 (ACM, New York, NY, USA, 2009), pp. 127–132Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Electrical Engineering and Computer SciencesUniversity of California BerkeleyBerkeleyUSA
  2. 2.Integrated Systems LaboratoryETH ZurichZürichSwitzerland
  3. 3.Department of Computer Science and EngineeringUniversity of California, San DiegoLa JollaUSA

Personalised recommendations