Data Partitioning Techniques for Partially Protected Caches to Reduce Soft Error Induced Failures

  • Kyoungwoo Lee
  • Aviral Shrivastava
  • Nikil Dutt
  • Nalini Venkatasubramanian
Part of the IFIP – The International Federation for Information Processing book series (IFIPAICT, volume 271)

Abstract

Exponentially increasing with technology scaling, soft errors have become a serious design concern in the deep sub-micron embedded systems. Partially Pro-tected Cache (PPC) is a promising microarchitectural feature to mitigate failures due to soft errors in embedded processors. A processor with PPC maintains two caches, one protected and the other unprotected, both at the same level of memory hierarchy. By finding out the data more prone to soft errors and mapping only that to the protected cache, the failure rate can be significantly improved at minimal power and performance penalty. While the effectiveness of PPCs has been demonstrated on multimedia applications – where the multimedia data is inherently resilient to soft errors – no such obvious data partitioning exists for applications in general. This paper proposes profile-based data partitioning schemes that are applicable to applications in general and effectively reduce failures due to soft errors at mini-mal power and performance overheads. Our experimental results demonstrate that our algorithm reduces the failure rate by 47× on benchmarks from MiBench while incurring only 0.5% performance and 15% power overheads.

References

  1. 1.
    International Technology Roadmap for Semiconductors 2005 Executive Summary. http://www.itrs.net/Links/2005ITRS/ExecSum2005.pdf.Google Scholar
  2. 2.
    L. Anghel and M. Nicolaidis. Cost reduction and evaluation of a temporary faults detecting technique. In IEEE/ACM Design, Automation and Test in Europe Conference (DATE), pages 591–597, 2000.Google Scholar
  3. 3.
    Ghazanfar-Hossein Asadi, Vilas Sridharan, Mehdi B. Tahoori, and David Kaeli. Balancing performance and reliability in the memory hierarchy. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 269–279, 2005.Google Scholar
  4. 4.
    Robert Baumann. Soft errors in advanced computer systems. IEEE Design and Test of Computers, pages 258–266, 2005.Google Scholar
  5. 5.
    Doug Burger and Todd M. Austin. The SimpleScalar Tool Set, version 2.0. SIGARCH Computer Architecture News, 25(3):13–25, 1997.Google Scholar
  6. 6.
    D. Ernst, N. S. Kim, S. Das, S. Pant, R. Rao, Toan Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner, and T. Mudge. Razor: A low-power pipeline based on circuit-level timing speculation. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 7–13, 2003.Google Scholar
  7. 7.
    J. Gaisler. Evaluation of a 32-bit microprocessor with builtin concurrent error-detection. In IEEE International Symposium on Fault-Tolerant Computing (FTCS), 1997.Google Scholar
  8. 8.
    M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown. MiBench: A free, commercially representative embedded benchmark suite. In IEEE Workshop on Workload Characterization, pages 3–14, 2001.Google Scholar
  9. 9.
    P. Hazucha and C. Svensson. Impact of CMOS technology scaling on the atmospheric neutron soft error rate. IEEE Trans. on Nuclear Science, 47(6):2586–2594, 2000.CrossRefGoogle Scholar
  10. 10.
    Hewlett Packard, http://www.hp.com. HP iPAQ h4000 Series - System Specifications.Google Scholar
  11. 11.
    Soontae Kim. Area-efficient error protection for caches. In IEEE/ACM Design, Automation and Test in Europe Conference (DATE), pages 1282–1287, Mar 2006.Google Scholar
  12. 12.
    S. Krishnamohan and N. R. Mahapatra. An efficient error-masking technique for improving the soft-error robustness of static CMOS circuits. In IEEE International SOC Conference (SOCC), pages 227–230, Sep 2004.Google Scholar
  13. 13.
    Kyoungwoo Lee, Aviral Shrivastava, Ilya Issenin, Nikil Dutt, and Nalini Venkatasubramanian. Mitigating soft error failures for multimedia applications by selective data protection. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), pages 411–420, Oct 2006.Google Scholar
  14. 14.
    Jin-Fu Li and Yu-Jane Huang. An error detection and correction scheme for RAMs with partial-write function. In IEEE International Workshop on Memory Technology, Design and Testing (MTDT), pages 115–120, 2005.Google Scholar
  15. 15.
    Lin Li, Vijay Degalahal, N. Vijaykrishnan, Mahmut Kandemir, and Mary Jane Irwin. Soft error and energy consumption interactions: A data cache perspective. In International Symposium on Low Power Electronics and Design (ISLPED), pages 132–137, Aug 2004.Google Scholar
  16. 16.
    P. Liden, P. Dahlgren, R. Johansson, and J. Karlsson. On latching probability of particle induced transients in combinational networks. In IEEE International Symposium on Fault-Tolerant Computing (FTCS), 1994.Google Scholar
  17. 17.
    Subhasish Mitra, Norbert Seifert, Ming Zhang, Quan Shi, and Kee Sup Kim. Robust system design with built-in soft-error resilience. IEEE Computer, 38(2):43–52, Feb 2005.Google Scholar
  18. 18.
    Kartik Mohanram and Nur A. Touba. Partial error masking to reduce soft error failure rate in logic circuits. In IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT), pages 433–440, 2003.Google Scholar
  19. 19.
    K. Mohr and L. Clark. Delay and area efficient first-level cache soft error detection and correction. In IEEE International Conference on Computer Design (ICCD), 2006.Google Scholar
  20. 20.
    Shubhendu S. Mukherjee, Joel Emer, Tryggve Fossum, and Steven K. Reinhardt. Cache scrubbing in microprocessors: Myth or necessity? In IEEE Pacific Rim International Symposium on Dependable Computing (PRDC), pages 37–42, 2004.Google Scholar
  21. 21.
    Shubhendu S. Mukherjee, Christopher Weaver, Joel Emer, Steven K. Reinhardt, and Todd Austin. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 29–40, Dec 2003.Google Scholar
  22. 22.
    M. Nicolaidis. Time redundancy based soft-error tolerance to rescue nanometer technologies. In IEEE VLSI Test Symposium (VTS), page 86, 1999.Google Scholar
  23. 23.
    A. K. Nieuwland, S. Jasarevic, and G. Jerin. Combinational logic soft error analysis and protection. In IEEE International Symposium on On-Line Testing (IOLTS), pages 99–104, 2006.Google Scholar
  24. 24.
    Richard Phelan. Addressing soft errors in ARM core-based designs. Technical report, ARM, 2003.Google Scholar
  25. 25.
    D. K. Pradhan. Fault-Tolerant Computer System Design. Prentice Hall, 1996. ISBN 0-1305-Google Scholar
  26. 26.
    Nhon Quach. High availability and reliability in the Itanium processor. IEEE/ACM International Google Scholar
  27. 27.
    P. Shivakumar and N. Jouppi. CACTI 3.0: An Integrated Cache Timing, Power, and Area Model. In WRL Technical Report 2001/2, 2001.Google Scholar
  28. 28.
    P. Shivakumar, M. Kistler, S. Keckler, D. Burger, and L. Alvisi. Modeling the effect of technology trends on soft error rate of combinational logic. In IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pages 389–398, 2002.Google Scholar
  29. 29.
    Aviral Shrivastava, Ilya Issenin, and Nikil Dutt. Compilation techniques for energy reduction in horizontally partitioned cache architectures. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), pages 90–96, 2005.Google Scholar
  30. 30.
    Makoto Sugihara, Tohru Ishihara, and Kazuaki Murakami. Task scheduling for reliable cache architectures of multiprocessor systems. In IEEE/ACM Design, Automation and Test in Europe Conference (DATE), pages 1490–1495, 2007.Google Scholar
  31. 31.
    Synopsys Inc., Mountain View, CA, USA. Design Compiler Reference Manual, 2001.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Kyoungwoo Lee
    • 1
  • Aviral Shrivastava
    • 2
  • Nikil Dutt
    • 1
  • Nalini Venkatasubramanian
    • 1
  1. 1.Department of Computer Science, School of Information and Computer SciencesUniversity of CaliforniaIrvineUSA
  2. 2.Department of Computer Science and Engineering, School of Computing and InformaticsArizona State UniversityTempe

Personalised recommendations