Specification and Verification of Soft Error Performance in Reliable Electronic Systems

  • Allan L. Silburt
  • Adrian Evans
  • Ana Burghelea
  • Shi-Jie Wen
  • David Ward
  • Ron Norrish
  • Dean Hogle
  • Ian Perryman
Chapter
Part of the Frontiers in Electronic Testing book series (FRET, volume 41)

Abstract

This chapter describes the modeling, analysis, and verification methods used to achieve a reliability target set for transient outages in equipment used to build the backbone routing infrastructure of the Internet. We focus on ASIC design and analysis techniques that were undertaken to achieve the targeted behavior using the 65-nm technology. Considerable attention is paid to Single Event Upset in flip-flops and their potential to produce network impacting events that are not systematically detected and controlled. Using random fault injection in large-scale RTL simulations, and slack time distributions from static timing analysis, estimates of functional and temporal soft error masking effects were applied to a system soft error model to drive decisions on interventions such as the choice of flip-flops, parity protection of registers groupings, and designed responses to detected upsets. Central to the design process is a modeling framework that accounts for the upset effects and relates them to the target specification. This enables the final system to be tested using large area neutron beam radiation to confirm the specification has been met.

Notes

Acknowledgments

The authors would like to thank D. Mah, W. Obeidi, M. Pugi, M. Kucinska, A. Agrawal, M. Ruthruff, P. Narayan, J. Bolduc, N. Taranath, and P. Therrien of Cisco Systems for their contributions to the neutron beam testing effort. We would also like to thank M. Olmos and D. Gauthier at iROC Inc, as well as E. Blackmore at TRIUMF and A. Prokofiev at TSL, for their critical support at the neutron test facilities. Finally, we would like to thank David Jones at Xtreme EDA for his contributions to the error masking investigations. The TRIUMF facility is supported by funding from the National Research Council of Canada.

References

  1. 1.
    D. Miras, “A survey of network QOS needs of advanced internet applications,” 2002, Internet2 – QoS working group, pp. 30–31, http://qos.internet2.edu/wg/apps/fellowship/Docs/Internet2AppsQoSNeeds.html.Google Scholar
  2. 2.
    “Transmission systems and media, digital systems and networks,” ITU-T Recommendation G.114, pp. 2–3, approved May 2003.Google Scholar
  3. 3.
    D. Katz, D. Ward, “Bidirectional Forwarding Detection,” IETF Network Working Group, Internet-Draft, http://tools.ietf.org/html/draft-ietf-bfd-base-00.
  4. 4.
    V. Paxson, M. Allman, “Computing TCP’s Retransmission Timer,” IETF Network Working Group, RFC 2988, p. 1, http://www.ietf.org/rfc/rfc2988.txt.
  5. 5.
    “Network performance objectives for IP-based services,” ITU-T Recommendation Y.1541, approved Feb. 22, 2006.Google Scholar
  6. 6.
    C.L. Chen, M.Y. Hsiao, “Error-correcting codes for semiconductor memory applications: a state-of-the-art review,” IBM Journal of Research and Development, vol. 28, pp. 124–134, Mar. 1984.CrossRefGoogle Scholar
  7. 7.
    E. Ibe et al, “Spreading diversity in multi-cell neutron-induced upsets with device scaling,” Custom Integrated Circuits Conference, San Jose CA, pp. 437–444, Sept. 2006.Google Scholar
  8. 8.
    N. Seifert et al, “Radiation-induced soft error rates of advanced CMOS bulk devices,” Int. Reliability Physics Symposium, San Jose, CA, pp. 217–224, 2006.Google Scholar
  9. 9.
    Y. Tosaka et al, “Comprehensive study of soft errors in advanced CMOS circuits with 90/130 nm technology,” Int. Electron Devices Meeting, San Francisco, CA, pp. 38.3.1–38.3.4, Dec. 2004.Google Scholar
  10. 10.
    T. Calin, M. Nicoaidis, R. Velazco, “Upset hardened memory design for submicron CMOS technology,” IEEE Transactions on Nuclear Science, vol. 43 no. 6, pp. 2874–2878, Dec. 1996.CrossRefGoogle Scholar
  11. 11.
    S. Mitra, N. Seifert, M. Zhang, K. Kim, “Robust system design with built-in soft-error resilience,” IEEE Computer, pp. 43–52, Feb. 2005.Google Scholar
  12. 12.
    H.T. Nguyen, Y. Yagil, N. Siefert, “Chip-level soft error estimation method,” IEEE Transactions on Device and Materials Reliability, vol. 5, no. 3, pp. 365–381, Sept. 2005.CrossRefGoogle Scholar
  13. 13.
    N. Wang, J. Quek, T. Rafacz, S. Patel, “Characterizing the effects of transient faults on a high-performance processor pipeline,” Proc. of the Intl. Conf. on Dependable Systems and Networks, pp. 61–70, 2004.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Allan L. Silburt
    • 1
  • Adrian Evans
    • 1
  • Ana Burghelea
    • 1
  • Shi-Jie Wen
    • 2
  • David Ward
    • 3
  • Ron Norrish
    • 4
  • Dean Hogle
    • 5
  • Ian Perryman
    • 6
  1. 1.SSE Silicon GroupCisco SystemsKanataCanada
  2. 2.Advanced Manufacturing Technology CentreCisco SystemsSan JoseUSA
  3. 3.Juniper NetworksSunnyvaleUSA
  4. 4.Technical Failure AnalysisCisco SystemsSan JoseUSA
  5. 5.SSE CRS HardwareCisco SystemsSan JoseUSA
  6. 6.Ian Perryman & AssociatesKanataCanada

Personalised recommendations