Advertisement

Journal of Computer Science and Technology

, Volume 28, Issue 6, pp 1045–1053 | Cite as

RevivePath: Resilient Network-on-Chip Design Through Data Path Salvaging of Router

  • Yin-He HanEmail author
  • Cheng Liu
  • Hang Lu
  • Wen-Bo Li
  • Lei Zhang
  • Xiao-Wei Li
Regular Paper

Abstract

Network-on-Chip (NoC) with excellent scalability and high bandwidth has been considered to be the most promising communication architecture for complex integration systems. However, NoC reliability is getting continuously challenging for the shrinking semiconductor feature size and increasing integration density. Moreover, a single node failure in NoC might destroy the network connectivity and corrupt the entire system. Introducing redundancies is an efficient method to construct a resilient communication path. However, prior work based on redundancies, either results in limited reliability with coarse grain protection or involves even larger hardware overhead with fine grain. In this paper, we notice that data path such as links, buffers and crossbars in NoC can be divided into multiple identical parallel slices, which can be utilized as inherent redundancy to enhance reliability. As long as there is one fault-free slice left available, the proposed salvaging scheme named as RevivePath, can be employed to make the overall data path still functional. Furthermore, RevivePath uses the direct redundancy to protect the control path such as switch arbiter, routing computation, to provide a full fault-tolerant scheme to the whole router. Experimental results show that it achieves quite high reliability with graceful performance degradation even under high fault rate.

Keywords

Network-on-Chip fault-tolerant on-chip router 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11390_2013_1396_MOESM1_ESM.doc (28 kb)
(DOC 30 kb)

References

  1. 1.
    Benini L, De Micheli G. Networks on chips: A new SoC paradigm. Computer, 2002, 35(1): 70–78.CrossRefGoogle Scholar
  2. 2.
    De Micheli G, Benini L. Networks on Chips: Technology and Tools. Morgan Kaufmann Pub, 2006.Google Scholar
  3. 3.
    Borkar S. Microarchitecture and design challenges for gigascale integration. In Proc. the 37th International Symposium on Microarchitecture, Dec. 2004, p.3.Google Scholar
  4. 4.
    Dally W, Towles B. Route packets, not wires: On-chip inter-connection networks. In Proc. Design Automation Conference, June 2001, pp.684-689.Google Scholar
  5. 5.
    Borkar S. Designing reliable systems from unreliable components: The challenges of transistor variability and degradation. IEEE Micro, 2005, 25(6): 10–16.CrossRefGoogle Scholar
  6. 6.
    Constantinescu C. Trends and challenges in VLSI circuit reliability. IEEE Micro, 2003, 23(4): 14–19.CrossRefGoogle Scholar
  7. 7.
    Zhang L, Han Y, Xu Q et al. On topology reconfiguration for defect-tolerant NoC-based homogeneous manycore systems. IEEE Trans. Very Large Scale Integration Systems, 2009, 17(9): 1173–1186.CrossRefGoogle Scholar
  8. 8.
    Boppana R V, Chalasani S. Fault-tolerant routing with non-adaptive wormhole algorithms in mesh networks. In Proc. Supercomputing, Nov. 1994, pp.693-702.Google Scholar
  9. 9.
    Zhang Z, Greiner A, Taktak S. A reconfigurable routing algorithm for a fault-tolerant 2D-mesh network-on-chip. In Proc. Design Automation Conference, June 2008, pp.441-446.Google Scholar
  10. 10.
    Flick D, DeOrio A, Chen G et al. A highly resilient routing algorithm for fault-tolerant NoCs. In Proc. Conf. Design, Automation and Test in Europe, April 2009, pp.21-26.Google Scholar
  11. 11.
    Flich J, Rodrigo S, Duato J. An efficient implementation of distributed routing algorithms for NoCs. In Proc. Int. Symp. Networks-on-Chip, April 2008, pp.87-96.Google Scholar
  12. 12.
    Wang J, Gu H, Yang Y et al. An energy- and buffer-aware fully adaptive routing algorithm for Network-on-Chip. Microelectronics Journal, 2013, 44(2): 137–144.CrossRefMathSciNetGoogle Scholar
  13. 13.
    Xiang D, Zhang Y, Pan Y. Practical deadlock-free fault-tolerant routing in meshes based on the planar network fault model. IEEE Trans. Computers, 2009, 58(5): 620–633.CrossRefMathSciNetGoogle Scholar
  14. 14.
    Xiang D, Luo W. An efficient adaptive deadlock-free routing algorithm for torus networks. IEEE Trans. Parallel and Distributed System, 2012, 23(5): 800–808.CrossRefGoogle Scholar
  15. 15.
    Siewiorek D, Swarz R. Reliable Computer Systems: Design and Evaluation (3rd edition). A K Peters/CRC Press, 1998.Google Scholar
  16. 16.
    Smolens J, Gold B, Kim J et al. Fingerprinting: Bounding soft-error-detection latency and bandwidth. In Proc. the 11th Int. Conf. Architectural Support for Programming Languages and Operating Systems, Oct. 2004, pp.224-234.Google Scholar
  17. 17.
    Weaver C, Austin T. A fault tolerant approach to microprocessor design. In Proc. International Conference on Dependable Systems and Networks, June 2001, pp.411-420.Google Scholar
  18. 18.
    Constantinides K, Plaza S, Blome J et al. BulletProof: A defect-tolerant CMP switch architecture. In Proc. the 12th International Symposium on High-Performance Computer Architecture, Feb. 2006, pp.5-16.Google Scholar
  19. 19.
    Hegde R, Shanbhag N R. Toward achieving energy efficiency in presence of deep submicronnoise. IEEE Trans. Very Large Scale Integration Systems, 2000, 8(4): 379–391.CrossRefGoogle Scholar
  20. 20.
    Kim J, Park D, Nicopoulos C et al. Design and analysis of an NoC architecture from performance, reliability and energy perspective. In Proc. Int. Symp. Architecture for Networking and Communications Systems, Oct. 2005, pp.173-182.Google Scholar
  21. 21.
    Murali S, Atienza D, Benini L et al. A multi-path routing strategy with guaranteed in-order packet delivery and fault tolerance for networks on chip. In Proc. Design Automation Conference, June 2006, pp.845-848.Google Scholar
  22. 22.
    Koibuchi M, Matsutani H, Amano H et al. A lightweight fault-tolerant mechanism for network-on-chip. In Proc. ACM/IEEE International Symposium on Networks-on-Chip, April 2008, pp.13-22.Google Scholar
  23. 23.
    Fick D, DeOrio A, Hu J et al. Vicis: A reliable network for unreliable silicon. In Proc. the 46th Design Automation Conference, July 2009, pp.812-817.Google Scholar
  24. 24.
    Palesi M, Kumar S, Catania V. Leveraging partially faulty links usage for enhancing yield and performance in networks-on-chip. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2010, 29(3): 426–440.CrossRefGoogle Scholar
  25. 25.
    Alaghi A, Karimi N, Sedghi M et al. Online NoC switch fault detection and diagnosis using a high level fault model. In Proc. International Symposium on Defect and Fault Tolerance in VLSI Systems, Sept. 2007, pp.21-29.Google Scholar
  26. 26.
    Gomez M E, Duato J, Flich J et al. An efficient fault-tolerant routing methodology for meshes and tori. Computer Architecture Letters, 2004, 3(1): 3.CrossRefGoogle Scholar
  27. 27.
    Ho C T, Stockmeyer L. A new approach to fault-tolerant wormhole routing for mesh-connected parallel computers. IEEE Trans. Computers, 2004, 53(4): 427–438.CrossRefGoogle Scholar
  28. 28.
    Han Y, Xu Y, Li H et al. Test resource partitioning based on efficient response compaction for test time and tester channels reduction. In Proc. Asian Test Symposium, Nov. 2003, pp.440-445.Google Scholar
  29. 29.
    Han Y, Xu Y, Chandra A et al. Test resource partitioning based on efficient response compaction for test time and tester channels reduction. Journal of Computer Science and Technology, 2005, 20(2): 201–210.CrossRefGoogle Scholar
  30. 30.
    Han Y, Hu Y, Li X et al. Embedded test decompressor to reduce the required channels and vector memory of tester for complex processor circuit. IEEE Trans. Very Large Scale Integration Systems, 2007, 15(5): 531–540.CrossRefGoogle Scholar
  31. 31.
    Han Y, Hu Y, Li H et al. Theoretic analysis and enhanced X-tolerance of test response compact based on convolutional code. In Proc. the 2005 Asia and South Pacific Design Automation Conference, Jan. 2005, pp.53-58.Google Scholar

Copyright information

© Springer Science+Business Media New York & Science Press, China 2013

Authors and Affiliations

  • Yin-He Han
    • 1
    • 2
    Email author
  • Cheng Liu
    • 1
    • 2
  • Hang Lu
    • 1
    • 2
  • Wen-Bo Li
    • 1
    • 2
  • Lei Zhang
    • 1
    • 2
  • Xiao-Wei Li
    • 1
    • 2
  1. 1.State Key Laboratory of Computer Architecture, Institute of Computing TechnologyChinese Academy of SciencesBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina

Personalised recommendations