Advertisement

Bio-Inspired Online Fault Detection in NoC Interconnect

Chapter

Abstract

Technology scaling over the years has enabled the integration of multiple processing cores on a single chip with Network-on-chip (NoC) becoming an interconnect standard for facilitating large scale connectivity between cores. However, these NoC components, like any other circuit components, are also becoming more susceptible to faults with further scaling. The ability to adapt and perform reliably in the presence of these faults is an emerging design challenge for NoC-based multiprocessor systems. A crucial requirement for such designs is to effectively detect the faults during runtime, in particular with the ability to differentiate between temporary and permanent faults. Developing interconnect architectures with online, low-cost fault detection capabilities remains largely unaddressed and is a major design challenge for current and future scalable NoC-based multiprocessor systems. This chapter introduces SMART, a novel “real-time” strategy for detecting faults in NoC interconnect by using biological synapses and neurons to detect temporal and spatial faults. Analysis of fault scenarios and results from real-time experiments on an FPGA implementation of SMART using the example EMBRACE NoC are provided.

Keywords

Fault Detection Inhibitory Synapse Flip Flop Cyclic Redundancy Check Permanent Fault 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    A. Alaghi, N. Karimi, M. Sedghi, Z. Navabi, Online NoC switch fault detection and diagnosis using a high level fault model, in 22nd IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, Rome, 2007Google Scholar
  2. 2.
    A. Benso, S. Di Carlo, G. Di Natale, P. Prinetto, A watchdog processor to detect data and control flow errors, in IEEE Online Testing Symposium, Greece, 2003Google Scholar
  3. 3.
    H. Paugam-Moisy and S.M. Bohte. Computing with Spiking Neuron Networks, In G. Rozenberg, T. BÅuck and J.N. Kok, Eds, Handbook of Natural Computing. Springer Verlag: Heidelberg, 2011Google Scholar
  4. 4.
    S. Carrillo, J. Harkin, L. McDaid, S. Pande, S. Cawley, F. Morgan, Adaptive routing strategies for large scale spiking neural network hardware implementations, in 21st International Conference on Artificial Neural Networks (ICANN), Finland, 2011Google Scholar
  5. 5.
    S. Carrillo, J. Harkin, L. McDaid, S. Pande, S. Cawley, B. McGinley, F. Morgan, Advancing interconnect density for spiking neural network hardware implementations using traffic-aware adaptive network-on-chip routers, in Neural Networks Elsevier Science, pp. 42–57 (2012)Google Scholar
  6. 6.
    C. Concatto, J. Almeida, G. Fachini, M. Herve, F. Kastensmidt, E. Cota, M. Lubaszewski, Improving the yield of NoC-based systems through fault diagnosis and adaptive routing. J. Parallel Distrib. Comput. 71(5), 664–674 (2011)CrossRefGoogle Scholar
  7. 7.
    A. Frantz, M. Cassel, F. Kastensmidt, E. Cota, L. Carro, Crosstalk- and SEU-aware networks on chips. IEEE Des. Test Comput. 24, 340–350 (2007)CrossRefGoogle Scholar
  8. 8.
    W. Gerstner, W. Kistler, Spiking Neuron Models: Single Neurons, Populations, Plasticity (Cambridge University Press, Cambridge, 2002)CrossRefGoogle Scholar
  9. 9.
    A. Ghania, L. McDaid, A. Belatreche, S. Hall, S. Huang, J. Marsland, T. Dowrick, A. Smith, Evaluating the generalisation capability of a CMOS based synapse, Neurocomputing, 83, 188–197 (2012)CrossRefGoogle Scholar
  10. 10.
    I. Golubev, R. Tsarev, T. Semenko, N-version software systems design. Proceedings of the 11th International Scientific and Practical Conference of Students, Post-graduates and Young Scientists, Boston, USA, 2007Google Scholar
  11. 11.
    O. Goloubeva, M. Rebaudengo, M. Reorda, M. Violante, Improved software-based processor control-flow errors detection technique, in Reliability and Maintainability Symposium, 2005Google Scholar
  12. 12.
    C. Grecu, A. Ivanov, R. Saleh, E. Sogomonyan, P. Pande, On-line fault detection and location for NoC interconnects, in Proceedings of the 12th IEEE International Symposium on On-Line Testing (IOLTS), Spain, 2006Google Scholar
  13. 13.
    J. Harkin, P. Dempster, B. Cather, T. McGinnity, Fault detection for self repairing systems, in IEE SMC, UK, 2007Google Scholar
  14. 14.
    J. Harkin, F. Morgan, L. McDaid, S. Hall, B. McGinley, S. Cawley, A reconfigurable and biologically inspired paradigm for computation using network-on-chip and spiking neural networks. Int. J. Reconfigurable Comput. (2009)Google Scholar
  15. 15.
    F.Morgan, S.Cawley, P.McGinley, S. Pande, L. McDaid, B. Glackin and J. Harkin Exploring the Evolution of NoC-Based Spiking Neural Networks on FPGAs. In: IEEE Field Programmable Technology Conference, Sydney, Australia, pp. 24–27 (2009)Google Scholar
  16. 16.
    C. Hernandez, A. Roca, R. Flich, S. Duato, Characterizing the impact of process variation on 45 nm NoC-based CMPs. J. Parallel Distrib. Syst. 71, 651–663 (2011)CrossRefGoogle Scholar
  17. 17.
    M. Herve, E. Cota, F. Kastensmidt, M. Lubasewski, Diagnosis of interconnect shorts in mesh NoCs, in 3rd ACM/IEEE International Symposium on Networks-on-Chip, San Diego, USA, 2009Google Scholar
  18. 18.
    M. Hosseinabady, A. Banaiyan, M. Bojnordi, Z. Navabi, A concurrent testing method for noc switches, in IEEE Design, Automation and Test in Europe, 2006Google Scholar
  19. 19.
    M. Kakoee, V. Bertacco, L. Benini, A distributed and topology-agnostic approach for on-line NoC testing, in IEEE/ACM International Symposium on NoCs, Pennsylvania, USA, 2011Google Scholar
  20. 20.
    A. Kohler, G. Schley, M. Radetzki, Fault tolerant network on-chip switching with graceful performance degradation. IEEE Trans. Comput. Aided Des. Integr. Circuit. Syst. 29, 883–896 (2010)CrossRefGoogle Scholar
  21. 21.
    L. McDaid, S. Hall, P. Kelly, A programmable facilitating synapse device, in IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 2008Google Scholar
  22. 22.
    S. Mitra, E.J. McCloskey, Design of redundant systems protected against common-mode failures, in Proceedings of the IEEE VLSI Test Symposium, California, USA, 2001Google Scholar
  23. 23.
    E. Mourad, A. Nayak, Comparison-based system level fault diagnosis: a neural network approach. IEEE Trans. Parallel Distrib. Syst. 23(6), 1047–1059 (2012)CrossRefGoogle Scholar
  24. 24.
    S. Murali, G. De Micheli, L. Benini, T. Theocharides, N. Vijaykrishnan, M. Irwin, Analysis of error recovery schemes for networks on chips. IEEE Trans. Des. Test Comput. 22(5), 434–442 (2005)CrossRefGoogle Scholar
  25. 25.
    P. Reviriego, C. Argyrides, J. Maestro, D. Pradhan, Improving memory reliability against soft errors using block parity. IEEE Trans. Nucl. Sci. 58(3), 981–986 (2011)CrossRefGoogle Scholar
  26. 26.
    P. Yaghini, A. Eghbal, H. Pedram, H. Zarandi, Investigation of transient fault effects in an asynchronous NoC router. J. Syst. Archit. 57(1), 61–68 (2011)CrossRefGoogle Scholar
  27. 27.
    Nodoushan M, Miremadi S, and Ejlali A. Control Flow checking using Branch Instructions, IEEE International Conference on Embedded and Ubiquitous Computing, Vol 1, 66–72 (2008)Google Scholar
  28. 28.
    Giaconia, G.C., Di Stefano, A., Capponi, G. FPGA-based concurrent watchdog for real time control systems. Electronic Letters, 39(10) 769–770 (2003)CrossRefGoogle Scholar
  29. 29.
    Dai, L., Shang, D., Xia, F., Yakovlev, A. Monitoring circuit based on threshold for fault-tolerant NoC. Electronics Letters, 46 984–5 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Intelligent Systems Research CentreUniversity of UlsterDerryUK

Personalised recommendations