Skip to main content
Log in

An efficient fault-tolerant routing algorithm in NoCs to tolerate permanent faults

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

With the possibility of integrating multiple cores into a single chip, research on the networks-on-chip (NoCs) as a kind of interconnection network has assumed great significance. In such networks, the effort is to provide broadband and extendable infrastructure for multi-core architectures. Communication between processors in an NoC is established using routing algorithms. Meanwhile, NoCs, like any other system, are prone to failure. With the increase in the number of network components on a chip, the probability of failure increases, too. Therefore, considering a fault-tolerant mechanism in NoCs seems to be a necessity. The main challenge of this work is combining performance and fault tolerance while reducing power, complexity and cost. In this paper, a fault-tolerant routing algorithm for tolerating static and dynamic faults in 2D Mesh NoCs and node failure model is presented. It should be taken into consideration that despite many other routing algorithms, the proposed method uses only one Virtual Channel. Results show that this method has lower latency and power consumption than SAVA and segment-based (SB) routing algorithms. It showed 2.91 and 12.74 % less power consumption than SAVA and SB, respectively, under SPLASH-2 traffic in an 8 \(\times \) 8 Mesh network with 8 faulty nodes. Its average latency, under Uniform, Transpose, and SPLASH-2 traffics in a 4 \(\times \) 4 Mesh with 4 faulty nodes and an 8 \(\times \) 8 Mesh network with 8 faulty nodes, was reduced by 4.39 and 14.08 % compared to SAVA and SB, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Olukotun K, Nafieh BA, Hammond L, Wilson K, Chang K (1996) The case for a single-chip multiprocessor. In: Proceedings of the 7th international conference on architectural support for programming languages and operating systems, pp 2–11

  2. Barroso LA et al (2000) Piranha: a scalable architecture based on single-chip multiprocessing. In: International symposium on computer architecture, pp 282–293

  3. Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th annual international high performance computing conference. The 1993 high performance computing: new horizons supercomputing symposium, Calgary, Alberta, pp 349–357

  4. Bhandarkar SM, Arabnia HR (1995) The Hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107–114

    Article  Google Scholar 

  5. Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor: theoretical properties and algorithms. Parallel Comput J Elsevier 21(11):1783–1806

    Article  Google Scholar 

  6. Arabnia HR, Oliver MA (1987) Arbitrary rotation of raster images with SIMD machine architectures. Int J Eurograph Assoc Comput Graph Forum 6(1):3–12

    Article  Google Scholar 

  7. Arabnia HR (1990) A Parallel Algorithm for the Arbitrary Rotation of Digitized Images using Process-and-Data-Decomposition Approach. J Parallel Distrib Comput 10(2):188–193

    Article  Google Scholar 

  8. Jerger NE, Peh LS (2009) On-chip networks. In: Mark H (ed) Synthesis Lectures on Computer Architecture. Morgan & Claypool Publishers, Madison

  9. Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multi-ring network. J Supercomput 25(1):43–63

    Article  MATH  Google Scholar 

  10. Arabnia HR, Bhandarkar SM (1996) Parallel stereocorrelation on a reconfigurable multi-ring network. J Supercomput (Springer Publishers) 10(3):243–270

    Article  MATH  Google Scholar 

  11. Dally WJ, Dennison LR, Harris D, Kan K, Xanthopoulos T (1994) The reliable router: a reliable and high-performance communications substrate for parallel computers. In: 1st International workshop on parallel computer routing and communication, pp 241–255

  12. Duato J, Mejia A, Palesi M, Flich J, Kumar S (2009) Region-based routing: a mechanism to support efficient routing algorithms in NoCs. IEEE Trans Very Large Scale Integr (VLSI) Syst 17(3):356–369

    Article  Google Scholar 

  13. Ni LM, McKinley PK (1993) A survey of wormhole routing techniques in direct networks. Computer 26(2):62–76

    Article  Google Scholar 

  14. Akbar R, Safaei F, Modallalkar SM (2015) A novel power efficient adaptive RED-based flow control mechanism for networks-on-chip. J Comput Electr Eng. doi:10.1016/j.compeleceng.2015.09.023

  15. Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. Int J Eurograph Assoc Comput Graph Forum 8(1):3–12

    Article  Google Scholar 

  16. Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image processing and computer vision. Int J Pattern Recognit Artif Intell (IJPRAI) 9(2):201–229 special issue on VLSI Algorithms and Architectures for Computer Vision, Image Processing, Pattern Recognition and AI

    Article  Google Scholar 

  17. Safaei F, Khonsari A, Gilak M (2010) A new performance measure for characterizing fault rings in interconnection networks. Inf Sci 180(5):664–678

    Article  MathSciNet  MATH  Google Scholar 

  18. Ozturk O, Kandemir M, Irwin MJ, Narayanan SHK (2010) Compiler directed network-on-chip reliability enhancement for chip multiprocessors. In: Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems (LCTES ’10). ACM, New York, NY, pp 85–94

  19. Arabnia HR (1995) A distributed stereocorrelation algorithm. In: Proceedings of computer communications and networks (ICCCN’95), IEEE, pp 479–482

  20. Arabnia HR, Oliver MA (1987) A transputer network for the arbitrary rotation of digitised images. Comput J 30(5):425–433

    Article  Google Scholar 

  21. Boppana RV, Chalasani S (1995) Fault-tolerant wormhole routing algorithms for mesh networks. IEEE Trans Comput 44(7):848–864

  22. Sui PH, Wang SD (1997) An improved algorithm for fault-tolerant wormhole routing in meshes. IEEE Trans Comput 46(9):1040–1042

    Article  MathSciNet  Google Scholar 

  23. Kim SP, Han T (1997) Fault-tolerant wormhole routing in mesh with overlapped solid fault regions. Parallel Comput 23:1937–1962

    Article  MathSciNet  MATH  Google Scholar 

  24. Park S, Youn JH, Bose B (2000) Fault-tolerant wormhole routing algorithms in meshes in the presence of concave faults. In: International parallel and distributed processing symposium, pp 633–638

  25. Glass CG, Ni L (1996) Fault-tolerant wormhole routing in meshes without virtual channels. IEEE Trans Parallel Distrib Syst 7(6):620–636

    Article  Google Scholar 

  26. Glass CG, Ni L (1994) The turn model for adaptive routing. J ACM 41(5):874–902

    Article  Google Scholar 

  27. Cunningham CM, Avresky DR (1995) Fault-tolerant adaptive routing for two dimensional meshes. In: Symposium on high-performance computer architecture, pp 122–131

  28. Nordbotten NA, Skeie T (2007) A routing methodology for dynamic fault tolerance in meshes and tori. In: International conference on high performance, pp 514–527

  29. Gomez ME, Duato J, Flich J, Lopez P, Robles A (2006) A routing methodology for achieving fault tolerance in direct networks. IEEE Trans Comput 55(4):400–415

    Article  Google Scholar 

  30. Mejia A, Flich J, Duato J, Reinemo S (2006) Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori. In: Parallel and distributed processing symposium, Rhodes Island

  31. Safaei F, ValadBeigi M (2012) An efficient routing methodology to tolerate static and dynamic faults in 2-D Mesh networks-on-chip. Microprocess Microsyst 36(7):531–542

    Article  Google Scholar 

  32. Safaei F, Mortazavi A (2010) A novel routing algorithm for achieving static fault-tolerance in 2-D meshes. In: 10th International conference on computer and information technology (CIT), pp 2621–2627

  33. Lysne EAO (2004) Simple deadlock-free dynamic network reconfiguration. In: Lecture notes in computer science 3296

  34. Lysne O et al (2008) An efficient and deadlock-free network reconfiguration protocol. IEEE Trans Comput 57(6):762–779

    Article  MathSciNet  Google Scholar 

  35. Pinkston TM et al (2003) Deadlock-free dynamic reconfiguration schemes for increased network dependability. IEEE Trans Parallel Distrib Syst 14(8):780–794

    Article  Google Scholar 

  36. Mustafa NU, Ozturk O, Niar S (2016) Adaptive routing framework for network on chip architectures. In: Proceedings of the 2016 workshop on rapid simulation and performance evaluation: methods and tools (RAPIDO ’16), ACM, New York, NY, Article 5, p 5

  37. ValadBeigi M, Safaei F (2012) PDR: a protocol for dynamic network reconfiguration based on deadlock recovery scheme. Simul Model Pract Theory 24:59–70

    Article  Google Scholar 

  38. Lopez P, Duato J (1993) Deadlock-free adaptive routing algorithms for the 3-D torus: limitations and solutions. In: Bode A, Reeve M, Wolf G (eds) Parallel architectures and languages. Lecture Notes in Computer Science, vol 694. Springer, Berlin, pp 684–687

  39. Nayebi A et al (2007) XMulator: an object oriented XML-based simulator. In: Asia international conference on modeling and simulation, pp 128–132

  40. Kahng A, Li B, Peh L, Samadi K (2009) ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration. In: Proceedings of design, automation test Europe (DATE), pp 423–428

  41. Dally WJ, Towles B (2004) Principles and practices of interconnection networks. Morgan Kaufmann Publishers, Burlington

    Google Scholar 

  42. SPLASH-2 (2008) http://www-flash.stanford.edu/apps/SPLASH

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farshad Safaei.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Akbar, R., Etedalpour, A.A. & Safaei, F. An efficient fault-tolerant routing algorithm in NoCs to tolerate permanent faults. J Supercomput 72, 4629–4650 (2016). https://doi.org/10.1007/s11227-016-1749-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1749-0

Keywords

Navigation