The Journal of Supercomputing

, Volume 72, Issue 12, pp 4629–4650 | Cite as

An efficient fault-tolerant routing algorithm in NoCs to tolerate permanent faults

  • Reza Akbar
  • Ali Asghar Etedalpour
  • Farshad Safaei


With the possibility of integrating multiple cores into a single chip, research on the networks-on-chip (NoCs) as a kind of interconnection network has assumed great significance. In such networks, the effort is to provide broadband and extendable infrastructure for multi-core architectures. Communication between processors in an NoC is established using routing algorithms. Meanwhile, NoCs, like any other system, are prone to failure. With the increase in the number of network components on a chip, the probability of failure increases, too. Therefore, considering a fault-tolerant mechanism in NoCs seems to be a necessity. The main challenge of this work is combining performance and fault tolerance while reducing power, complexity and cost. In this paper, a fault-tolerant routing algorithm for tolerating static and dynamic faults in 2D Mesh NoCs and node failure model is presented. It should be taken into consideration that despite many other routing algorithms, the proposed method uses only one Virtual Channel. Results show that this method has lower latency and power consumption than SAVA and segment-based (SB) routing algorithms. It showed 2.91 and 12.74 % less power consumption than SAVA and SB, respectively, under SPLASH-2 traffic in an 8 \(\times \) 8 Mesh network with 8 faulty nodes. Its average latency, under Uniform, Transpose, and SPLASH-2 traffics in a 4 \(\times \) 4 Mesh with 4 faulty nodes and an 8 \(\times \) 8 Mesh network with 8 faulty nodes, was reduced by 4.39 and 14.08 % compared to SAVA and SB, respectively.


Fault-tolerant Networks-on-chip Mesh topology Routing algorithm Permanent faults Fault region Deadlock Power consumption 


  1. 1.
    Olukotun K, Nafieh BA, Hammond L, Wilson K, Chang K (1996) The case for a single-chip multiprocessor. In: Proceedings of the 7th international conference on architectural support for programming languages and operating systems, pp 2–11Google Scholar
  2. 2.
    Barroso LA et al (2000) Piranha: a scalable architecture based on single-chip multiprocessing. In: International symposium on computer architecture, pp 282–293Google Scholar
  3. 3.
    Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th annual international high performance computing conference. The 1993 high performance computing: new horizons supercomputing symposium, Calgary, Alberta, pp 349–357Google Scholar
  4. 4.
    Bhandarkar SM, Arabnia HR (1995) The Hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107–114CrossRefGoogle Scholar
  5. 5.
    Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor: theoretical properties and algorithms. Parallel Comput J Elsevier 21(11):1783–1806CrossRefGoogle Scholar
  6. 6.
    Arabnia HR, Oliver MA (1987) Arbitrary rotation of raster images with SIMD machine architectures. Int J Eurograph Assoc Comput Graph Forum 6(1):3–12CrossRefGoogle Scholar
  7. 7.
    Arabnia HR (1990) A Parallel Algorithm for the Arbitrary Rotation of Digitized Images using Process-and-Data-Decomposition Approach. J Parallel Distrib Comput 10(2):188–193CrossRefGoogle Scholar
  8. 8.
    Jerger NE, Peh LS (2009) On-chip networks. In: Mark H (ed) Synthesis Lectures on Computer Architecture. Morgan & Claypool Publishers, MadisonGoogle Scholar
  9. 9.
    Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multi-ring network. J Supercomput 25(1):43–63CrossRefzbMATHGoogle Scholar
  10. 10.
    Arabnia HR, Bhandarkar SM (1996) Parallel stereocorrelation on a reconfigurable multi-ring network. J Supercomput (Springer Publishers) 10(3):243–270CrossRefzbMATHGoogle Scholar
  11. 11.
    Dally WJ, Dennison LR, Harris D, Kan K, Xanthopoulos T (1994) The reliable router: a reliable and high-performance communications substrate for parallel computers. In: 1st International workshop on parallel computer routing and communication, pp 241–255Google Scholar
  12. 12.
    Duato J, Mejia A, Palesi M, Flich J, Kumar S (2009) Region-based routing: a mechanism to support efficient routing algorithms in NoCs. IEEE Trans Very Large Scale Integr (VLSI) Syst 17(3):356–369CrossRefGoogle Scholar
  13. 13.
    Ni LM, McKinley PK (1993) A survey of wormhole routing techniques in direct networks. Computer 26(2):62–76CrossRefGoogle Scholar
  14. 14.
    Akbar R, Safaei F, Modallalkar SM (2015) A novel power efficient adaptive RED-based flow control mechanism for networks-on-chip. J Comput Electr Eng. doi: 10.1016/j.compeleceng.2015.09.023
  15. 15.
    Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. Int J Eurograph Assoc Comput Graph Forum 8(1):3–12CrossRefGoogle Scholar
  16. 16.
    Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image processing and computer vision. Int J Pattern Recognit Artif Intell (IJPRAI) 9(2):201–229 special issue on VLSI Algorithms and Architectures for Computer Vision, Image Processing, Pattern Recognition and AICrossRefGoogle Scholar
  17. 17.
    Safaei F, Khonsari A, Gilak M (2010) A new performance measure for characterizing fault rings in interconnection networks. Inf Sci 180(5):664–678MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Ozturk O, Kandemir M, Irwin MJ, Narayanan SHK (2010) Compiler directed network-on-chip reliability enhancement for chip multiprocessors. In: Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems (LCTES ’10). ACM, New York, NY, pp 85–94Google Scholar
  19. 19.
    Arabnia HR (1995) A distributed stereocorrelation algorithm. In: Proceedings of computer communications and networks (ICCCN’95), IEEE, pp 479–482Google Scholar
  20. 20.
    Arabnia HR, Oliver MA (1987) A transputer network for the arbitrary rotation of digitised images. Comput J 30(5):425–433CrossRefGoogle Scholar
  21. 21.
    Boppana RV, Chalasani S (1995) Fault-tolerant wormhole routing algorithms for mesh networks. IEEE Trans Comput 44(7):848–864Google Scholar
  22. 22.
    Sui PH, Wang SD (1997) An improved algorithm for fault-tolerant wormhole routing in meshes. IEEE Trans Comput 46(9):1040–1042MathSciNetCrossRefGoogle Scholar
  23. 23.
    Kim SP, Han T (1997) Fault-tolerant wormhole routing in mesh with overlapped solid fault regions. Parallel Comput 23:1937–1962MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Park S, Youn JH, Bose B (2000) Fault-tolerant wormhole routing algorithms in meshes in the presence of concave faults. In: International parallel and distributed processing symposium, pp 633–638Google Scholar
  25. 25.
    Glass CG, Ni L (1996) Fault-tolerant wormhole routing in meshes without virtual channels. IEEE Trans Parallel Distrib Syst 7(6):620–636CrossRefGoogle Scholar
  26. 26.
    Glass CG, Ni L (1994) The turn model for adaptive routing. J ACM 41(5):874–902CrossRefGoogle Scholar
  27. 27.
    Cunningham CM, Avresky DR (1995) Fault-tolerant adaptive routing for two dimensional meshes. In: Symposium on high-performance computer architecture, pp 122–131Google Scholar
  28. 28.
    Nordbotten NA, Skeie T (2007) A routing methodology for dynamic fault tolerance in meshes and tori. In: International conference on high performance, pp 514–527Google Scholar
  29. 29.
    Gomez ME, Duato J, Flich J, Lopez P, Robles A (2006) A routing methodology for achieving fault tolerance in direct networks. IEEE Trans Comput 55(4):400–415CrossRefGoogle Scholar
  30. 30.
    Mejia A, Flich J, Duato J, Reinemo S (2006) Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori. In: Parallel and distributed processing symposium, Rhodes IslandGoogle Scholar
  31. 31.
    Safaei F, ValadBeigi M (2012) An efficient routing methodology to tolerate static and dynamic faults in 2-D Mesh networks-on-chip. Microprocess Microsyst 36(7):531–542CrossRefGoogle Scholar
  32. 32.
    Safaei F, Mortazavi A (2010) A novel routing algorithm for achieving static fault-tolerance in 2-D meshes. In: 10th International conference on computer and information technology (CIT), pp 2621–2627Google Scholar
  33. 33.
    Lysne EAO (2004) Simple deadlock-free dynamic network reconfiguration. In: Lecture notes in computer science 3296Google Scholar
  34. 34.
    Lysne O et al (2008) An efficient and deadlock-free network reconfiguration protocol. IEEE Trans Comput 57(6):762–779MathSciNetCrossRefGoogle Scholar
  35. 35.
    Pinkston TM et al (2003) Deadlock-free dynamic reconfiguration schemes for increased network dependability. IEEE Trans Parallel Distrib Syst 14(8):780–794CrossRefGoogle Scholar
  36. 36.
    Mustafa NU, Ozturk O, Niar S (2016) Adaptive routing framework for network on chip architectures. In: Proceedings of the 2016 workshop on rapid simulation and performance evaluation: methods and tools (RAPIDO ’16), ACM, New York, NY, Article 5, p 5Google Scholar
  37. 37.
    ValadBeigi M, Safaei F (2012) PDR: a protocol for dynamic network reconfiguration based on deadlock recovery scheme. Simul Model Pract Theory 24:59–70CrossRefGoogle Scholar
  38. 38.
    Lopez P, Duato J (1993) Deadlock-free adaptive routing algorithms for the 3-D torus: limitations and solutions. In: Bode A, Reeve M, Wolf G (eds) Parallel architectures and languages. Lecture Notes in Computer Science, vol 694. Springer, Berlin, pp 684–687Google Scholar
  39. 39.
    Nayebi A et al (2007) XMulator: an object oriented XML-based simulator. In: Asia international conference on modeling and simulation, pp 128–132Google Scholar
  40. 40.
    Kahng A, Li B, Peh L, Samadi K (2009) ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration. In: Proceedings of design, automation test Europe (DATE), pp 423–428Google Scholar
  41. 41.
    Dally WJ, Towles B (2004) Principles and practices of interconnection networks. Morgan Kaufmann Publishers, BurlingtonGoogle Scholar
  42. 42.

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Reza Akbar
    • 1
  • Ali Asghar Etedalpour
    • 1
  • Farshad Safaei
    • 1
  1. 1.Faculty of Computer Science and EngineeringShahid Beheshti University G.C.TehranIran

Personalised recommendations