The Journal of Supercomputing

, Volume 71, Issue 7, pp 2339–2364 | Cite as

A HoL-blocking aware mechanism for selecting the upward path in fat-tree topologies

  • C. Gómez
  • F. Gilabert
  • M. E. Gómez
  • P. López
  • J. Duato


Large cluster-based machines require efficient high-performance interconnection networks. Routing is a key design issue of interconnection networks. Adaptive routing usually outperforms deterministic routing at the expense of introducing out-of-order packet delivery. Many of the commodity interconnects for clusters are based on fat-trees. The adaptive routing algorithm commonly used in fat-trees is composed of a fully adaptive upward subpath, followed by a deterministic downward subpath. As the latter is determined by the former, choosing the most adequate upward path for each packet is critical in fat-trees to achieve a good performance. In this paper, we present a mechanism for selecting the upward path in fat-trees, which enables optimum use of the available network resources to achieve a high network throughput. The proposed path selection is destination based, which allows reducing the head-of-line blocking effect. Indeed, the proposed mechanism can be used either as a selection function (the provided path is used as the preferred one), or as a deterministic routing algorithm (the path is the only possible one). The results show that the resulting selection function outperforms any other known one. Moreover, the proposed deterministic routing algorithm can achieve a similar, or even higher, level of performance than adaptive routing, while providing in-order packet delivery and a simpler switch implementation.


Regular indirect topologies Fat-trees Adaptive routing Deterministic routing In-order delivery of packets 



This work was supported by the Spanish Ministerio de Ciencia e Innovación (MICINN) and jointly financed with Plan E funds, under Grant TIN2009-14475-C04 as well as by Consolider-Ingenio 2010 under Grant CSD2006-00046.


  1. 1.
    Abali B et al (2001) Adaptive routing on the new switch chip for IBM SP systems. J Parallel Distrib Comput 61(9):1148–1179zbMATHCrossRefGoogle Scholar
  2. 2.
    Bakker E, van Leeuwer J, Tan RB (1991) Linear interval routing. Algoritms Rev 2:45–61Google Scholar
  3. 3.
    Bogdanski B, Reinemo S-A, Sem-Jacobsen FO, Gran sFtree EG (2012) A fully connected and deadlock free switch-to-switch routing algorithm for fat-trees. ACM Trans Archit Code Optim 8(4):55-1–55-20Google Scholar
  4. 4.
    Bogdanski B, Dag B, Reinemo S-A, Flich J (2013) Making the network scalable: inter-subnet routing in InfiniBand. In: Proceedings of the Euro-Par 2013 international conferenceGoogle Scholar
  5. 5.
    Dally WJ, Towles B (2004) Principles and practices of interconnection networks. Morgan Kaufmann, BurlingtonGoogle Scholar
  6. 6.
    Duato J, Yalamanchili S, Ni L (2004) Interconnection networks: an engineering approach. Morgan Kaufmann, BurlingtonGoogle Scholar
  7. 7.
    Escudero-Sahuquillo J, Gunnar E, Garcia PJ, Flich J, Skeie T, Lysne O, Quiles FJ, Duato J (2014) Efficient and cost-effective hybrid congestion control for HPC interconnection networks. IEEE Trans Parallel Distrib Syst (to apear). doi: 10.1109/TPDS.2014.2307851
  8. 8.
    Flich J, Malumbres MP, López P, Duato J (2000) Improving routing performance in Myrinet networks. In: Proceedings of the 14th international parallel and distributed processing symposiumGoogle Scholar
  9. 9.
    García PJ, Flich J, Duato J, Johnson I, Quiles FJ, Naven F (2005) Dynamic evolution of congestion trees: analysis and impact on switch architecture. In: Proceedings of 1st HiPEAC conference, pp 266–285Google Scholar
  10. 10.
    Geoffray P, Hoefler T (2008) Adaptive routing strategies for modern high performance networks. In: IEEE HOTIGoogle Scholar
  11. 11.
    Gilabert F, Gómez ME, López P, Duato J (2006) On the influence of the selection function on the performance of fat-trees. In: European conference on parallel computingGoogle Scholar
  12. 12.
    Greenberg R, Leiserson C (1985) Randomized routing on fat-trees. In: Annual symposium on the foundations of computer scienceGoogle Scholar
  13. 13.
    Gómez ME, López P, Duato J (2005) A memory-effective routing strategy for regular interconnection networks. In: IEEE international parallel and distributed processing symposiumGoogle Scholar
  14. 14.
    Gómez C, Gilabert F, Gómez ME, López P, Duato J (2007) Deterministic versus adaptive routing in fat-trees workshop on communication architecture on clusters. In: IEEE international parallel and distributed processing symposiumGoogle Scholar
  15. 15.
    Hillis WD, Tucker L (1993) The CM-5 connection machine: a scalable supercomputer. Commun ACM 36(11):31–40CrossRefGoogle Scholar
  16. 16.
    Hoefler T, Schneider T, Lumsdaine A (2009) Optimized routing for large-scale InfiniBand networks. In: Proceedings of the 2009 17th IEEE symposium on high performance interconnectsGoogle Scholar
  17. 17.
    Infiniband Trade Association.
  18. 18.
    Johnson G, Kerbbyson D, Lang M (2008) Optimization of InfiniBand scientific applications. In: 22nd international parallel and distributed processingGoogle Scholar
  19. 19.
    Kariniemi H (2006) On-line reconfigurable extended generalized fat tree network-on-chip for multiprocessor system-on-chip circuits. PhD. thesis, Tampere University of TechnologyGoogle Scholar
  20. 20.
    Karol MJ, Hluchyj MG, Morgan SP (1987) Input versus output queueing on a space-division packet switch. IEEE Trans Commun 35:1347–1356Google Scholar
  21. 21.
    Kim J, Park D, Theocharides T, Vijaykrishnan N, Das CR (2005) A low latency router supporting adaptivity for on-chip interconnects. In: 42nd annual conference on design automationGoogle Scholar
  22. 22.
    Kim J, Dally WJ, Dally J, Abts D (2006) Adaptive routing in high-radix clos network. In: SC 2006 conference, proceedings of the ACM/IEEE, Tampa, FL, 7 Nov 2006. doi: 10.1109/SC.2006.10
  23. 23.
    Lin X, Chung Y, Huang T (2004) A multiple LID routing for fat-tree-based InfiniBand networks. In: IEEE international parallel and distributed processing symposiumGoogle Scholar
  24. 24.
    Martínez JC, Flich J, Robles A, López P, Duato J (2004) Supporting adaptive routing in IBA switches. J Syst Archit 49:441–449CrossRefGoogle Scholar
  25. 25.
    Martínez JC, Flich J, Robles A, López P, Duato J, Koibuchi M (2005) In-order packet delivery in interconnection networks using adaptive routing. In: IEEE international parallel and distributed processing symposiumGoogle Scholar
  26. 26.
  27. 27.
    Petrini F, Vanneschi M (1995) k-ary n-tress: high performance networks for massively parallel architecture. In: IEEE Micro, vol 15Google Scholar
  28. 28.
    Quadrics homepage.
  29. 29.
    Scott S, Abts D, Kim J, Dally WJ (2006) The BlackWidow high-radix clos network. In: International sympium on computer architectureGoogle Scholar
  30. 30.
    Ruemmler C, Wilkes J (1993) Unix disk access patterns. In: Winter Usenix conferenceGoogle Scholar
  31. 31.
  32. 32.
    Top 500 Supercomputer site (2014).
  33. 33.
    Vishnu A, Koop M, Moody A, Mamidala A, Narravula S, Panda D (2007) Hot-spot avoidancce with multipathing over InfiniBand: an MPI perspective. In: International symposium on cluster computing and the gridGoogle Scholar
  34. 34.
    Zahavi E, Johnson G, Kerbyson DJ, Lang M (2010) Optimized InfiniBandTM fat-tree routing for shift all-to-all communication patterns. Concurr Comput Pract Experience 22:2Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • C. Gómez
    • 1
  • F. Gilabert
    • 1
  • M. E. Gómez
    • 1
  • P. López
    • 1
  • J. Duato
    • 1
  1. 1.Departamento de Informática de Sistemas y ComputaciónUniversitat Politècnica de ValenciaValenciaSpain

Personalised recommendations