On-the-fly adaptive routing for dragonfly interconnection networks
- 287 Downloads
- 2 Citations
Abstract
Adaptive deadlock-free routing mechanisms are required to handle variable traffic patterns in dragonfly networks. However, distance-based deadlock avoidance mechanisms typically employed in Dragonflies increase the router cost and complexity as a function of the maximum allowed path length. This paper presents on-the-fly adaptive routing (OFAR), a routing/flow-control scheme that decouples the routing and the deadlock avoidance mechanisms. OFAR allows for in-transit adaptive routing with local and global misrouting, without imposing dependencies between virtual channels, and relying on a deadlock-free escape subnetwork to avoid deadlock. This model lowers latency, increases throughput, and adapts faster to transient traffic than previously proposed mechanisms. The low capacity of the escape subnetwork makes it prone to congestion. A simple congestion management mechanism based on injection restriction is considered to avoid such issues. Finally, reliability is considered by introducing mechanisms to find multiple edge-disjoint Hamiltonian rings embedded on the dragonfly, allowing to use multiple escape subnetworks.
Keywords
Interconnection network Dragonfly network OFAR Adaptive routing Deadlock avoidanceNotes
Acknowledgments
This work has been supported by the Spanish Ministry of Education, FPU grant AP2010-4900; the Spanish Science and Technology Commission (CICYT) under contracts TIN2010-21291-C02-02, TIN2012-34557 and TIN2013- 46957-C2-2-P; the European Union FP7 under Agreements ICT-288777 (Mont-Blanc) and ERC-321253 (RoMoL); the European HiPEAC Network of Excellence and the JSA no. 2013-119 as part of the IBM/BSC Technology Center for Supercomputing agreement.
References
- 1.Arimilli B, Arimilli R, Chung V, Clark S, Denzel W, Drerup B, Hoefler T, Joyner J, Lewis J, Li J et al (2010) The PERCS high-performance interconnect. In: 2010 18th IEEE symposium on high performance interconnects. IEEE, pp 75–82Google Scholar
- 2.Bhatele A, Gropp WD, Jain N, Kale LV (2011) Avoiding hot-spots on two-level direct networks. In: 2011 international conference for high performance computing, networking, storage and analysis (SC), pp 1–11Google Scholar
- 3.Brookes S, Roscoe A (1991) Deadlock analysis in networks of communicating processes. Distrib Comput 4(4):209–230CrossRefMATHMathSciNetGoogle Scholar
- 4.Carrion C, Beivide R, Gregorio J, Vallejo F (1997) A flow control mechanism to avoid message deadlock in k-ary n-cube networks. In: Proceedings of fourth international conference on high-performance computing, 1997, pp 322–329. doi: 10.1109/HIPC.1997.634510
- 5.Cidon I, Ofek Y (1993) Metaring-a full-duplex ring with fairness and spatial reuse. IEEE Trans Commun 41(1):110–120. doi: 10.1109/26.212370 CrossRefGoogle Scholar
- 6.Duato J (1995) A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Trans Parallel Distrib Syst 6(10):1055–1067CrossRefGoogle Scholar
- 7.Faanes G, Bataineh A, Roweth D, Court T, Froese E, Alverson B, Johnson T, Kopnick J, Higgins M, Reinhard J (2012) Cray cascade: a scalable HPC system based on a dragonfly network. In: International conference on high performance computing, networking, storage and analysis, SC ’12. IEEE Computer Society Press, Los Alamitos, pp 103:1–103:9Google Scholar
- 8.García M, Fuentes P, Odriozola M, Vallejo E, Beivide R (2014) FOGSim interconnection network simulator. University of Cantabria. https://code.google.com/p/fogsim/
- 9.García M, Vallejo E, Beivide R, Odriozola M, Camarero C, Valero M, Labarta J, Rodríguez G (2013) Global misrouting policies in two-level hierarchical networks. In: Interconnection network architecture: on-chip, multi-chip, pp 13–16Google Scholar
- 10.García M, Vallejo E, Beivide R, Odriozola M, Camarero C, Valero M, Rodríguez G, Labarta J, Minkenberg C (2012) On-the-fly adaptive routing in high-radix hierarchical networks. In: International conference on parallel processing (ICPP)Google Scholar
- 11.García M, Vallejo E, Beivide R, Valero M, Rodríguez G (2013) OFAR-CM: efficient dragonfly networks with simple congestion management. In: 2013 IEEE 21st annual symposium on high-performance interconnects (HOTI), pp 55–62. doi: 10.1109/HOTI.2013.16
- 12.Garcia PJ (2011) Congestion management in HPC interconnection networks. HPC Advisory Council European WorkshopGoogle Scholar
- 13.Gunther K (1981) Prevention of deadlocks in packet-switched data transport systems. IEEE Trans Commun 29(4):512–524. doi: 10.1109/TCOM.1981.1095021 CrossRefGoogle Scholar
- 14.Gupta P, McKeown N (1999) Designing and implementing a fast crossbar scheduler. Micro IEEE 19(1):20–28CrossRefGoogle Scholar
- 15.IEEE 802 LAN/MAN Standards Committee (2004) IEEE 802.1d-2004 MAC bridgesGoogle Scholar
- 16.IEEE 802 LAN/MAN Standards Committee (2010) IEEE standard for local and metropolitan area networks–virtual bridged local area networks–amendment: 10: Congestion notification, 802.1QauGoogle Scholar
- 17.Jacobson V (1988) Congestion avoidance and control. ACM SIGCOMM Comput Commun Rev 18:314–329CrossRefGoogle Scholar
- 18.Jiang N, Kim J, Dally WJ (2009) Indirect adaptive routing on large scale interconnection networks. In: ISCA ’09: 36th international symposium on computer architectureGoogle Scholar
- 19.Kerbyson DJ, Barker KJ (2011) Analyzing the performance bottlenecks of the POWER7-IH network. In: CLUSTER. IEEE, pp 244–252Google Scholar
- 20.Kermani P, Kleinrock L (1976) Virtual cut-through: a new computer communication switching technique. Comput Netw 3(4):267–286MathSciNetGoogle Scholar
- 21.Kim J, Dally W, Scott S, Abts D (2008) Technology-driven, highly-scalable dragonfly topology. In: Proceedings of the 35th annual international symposium on computer architecture. IEEE Computer Society, pp 77–88Google Scholar
- 22.Lam S, Reiser M (1979) Congestion control of store-and-forward networks by input buffer limits—an analysis. IEEE Trans Commun 27(1):127–134. doi: 10.1109/TCOM.1979.1094280 CrossRefGoogle Scholar
- 23.Pinkston T (2004) Deadlock characterization and resolution in interconnection networks. In: Deadlock resolution in computer-integrated systems, CRC Press, pp 445–492Google Scholar
- 24.Prisacari B, Rodriguez G, Garcia M, Vallejo E, Beivide R, Minkenberg C (2014) Performance implications of remote-only load balancing under adversarial traffic in dragonflies. In: 8th international workshop on interconnection network architecture: on-chip, multi-chip, INA-OCMC ’14. doi: 10.1145/2556857.2556860
- 25.Silla F, Duato J (2000) High-performance routing in networks of workstations with irregular topology. IEEE Trans Parallel Distrib Syst 11(7):699–719. doi: 10.1109/71.877816
- 26.Valiant L (1982) A scheme for fast parallel communication. SIAM J Comput 11:350CrossRefMATHMathSciNetGoogle Scholar