Network unfairness in dragonfly topologies

Fuentes, Pablo; Vallejo, Enrique; Camarero, Cristóbal; Beivide, Ramón; Valero, Mateo

doi:10.1007/s11227-016-1758-z

Network unfairness in dragonfly topologies

Published: 25 May 2016

Volume 72, pages 4468–4496, (2016)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Pablo Fuentes¹,
Enrique Vallejo¹,
Cristóbal Camarero¹,
Ramón Beivide¹ &
…
Mateo Valero²

542 Accesses
6 Citations
Explore all metrics

Abstract

Dragonfly networks arrange network routers in a two-level hierarchy, providing a competitive cost-performance solution for large systems. Non-minimal adaptive routing (adaptive misrouting) is employed to fully exploit the path diversity and increase the performance under adversarial traffic patterns. Network fairness issues arise in the dragonfly for several combinations of traffic pattern, global misrouting and traffic prioritization policy. Such unfairness prevents a balanced use of the resources across the network nodes and degrades severely the performance of any application running on an affected node. This paper reviews the main causes behind network unfairness in dragonflies, including a new adversarial traffic pattern which can easily occur in actual systems and congests all the global output links of a single router. A solution for the observed unfairness is evaluated using age-based arbitration. Results show that age-based arbitration mitigates fairness issues, especially when using in-transit adaptive routing. However, when using source adaptive routing, the saturation of the new traffic pattern interferes with the mechanisms employed to detect remote congestion, and the problem grows with the network size. This makes source adaptive routing in dragonflies based on remote notifications prone to reduced performance, even when using age-based arbitration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On-the-fly adaptive routing for dragonfly interconnection networks

Article 16 December 2014

Marina García, Enrique Vallejo, … Cyriel Minkenberg

Modeling UGAL on the Dragonfly Topology

Efficient implementation of multi-level Dragonfly networks with Hamming graph for future optical networks

Article 30 May 2023

Heba A. Hassan, Amr A. Al-Awamry, … Fathi E. Abd El-Samie

References

Abts D (2011) Cray xt4 and seastar 3-d torus interconnect. In: Encyclopedia of parallel computing. Springer, Berlin, pp 470–477
Abts D, Weisser D (2007) Age-based packet arbitration in large-radix k-ary n-cubes. In: Proceedings of the 2007 ACM/IEEE Conference on supercomputing, 2007. SC ’07, pp 1–11. doi:10.1145/1362622.1362630
Adiga NR, Blumrich MA, Chen D, Coteus P, Gara A, Giampapa ME, Heidelberger P, Singh S, Steinmacher-Burow BD, Takken T, Tsao M, Vranas P (2005) Blue Gene/L torus interconnection network. IBM J Res Dev 49(2.3):265–276. doi:10.1147/rd.492.0265
Article Google Scholar
Allman M, Paxson V, Blanton E (2009) TCP congestion control. RFC 5681
Alverson R (2012) Cray high speed networking. In: IEEE hot interconnects
Arimilli B, Arimilli R, Chung V, Clark S, Denzel W, Drerup B, Hoefler T, Joyner J, Lewis J, Li J, et al (2010) The PERCS high-performance interconnect. In: 18th symposium on high performance interconnects. IEEE, pp 75–82
Camarero C, Vallejo E, Beivide R (2014) Topological characterization of hamming and dragonfly networks and its implications on routing. ACM Trans Archit Code Optim 11(4):39:1–39:25
Article Google Scholar
Chen D, Eisley N, Heidelberger P, Senger R, Sugawara Y, Kumar S, Salapura V, Satterfield D, Steinmacher-Burow B, Parker J (2011) The IBM Blue Gene/Q interconnection network and message unit. In: SC: international conference for high performance computing, networking, storage and analysis, pp 1–10
Duato J, Johnson I, Flich J, Naven F, Garcia P, Nachiondo T (2005) A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks. In: HPCA-11: international symposium on high-performance computer architecture, pp 108–119. doi:10.1109/HPCA.2005.1
Fuentes P, Vallejo E, Camarero C, Beivide R, Valero M (2015) Throughput unfairness in dragonfly networks under realistic traffic patterns. In: 1st IEEE international workshop on high-performance interconnection networks towards the exascale and big-data era (HiPINEB), pp 801–808. doi:10.1109/CLUSTER.2015.136
García M, Vallejo E, Beivide R, Odriozola M, Valero M (2013) Efficient routing mechanisms for dragonfly networks. In: The 42nd international conference on parallel processing (ICPP-42)
García M, Fuentes P, Odriozola M, Vallejo E, Beivide R (2014) FOGSim interconnection network simulator. University of Cantabria. http://fuentesp.github.io/fogsim/
García M, Vallejo E, Beivide R, Odriozola M, Camarero C, Valero M, Labarta J, Rodríguez G (2013) Global misrouting policies in two-level hierarchical networks. In: INA-OCMC: workshop on interconnection network architecture: on-chip, multi-chip, pp 13–16. doi:10.1145/2482759.2482763
Garcia M, Vallejo E, Beivide R, Odriozola M, Camarero C, Valero M, Rodriguez G, Labarta J, Minkenberg C (2012) On-the-fly adaptive routing in high-radix hierarchical networks. In: 41st international conference on parallel processing (ICPP), pp 279–288. doi:10.1109/ICPP.2012.46
Izu C, Vallejo E (2012) Throughput fairness in indirect interconnection networks. In: 13th international conference on parallel and distributed computing, applications and technologies, PDCAT ’12. IEEE Computer Society, pp 233–238. doi:10.1109/PDCAT.2012.129
Jiang N, Becker D, Michelogiannakis G, Dally W (2012) Network congestion avoidance through speculative reservation. In: 2012 IEEE 18th international symposium on high performance computer architecture (HPCA), pp 1–12. doi:10.1109/HPCA.2012.6169047
Jiang N, Dennison L, Dally WJ (2015) Network endpoint congestion control for fine-grained communication. In: Proceedings of the international conference for high performance computing, networking, storage and analysis, SC ’15. ACM, New York, pp 35:1–35:12. doi:10.1145/2807591.2807600
Jiang N, Kim J, Dally WJ (2009) Indirect adaptive routing on large scale interconnection networks. In: International symposium on computer architecture (ISCA), pp 220–231
Kim J, Dally W, Scott S, Abts D (2008) Technology-driven, highly-scalable dragonfly topology. In: ISCA’08: 35th international symposium on computer architecture. IEEE Computer Society, pp 77–88
Kim J, Dally W, Towles B, Gupta A (2005) Microarchitecture of a high-radix router. In: ACM SIGARCH computer architecture news, vol 33. IEEE Computer Society, pp 420–431
Lee JW, Ng MC, Asanovic K (2008) Globally-synchronized frames for guaranteed quality-of-service in on-chip networks. In: 35th international symposium on computer architecture. IEEE, pp 89–100
Lee M, Kim J, Abts D, Marty M, Lee J (2010) Probabilistic distance-based arbitration: providing equality of service for many-core CMPs. In: 2010 43rd annual IEEE/ACM international symposium on microarchitecture (MICRO), pp 509 –519. doi:10.1109/MICRO.2010.18
Miao SJ, Hsu Y (2011) Group allocation: A novel fairness mechanism for on-chip network. In: 2011 IEEE 2nd international conference on networked embedded systems for enterprise applications (NESEA), pp 1–7. doi:10.1109/NESEA.2011.6144932
Valiant L (1982) A scheme for fast parallel communication. SIAM J Comput 11:350
Article MathSciNet MATH Google Scholar
Won J, Kim G, Kim J, Jiang T, Parker M, Scott S (2015) Overcoming far-end congestion in large-scale networks. In: International symposium on high performance computer architecture (HPCA), pp 415–427. doi:10.1109/HPCA.2015.7056051

Download references

Acknowledgments

This work has been supported by the Spanish Ministry of Education, FPU Grant FPU13/00337, the Spanish Science and Technology Commission (CICYT) under contracts TIN2012-34557 and TIN2013-46957-C2-2-P, and the European HiPEAC Network of Excellence.

Author information

Authors and Affiliations

University of Cantabria, Santander, Spain
Pablo Fuentes, Enrique Vallejo, Cristóbal Camarero & Ramón Beivide
Barcelona Supercomputing Center and Universitat Politècnica de Catalunya, Barcelona, Spain
Mateo Valero

Authors

Pablo Fuentes
View author publications
You can also search for this author in PubMed Google Scholar
Enrique Vallejo
View author publications
You can also search for this author in PubMed Google Scholar
Cristóbal Camarero
View author publications
You can also search for this author in PubMed Google Scholar
Ramón Beivide
View author publications
You can also search for this author in PubMed Google Scholar
Mateo Valero
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Enrique Vallejo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fuentes, P., Vallejo, E., Camarero, C. et al. Network unfairness in dragonfly topologies. J Supercomput 72, 4468–4496 (2016). https://doi.org/10.1007/s11227-016-1758-z

Download citation

Published: 25 May 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s11227-016-1758-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Network unfairness in dragonfly topologies

Abstract

Access this article

Similar content being viewed by others

On-the-fly adaptive routing for dragonfly interconnection networks

Modeling UGAL on the Dragonfly Topology

Efficient implementation of multi-level Dragonfly networks with Hamming graph for future optical networks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Network unfairness in dragonfly topologies

Abstract

Access this article

Similar content being viewed by others

On-the-fly adaptive routing for dragonfly interconnection networks

Modeling UGAL on the Dragonfly Topology

Efficient implementation of multi-level Dragonfly networks with Hamming graph for future optical networks

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation