The Journal of Supercomputing

, Volume 75, Issue 1, pp 255–271 | Cite as

Mesh-of-Torus: a new topology for server-centric data center networks

  • Peibo Xie
  • Huaxi GuEmail author
  • Kun Wang
  • Xiaoshan Yu
  • Shangqi Ma


Various topologies have been proposed for high-performance computing (HPC), i.e., fat-tree, Torus topology. Compared with conventional fat-tree topology, Torus performs much better when applied in HPC. Unfortunately, due to its wraparound links, Torus topology naturally has the tendency to trigger deadlock incidents inside the network. Researchers solve this problem by means of virtual channel, but this approach will also restrict the routing of message. In this paper, we propose a deadlock-free topology for HPC, called Mesh-of-Torus, which incarnates the good characteristics of Mesh and Torus topology. Comparing with mesh, Mesh-of-Torus has shorter network diameter. Furthermore, we have proposed a corresponding port assignment rules in consideration of complicated internal arbitration or scheduling mechanism incurred by the employment of virtual channel. Deadlock avoidance can be achieved when dimension-order routing algorithm and our port assignment rules are applied to Mesh-of-Torus. Finally, simulations and mathematical analysis have shown that Mesh-of-Torus outperforms Mesh in terms of average end-to-end latency and network load distribution.


Data center Supercomputing Mesh-of-Torus topology Deadlock-free 



This work was supported by the National Science Foundation of China under Grants 61634004 and 61472300, the Fundamental Research Funds for the Central Universities Grant Nos. JB170107 and JB180309, and the key research and development plan of Shaanxi province No. 2017ZDCXL-GY-05-01.


  1. 1.
    Arabnia HR, Oliver MA (1987) A transfer network for the arbitrary rotation of digitised images. Comput J 30(5):425–432CrossRefGoogle Scholar
  2. 2.
    Wijngaart RFVD, Georganas E, Mattson TG, Wissink A (2017) A new parallel research kernel to expand research on dynamic load-balancing capabilities. In: International Supercomputing ConferenceGoogle Scholar
  3. 3.
    Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor-theoretical properties and algorithms. Parallel Comput 21(11):1783–1805CrossRefGoogle Scholar
  4. 4.
    Ding M, Tian H (2016) PCA-based network traffic anomaly detection. Tsinghua Sci Technol 21(2):500–509CrossRefGoogle Scholar
  5. 5.
    Alonso P, Ranilla J, Aguiar JV (2017) High-performance computing. J Supercomput 73(1):1–3CrossRefGoogle Scholar
  6. 6.
    Seitz CL (1985) The cosmic cube. Commun ACM 28(1):22–33CrossRefGoogle Scholar
  7. 7.
    Chen D, Eisley NA, Heidelberger P, Senger RM, Sugawara Y, Kumar S, Salapura V, Satterfield DL, Burow BS., Parker JJ (2011) The IBM Blue Gene/Q interconnection network and message unit. In: High Performance Computing, Networking, Storage and AnalysisGoogle Scholar
  8. 8.
    Xenopoulos P, Daniel J, Matheson M, Sukumar S (2016) Big data analytics on HPC architectures: performance and cost. In: IEEE International Conference on Big DataGoogle Scholar
  9. 9.
    González ÁF, RosilloEmai R, Dávila JÁM, Matellán V (2015) Historical review and future challenges in supercomputing and networks of scientific communication. J Supercomput 71(12):4476–4503CrossRefGoogle Scholar
  10. 10.
    Azad HS, Bagherzadeh N, Jaberipour G (2015) Advances in multicore systems architectures. J Supercomput 71(8):2783–2786CrossRefGoogle Scholar
  11. 11.
    Bermúdez Garzón DF, Requena CG, Gómez ME, López P, Duato J (2016) A family of fault-tolerant efficient indirect topologies. IEEE Trans Parallel Distrib Syst 27(4):927–940CrossRefGoogle Scholar
  12. 12.
    Dhanak M, Godbole PD, Patil RA (2016) Torus network labeling in high performance computing. In: International Conference on Computing Communication Control and AutomationGoogle Scholar
  13. 13.
    Yu Z, Xiang D, Wang X (2015) Balancing virtual channel utilization for deadlock-free routing in torus networks. J Supercomput 71(8):3094–3115CrossRefGoogle Scholar
  14. 14.
    Abbas D, Jamshidi K (2015) A fault-tolerant hierarchical hybrid mesh-based wireless network-on-chip architecture for multicore platforms. J Supercomput 71(8):3116–3148CrossRefGoogle Scholar
  15. 15.
    Prisacari B, Rodriguez G, Minkenberg C, Palacio RB (2012) Performance implications of deadlock avoidance techniques in torus networks. In: International Conference on High Performance Switching and RoutingGoogle Scholar
  16. 16.
    Puente V, Beivide R, Gregorio JA, Prellezo JM, Duato J, Izu C (1999) Adaptive bubble router: a design to improve performance in torus networks. In: International Conference on Parallel ProcessingGoogle Scholar
  17. 17.
    Jeong YS, Lee SE (2013) Deadlock-free XY-YX router for on-chip interconnection network. Ieice Electron Express 10(20):20130699CrossRefGoogle Scholar
  18. 18.
    Yu Z, Wang X, Shen K (2016) Conditional forwarding: simple flow control to increase adaptivity for fully adaptive routing algorithms. J Supercomput 72(2):639–653CrossRefGoogle Scholar
  19. 19.
    Boden NJ, Cohen D, Felderman RE (1995) Myrinet: a gigabit-per-second local area network. Micro IEEE 15(1):29–36CrossRefGoogle Scholar
  20. 20.
    Veselovsky G, Batovski DA (2003) A study of the permutation capability of a binary hypercube under deterministic dimension-order routing. In: Parallel, Distributed and Network-Based ProcessingGoogle Scholar
  21. 21.
    Ren P, Kinsy MA, Zheng N (2016) Fault-aware load-balancing routing for 2D-mesh and torus on-chip network topologies. IEEE Trans Comput 65(3):873–887MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Šeda M, Šedová J, Horký M (2017) Multichannel queueing systems and their simulation. In: Applied Physics, System Science and Computers. APSACGoogle Scholar
  23. 23.
    Cheng B, Fan J, Jia X (2013) Parallel construction of independent spanning trees and an application in diagnosis on Möbius cubes. J Supercomput 65(3):1279–1301CrossRefGoogle Scholar
  24. 24.
    Xiang D, Pan Y, Wang Q, Chen Z (2008) Deadlock-free fully adaptive routing in 2-dimensional tori based on a new virtual network partitioning scheme. In: International Conference on Distributed Computing SystemsGoogle Scholar
  25. 25.
    Liu Z, Fan J, Jia X (2015) Embedding complete binary trees into parity cubes. J Supercomput 71(1):1–27CrossRefGoogle Scholar
  26. 26.
    Farrington PA, Nembhard HB, Sturrock DT, Evans GW, Chang X (2009) Network simulations with Opnet. In: Winter Simulation ConferenceGoogle Scholar
  27. 27.
    Lang H, Quan Z (2008) OPNET modeling and simulation of MSM Clos switch fabric and algorithm with OPNET. Mod Electron Tech 19:011Google Scholar
  28. 28.
    Li H, Cheng Y, Zhou C, Zhuang W (2009) Minimizing end-to-end delay: a novel routing metric for multi-radio wireless mesh networks. In: International Conference on Computer CommunicationsGoogle Scholar
  29. 29.
    Yu Y, Huang Y, Zhao B, Hua Y (2008) Throughput analysis of wireless mesh networks. In: International Conference on Acoustics, Speech, and Signal ProcessingGoogle Scholar
  30. 30.
    Zhao D, Zou J, Todd TD (2007) Admission control with load balancing in IEEE 802.11-based ESS mesh networks. Wireless Netw 13(3):351–359CrossRefGoogle Scholar
  31. 31.
    Yu J, Bang HC, Lee H, Yang SL (2016) Adaptive internet of things and web of things convergence platform for Internet of reality services. J Supercomput 72(1):84–102CrossRefGoogle Scholar
  32. 32.
    Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multi-ring network. J Supercomput 25(1):43–63CrossRefzbMATHGoogle Scholar
  33. 33.
    Arabnia HR (1990) A parallel algorithm for the arbitrary rotation of digitized images using process-and-data-decomposition approach. J Parallel Distrib Comput 10(2):188–193CrossRefGoogle Scholar
  34. 34.
    Arabnia HR (1996) Distributed stereocorrelation algorithm. Int J Comput Commun 19(8):707–712CrossRefGoogle Scholar
  35. 35.
    Wang X, Fan JX, Lin CK (2018) BCDC: a high-performance, server-centric data center network. J Comput Sci Technol 33(2):400–416MathSciNetCrossRefGoogle Scholar
  36. 36.
    Wang T, Su Z, Xia Y (2018) CLOT: a cost-effective low-latency overlaid torus-based network architecture for data centers. In: IEEE International Conference on CommunicationsGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.State Key Laboratory of ISNXidian UniversityXi’anChina
  2. 2.School of Computer ScienceXidian UniversityXi’anChina

Personalised recommendations