Abstract
In this paper we address the problem of node allocation for high performance computer systems based on the Angara interconnect with the torus topology. Most allocation strategies for the torus topologies assume redundancy, i.e. for a user job it is possible to allocate more nodes than required. We propose the new node allocation algorithm for supercomputers with the Angara interconnect. The new algorithm removes the limitations of the previously proposed base algorithm, it allows to find more possible solutions of the problem of node allocation. Using the developed simulator, we evaluate the utilization and average relative waiting job time of a job in a queue for systems up to 512 nodes with a 3D torus topology and up to 1024 nodes with a 4D torus topology. For all considered topologies the new node allocation algorithm, on average, improves the system utilization over the base algorithm by 9.61\(\%\). Similarly, for the average waiting job time relative to the requested job time, the gain is 2.04 times, on average. Secondly, we implement the ability to disable unused transceivers of the Angara interconnect router for each user job. Using the simulator and the same job workloads, the achieved energy saving is up to 2.78 kW for a system of 512 nodes (3D torus) and 4.41 kW for a system of 1024 nodes (4D torus).
Similar content being viewed by others
REFERENCES
V. Stegailov, E. Dlinnova, T. Ismagilov, M. Khalilov, N. Kondratyuk, D. Makagon, A. Semenov, A. Simonov, G. Smirnov, and A. Timofeev, ‘‘Angara interconnect makes GPU-based Desmos supercomputer an efficient tool for molecular dynamics calculations,’’ Int. J. High Perform. Comput. Appl. (2019).
A. Agarkov, T. Ismagilov, D. Makagon, A. Semenov, and A. Simonov, ‘‘Performance evaluation of the Angara interconnect,’’ in Proceedings of the International Conference on Russian Supercomputing Days, Moscow, Russia (2016), pp. 626–639.
M. Khalilov and A. Timofeev, ‘‘Optimization of MPI-process mapping for clusters with Angara interconnect,’’ Lobachevskii J. Math. 39, 1188–1198 (2018).
G. Ostroumova, N. Orekhov, and V. Stegailov, ‘‘Reactive molecular-dynamics study of onion-like carbon nanoparticle formation,’’ Diamond Rel. Mater. 94, 14–20 (2019).
S. Polyakov, V. Podryga, and D. Puzyrkov, ‘‘High performance computing in multiscale problems of gas dynamics,’’ Lobachevskii J. Math. 39, 1239–1250 (2018).
V. Stegailov, G. Smirnov, and V. Vecher, ‘‘VASP hits the memory wall: Processors efficiency comparison,’’ Concurr. Comput.: Pract. Exp., e5136 (2019). https://doi.org/10.1002/cpe.5136
M. Tolstykh, G. Goyman, R. Fadeev, and V. Shashkin, ‘‘Structure and algorithms of SLAV atmosphere model parallel program complex,’’ Lobachevskii J. Math. 39, 587–595 (2018).
A. Shamsutdinov, M. Khalilov, T. Ismagilov, A. Piryugin, S. Biryukov, V. Stegailov, and A. Timofeev, ‘‘Performance of supercomputers based on Angara interconnect and novel AMD CPUs/GPUs,’’ in Proceedings of the International Conference on Mathematical Modeling and Supercomputer Technologies (Springer, New York, 2020), pp. 401–416.
X. Yang, Z. Zhou, W. Tang, X. Zheng, J. Wang, and Z. Lan, ‘‘Balancing job performance with system performance via locality-aware scheduling on torus-connected systems,’’ in Cluster Computing CLUSTER, Proceedings of the 2014 IEEE International Conference (IEEE, 2014), pp. 140–148. https://doi.org/10.1109/CLUSTER.2014.6968751
W. Tang, Z. Lan, N. Desai, D. Buettner, and Y. Yu, ‘‘Reducing fragmentation on torus-connected supercomputers,’’ in Proceedings of the 2011 IEEE International Parallel and Distributed Processing Symposium (IEEE, 2011), pp. 828–839.
G. Lakner, B. Knudson, et al., IBM System Blue Gene Solution: Blue Gene/Q System Administration (IBM Redbooks, 2013).
W. Qiao and L. M. Ni, ‘‘Efficient processor allocation for 3D tori,’’ in Proceedings of 9th IEEE International Parallel Processing Symposium (Comput. Soc. Press, 1995), pp. 466–471.
H. Choo, S.-M. Yoo, and H. Y. Youn, ‘‘Processor scheduling and allocation for 3D torus multicomputer systems,’’ IEEE Trans. Parallel Distrib. Syst. 11, 475–484 (2000).
Y. Ajima, S. Sumimoto, and T. Shimizu, ‘‘Tofu: A 6D mesh/torus interconnect for exascale computers,’’ Computer 42 (11), 36–40 (2009).
U. Schwiegelshohn and R. Yahyapour, ‘‘Analysis of first-come-first-serve parallel job scheduling,’’ in Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms SODA, San Francisco, January 25–27, 1998 (Citeseer, 1998), vol. 98, pp. 629–638.
I. Ababneh and S. Bani-Mohammad, ‘‘A new window-based job scheduling scheme for 2D mesh multicomputers,’’ Simul. Model. Pract. Theory 19, 482–493 (2011).
A. W. Mu’alem and D. G. Feitelson, ‘‘Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling,’’ IEEE Trans. Parallel Distrib. Syst. 12, 529–543 (2001).
G. Staples, ‘‘Torque resource manager,’’ in Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (2006).
A. Baranov, S. Smirnov, M. Khramtsov, and S. Sharf, ‘‘MMS-1000 RCMS modernization,’’ in Proceedings of the Russian Scientific Conference Scientific Service on the Internet (2008).
Slurm workload manager Home Page. https://slurm.schedmd.com/overview.html. Accessed 2022.
A. Mukosey, A. Semenov, and A. Tretiakov, ‘‘Optimized graph based routing algorithm for the Angara interconnect,’’ arXiv: 2110.00851 (2021).
M.-C. Heydemann, J. C. Meyer, and D. Sotteau, ‘‘On forwarding indices of networks,’’ Discrete Appl. Math. 23 (2), 103–123 (1989).
J. C. Sancho, A. Robles, P. Lopez, J. Flich, and J. Duato, ‘‘Routing in Infiniband torus network topologies,’’ in Proceedings of the 2003 International Conference on Parallel Processing, 2003 (IEEE, 2003), pp. 509–518.
Funding
The study was carried out with a grant from the Russian Science Foundation (project no. 20-71-10127).
Author information
Authors and Affiliations
Corresponding authors
Additional information
(Submitted by V. V. Voevodin)
Rights and permissions
About this article
Cite this article
Mukosey, A.V., Semenov, A.S. Simulation of Utilization and Energy Saving of the Angara Interconnect. Lobachevskii J Math 43, 873–881 (2022). https://doi.org/10.1134/S1995080222070186
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1995080222070186