Algorithm Selection of MPI Collectives Considering System Utilization

Salimi Beni, Majid; Hunold, Sascha; Cosenza, Biagio

doi:10.1007/978-3-031-48803-0_37

Majid Salimi Beni¹⁹,
Sascha Hunold²⁰ &
Biagio Cosenza¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14352))

Included in the following conference series:

European Conference on Parallel Processing

148 Accesses

Abstract

MPI collective communications play an important role in coordinating and exchanging data among parallel processes in high performance computing. Various algorithms exist for implementing MPI collectives, each of which exhibits different characteristics, such as message overhead, latency, and scalability, which can significantly impact overall system performance. Therefore, choosing a suitable algorithm for each collective operation is crucial to achieve optimal performance. In this paper, we present our experience with MPI collectives algorithm selection on a large-scale supercomputer and highlight the impact of network traffic and system workload as well as other previously-investigated parameters such as message size, communicator size, and network topology. Our analysis shows that network traffic and system workload can make the performance of MPI collectives highly variable and, accordingly, impact the algorithm selection strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Beni, M.S., Cosenza, B.: An analysis of performance variability on dragonfly+ topology. In: 2022 IEEE International Conference on Cluster Computing (CLUSTER), pp. 500–501. IEEE (2022)
Google Scholar
Chunduri, S., Parker, S., Balaji, P., Harms, K., Kumaran, K.: Characterization of MPI usage on a production supercomputer. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 386–400. IEEE (2018)
Google Scholar
Faraj, A., Yuan, X., Lowenthal, D.: STAR-MPI: self tuned adaptive routines for MPI collective operations. In: Proceedings of the 20th Annual International Conference on Supercomputing, pp. 199–208 (2006)
Google Scholar
GitHub - cea-hpc/hp2p: Heavy Peer To Peer: a MPI based benchmark for network diagnostic. https://github.com/cea-hpc/hp2p. Accessed 25 Sept 2023
Hunold, S., Bhatele, A., Bosilca, G., Knees, P.: Predicting MPI collective communication performance using machine learning. In: 2020 IEEE International Conference on Cluster Computing (CLUSTER), pp. 259–269. IEEE (2020)
Google Scholar
Hunold, S., Carpen-Amarie, A.: Autotuning MPI collectives using performance guidelines. In: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, pp. 64–74 (2018)
Google Scholar
Hunold, S., Carpen-Amarie, A.: Reproducible MPI benchmarking is still not as easy as you think. IEEE Trans. Parallel Distrib. Syst. 27, 3617–3630 (2016)
Article Google Scholar
Hunold, S., Steiner, S.: OMPICollTune: autotuning MPI collectives by incremental online learning. In: 2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), pp. 123–128. IEEE (2022)
Google Scholar
Loch, W.J., Koslovski, G.P.: Sparbit: towards to a logarithmic-cost and data locality-aware MPI allgather algorithm. J. Grid Comput. 21, 18 (2023)
Google Scholar
Marconi100, the new accelerated system. https://www.hpc.cineca.it/hardware/marconi100. Accessed 25 Sept 2023
Nuriyev, E., Rico-Gallego, J.-A., Lastovetsky, A.: Model-based selection of optimal MPI broadcast algorithms for multi-core clusters. J. Parallel Distrib. Comput. 165, 1–16 (2022)
Article Google Scholar
Salimi Beni, M., Cosenza, B.: An analysis of long-tailed network latency distribution and background traffic on dragonfly+. In: Gainaru, A., Zhang, C., Luo, C. (eds.) Bench 2022. LNCS, vol. 13852, pp. 123–142. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-31180-2_8
Chapter Google Scholar
Beni, M.S., Crisci, L., Cosenza, B.: EMPI: enhanced message passing interface in modern c++. In: 2023 23rd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), pp. 141–153. IEEE (2023)
Google Scholar
Wilkins, M., Guo, Y., Thakur, R., Dinda, P., Hardavellas, N.: ACCLAiM: advancing the practicality of MPI collective communication autotuning using machine learning. In: 2022 IEEE International Conference on Cluster Computing (CLUSTER), pp. 161–171. IEEE (2022)
Google Scholar
Wilkins, M., Guo, Y., Thakur, R., Hardavellas, N., Dinda, P., Si, M.: A fact-based approach: making machine learning collective autotuning feasible on exascale systems. In: 2021 Workshop on Exascale MPI (ExaMPI), pp. 36–45. IEEE (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Salerno, Salerno, Italy
Majid Salimi Beni & Biagio Cosenza
Faculty of Informatics, TU Wien, Vienna, Austria
Sascha Hunold

Authors

Majid Salimi Beni
View author publications
You can also search for this author in PubMed Google Scholar
Sascha Hunold
View author publications
You can also search for this author in PubMed Google Scholar
Biagio Cosenza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Majid Salimi Beni .

Editor information

Editors and Affiliations

University of Cyprus, Nicosia, Cyprus
Demetris Zeinalipour
University of Santiago de Compostela, Santiago de Compostela, Spain
Dora Blanco Heras
University of Cyprus, Nicosia, Cyprus
George Pallis
Cyprus University of Technology, Limassol, Cyprus
Herodotos Herodotou
University of Nicosia, Nicosia, Cyprus
Demetris Trihinas
Inria, Nantes, France
Daniel Balouek
Louisiana State University, Baton Rouge, LA, USA
Patrick Diehl
Karlsruhe Institute of Technology, Karlsruhe, Germany
Terry Cojean
Ludwig-Maximilians-Universität, Munich, Germany
Karl Fürlinger
Roskilde University, Roskilde, Denmark
Maja Hanne Kirkeby
Bank of Italy, Rome, Italy
Matteo Nardelli
Roma Tre University, Rome, Italy
Pierangelo Di Sanzo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Salimi Beni, M., Hunold, S., Cosenza, B. (2024). Algorithm Selection of MPI Collectives Considering System Utilization. In: Zeinalipour, D., et al. Euro-Par 2023: Parallel Processing Workshops. Euro-Par 2023. Lecture Notes in Computer Science, vol 14352. Springer, Cham. https://doi.org/10.1007/978-3-031-48803-0_37

Download citation

DOI: https://doi.org/10.1007/978-3-031-48803-0_37
Published: 14 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48802-3
Online ISBN: 978-3-031-48803-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Algorithm Selection of MPI Collectives Considering System Utilization