Abstract
Execution of heterogeneous workflows on high-performance computing (HPC) platforms present unprecedented resource management and execution coordination challenges for runtime systems. Task heterogeneity increases the complexity of resource and execution management, limiting the scalability and efficiency of workflow execution. Resource partitioning and distribution of tasks execution over portioned resources promises to address those problems but we lack an experimental evaluation of its performance at scale. This paper provides a performance evaluation of the Process Management Interface for Exascale (PMIx) and its reference implementation PRRTE on the leadership-class HPC platform Summit, when integrated into a pilot-based runtime system called RADICAL-Pilot. We partition resources across multiple PRRTE Distributed Virtual Machine (DVM) environments, responsible for launching tasks via the PMIx interface. We experimentally measure the workload execution performance in terms of task scheduling/launching rate and distribution of DVM task placement times, DVM startup and termination overheads on the Summit leadership-class HPC platform. Integrated solution with PMIx/PRRTE enables using an abstracted, standardized set of interfaces for orchestrating the launch process, dynamic process management and monitoring capabilities. It extends scaling capabilities allowing to overcome a limitation of other launching mechanisms (e.g., JSM/LSF). Explored different DVM setup configurations provide insights on DVM performance and a layout to leverage it. Our experimental results show that heterogeneous workload of 65,500 tasks on 2048 nodes, and partitioned across 32 DVMs, runs steady with resource utilization not lower than \(52\%\). While having less concurrently executed tasks resource utilization is able to reach up to \(85\%\), based on results of heterogeneous workload of 8200 tasks on 256 nodes and 2 DVMs.
Keywords
- High performance computing
- Resource management
- Middleware
- Runtime system
- Runtime environment
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Exaworks: Software development kit. https://github.com/ExaWorks/SDK. Accessed 10 Feb 2022
Github repository with experiments data. https://github.com/radical-experiments/summit_prrte_multi_dvm
INCITE innovative and novel computational impact on theory and experiment program, https://www.doeleadershipcomputing.org. Accessed 10 Feb 2022
OpenPMIx, reference implementation of the process management interface exascale (PMIx) standard. https://openpmix.github.io. Accessed 10 Feb 2022
Process management interface for exascale (PMIx) standard. https://pmix.github.io/pmix-standard/. Accessed 10 Feb 2022
User guide for leadership-class supercomputer summit at ornl oak ridge leadership computing facility. https://docs.olcf.ornl.gov/systems/summit_user_guide.html. Accessed 10 Feb 2022
Ahn, D.H., et al.: Flux: overcoming scheduling challenges for exascale workflows. Future Gener. Comput. Syst. 110, 202–213 (2020). https://doi.org/10.1016/j.future.2020.04.006
Al-Saadi, A., et al.: Exaworks: Workflows for exascale. 16th Workshop on Workflows in Support of Large-Scale Science. SC (2021). https://arxiv.org/abs/2108.13521
Al-Saadi, A., et al.: IMPECCABLE: Integrated modeling pipeline for covid cure by assessing better leads. In: 50th International Conference on Parallel Processing. ICPP 2021, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3472456.3473524
Balaji, P., et al: Mpich user’s guide. Argonne National Laboratory (2014)
Ben-Nun, T., Gamblin, T., Hollman, D., Krishnan, H., Newburn, C.J.: Workflows are the new applications: Challenges in performance, portability, and productivity. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 57–69. IEEE (2020)
Berkowitz, E., Jansen, G., McElvain, K., Walker-Loud, A.: Job management with mpi_jm. In: International Conference on High Performance Computing, pp. 432–439. Springer (2018)
Berkowitz, E., Jansen, G.R., McElvain, K., Walker-Loud, A.: Job management and task bundling. EPJ Web Conf. 175, 09007 (2018). https://doi.org/10.1051/epjconf/201817509007
Bhatia, H., et al.: Generalizable coordination of large multiscale workflows: Challenges and learnings at scale. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. SC ’21, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3458817.3476210
Casalino, L., et al.: Ai-driven multiscale simulations illuminate mechanisms of sars-cov-2 spike dynamics. The Int. J. High Performance Comput. Appl. 35(5), 432–451 (2021)
Castain, R.H., Hursey, J., Bouteiller, A., Solt, D.: PMIx: process management for exascale environments. Parallel Comput. 79, 9–29 (2018)
Eastman, P., et al: OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLOS Comput. Biol. 13(7), 1–17 (2017). https://doi.org/10.1371/journal.pcbi.1005659
Fifield, T., Carmona, A., Casajús, A., Graciani, R., Sevior, M.: Integration of cloud, grid and local cluster resources with dirac. In: Journal of Physics: Conference Series. vol. 331, p. 062009. IOP Publishing (2011)
Gabriel, E., et al.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Proceedings. 11th European PVM/MPI Users’ Group Meeting, pp. 97–104. Budapest, Hungary (2004)
Hou, K., Koziol, Q., Byna, S.: Taskworks: A task engine for empowering asynchronous operations in hpc applications. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (2020)
Hufnagel, D.: Cms use of allocation based hpc resources. In: J. Phys. Conf. Ser. 898 092050 (2017)
Maeno, T., et al.: Evolution of the ATLAS PanDA workload management system for exascale computational science. In: Proceedings of the 20th International Conference on Computing in High Energy and Nuclear Physics (CHEP2013), J. Phys.: Conf. Ser. 513(3) 032062 IOP Publishing (2014)
Merzky, A., Turilli, M., Maldonado, M., Santcroos, M., Jha, S.: Using pilot systems to execute many task workloads on supercomputers. In: Workshop on Job Scheduling Strategies for Parallel Processing. pp. 61–82. Springer (2018). https://doi.org/10.1007/978-3-030-10632-4_4
Merzky, A., Weidner, O., Jha, S.: SAGA: A standardized access layer to heterogeneous distributed computing infrastructure. Software-X (2015). http://dx.doi.org/10.1016/j.softx.2015.03.001
Oleynik, D., Panitkin, S., Turilli, M., Angius, A., Oral, S., De, K., Klimentov, A., Wells, J.C., Jha, S.: High-throughput computing on high-performance platforms: A case study. In: 2017 IEEE 13th International Conference on e-Science (e-Science), pp. 295–304. IEEE (2017)
Phillips, J.C., et al.: Scalable molecular dynamics on cpu and gpu architectures with NAMD. J. Chem. Phys. 153(4), 044130 (2020). https://doi.org/10.1063/5.0014475
Santcroos, M., Castain, R., Merzky, A., Bethune, I., Jha, S.: Executing dynamic heterogeneous workloads on blue waters with radical-pilot. Cray User Group 2016 (2016)
Sfiligoi, I.: glideinWMS-a generic pilot-based workload management system. In: Proceedings of the International Conference on Computing in High Energy and Nuclear Physics (CHEP2007). J. Phys.: Conf. Series. 119(6), 062044. IOP Publishing (2008)
Svirin, P., et al.: BigPanDA: panda workload management system and its applications beyond ATLAS. EPJ Web Conf. 214, 03050 (2019). https://doi.org/10.1051/epjconf/201921403050
Tsaregorodtsev, A., Garonne, V., Stokes-Rees, I.: DIRAC: A scalable lightweight architecture for high throughput computing. In: Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, pp. 19–25 (2004)
Turilli, M., Merzky, A., Naughton, T.J., Elwasif, W., Jha, S.: Characterizing the performance of executing many-tasks on summit. In: IPDRM 2019 (2019)
Turilli, M., Santcroos, M., Jha, S.: A comprehensive perspective on pilot-job systems. ACM Comput. Surv. (CSUR) 51(2), 43 (2018)
Vallée, G.R., Bernholdt, D.: Improving support of MPI+OpenMP applications. In: Proceedings of the EuroMPI 2018 Conference (2018)
Ward, L., et al.: Colmena: Scalable machine-learning-based steering of ensemble simulations for high performance computing. In: 2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC), pp. 9–20. IEEE (2021)
Acknowledgments
We would like to thank other members of the PMIx community, and Ralph Castain in particular, for the excellent work that we build upon. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This work is also supported by the ExaWorks project (part of the Exascale Computing Project (ECP)) under DOE Contract No. DE-SC0012704 and by the DOE HEP Center for Computational Excellence at Brookhaven National Laboratory under B &R KA2401045. We also acknowledge DOE INCITE awards for allocations on Summit.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Titov, M., Turilli, M., Merzky, A., Naughton, T., Elwasif, W., Jha, S. (2023). RADICAL-Pilot and PMIx/PRRTE: Executing Heterogeneous Workloads at Large Scale on Partitioned HPC Resources. In: Klusáček, D., Julita, C., Rodrigo, G.P. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2022. Lecture Notes in Computer Science, vol 13592. Springer, Cham. https://doi.org/10.1007/978-3-031-22698-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-22698-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22697-7
Online ISBN: 978-3-031-22698-4
eBook Packages: Computer ScienceComputer Science (R0)