Skip to main content

RADICAL-Pilot and PMIx/PRRTE: Executing Heterogeneous Workloads at Large Scale on Partitioned HPC Resources

  • 248 Accesses

Part of the Lecture Notes in Computer Science book series (LNCS,volume 13592)

Abstract

Execution of heterogeneous workflows on high-performance computing (HPC) platforms present unprecedented resource management and execution coordination challenges for runtime systems. Task heterogeneity increases the complexity of resource and execution management, limiting the scalability and efficiency of workflow execution. Resource partitioning and distribution of tasks execution over portioned resources promises to address those problems but we lack an experimental evaluation of its performance at scale. This paper provides a performance evaluation of the Process Management Interface for Exascale (PMIx) and its reference implementation PRRTE on the leadership-class HPC platform Summit, when integrated into a pilot-based runtime system called RADICAL-Pilot. We partition resources across multiple PRRTE Distributed Virtual Machine (DVM) environments, responsible for launching tasks via the PMIx interface. We experimentally measure the workload execution performance in terms of task scheduling/launching rate and distribution of DVM task placement times, DVM startup and termination overheads on the Summit leadership-class HPC platform. Integrated solution with PMIx/PRRTE enables using an abstracted, standardized set of interfaces for orchestrating the launch process, dynamic process management and monitoring capabilities. It extends scaling capabilities allowing to overcome a limitation of other launching mechanisms (e.g., JSM/LSF). Explored different DVM setup configurations provide insights on DVM performance and a layout to leverage it. Our experimental results show that heterogeneous workload of 65,500 tasks on 2048 nodes, and partitioned across 32 DVMs, runs steady with resource utilization not lower than \(52\%\). While having less concurrently executed tasks resource utilization is able to reach up to \(85\%\), based on results of heterogeneous workload of 8200 tasks on 256 nodes and 2 DVMs.

Keywords

  • High performance computing
  • Resource management
  • Middleware
  • Runtime system
  • Runtime environment

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Exaworks: Software development kit. https://github.com/ExaWorks/SDK. Accessed 10 Feb 2022

  2. Github repository with experiments data. https://github.com/radical-experiments/summit_prrte_multi_dvm

  3. INCITE innovative and novel computational impact on theory and experiment program, https://www.doeleadershipcomputing.org. Accessed 10 Feb 2022

  4. OpenPMIx, reference implementation of the process management interface exascale (PMIx) standard. https://openpmix.github.io. Accessed 10 Feb 2022

  5. Process management interface for exascale (PMIx) standard. https://pmix.github.io/pmix-standard/. Accessed 10 Feb 2022

  6. User guide for leadership-class supercomputer summit at ornl oak ridge leadership computing facility. https://docs.olcf.ornl.gov/systems/summit_user_guide.html. Accessed 10 Feb 2022

  7. Ahn, D.H., et al.: Flux: overcoming scheduling challenges for exascale workflows. Future Gener. Comput. Syst. 110, 202–213 (2020). https://doi.org/10.1016/j.future.2020.04.006

    CrossRef  Google Scholar 

  8. Al-Saadi, A., et al.: Exaworks: Workflows for exascale. 16th Workshop on Workflows in Support of Large-Scale Science. SC (2021). https://arxiv.org/abs/2108.13521

  9. Al-Saadi, A., et al.: IMPECCABLE: Integrated modeling pipeline for covid cure by assessing better leads. In: 50th International Conference on Parallel Processing. ICPP 2021, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3472456.3473524

  10. Balaji, P., et al: Mpich user’s guide. Argonne National Laboratory (2014)

    Google Scholar 

  11. Ben-Nun, T., Gamblin, T., Hollman, D., Krishnan, H., Newburn, C.J.: Workflows are the new applications: Challenges in performance, portability, and productivity. In: 2020 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 57–69. IEEE (2020)

    Google Scholar 

  12. Berkowitz, E., Jansen, G., McElvain, K., Walker-Loud, A.: Job management with mpi_jm. In: International Conference on High Performance Computing, pp. 432–439. Springer (2018)

    Google Scholar 

  13. Berkowitz, E., Jansen, G.R., McElvain, K., Walker-Loud, A.: Job management and task bundling. EPJ Web Conf. 175, 09007 (2018). https://doi.org/10.1051/epjconf/201817509007

    CrossRef  Google Scholar 

  14. Bhatia, H., et al.: Generalizable coordination of large multiscale workflows: Challenges and learnings at scale. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. SC ’21, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3458817.3476210

  15. Casalino, L., et al.: Ai-driven multiscale simulations illuminate mechanisms of sars-cov-2 spike dynamics. The Int. J. High Performance Comput. Appl. 35(5), 432–451 (2021)

    CrossRef  Google Scholar 

  16. Castain, R.H., Hursey, J., Bouteiller, A., Solt, D.: PMIx: process management for exascale environments. Parallel Comput. 79, 9–29 (2018)

    CrossRef  MathSciNet  Google Scholar 

  17. Eastman, P., et al: OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLOS Comput. Biol. 13(7), 1–17 (2017). https://doi.org/10.1371/journal.pcbi.1005659

  18. Fifield, T., Carmona, A., Casajús, A., Graciani, R., Sevior, M.: Integration of cloud, grid and local cluster resources with dirac. In: Journal of Physics: Conference Series. vol. 331, p. 062009. IOP Publishing (2011)

    Google Scholar 

  19. Gabriel, E., et al.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Proceedings. 11th European PVM/MPI Users’ Group Meeting, pp. 97–104. Budapest, Hungary (2004)

    Google Scholar 

  20. Hou, K., Koziol, Q., Byna, S.: Taskworks: A task engine for empowering asynchronous operations in hpc applications. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (2020)

    Google Scholar 

  21. Hufnagel, D.: Cms use of allocation based hpc resources. In: J. Phys. Conf. Ser. 898 092050 (2017)

    Google Scholar 

  22. Maeno, T., et al.: Evolution of the ATLAS PanDA workload management system for exascale computational science. In: Proceedings of the 20th International Conference on Computing in High Energy and Nuclear Physics (CHEP2013), J. Phys.: Conf. Ser. 513(3) 032062 IOP Publishing (2014)

    Google Scholar 

  23. Merzky, A., Turilli, M., Maldonado, M., Santcroos, M., Jha, S.: Using pilot systems to execute many task workloads on supercomputers. In: Workshop on Job Scheduling Strategies for Parallel Processing. pp. 61–82. Springer (2018). https://doi.org/10.1007/978-3-030-10632-4_4

  24. Merzky, A., Weidner, O., Jha, S.: SAGA: A standardized access layer to heterogeneous distributed computing infrastructure. Software-X (2015). http://dx.doi.org/10.1016/j.softx.2015.03.001

  25. Oleynik, D., Panitkin, S., Turilli, M., Angius, A., Oral, S., De, K., Klimentov, A., Wells, J.C., Jha, S.: High-throughput computing on high-performance platforms: A case study. In: 2017 IEEE 13th International Conference on e-Science (e-Science), pp. 295–304. IEEE (2017)

    Google Scholar 

  26. Phillips, J.C., et al.: Scalable molecular dynamics on cpu and gpu architectures with NAMD. J. Chem. Phys. 153(4), 044130 (2020). https://doi.org/10.1063/5.0014475

    CrossRef  Google Scholar 

  27. Santcroos, M., Castain, R., Merzky, A., Bethune, I., Jha, S.: Executing dynamic heterogeneous workloads on blue waters with radical-pilot. Cray User Group 2016 (2016)

    Google Scholar 

  28. Sfiligoi, I.: glideinWMS-a generic pilot-based workload management system. In: Proceedings of the International Conference on Computing in High Energy and Nuclear Physics (CHEP2007). J. Phys.: Conf. Series. 119(6), 062044. IOP Publishing (2008)

    Google Scholar 

  29. Svirin, P., et al.: BigPanDA: panda workload management system and its applications beyond ATLAS. EPJ Web Conf. 214, 03050 (2019). https://doi.org/10.1051/epjconf/201921403050

    CrossRef  Google Scholar 

  30. Tsaregorodtsev, A., Garonne, V., Stokes-Rees, I.: DIRAC: A scalable lightweight architecture for high throughput computing. In: Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, pp. 19–25 (2004)

    Google Scholar 

  31. Turilli, M., Merzky, A., Naughton, T.J., Elwasif, W., Jha, S.: Characterizing the performance of executing many-tasks on summit. In: IPDRM 2019 (2019)

    Google Scholar 

  32. Turilli, M., Santcroos, M., Jha, S.: A comprehensive perspective on pilot-job systems. ACM Comput. Surv. (CSUR) 51(2), 43 (2018)

    Google Scholar 

  33. Vallée, G.R., Bernholdt, D.: Improving support of MPI+OpenMP applications. In: Proceedings of the EuroMPI 2018 Conference (2018)

    Google Scholar 

  34. Ward, L., et al.: Colmena: Scalable machine-learning-based steering of ensemble simulations for high performance computing. In: 2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC), pp. 9–20. IEEE (2021)

    Google Scholar 

Download references

Acknowledgments

We would like to thank other members of the PMIx community, and Ralph Castain in particular, for the excellent work that we build upon. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This work is also supported by the ExaWorks project (part of the Exascale Computing Project (ECP)) under DOE Contract No. DE-SC0012704 and by the DOE HEP Center for Computational Excellence at Brookhaven National Laboratory under B &R KA2401045. We also acknowledge DOE INCITE awards for allocations on Summit.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mikhail Titov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Titov, M., Turilli, M., Merzky, A., Naughton, T., Elwasif, W., Jha, S. (2023). RADICAL-Pilot and PMIx/PRRTE: Executing Heterogeneous Workloads at Large Scale on Partitioned HPC Resources. In: Klusáček, D., Julita, C., Rodrigo, G.P. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2022. Lecture Notes in Computer Science, vol 13592. Springer, Cham. https://doi.org/10.1007/978-3-031-22698-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-22698-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-22697-7

  • Online ISBN: 978-3-031-22698-4

  • eBook Packages: Computer ScienceComputer Science (R0)