Abstract
This chapter discusses strategies and techniques to pave the way for executing traditional HPC applications in the Cloud. We follow one direction that may appear unusual at first glance. Indeed, we sketch the problems, issues, and solutions when the effort is put into the HPC scheduler. We mean that the HPC applications are not rewritten, but the HPC scheduler has been cloudified. Thus, it is now available as any other Cloud service, on-demand. Then, this chapter introduces the issues for a Cloud orchestrator controller that enables the autoscaling of containerized HPC Clusters in the Cloud. The proposed solution aims to trigger the creation or suppression of containerized HPC compute nodes according to metrics collected at the containerized HPC scheduler’s job queue level. In summarizing, we underline approaches specialized to address the unification of HPC and Cloud, which is the chapter’s core material. Following our options, perspectives are numerous, and we summarize some of them as future works.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Thomas Sterling, Maciej Brodowicz, and Matthew Anderson, High Performance Computing: Modern Systems and Practices 1st Edition, Morgan Kaufmann; (December 19, 2017), ISBN-10: 012420158X; ISBN-13: 978-0124201583
Maciej Brodowicz, Thomas L. Sterling, Matthew Anderson: Continuum Computing - on a New Performance Trajectory beyond Exascale. Supercomput. Front. Innov. 5(3): 5–24 (2018)
Nick Antonopoulos, Lee Gillam: Cloud Computing - Principles, Systems and Applications, Second Edition. Computer Communications and Networks, Springer 2017, ISBN 978-3-319-54644-5
Google, Kubernetes – see https://kubernetes.io/
Morris A. Jette and Andy B. Yoo and Mark Grondona, SLURM: Simple Linux Utility for Resource Management, In Lecture Notes in Computer Science: Proceedings of Job Scheduling Strategies for Parallel Processing (JSSPP) 2003, 2002, pages 44–60, Springer-Verlag
Henderson, R.L., Tweten, D.: Portable Batch System-PBS: Requirements Specification. NASA Ames Research Center (1998)
The OAR job scheduler. See http://oar.imag.fr
Christophe Cérin, Nicolas Grenèche, Tarek Menouer, Towards Pervasive Containerization of HPC Job Scheduler, SBAC-PAD 2020, pages 281–288.
Nicolas Grenèche, Tarek Menouer, Christophe Cérin, and Olivier Richard, A methodology to scale containerized HPC infrastructures in the Cloud, In: Euro-Par 2022, Glasgow, UK, August 22–26, 2022 (2022).
C. Misale and M. Drocco and D. J. Milroy and C. Gutierrez and S. Herbein and D. H. Ahn and Y. Park, 2021 3rd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC), It’s a Scheduling Affair: GROMACS in the Cloud with the KubeFlux Scheduler, 2021, pages 10–16, doi https://doi.org/10.1109/CANOPIEHPC54579.2021.00006, IEEE Computer Society, Los Alamitos, CA, USA.
A. M. Beltre and P. Saha and M. Govindaraju and A. Younge and R. E. Grant, 2019 IEEE/ACM International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC), Enabling HPC Workloads on Cloud Infrastructure Using Kubernetes Container Orchestration Mechanisms, 2019, pages 11–20.
A. Torrez and T. Randles and R. Priedhorsky, 2019 IEEE/ACM International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC), HPC Container Runtimes have Minimal or No Performance Impact, 2019, doi https://doi.org/10.1109/CANOPIE-HPC49598.2019.00010, IEEE Computer Society, Los Alamitos, CA, USA.
A. J. Younge and K. Pedretti and R. E. Grant and R. Brightwell, 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), A Tale of Two Systems: Using Containers to Deploy HPC Applications on Supercomputers and Clouds, 2017, pages 74–81, doi https://doi.org/10.1109/CloudCom.2017.40, IEEE Computer Society, Los Alamitos, CA, USA
Viktoria Spisakova, Dalibor Klusacek and Lukas Hejtmanek, Using Kubernetes in Academic Environment: Problems and Approaches (Open Scheduling Problem), Job Scheduling Strategies for Parallel Processing (JSSPP). In conjunction with 36th IEEE International Parallel and Distributed Processing Symposium (IPDPS’2022), May 30–June 3, 2022 in Lyon, France. Available at https://jsspp.org/papers22/kubernetes-OSP.pdf
Naweiluo Zhou, Yiannis Georgiou, Marcin Pospieszny, Li Zhong, Huan Zhou, Christoph Niethammer, Branislav Pejak, Oskar Marko, Dennis Hoppe: Container orchestration on HPC systems through Kubernetes. J. Cloud Comput. 10(1): 16 (2021)
S. Niu, J. Zhai, X. Ma, X. Tang, W. Chen and W. Zheng, “Building Semi-Elastic Virtual Clusters for Cost-Effective HPC Cloud Resource Provisioning,” in IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 7, pp. 1915-1928, 1 July 2016, doi: https://doi.org/10.1109/TPDS.2015.2476459.
Meier, Konrad & Fleig, Georg & Hauth, Thomas & Janczyk, Michael & Quast, Günter & von Suchodoletz, Dirk & Wiebelt, Bernd. (2016). Dynamic provisioning of a HEP computing infrastructure on a shared hybrid HPC system. Journal of Physics: Conference Series. 762. 012012. https://doi.org/10.1088/1742-6596/762/1/012012.
Emmanuel Jeanvoine, Luc Sarzyniec, Lucas Nussbaum. Kadeploy3: Efficient and Scalable Operating System Provisioning for Clusters. USENIX Association, USENIX Association, 2013, 38 (1), pp. 38–44
The GROMACS user guide - https://doi.org/10.5281/zenodo.6103568
Andrew S. Tanenbaum and Herbert Bos. Modern Operating Systems, 4th edition. Published by Pearson (July 14th 2021) - Copyright Ⓒ 2015
Acknowledgements
This work is conducted during the Délégation with Centre National de la Recherche Scientifique (CNRS) of Mr. Cérin. Thanks to the institutional support of the CNRS, University of Grenoble Alpes, DATAMOVE INRIA Team, and university Sorbonne Paris Nord. Mr. Grenèche is also working with “Pôle de soutien à la recherche” of Sorbonne Paris Nord, Direction des Systèmes d’Information (DSI).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Cérin, C., Grenèche, N., Menouer, T. (2023). Executing Traditional HPC Application Code in Cloud with Containerized Job Schedulers. In: Borin, E., Drummond, L.M.A., Gaudiot, JL., Melo, A., Melo Alves, M., Navaux, P.O.A. (eds) High Performance Computing in Clouds . Springer, Cham. https://doi.org/10.1007/978-3-031-29769-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-29769-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29768-7
Online ISBN: 978-3-031-29769-4
eBook Packages: Computer ScienceComputer Science (R0)