An study of the effect of process malleability in the energy efficiency on GPU-based clusters

Iserte, Sergio; Rojek, Krzysztof

doi:10.1007/s11227-019-03034-x

An study of the effect of process malleability in the energy efficiency on GPU-based clusters

Published: 21 October 2019

Volume 76, pages 255–274, (2020)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

310 Accesses
5 Citations
2 Altmetric
Explore all metrics

Abstract

The adoption of graphic processor units (GPU) in high-performance computing (HPC) infrastructures determines, in many cases, the energy consumption of those facilities. For this reason, an efficient management and administration of the GPU-enabled clusters is crucial for the optimum operation of the cluster. The main aim of this work is to study and design efficient mechanisms of job scheduling across GPU-enabled clusters by leveraging process malleability techniques, able to reconfigure running jobs, depending on the cluster status. This paper presents a model that improves the energy efficiency when processing a batch of jobs in an HPC cluster. The model is validated through the MPDATA algorithm, as a representative example of stencil computation used in numerical weather prediction. The proposed solution applies the efficiency metrics obtained in a new reconfiguration policy aimed at job arrays. This solution allows the reduction in the processing time of workloads up to 4.8 times and reduction in the energy consumption up to 2.4 times the cluster compared to the traditional job management, where jobs are not reconfigured during their execution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Egyptian national HPC grid (EN-HPCG): open-source Slurm implementation from cluster to grid approach

Article Open access 17 April 2024

Cloud benchmarking and performance analysis of an HPC application in Amazon EC2

Article Open access 28 June 2023

GPU Architecture

Notes

References

Nvidia web page. http://www.nvidia.com (2018). Accessed: 2018-12-17
Barlas G (2014) Multicore and GPU programming: an integrated approach. Elsevier, Amsterdam
Google Scholar
Burtscher M, Zecena I, Zong Z (2014) Measuring GPU power with the K20 built-in sensor. ACM, New York, pp 28:28–28:36
Google Scholar
Comprés I, Mo-Hellenbrand A, Gerndt M, Bungartz HJ (2016) Infrastructure and API extensions for elastic execution of MPI applications. In: Proceedings of the 23rd European MPI Users’ Group Meeting on—EuroMPI 2016. ACM Press, New York, pp 82–97
El Maghraoui K, Desell TJ, Szymanski BK, Varela CA (2009) Malleable iterative MPI applications. Concurr Comput Practice Exp 21(3):393–413
Article Google Scholar
El Maghraoui K, Szymanski BK, Varela C (2006) An architecture for reconfigurable iterative MPI applications in dynamic environments. In: International Conference on Parallel Processing and Applied Mathematics, pp 258–27
Feitelson DG (1996) Packing schemes for gang scheduling. In: Feitelson DG, Rudolph L (eds) Lecture notes in computer science book series (LNCS), vol 1162. Springer, Berlin, pp 89–110
Google Scholar
Gupta A, Acun B, Sarood O, Kalé LV (2014) Towards realizing the potential of malleable jobs. In: 21st International Conference on High Performance Computing (HiPC)
Iserte S (2018) High-throughput computation through efficient resource management. Ph.D. thesis, Universitat Jaume I, Castelló de la Plana
Iserte S, Martínez H, Barrachina S, Castillo M, Mayo R, Peña AJ (2018) Dynamic reconfiguration of noniterative scientific applications. Int J High Perform Comput Appl 33:804–816
Article Google Scholar
Iserte S, Mayo R, Quintana-Ortí ES, Beltran V, Peña AJ (2017) Efficient scalable computing through flexible applications and adaptive workloads. In: 10th International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2). Bristol
Iserte S, Mayo R, Quintana-Ortí ES, Beltran V, Peña AJ (2018) DMR API: improving cluster productivity by turning applications into malleable. Parallel Comput 78:54–66
Article Google Scholar
Kungand HT, Leiserson CE (1979) Algorithms for VLSI processor arrays. In: Introduction to VLSI Systems. Addison-Wesley
Lemarinier P, Hasanov K, Venugopal S, Katrinis K (2016) Architecting malleable MPI applications for priority-driven adaptive scheduling. In: Proceedings of the 23rd European MPI Users’ Group Meeting (EuroMPI), pp 74–81
Lublin U, Feitelson DG (2003) The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J Parallel Distrib Comput 63(11):1105–1122
Article Google Scholar
Martín G, Singh DE, Marinescu MC, Carretero J (2015) Enhancing the performance of malleable MPI applications by using performance-aware dynamic reconfiguration. Parallel Comput 46:60–77
Article Google Scholar
Prabhakaran S, Neumann M, Rinke S, Wolf F, Gupta A, Kale LV (2015) A batch system with efficient adaptive scheduling for malleable and evolving applications. In: 2015 IEEE International Parallel and Distributed Processing Symposium, pp 429–438
Prusa J, Smolarkiewicz P, Wyszogrodzki A (2008) EULAG, a computational model for multiscale flows. Comput Fluids 37:1193–1207
Article MathSciNet Google Scholar
Rojek K (2018) Machine learning method for energy reduction by utilizing dynamic mixed precision on GPU-based supercomputers. Concurr Comput Practice Exp. https://doi.org/10.1002/cpe.4644
Article Google Scholar
Rojek K, Quintana-Ortí ES, Wyrzykowski R (2017) Modeling power consumption of 3D MPDATA and the CG method on ARM and intel multicore architectures. J Supercomput 73(10):4373–4389
Article Google Scholar
Rojek K, Wyrzykowski R (2017) Performance modeling of 3D MPDATA simulations on GPU cluster. J Supercomput 73(2):664–675
Article Google Scholar
Rojek K, Wyrzykowski R, Kuczynski L (2017) Systematic adaptation of stencil-based 3D MPDATA to GPU architectures. Concurr Comput Practice Exp 29(9):e3970
Article Google Scholar
Sainz F, Bellon J, Beltran V, Labarta J (2015) Collective offload for heterogeneous clusters. In: 22nd International Conference on High Performance Computing (HiPC)
Smolarkiewicz P (2006) Multidimensional positive definite advection transport algorithm: an overview. Int J Numer Methods Fluids 50:1123–1144
Article MathSciNet Google Scholar
Spenke F, Balzer K, Frick S, Hartke B, Dieterich JM (2019) Malleable parallelism with minimal effort for maximal throughput and maximal hardware load. Comput Theor Chem 1151:72–77
Article Google Scholar
Sudarsan R, Ribbens C (2009) Scheduling resizable parallel applications. In: International Symposium on Parallel and Distributed Processing
Szustak L (2018) Strategy for data-flow synchronizations in stencil parallel computations on multi-/manycore systems. J Supercomput 74(4):1534–1546
Article Google Scholar
Yoo AB, Jette MA, Grondona M (2003) SLURM: simple linux utility for resource management. In: 9th International Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), pp 44–60

Download references

Acknowledgements

The researcher from Universitat Jaume I (UJI) was supported by the project TIN2017-82972-R from MINECO and FEDER. The National Polish Science Centre supported the researcher from Czestochowa University of Technology under Grant No. UMO-2015/17/D/ST6/04059 and under Grant No. UMO-2017/26/D/ST6/00687. This work was partially performed during a short-term scientific missing (STSM) from Krzysztof Rojek to UJI supported by the EU COST IC1305. Authors are also grateful to the BSC for letting them use the HPC facilities. Finally, authors want to thank Prof. Enrique S. Quintana-Ortí for his invaluable insights and comments, as well as the anonymous reviewers whose suggestions significantly improved the manuscript.

Author information

Authors and Affiliations

Universitat Jaume I, Castelló de la Plana, Spain
Sergio Iserte
Universitat de València, Valencia, Spain
Sergio Iserte
Czestochowa University of Technology, Czestochowa, Poland
Krzysztof Rojek
byteLAKE S.C., Wrocław, Poland
Krzysztof Rojek

Authors

Sergio Iserte
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Rojek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergio Iserte.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Iserte, S., Rojek, K. An study of the effect of process malleability in the energy efficiency on GPU-based clusters. J Supercomput 76, 255–274 (2020). https://doi.org/10.1007/s11227-019-03034-x

Download citation

Published: 21 October 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s11227-019-03034-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An study of the effect of process malleability in the energy efficiency on GPU-based clusters

Abstract

Access this article

Similar content being viewed by others

The Egyptian national HPC grid (EN-HPCG): open-source Slurm implementation from cluster to grid approach

Cloud benchmarking and performance analysis of an HPC application in Amazon EC2

GPU Architecture

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An study of the effect of process malleability in the energy efficiency on GPU-based clusters

Abstract

Access this article

Similar content being viewed by others

The Egyptian national HPC grid (EN-HPCG): open-source Slurm implementation from cluster to grid approach

Cloud benchmarking and performance analysis of an HPC application in Amazon EC2

GPU Architecture

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation