Abstract
Nowadays, there is no doubt that energy consumption has become a limiting factor in the design and operation of high performance computing (HPC) systems. This is evidenced by the rise of efforts both from the academia and the industry to reduce the energy consumption of those systems. Unlike hardware solutions, software initiatives targeting HPC systems’ energy consumption reduction despite their effectiveness are often limited for reasons including: (i) the program specific nature of the solution proposed; (ii) the need of deep understanding of applications at hand; (iii) proposed solutions are often difficult to use by novices and/or are designed for single task environments. This paper propose a three step blind system-wide, application independent, fine-grain, and easy to use (user friendly) methodology for improving energy performance of HPC systems. The methodology typically breaks into phase detection, phase characterization, and phase identification and system reconfiguration. And it is blind in the sense that it does not require any knowledge from users. It relies upon reconfigurable capabilities offered by the majority of HPC subsystems – including the processor, storage, memory, and communication subsystems – to reduce the overall energy consumption of the system (excluding network equipments) at runtime. We also present an implementation of our methodology through which we demonstrate its effectiveness via static analyses and experiments using benchmarks representative of HPC workloads.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Peng, S.: Green memory moving into the driver’s seat. Intel Developer. Forum IDF (2010)
Kimura, H., Imada, T., Sato, M.: Runtime energy adaptation with low-impact instrumented code in a power-scalable cluster system. In: Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, CCGRID ’10, Washington, DC, USA, pp. 378–387. IEEE Computer Society (2010)
Freeh, V.W., Kappiah, N., Lowenthal, D.K., Bletsch, T.K.: Just-in-time dynamic voltage scaling: exploiting inter-node slack to save energy in mpi programs. J. Parallel Distrib. Comput. 68(9), 1175–1185 (2008)
Rountree, B., Lownenthal, D.K., de Supinski, B.R., Schulz, M., Freeh, V.W., Bletsch, T.: Adagio: making dvs practical for complex hpc applications. In: Proceedings of the 23rd International Conference on Supercomputing, ICS ’09, pp. 460–469. ACM, New York (2009)
Lim, M.Y., Freeh, V.W., Lowenthal, D.K.: Adaptive, transparent frequency and voltage scaling of communication phases in mpi programs. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC ’06. ACM, New York (2006)
Choi, K., Soma, R., Pedram, M.: Fine-grained dynamic voltage and frequency scaling for precise energy and performance tradeoff based on the ratio of off-chip access to on-chip computation times. Trans. Comp.-Aided Des. Integ. Cir. Sys. 24, 18–28 (2006)
Ge, R., Feng, X., Cameron, K.W.: Performance-constrained distributed dvs scheduling for scientific applications on power-aware clusters. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, SC ’05, p. 34. IEEE Computer Society, Washington, DC (2005)
Freeh, V.W., Lowenthal, D.K.: Using multiple energy gears in mpi programs on a power-scalable cluster. In: Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’05, pp. 164–173. ACM, New York (2005)
Cameron, K.W., Ge, R., Feng, X.: High-performance, power-aware distributed computing for scientific applications. Computer 38, 40–47 (2005)
Isci, C., Contreras, G., Martonosi, M.: Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 39, pp. 359–370. IEEE Computer Society, Washington, DC (2006)
Tsafack, G.L., Lefevre, L., Pierson, J.-M., Stolf, P., Da Costa, G.: A runtime framework for energy efficient hpc systems without a priori knowledge of applications. In: ICPADS 2012: 18th International Conference on Parallel and Distributed Systems, Singapore, Singapore, pp. 660–667. IEEE, December 2012
Tsafack, G.L., Lefevre, L., Pierson, J.-M., Stolf, P., Da Costa, G.: Beyond cpu frequency scaling for a fine-grained energy control of hpc systems. In: SBAC-PAD 2012: 24th International Symposium on Computer Architecture and High Performance Computing, New York City, USA, pp. 132–138. IEEE, October 2012
Bolze, R., Cappello, F., Caron, E., Daydé, M., Desprez, F., Jeannot, E., Jégou, Y., Lanteri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Primet, P., Quetier, B., Richard, O., Talbi, E.-G., Touche, I.: Grid’5000: a large scale and highly reconfigurable experimental grid testbed. Int. J. High Perform. Comput. Appl. 20, 481–494 (2006)
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The nas parallel benchmarks. The International Journal of Supercomputer Applications, Tech. Rep. (1991)
Acknowledgments
This work is supported by the INRIA large scale initiative Hemera focused on “developing large scale parallel and distributed experiments”. Some experiments of this article were performed on the Grid5000 platform, an initiative from the French Ministry of Research through the ACI GRID incentive action, INRIA, CNRS and RENATER and other contributing partners (http://www.grid5000.fr). It was also partially supported by the COST (European Cooperation in Science and Technology) framework, under Action IC0804.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsafack Chetsa, G.L., Lefevre, L., Stolf, P. (2013). A Three Step Blind Approach for Improving HPC Systems’ Energy Performance. In: Pierson, JM., Da Costa, G., Dittmann, L. (eds) Energy Efficiency in Large Scale Distributed Systems. EE-LSDS 2013. Lecture Notes in Computer Science(), vol 8046. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40517-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-40517-4_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40516-7
Online ISBN: 978-3-642-40517-4
eBook Packages: Computer ScienceComputer Science (R0)