Advertisement

Bayesian Optimization of HPC Systems for Energy Efficiency

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10876)

Abstract

Energy efficiency is a crucial factor in developing large supercomputers and cost-effective datacenters. However, tuning a system for energy efficiency is difficult because the power and performance are conflicting demands. We applied Bayesian optimization (BO) to tune a graphics processing unit (GPU) cluster system for the benchmark used in the Green500 list, a popular energy-efficiency ranking of supercomputers. The resulting benchmark score enabled our system, named “kukai”, to earn second place in the Green500 list in June 2017, showing that BO is a useful tool. By determining the search space with minimal knowledge and preliminary experiments beforehand, BO could automatically find a sufficiently good configuration. Thus, BO could eliminate laborious manual tuning work and reduce the occupancy time of the system for benchmarking. Because BO is a general-purpose method, it may also be useful for tuning any practical applications in addition to Green500 benchmarks.

Keywords

Bayesian optimization Energy efficiency Automatic parameter tuning 

Notes

Acknowledgements

We would like to thank Sunao Torii, Kenichi Inaba, Ryo Sakamoto, Yuki Yamaura and Michiya Hagimoto for their technical contributions, in particular, for their extensive expertise in liquid immersion cooling, system configuration, and power measurement. Without them, we would be unable to achieve second place in the Green500 ranking.

Supplementary material

References

  1. 1.
    Močkus, J., Tiesis, V., Zilinskas, A.: The application of bayesian methods for seeking the extremum. Towar. Glob. Optim. 2, 117–129 (1978)zbMATHGoogle Scholar
  2. 2.
    Rasmussen, C.E.: Gaussian Processes for Machine Learning (2006)Google Scholar
  3. 3.
    Ansel, J., Kamil, S., Veeramachaneni, K., Ragan-Kelley, J., Bosboom, J., O’Reilly, U.M., Amarasinghe, S.: Opentuner: an extensible framework for program autotuning. In: International Conference on Parallel Architectures and Compilation Techniques, Edmonton, Canada, August 2014Google Scholar
  4. 4.
    Storn, R., Price, K.: Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Dongarra, J.J., Luszczek, P., Petitet, A.: The linpack benchmark: past, present and future. Concurr. Comput. Pract. Exp. 15(9), 803–820 (2003)CrossRefGoogle Scholar
  6. 6.
    Dalibard, V., Schaarschmidt, M., Yoneki, E.: Boat: building auto-tuners with structured Bayesian optimization. In: Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Republic and Canton of Geneva, Switzerland, International World Wide Web Conferences Steering Committee, pp. 479–488 (2017)Google Scholar
  7. 7.
    Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D.G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., Zheng, X.: Tensorflow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283 (2016)Google Scholar
  8. 8.
    Kushner, H.J.: A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J. Fluids Eng. 86(1), 97–106 (1964)Google Scholar
  9. 9.
    Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Neural Information Processing Systems, pp. 2951–2959 (2012)Google Scholar
  10. 10.
    Murray, I., Adams, R.P.: Slice sampling covariance hyperparameters of latent Gaussian models. In: Advances in Neural Information Processing Systems, vol. 23, pp. 1732–1740 (2010)Google Scholar
  11. 11.
    Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. arXiv preprint arXiv:0912.3995 (2009)
  12. 12.
    Contal, E., Perchet, V., Vayatis, N.: Gaussian process optimization with mutual information. In: Proceedings of the 31th International Conference on Machine Learning, pp. 253–261 (2014)Google Scholar
  13. 13.
    EEHPC Working Group: Energy Efficient High Performance Computing Power Measurement Methodology (2015). https://www.top500.org/static/media/uploads/methodology-2.0rc1.pdf
  14. 14.
    Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 2951–2959. Curran Associates, Inc. (2012)Google Scholar
  15. 15.
    Martinez-Cantin, R.: Bayesopt: a Bayesian optimization library for nonlinear optimization, experimental design and bandits. J. Mach. Learn. Res. 15, 3915–3919 (2014)MathSciNetzbMATHGoogle Scholar
  16. 16.
    The GPyOpt authors: GPyOpt: a Bayesian optimization framework in python (2016). http://github.com/SheffieldML/GPyOpt
  17. 17.
    Petitet, A., Whaley, R.C., Dongarra, J., Cleary, A.: HPL tuning. http://www.netlib.org/benchmark/hpl/tuning.html
  18. 18.
    Stein, M.: Large sample properties of simulations using Latin hypercube sampling. Technometrics 29, 143–151 (1987)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Yahoo Japan CorporationTokyoJapan
  2. 2.The University of TokyoTokyoJapan

Personalised recommendations