Skip to main content

Reducing Energy Costs for IBM Blue Gene/P via Power-Aware Job Scheduling

  • Conference paper
  • First Online:
Job Scheduling Strategies for Parallel Processing (JSSPP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8429))

Included in the following conference series:

Abstract

Energy expense is becoming increasingly dominant in the operating costs of high-performance computing (HPC) systems. At the same time, electricity prices vary significantly at different times of the day. Furthermore, job power profiles also differ greatly, especially on HPC systems. In this paper, we propose a smart, power-aware job scheduling approach for HPC systems based on variable energy prices and job power profiles. In particular, we propose a 0-1 knapsack model and demonstrate its flexibility and effectiveness for scheduling jobs, with the goal of reducing energy cost and not degrading system utilization. We design scheduling strategies for Blue Gene/P, a typical partition-based system. Experiments with both synthetic data and real job traces from production systems show that our power-aware job scheduling approach can reduce the energy cost significantly, up to 25 %, with only slight impact on system utilization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zhou, Z., Tang, W., Zheng, Z., Lan, Z., Desai, N.: Evaluating performance impacts of delayed failure repairing on large-scale systems. In: 2011 IEEE International Conference on Cluster Computing (CLUSTER), pp. 532–536 (2011)

    Google Scholar 

  2. Bergman, K., Borkar, S., Campbell, D., Carlson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hiller, J., Karp, S., Keckler, S., Klein, D., Lucas, R., Richards, M., Scarpelli, A., Scott, S., Snavely, A., Sterling, T., Williams, R.S., Yelick, K., Bergman, K., Borkar, S., Campbell, D., Carlson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hiller, J., Keckler, S., Klein, D., Kogge, P., Williams, R.S., Yelick, K.: Exascale computing study: technology challenges in achieving exascale systems (2008)

    Google Scholar 

  3. Patel, C., Sharma, R., Bash, C., Graupner, S.: Energy aware grid: global workload placement based on energy efficiency. In: Proceedings of IMECE (2003)

    Google Scholar 

  4. Goiri, I., Le, K., Haque, M., Beauchea, R., Nguyen, T., Guitart, J., Torres, J., Bianchini, R.: Greenslot: scheduling energy consumption in green datacenters. In: 2011 International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pp. 1–11 (2011)

    Google Scholar 

  5. Jossen, A., Garche, J., Sauer, D.U.: Operation conditions of batteries in PV applications. Sol. Energy 76, 759–769 (2004)

    Article  Google Scholar 

  6. Fan, X., Weber, W.-D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. In: Proceedings of the 34th annual International Symposium on Computer Architecture, ISCA ’07, pp. 13–23. ACM, New York (2007)

    Google Scholar 

  7. Qureshi, A., Weber, R., Balakrishnan, H., Guttag, J., Maggs, B.: Cutting the electric bill for internet-scale systems. In: Proceedings of the ACM SIGCOMM 2009 conference on data communication, SIGCOMM ’09, pp. 123–134. ACM, New York (2009)

    Google Scholar 

  8. Hennecke, M., Frings, W., Homberg, W., Zitz, A., Knobloch, M., Böttiger, H.: Measuring power consumption on IBM Blue Gene/P. Comput. Sci. Res. Dev. 27(4), 329–336 (2012)

    Article  Google Scholar 

  9. Parallel workload archive. http://www.cs.huji.ac.il/labs/parallel/workload/

  10. Mämmelä, O., Majanen, M., Basmadjian, R., Meer, H., Giesler, A., Homberg, W.: Energy-aware job scheduler for high-performance computing. Comput. Sci. Res. Dev. 27(4), 265–275 (2012)

    Article  Google Scholar 

  11. Meisner, D., Sadler, C., Barroso, L., Weber, W., Wenisch, T.: Power management of online data-intensive services. In: 2011 38th Annual International Symposium on Computer Architecture (ISCA), pp. 319–330 (2011)

    Google Scholar 

  12. Barroso, L., Holzle, U.: The case for energy-proportional computing. Computer 40(12), 33–37 (2007)

    Google Scholar 

  13. Pinheiro, E., Bianchini, R., Carrera, E.V., Heath, T.: Load balancing and unbalancing for power and performance in cluster-based systems. In: Proceedings of the Workshop on Compilers and Operating Systems for Low, Power (COLP’01) (2001)

    Google Scholar 

  14. Liu, Y., Zhu, H.: A survey of the research on power management techniques for high-performance systems. Softw. Pract. Exper. 40, 943–964 (2010)

    Article  Google Scholar 

  15. Lee, E., Kulkarni, I., Pompili, D., Parashar, M.: Proactive thermal management in green datacenters. J. Supercomput. 60(2), 165–195 (2012)

    Article  Google Scholar 

  16. Feng, W., Warren, M., Weigle, E.: The bladed beowulf: a cost-effective alternative to traditional beowulfs. In: Proceedings 2002 IEEE International Conference on Cluster Computing, 2002, pp. 245–254 (2002)

    Google Scholar 

  17. Hikita, J., Hirano, A., Nakashima, H.: Saving 200 kw and \(\$200\) k/year by power-aware job/machine scheduling. In: IEEE International Symposium on Parallel and Distributed Processing, 2008, IPDPS 2008, pp. 1–8 (2008)

    Google Scholar 

  18. Etsion, Y., Tsafrir, D.: A short survey of commercial cluster batch schedulers, Technical report. The Hebrew University of Jerusalem, Jerusalem (2005)

    Google Scholar 

  19. Feitelson, D., Weil, A.: Utilization and predictability in scheduling the IBM SP2 with backfilling. In: Parallel Processing Symposium, 1998, IPPS/SPDP 1998. In: Proceedings of the 1st Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing 1998, pp. 542–546 (1998)

    Google Scholar 

  20. Tsafrir, D., Etsion, Y., Feitelson, D.: Backfilling using system-generated predictions rather than user runtime estimates. IEEE Trans. Parallel Distrib. Syst. 18(6), 789–803 (2007)

    Article  Google Scholar 

  21. Li, Y., Lan, Z., Gujrati, P., Sun, X.-H.: Fault-aware runtime strategies for high-performance computing. IEEE Trans. Parallel Distrib. Syst. 20(4), 460–473 (2009)

    Article  Google Scholar 

  22. IBM Blue Gene team: Overview of the IBM Blue Gene/P project. IBM J. Res. Dev. 52(1.2), pp. 199–220 (2008)

    Google Scholar 

  23. Cormen, T.H., Stein, C., Rivest, R.L., Leiserson, C.E.: Introduction to Algorithms, 2nd edn. McGraw-Hill Higher Education, New York (2001)

    MATH  Google Scholar 

  24. Tang, W., Lan, Z., Desai, N., Buettner, D.: Fault-aware, utility-based job scheduling on Blue Gene/P systems. In: IEEE International Conference on Cluster Computing and Workshops, 2009, CLUSTER ’09, pp. 1–10 (2009)

    Google Scholar 

  25. Tang, W., Lan, Z., Desai, N., Buettner, D., Yu, Y.: Reducing fragmentation on torus-connected supercomputers. In: 2011 IEEE International Parallel Distributed Processing Symposium (IPDPS), pp. 828–839 (2011)

    Google Scholar 

  26. Cobalt resource manager. http://trac.mcs.anl.gov/projects/cobalt

  27. Sabin, G., Kochhar, G., Sadayappan, P.: Job fairness in non-preemptive job scheduling. In: International Conference on Parallel Processing, 2004, ICPP 2004, vol. 1, pp. 186–194 (2004)

    Google Scholar 

  28. Sabin, G., Sadayappan, P.: Unfairness metrics for space-sharing parallel job schedulers. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 238–256. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  29. Tang, W., Ren, D., Lan, Z., Desai, N.: Adaptive metric-aware job scheduling for production supercomputers. In: 2012 41st International Conference on Parallel Processing Workshops (ICPPW), pp. 107–115 (2012)

    Google Scholar 

  30. Pemmaraju, S., Skiena, S.: Computational Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Cambridge University Press, New York (2003)

    Book  Google Scholar 

  31. Rodero, I., Guim, F., Corbalan, J.: Evaluation of coordinated grid scheduling strategies. In: 11th IEEE International Conference on High Performance Computing and Communications, 2009, HPCC ’09, pp. 1–10 (2009)

    Google Scholar 

  32. Tang, W., Desai, N., Buettner, D., Lan, Z.: Analyzing and adjusting user runtime estimates to improve job scheduling on the Blue Gene/P. In: IEEE International Symposium on Parallel Distributed Processing (IPDPS) 2010, pp. 1–11 (2010)

    Google Scholar 

Download references

Acknowledgment

This work was supported in part by the U.S. National Science Foundation grants CNS-0834514 and CNS-0720549 and in part by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research under contract DE-AC02-06CH1135. We thank Dr. Ioan Raicu for generously providing high-performance servers for our experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhou Zhou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhou, Z., Lan, Z., Tang, W., Desai, N. (2014). Reducing Energy Costs for IBM Blue Gene/P via Power-Aware Job Scheduling. In: Desai, N., Cirne, W. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2013. Lecture Notes in Computer Science(), vol 8429. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43779-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-43779-7_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-43778-0

  • Online ISBN: 978-3-662-43779-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics