The Journal of Supercomputing

, Volume 71, Issue 1, pp 202–216 | Cite as

Blue Gene/Q defragmentation for energy waste minimisation

Article

Abstract

In this research, we explore the defragmentation of allocated compute resources so as to conserve energy on an IBM Blue Gene/Q. We examine a real trace from a new four-rack system and explore through simulation three heuristics to minimise energy waste through defragmentation. We describe a number of heuristics for detecting when it is desirable from an energy standpoint to defragment the computing resource through checkpoint/restart. Using heuristics, we were able to gain a simulated saving of 4.36 % of total system power. When applied to all BlueGene/Qs on the Top500 list, this is the equivalent of running the average US household for 698.5 years per annum.

Keywords

Defragmentation Blue Gene Allocation 

References

  1. 1.
    Aziz A, El-Rewini H (2009) Power aware scheduling in computational grids. In: Proceedings of the 2009 international conference on parallel and distributed processing techniques and applications, CSREA Press, Las VegasGoogle Scholar
  2. 2.
    Bautista-Gomez L, Komatitsch D, Maruyama N, Tsuboi S, Cappello F, Matsuoka S (2011) FTI: high performance fault tolerance interface for hybrid systems. In: Proceedings of international conference for high performance computing, networking, storage and analysis, Seattle, WAGoogle Scholar
  3. 3.
    Chari S (2011) Ibm blue gene/q: the most energy efficient green solution for high performance computing. Cabot Partners Group Inc., DanburyGoogle Scholar
  4. 4.
    Elnozahy E, Kistler M, Rajamony R (2003) Energy-efficient server clusters, chapter: Power-aware computer systems. In: Proceedings of lecture notes in computer science, Springer, Berlin, pp 179–197Google Scholar
  5. 5.
    Freeh VW, Pan F, Kappiah N, Lowenthal DK, Springer R (2005) Exploring the energy-time tradeoff in MPI programs on a power-scalable cluster. In: Proceedings of the 19th IEEE international parallel and distributed processing symposium (IPDPS’05), IPDPS ’05, IEEE Computer Society, Washington, DC, USA, 2005, p 4aGoogle Scholar
  6. 6.
    Gara A, Blumrich MA, Chen D, Chiu GL-T, Coteus P, Giampapa ME, Haring RA, Heidelberger P, Hoenicke D, Kopcsay GV, Liebsch TA, Ohmacht M, Steinmacher-Burow BD, Takken T, Vranas P (2009) Overview of the blue gene/l system architecture. IBM J Res Dev 49(2/3):195–212Google Scholar
  7. 7.
    Gilge M (2012) Ibm system blue gene solution: Blue gene/q application development. Technical report SG24-7948-00, International Business Machines Corporation, 2012Google Scholar
  8. 8.
    Harada F, Ushio T, Nakamoto Y (2006) Power-aware resource allocation with fair QoSguarantee. In: Proceedings of the 12th IEEE international conference on embedded and real-time computing systems and applications, RTCSA ’06, IEEE Computer Society, Washington, DC, USA, 2006, pp 287–293Google Scholar
  9. 9.
    Heath T, Diniz B, Carrera EV, Meira W Jr, Bianchini R (2005) Energy conservation in heterogeneous server clusters. In: Proceedings of the tenth ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP ’05, ACM, New York, NY, USA, 2005, pp 186–195Google Scholar
  10. 10.
    Hsu CH, Feng W (2005) A feasibility analysis of power awareness in commodity-based high-performance clusters. In: Proceedings of 7th IEEE international conference on cluster computing (CLUSTER’05), Boston, Massachusetts, Sept 2005Google Scholar
  11. 11.
    Hsu CH, Feng W (2005) A power-aware run-time system for high-performance computing. In: Proceedings of ACM/IEEE SC2005, the international conference on high-performance computing, networking, and storage, Seattle, Washington, Nov 2005Google Scholar
  12. 12.
    Hsu CH, Feng W, Archuleta JS (2005) Towards efficient supercomputing: a quest for the right metric. In: Proceedings of 1st IEEE workshop on high-performance, power-aware computing (in conjunction with the 19th international parallel and distributed processing symposium), Denver, Colorado, April 2005Google Scholar
  13. 13.
    Khan SU (2009) A game theoretical energy efficient resource allocation technique for large distributed computing systems. In: Proceedings of the 2009 international conference on parallel and distributed processing techniques and applications, CSREA Press, Las VegasGoogle Scholar
  14. 14.
    Khargharia B, Hariri S, Yousif MS (2008) Autonomic power and performance management for computing systems. Clust Comput 11(2):167–181CrossRefGoogle Scholar
  15. 15.
    Lawrence Livermore National Laboratory. https://computation-rnd.llnl.gov/scr/. Retrieved 03 July 2014
  16. 16.
    Meuer H, Strohmaier E, Dongarra J, Simon H (2013) Top 500. Retrieved from http://s.top500.org/static/lists/2013/06/TOP500_201306.xls. Accessed 19 June 2013
  17. 17.
    Pinheiro E, Bianchini R, Carrera E, Heath T (2001) Load balancing and unbalancing for power and performance in cluster-based systems. In: Proceedings of the workshop on compilers and operating systems for low power (COLP’01), Sept 2001Google Scholar
  18. 18.
    Rajachandrasekar R, Moody A, Mohror K, Panda DK (2013) A 1 PB/s file system to checkpoint three million MPI tasks. In: Proceedings of the ACM international symposium on high-performance parallel and distributed computing (HPDC’13)Google Scholar
  19. 19.
    Rusu C, Ferreira A, Scordino C, Watson A (2006) Energy-efficient real-time heterogeneous server clusters. In: Proceedings of the 12th IEEE real-time and embedded technology and applications symposium, RTAS ’06, IEEE Computer Society, Washington, DC, USA, 2006, pp 418–428Google Scholar
  20. 20.
    Springer R, Lowenthal DK, Rountree B, Freeh VW (2006) Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster. In: Proceedings of the eleventh ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP ’06, ACM, New York, NY, USA, 2006, pp 230–238Google Scholar
  21. 21.
    The Blue Gene/P Team (2008) Overview of the ibm bluegene/p project. IBM J Res Dev 52(1/2):199–220Google Scholar
  22. 22.
    US Department of Energy. http://www.eia.gov/tools/faqs/faq.cfm?id=97&t=3. Retrieved 28 Mar 2013
  23. 23.
    Vahdat A, Lebeck A, Ellis CS (2000) Every joule is precious: the case for revisiting operating system design for energy efficiency. In: Proceedings of the 9th workshop on ACM SIGOPS European workshop, EW 9, ACM, New York, NY, USA, pp 31–36Google Scholar
  24. 24.
    Velte TJ, Velte A, Elsenpeter R (2008) Green IT: reduce your information system’s environmental impact while adding to the bottom line. McGraw-Hill, New YorkGoogle Scholar
  25. 25.
    Verma A, Ahuja P, Neogi A (2008) Power-aware dynamic placement of HPC applications. In: Proceedings of the 22nd annual international conference on Supercomputing, ICS ’08, ACM, New York, NY, USA, pp 175–184Google Scholar
  26. 26.
    Yoshii K, Iskra K, Gupt R, Beckman P, Vishwanath V, Yu C, Coghlan S (2012) Evaluating power monitoring capabilities on ibm blue gene/p and blue gene/q. In: Proceedings of IEEE international conference on cluster computing (CLUSTER)Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.CarltonAustralia

Personalised recommendations