Advertisement

Power Management Framework for Post-petascale Supercomputers

  • Masaaki Kondo
  • Ikuo Miyoshi
  • Koji Inoue
  • Shinobu Miwa
Chapter

Abstract

Power consumption is a first class design constraint for developing future exascale computing systems. To achieve exascale system performance with realistic power provisioning of 20–30 MW, we need to improve power-performance efficiency significantly compared to today’s supercomputer systems. In order to maximize effective performance within a power constraint, investigating how to optimize power resource allocation to each hardware component or each job submitted to the system is necessary. We have been conducting research and development on a software framework for code optimization and system power management for the power-constraint adaptive systems. We briefly introduce the research efforts for maximizing application performance under a given power constraint, power-aware resource manager, and power-performance simulation and analysis framework for future supercomputer systems.

References

  1. 1.
    Cao, T., He, Y., Kondo, M.: Demand-aware power management for power-constrained HPC systems. In: Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid2016), Cartagena, pp. 21–31 (2016)Google Scholar
  2. 2.
    Cao, T., Huang, W., He, Y., Kondo, M.: Cooling-aware job scheduling and node allocation for overprovisioned HPC systems. In: Prodeedings of 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS 2017), Orlando (2017)Google Scholar
  3. 3.
    Casanova, H., Giersch, A., Legrand, A., Quinson, M., Suter, F.: Versatile, scalable, and accurate simulation of distributed applications and platforms. J. Parallel Distrib. Comput. 74(10), 2899–2917 (2014)CrossRefGoogle Scholar
  4. 4.
  5. 5.
    Inadomi, Y., Patki, T., Inoue, K., Aoyagi, M., Rountree, B., Schulz, M., Lowenthal, D., Wada, Y., Fukazawa, K., Ueda, M., Kondo, M., Miyoshi, I.: Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin (2015)Google Scholar
  6. 6.
    Intel 64 and IA-32 architectures software developers manual. Intel, vol. 3, Mar 2013Google Scholar
  7. 7.
    Kogge, P.M.: Architectural challenges at the exascale frontier. In: Simulating the Future: Using One Million Cores and Beyond (invited talk) (2008)Google Scholar
  8. 8.
    Miwa, S., Aita, S., Nakamura, H.: Performance estimation for high performance computing systems with energy efficient ethernet technology. J. Comput. Sci. Res. Dev. 29(3–4), 161–169 (2014)CrossRefGoogle Scholar
  9. 9.
    Miwa, S., Nakamura, H.: Profile-based power shifting in interconnection networks with on/off links. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin (2015)Google Scholar
  10. 10.
  11. 11.
    Sakamoto, R., Cao, T., Kondo, M., Inoue, K., Ueda, M., Patki, T., Ellsworth, D., Rountree, B., Schulz, M.: Production hardware overprovisioning: real-world performance optimization using an extensible power-aware resource management framework. In: Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS 2017), Orlando (2017)Google Scholar
  12. 12.
    Sakamoto, R., Patki, T., Cao, T., Kondo, M., Inoue, K., Ueda, M., Ellsworth, D., Rountree, B., Schulz, M.: Analyzing resource trade-offs in hardware overprovisioned supercomputers. In: Proceedings of the 32nd IEEE International Parallel & Distributed Processing Symposium (IPDPS2018), Orlando (2017)Google Scholar
  13. 13.
    Saravanan, K.P., Carpenter, P.M., Ramirez, A.: Power/performance evaluation of energy efficient ethernet (EEE) for high performance computing. In: Proceedings of the 2013 IEEE International Symposium on Performance Analysis of Systems and Software, Austin, pp. 205–214 (2013)Google Scholar
  14. 14.
    Saravanan, K.P., Carpenter, P.M., Ramirez, A.: A performance perspective on energy efficient HPC links. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC14), New Orleans, pp. 313–322 (2014)Google Scholar
  15. 15.
    Shende, S.S., Malony, A.D.: The TAU parallel performance system. Int. J. High Perform. Comput. Appl. 20(2) (2006). https://www.cs.uoregon.edu/research/tau/home.php CrossRefGoogle Scholar
  16. 16.
    Totoni, E., Jain, N., Kale, L.: Power management of extreme-scale networks with on/off links in runtime systems. ACM Trans. Parallel Comput. 1(2), 16 (2015)CrossRefGoogle Scholar
  17. 17.
    Wada, Y., He, Y., Cao, T., Kondo, M.: A power management framework with simple DSL for automatic power-performance optimization on power-constrained HPC systems. In: Proceedings of Supercomputing Asia (SCA18) (2018)Google Scholar
  18. 18.
    Yoo, A., Jette, M., Grondona, M.: SLURM: simple linux utility for resource management. In: Job Scheduling Strategies for Parallel Processing, Seattle. Lecture Notes in Computer Science, vol. 2862, pp. 44–60 (2003)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Masaaki Kondo
    • 1
  • Ikuo Miyoshi
    • 2
  • Koji Inoue
    • 3
  • Shinobu Miwa
    • 4
  1. 1.The University of TokyoTokyoJapan
  2. 2.Fujitsu LimitedKawasakiJapan
  3. 3.Kyushu UniversityFukuokaJapan
  4. 4.The University of Electro-CommunicationsChofuJapan

Personalised recommendations