A Run-Time System for Power-Constrained HPC Applications

  • Aniruddha MaratheEmail author
  • Peter E. Bailey
  • David K. Lowenthal
  • Barry Rountree
  • Martin Schulz
  • Bronis R. de Supinski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9137)


As the HPC community attempts to reach exascale performance, power will be one of the most critical constrained resources. Achieving practical exascale computing will therefore rely on optimizing performance subject to a power constraint. However, this additional complication should not add to the burden of application developers; optimizing the run-time environment given restricted power will primarily be the job of high-performance system software.

This paper introduces Conductor, a run-time system that intelligently distributes available power to nodes and cores to improve performance. The key techniques used are configuration space exploration and adaptive power balancing. Configuration exploration dynamically selects the optimal thread concurrency level and DVFS state subject to a hardware-enforced power bound. Adaptive power balancing efficiently determines where critical paths are likely to occur so that more power is distributed to those paths. Greater power, in turn, allows increased thread concurrency levels, the DVFS states, or both. We describe these techniques in detail and show that, compared to the state-of-the-art technique of using statically predetermined, per-node power caps, Conductor leads to a best-case performance improvement of up to 30 %, and average improvement of 19.1 %.


Execution Time Power Allocation Critical Path Power Constraint Power Limit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



Part of this work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344 (LLNL-CONF-667408).


  1. 1.
  2. 2.
    Ashby, S., Beckman, P., Chen, J., Colella, P., Collins, B., Crawford, D., Dongarra, J., Kothe, D., Lusk, R., Messina, P., Mezzacappa, T., Moin, P., Norman, M., Rosner, R., Sarkar, V., Siegel, A., Streitz, F., White, A., Wright, M.: The opportunities and challenges of exascale computing (2010)Google Scholar
  3. 3.
    Bailey, D., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Frederickson, P., Lasinski, T., Schreiber, R., et al.: The NAS parallel benchmarks summary and preliminary results. In: Supercomputing, pp. 158–165 (1991)Google Scholar
  4. 4.
    Bailey, P.E., Lowenthal, D.K., Ravi, V., Rountree, B., Schulz, M., de Supinski, B.R.: Adaptive configuration selection for power-constrained heterogeneous systems. In: ICPP (2014)Google Scholar
  5. 5.
    Bulatov, V., Cai, W., Fier, J., Hiratani, M., Hommes, G., Pierce, T., Tang, M., Rhee, M., Yates, K., Arsenlis, T.: Scalable line dynamics in ParaDiS. In: Supercomputing (2004)Google Scholar
  6. 6.
    Cameron, K.W., Feng, X., Ge, R.: Performance-constrained distributed DVS scheduling for scientific applications on power-aware clusters. In: Supercomputing (2005)Google Scholar
  7. 7.
    Darema, F., George, D.A., Norton, V.A., Pfister, G.F.: A single-program-multiple-data computational model for EPEX/FORTRAN. Parallel Comput. 7(1), 11–24 (1988)zbMATHCrossRefGoogle Scholar
  8. 8.
    Etinski, M., Corbalan, J., Labarta, J., Valero, M.: Optimizing job performance under a given power constraint in HPC centers. In: IGCC (2010)Google Scholar
  9. 9.
    Etinski, M., Corbalan, J., Labarta, J., Valero, M.: Linear programming based parallel job scheduling for power constrained systems. In: HPCS (2011)Google Scholar
  10. 10.
    Femal, M.E., Freeh, V.W.: Safe overprovisioning: using power limits to increase aggregate throughput. In: Falsafi, B., VijayKumar, T.N. (eds.) PACS 2004. LNCS, vol. 3471, pp. 150–164. Springer, Heidelberg (2005) CrossRefGoogle Scholar
  11. 11.
    Ge, R., Feng, X., Feng, W., Cameron, K.W.: CPU Miser: a performance-directed, run-time system for power-aware clusters. In: ICPP (2007)Google Scholar
  12. 12.
    Hsu, C.-H., Feng, W.-C.: A power-aware run-time system for high-performance computing. In: Supercomputing, November 2005Google Scholar
  13. 13.
    InsideHPC. Power consumption is the exascale gorilla in the room.
  14. 14.
    Intel. Intel-64 and IA-32 Architectures Software Developer’s Manual, Volumes 3A and 3B: System Programming Guide, December 2011Google Scholar
  15. 15.
    Isci, C., Buyuktosunoglu, A., Cher, C., Bose, P., Martonosi, M.: An analysis of efficient multi-core global power management policies: maximizing performance for a given power budget. In: IEEE/ACM International Symposium on Microarchitecture, pp. 347–358 (2006)Google Scholar
  16. 16.
    Kappiah, N., Freeh, V.W., Lowenthal, D.K.: Just in time dynamic voltage scaling: exploiting inter-node slack to save energy in MPI programs. In: Supercomputing, November 2005Google Scholar
  17. 17.
    Karlin, I., Keasler, J., Neely, R.: Lulesh 2.0 updates and changes. Technical report LLNL-TR-641973, August 2013Google Scholar
  18. 18.
    Li, D., de Supinski, B., Schulz, M., Cameron, K., Nikolopoulos, D.: Hybrid MPI/OpenMP power-aware computing. In: IPDPS (2010)Google Scholar
  19. 19.
    Nathuji, R., Schwan, K., Somani, A., Joshi, Y.: VPM tokens: virtual machine-aware power budgeting in datacenters. Cluster Comput. 12(2), 189–203 (2009)CrossRefGoogle Scholar
  20. 20.
    Patki, T., Lowenthal, D.K., Rountree, B., Schulz, M., de Supinski, B.R.: Exploring hardware overprovisioning in power-constrained, high performance computing. In: ICS (2013)Google Scholar
  21. 21.
    Pawlowski, S.S.: Exascale science: the next frontier in high performance computing. In: International Conference on Supercomputing, June 2010Google Scholar
  22. 22.
    Rountree, B., Ahn, D.H., de Supinski, B.R., Lowenthal, D.K., Schulz, M.: Beyond DVFS: a first look at performance under a hardware-enforced power bound. In: HPPAC (2012)Google Scholar
  23. 23.
    Rountree, B., Lowenthal, D.K., de Supinski, B., Schulz, M., Freeh, V.W.: Adagio: making DVS practical for complex HPC applications. In: ICS (2009)Google Scholar
  24. 24.
    Rountree, B., Lowenthal, D.K., Funk, S., Freeh, V.W., de Supinski, B., Schulz, M.: Bounding energy consumption in large-scale MPI programs. In: Supercomputing, November 2007Google Scholar
  25. 25.
    Sarood, O., Langer, A., Gupta, A., Kale, L.: Maximizing throughput of overprovisioned HPC data centers under a strict power budget. In: Supercomputing (2014)Google Scholar
  26. 26.
    Sarood, O., Langer, A., Kalé, L., Rountree, B., De Supinski, B.: Optimizing power allocation to CPU and memory subsystems in overprovisioned HPC systems. In: CLUSTER (2013)Google Scholar
  27. 27.
    van der Wijngaart, R.F., Haopiang, J.: NAS parallel multi-zone benchmarks (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Aniruddha Marathe
    • 1
    Email author
  • Peter E. Bailey
    • 1
  • David K. Lowenthal
    • 1
  • Barry Rountree
    • 2
  • Martin Schulz
    • 2
  • Bronis R. de Supinski
    • 2
  1. 1.Department of Computer ScienceThe University of ArizonaTucsonUSA
  2. 2.Lawrence Livermore National LaboratoryLivermoreUSA

Personalised recommendations