E-AMOM: an energy-aware modeling and optimization methodology for scientific applications

  • Charles Lively
  • Valerie Taylor
  • Xingfu Wu
  • Hung-Ching Chang
  • Chun-Yi Su
  • Kirk Cameron
  • Shirley Moore
  • Dan Terpstra
Special Issue Paper

Abstract

In this paper, we present the Energy-Aware Modeling and Optimization Methodology (E-AMOM) framework, which develops models of runtime and power consumption based upon performance counters and uses these models to identify energy-based optimizations for scientific applications. E-AMOM utilizes predictive models to employ run-time Dynamic Voltage and Frequency Scaling (DVFS) and Dynamic Concurrency Throttling (DCT) to reduce power consumption of the scientific applications, and uses cache optimizations to further reduce runtime and energy consumption of the applications. The models and optimization are done at the level of the kernels that comprise the application. Our models resulted in an average error rate of at most 6.79 % for Hybrid MPI/OpenMP and MPI implementations of six scientific applications. With respect to optimizations, we were able to reduce the energy consumption by up to 21 %, with a reduction in runtime by up to 14.15 %, and a reduction in power consumption by up to 12.50 %.

Keywords

Performance modeling Energy consumption Power consumption MPI Hybrid MPI/OpenMP Power prediction Performance optimization 

References

  1. 1.
    Bair E, Hastle T, Paul D, Tibshirani R (2006) Prediction by supervised principal components. J Am Stat Assoc 101(473):119–137 CrossRefMATHGoogle Scholar
  2. 2.
    Bellosa F (2000) The benefits of event-driven energy accounting in power-sensitive systems. In: Proc 9th ACM SIGOPS European workshop, Sep 2000, pp 37–42 Google Scholar
  3. 3.
    Bircher WL, John LK (2012) Complete system power estimation using processor performance events. IEEE Trans Comput 61(4) Google Scholar
  4. 4.
    Curtis-Maury M (2008) Prediction-based power-performance adaption of multithreaded scientific codes. In: Proc IEEE Transactions on Parallel and Distributed Systems (TPDS’08), vol. 19, 1396–1410 Google Scholar
  5. 5.
    Curtis-Maury M, Dzierwa J, Antonopoulos C, Nikolopoulos D (2006) Online power-performance adaptation of multithreaded programs using hardware event-based prediction. In: Proc int’l conf on supercomputing (ICS ’06), June 2006, pp 157–166 Google Scholar
  6. 6.
    Curtis-Maury M, Shah A, Blagojevic F, Nikolopoulos D, de Supinski BR et al. (2008) Prediction models for multi-dimensional power-performance optimization of many cores. In: Proc 17th int’l conf parallel architectures and compilation techniques (PACT ’08), pp 250–259 CrossRefGoogle Scholar
  7. 7.
    Ethier S (2005) In: First experience on BlueGene/L, BlueGene applications workshop, April 2005 Google Scholar
  8. 8.
    Freeh V, Pan F, Lowenthal D, Kappiah N (2005) Using multiple energy gears in MPI programs on a power-scalable cluster. In: Proc 10th ACM symp principles and practice of parallel programming, June 2005, pp 164–173 Google Scholar
  9. 9.
    Freeh V, Lowenthal D, Pan F, Kappiah N, Springer R, Rountree B, Femal M (2007) Analyzing the energy-time trade-offs in high-performance computing applications. In: IEEE transactions on parallel, Oct 2007, pp 835–848 Google Scholar
  10. 10.
    Ge R, Feng X, Song S, Chang H, Li D, Cameron KW (2010) PowerPack: energy profiling and analysis of high-performance systems and applications. IEEE Trans Parallel Distrib Syst 21(5):658–671 CrossRefGoogle Scholar
  11. 11.
    Jin H, Van der Wijngaart RF (2004) Performance characteristics of the multi-zone NAS parallel benchmarks. J Parallel Distrib Comput 66(5):674–685 CrossRefGoogle Scholar
  12. 12.
    Kappiah N, Freeh V, Lowenthal D (2005) Just in time dynamic voltage scaling: exploiting inter-node slack to save energy in MPI programs. In: The 2005 ACM/IEEE conference on supercomputing (SC05) Google Scholar
  13. 13.
    Li D, de Supinski B, Schulz M, Cameron K, Nikolopoulos D (2010) In: “Hybrid MPI/OpenMP power-aware computing,” Proc 24th IEEE int’l parallel & distributed processing symp, May 2010, pp 1–12 Google Scholar
  14. 14.
    Li D, Nikolopoulos D, Cameron K, de Supinski B, Schulz M (2010) Power-aware MPI task aggregation prediction for high-end computing systems. In: Proc 24th IEEE int’l parallel & distributed processing symp, pp 1–12 Google Scholar
  15. 15.
    Lim M, Porterfield A, Fowler R (2010) SoftPower: fine-grain power estimations using performance counters. In: Proc 19th int’l symp high performance distributed computing (HPDC ‘10), June 2010, pp 308–311 CrossRefGoogle Scholar
  16. 16.
    Lively C (2012) E-AMOM: an Energy-Aware Modeling and Optimization Methodology for scientific applications on multicore systems. Doctoral dissertation, Texas A&M University Google Scholar
  17. 17.
    Lively C, Wu X, Taylor V, Moore S, Chang H-C et al. (2011) Power-aware predictive models of hybrid (MPI/OpenMP) scientific applications on multicore systems. In: Proc int’l conf on energy-aware high performance computing (EnA-HPC ‘11), Sept 2011 Google Scholar
  18. 18.
    Lively C, Wu X, Taylor V, Moore S, Chang H-C et al. (2011) Energy and performance characteristics of different parallel implementations of scientific applications on multicore systems. Int J High Perform Comput Appl 25(3):342–350 CrossRefGoogle Scholar
  19. 19.
    Miyoshi A, Lefurgy C, Hensbergen E, Rajamony R, Rajkumar R (2002) Critical power slope: understanding the runtime effects of frequency scaling. In: Proc 2005 ACM/IEEE conf supercomputing (SC ‘05), Nov 2002, pp 35–44 Google Scholar
  20. 20.
    Performance Application Programming Interface, papi. http://icl.cs.utk.edu/papi/ (2012)
  21. 21.
    Pusukuri KK, Vengerov D, Fedorova A (2009) A methodology for developing simple and robust power models using performance monitoring events. In: Proceedings of WISOCA, Austin, Texas, USA, June 2009 Google Scholar
  22. 22.
    Singh K, Bhadhauria M, McKee SA (2008) Real time power estimation and thread scheduling via performance counters. In: Proc of workshop on design, architecture, and simulation of chip multi-processors, November 2008 Google Scholar
  23. 23.
    Taylor V, Wu X, Stevens R (2003) Prophesy: an infrastructure for performance analysis and modeling of parallel and grid applications. ACM SIGMETRICS Perform Eval Rev 30(4):13–18 CrossRefGoogle Scholar
  24. 24.
    Wu X, Taylor V, Garrick S, Yu D, Richard J (2006) Performance analysis, modeling and prediction of a parallel multiblock lattice Boltzmann application using prophesy system. In: IEEE international conference on cluster computing, Sept 2006 Google Scholar
  25. 25.
    Wu X, Taylor V, Lively C, Sharkawi S (2009) Performance analysis and optimization of parallel scientific applications on CMP clusters. In: Scalable computing: practice and experience, vol 10, pp 61–74 Google Scholar
  26. 26.
    Wu X, Duan B, Taylor V (2011) Parallel simulations of dynamic earthquake rupture along geometrically complex faults on CMP systems. J Algorithms Comput Technol 5(2):313–340 CrossRefGoogle Scholar
  27. 27.
    Wu X, Lively C, Taylor V, Chang H, Su C, Cameron K, Moore S, Terpstra D, Weaver V (2013) MuMMI: multiple metrics modeling infrastructure. In: The 14th IEEE/ACIS international conf on soft eng, art intel, net and para/dis comp, July 2013, Honolulu, Hawaii. Also see the link: http://www.mummi.org Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Charles Lively
    • 1
  • Valerie Taylor
    • 1
  • Xingfu Wu
    • 1
  • Hung-Ching Chang
    • 2
  • Chun-Yi Su
    • 2
  • Kirk Cameron
    • 2
  • Shirley Moore
    • 3
  • Dan Terpstra
    • 4
  1. 1.Department of Computer Science and EngineeringTexas A&M UniversityCollege StationUSA
  2. 2.Department of Computer ScienceVirginia TechBlacksburgUSA
  3. 3.Dept. of Computer ScienceUniversity of Texas at El PasoEl PasoUSA
  4. 4.Innovative Computing Lab.University of TennesseeKnoxvilleUSA

Personalised recommendations