Auto-tuning for Energy Usage in Scientific Applications

  • Ananta Tiwari
  • Michael A. Laurenzano
  • Laura Carrington
  • Allan Snavely
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7156)


The power wall has become a dominant impeding factor in the realm of exascale system design. It is therefore important to understand how to most effectively create software to minimize its power usage while maintaining satisfactory levels of performance. This work uses existing software and hardware facilities to tune applications to minimize for several combinations of power and performance. The tuning is done with respect to software level performance-related tunables and for processor clock frequency. These tunable parameters are explored via an offline search to find the parameter combinations that are optimal with respect to performance (or delay, D), energy (E), energy×delay (E×D) and energy×delay×delay (E×D 2). These searches are employed on a parallel application that solves Poisson’s equation using stencils. We show that the parameter configuration that minimizes energy consumption can save, on average, 5.4% energy with a performance loss of 4% when compared to the configuration that minimizes runtime.


Clock Frequency Energy Usage Relaxation Function Power Usage Compiler Optimization 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
    Bedard, D., Lim, M.Y., Fowler, R., Porterfield, A.: PowerMon: Fine-grained and integrated power monitoring for commodity computer systems. In: Proceedings of the IEEE SoutheastCon 2010 (SoutheastCon), pp. 479–484 (2010)Google Scholar
  5. 5.
    Bekas, C., Curioni, A.: A new energy aware performance metric. Computer Science - Research and Development 25, 187–195 (2010)CrossRefGoogle Scholar
  6. 6.
    Brooks, D., Tiwari, V., Martonosi, M.: Wattch: a framework for architectural-level power analysis and optimizations. In: Proceedings of the 27th Annual International Symposium on Computer Architecture, ISCA 2000, pp. 83–94. ACM, New York (2000)CrossRefGoogle Scholar
  7. 7.
    Brooks, D.M., Bose, P., Schuster, S.E., Jacobson, H., Kudva, P.N., Buyuktosunoglu, A., Wellman, J.-D., Zyuban, V., Gupta, M., Cook, P.W.: Power-aware microarchitecture: Design and modeling challenges for next-generation microprocessors. IEEE Micro 20, 26–44 (2000)CrossRefGoogle Scholar
  8. 8.
    Chen, C.: Model-Guided Empirical Optimization for Memory Hierarchy. PhD thesis, University of Southern California (2007)Google Scholar
  9. 9.
    Chung, I.-H., Hollingsworth, J.: A case study using automatic performance tuning for large-scale scientific programs. In: 2006 15th IEEE International Symposium on High Performance Distributed Computing, pp. 45–56 (2006)Google Scholar
  10. 10.
    Ciccotti, P., et al.: Characterization of the DARPA Ubiquitous High Performance Computing (UHPC) Challenge Applications. Submission to International Symposium on Workload Characterization, IIWSC (2011)Google Scholar
  11. 11.
    Flinn, J., Satyanarayanan, M.: Energy-aware adaptation for mobile applications. In: Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles, SOSP 1999, pp. 48–63. ACM, New York (1999)CrossRefGoogle Scholar
  12. 12.
    Freeh, V.W., Kappiah, N., Lowenthal, D.K., Bletsch, T.K.: Just-in-time dynamic voltage scaling: Exploiting inter-node slack to save energy in mpi programs. J. Parallel Distrib. Comput. 68, 1175–1185 (2008)CrossRefGoogle Scholar
  13. 13.
    Ge, R., Feng, X., Song, S., Chang, H.-C., Li, D., Cameron, K.: PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications. IEEE Transactions on Parallel and Distributed Systems 21(5), 658–671 (2010)CrossRefGoogle Scholar
  14. 14.
    Horowitz, M., Indermaur, T., Gonzalez, R.: Low-power digital design. In: IEEE Symposium on Low Power Electronics, Digest of Technical Papers 1994, pp. 8–11 (October 1994)Google Scholar
  15. 15.
    Hotta, Y., Sato, M., Kimura, H., Matsuoka, S., Boku, T., Takahashi, D.: Profile-based optimization of power performance by using dynamic voltage scaling on a pc cluster. In: Proceedings of the 20th International Conference on Parallel and Distributed Processing, IPDPS 2006, p. 298. IEEE Computer Society, Washington, DC (2006)Google Scholar
  16. 16.
    Hsu, C.-H., Feng, W.-C.: A Power-Aware Run-Time System for High-Performance Computing. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, SC 2005, p. 1. IEEE Computer Society, Washington, DC (2005)Google Scholar
  17. 17.
    Kadayif, I., Kandemir, M., Vijaykrishnan, N., Irwin, M., Sivasubramaniam, A.: Eac: a compiler framework for high-level energy estimation and optimization. In: Proceedings of Design, Automation and Test in Europe Conference and Exhibition, 2002, pp. 436–442 (2002)Google Scholar
  18. 18.
    Kandemir, M., Vijaykrishnan, N., Irwin, M.J., Ye, W.: Influence of compiler optimizations on system power. IEEE Trans. Very Large Scale Integr. Syst. 9, 801–804 (2001)CrossRefGoogle Scholar
  19. 19.
    Laurenzano, M.A., Meswani, M., Carrington, L., Snavely, A., Tikir, M.M., Poole, S.: Reducing Energy Usage with Memory and Computation-Aware Dynamic Frequency Scaling. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011, Part I. LNCS, vol. 6852, pp. 79–90. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  20. 20.
    Li, D., de Supinski, B., Schulz, M., Cameron, K., Nikolopoulos, D.: Hybrid MPI/OpenMP power-aware computing. In: 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS), pp. 1–12 (April 2010)Google Scholar
  21. 21.
    Olschanowsky, C., Carrington, L., Tikir, M., Laurenzano, M., Rosing, T.S., Snavely, A.: Fine-grained energy consumption characterization and modeling. In: DOD High Performance Computing Modernization Program User Group Conference (2010)Google Scholar
  22. 22.
    Pillai, P., Shin, K.G.: Real-time dynamic voltage scaling for low-power embedded operating systems. SIGOPS Oper. Syst. Rev. 35, 89–102 (2001)CrossRefGoogle Scholar
  23. 23.
    Rahman, S.F., Guo, J., Yi, Q.: Automated empirical tuning of scientific codes for performance and power consumption. In: Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC 2011, pp. 107–116. ACM, New York (2011)CrossRefGoogle Scholar
  24. 24.
    Rivera, G., Tseng, C.-W.: Tiling optimizations for 3D scientific computations. In: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing (CDROM), Supercomputing 2000. IEEE Computer Society, Washington, DC (2000)Google Scholar
  25. 25.
    Seng, J.S., Tullsen, D.M.: The Effect of Compiler Optimizations on Pentium 4 Power Consumption. In: Proceedings of the Seventh Workshop on Interaction between Compilers and Computer Architectures, INTERACT 2003, p. 51. IEEE Computer Society, Washington, DC (2003)CrossRefGoogle Scholar
  26. 26.
    Singh, K., Bhadauria, M., McKee, S.A.: Prediction-based power estimation and scheduling for cmps. In: Proceedings of the 23rd International Conference on Supercomputing, ICS 2009, pp. 501–502. ACM, New York (2009)CrossRefGoogle Scholar
  27. 27.
    Tiwari, A., Chen, C., Chame, J., Hall, M., Hollingsworth, J.: A Scalable Auto-Tuning Framework for Compiler Optimization. In: 23rd IEEE International Parallel & Distributed Processing Symposium, Rome, Italy (May 2009)Google Scholar
  28. 28.
    Vuduc, R., Demmel, J.W., Yelick, K.A.: Oski: A library of automatically tuned sparse matrix kernels. Journal of Physics: Conference Series 16, 521–530 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ananta Tiwari
    • 1
  • Michael A. Laurenzano
    • 1
  • Laura Carrington
    • 1
  • Allan Snavely
    • 1
  1. 1.Performance Modeling and Characterization LaboratorySan Diego Supercomputer CenterUSA

Personalised recommendations