Skip to main content

Advertisement

Log in

Integrating performance analysis and energy efficiency optimizations in a unified environment

  • Special Issue Paper
  • Published:
Computer Science - Research and Development

Abstract

Performance analysis tools have been available for decades. They help developers to speed up their applications and pinpoint bottlenecks in scalability. They are wide-spread, well understood, and sophisticated. Since the growing power consumption of HPC systems has become a major cost factor, support for energy efficiency evaluation has been added to various performance analysis tools. Furthermore, beneficial as well as detrimental effects of power saving strategies on energy efficiency are already well understood. However, appropriate tools to directly exploit the detected potentials are not yet available. We therefore present a library that reuses the highly sophisticated instrumentation mechanisms of VampirTrace to dynamically change hardware and software parameters that influence energy efficiency. We also present a library that wraps OpenMP runtimes of several x86_64 compilers in order to provide a low-overhead instrumentation at a parallel region level. This enhances VampirTrace’s abilities to handle OpenMP programs without the typically required recompilation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Listing 1
Listing 2
Listing 3
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. The used threshold and reduced frequencies are estimates based on the authors experience and not meant to fit perfectly for a maximal saving of energy consumption. Still they provide a good impression how effective a first optimization can be.

  2. In this paper we focus on limited scalability due to architectural, not algorithmical reasons. The latter would include regions with a high amount of barriers and locks and would also benefit from a reduced number of threads.

References

  1. Adhianto L, Banerjee S, Fagan M, Krentel M, Marin G, Mellor-Crummey J, Tallent NR (2010) Hpctoolkit: tools for performance analysis of optimized parallel programs. Concurr Comput, Pract Exp 22(6):685–701. doi:10.1002/cpe.1553

    Google Scholar 

  2. Bailey DH, Barszcz E, Barton JT, Browning DS, Carter RL, Dagum L, Fatoohi RA, Frederickson PO, Lasinski TA, Schreiber RS, Simon HD, Venkatakrishnan V, Weeratunga SK (1991) The nas parallel benchmarkssummary and preliminary results. In: Proceedings of the 1991 ACM/IEEE conference on supercomputing, Supercomputing ’91. ACM, New York, pp 158–165. doi:10.1145/125826.125925

    Chapter  Google Scholar 

  3. Chetsa GLT, Lefèvre L, Pierson JM, Stolf P, Costa GD (2012) A runtime framework for energy efficient hpc systems without a priori knowledge of applications. In: ICPADS. IEEE Comput. Soc., Los Alamitos, pp 660–667

    Google Scholar 

  4. Knüpfer A, Brunst H, Doleschal J, Jurenz M, Lieber M, Mickler H, Müller M, Nagel WE (2008) The vampir performance analysis tool-set. In: Resch M, Keller R, Himmler V, Krammer B, Schulz A (eds) Tools for high performance computing. Springer, Berlin, pp 139–155. doi:10.1007/978-3-540-68564-7_9

    Chapter  Google Scholar 

  5. Knüpfer A, Rössel C, Mey D, Biersdorff S, Diethelm K, Eschweiler D, Geimer M, Gerndt M, Lorenz D, Malony A, Nagel WE, Oleynik Y, Philippen P, Saviankou P, Schmidl D, Shende S, Tschüter R, Wagner M, Wesarg B, Wolf F (2012) Score-p: a joint performance measurement run-time infrastructure for periscope, scalasca, tau, and vampir. In: Brunst H, Müller MS, Nagel WE, Resch MM (eds) Tools for high performance computing 2011. Springer, Berlin, pp 79–91. doi:10.1007/978-3-642-31476-6_7

    Chapter  Google Scholar 

  6. Liao Sw, Hung TH, Nguyen D, Chou C, Tu C, Zhou H (2009) Machine learning-based prefetch optimization for data center applications. In: Proceedings of the conference on high performance computing networking, storage and analysis, SC ’09. ACM, New York, pp 56:1–56:10. doi:10.1145/1654059.1654116

    Google Scholar 

  7. Lively C, Wu X, Taylor V, Moore S, Chang HC, Su CY, Cameron K (2012) Power-aware predictive models of hybrid (mpi/openmp) scientific applications on multicore systems. Comput Sci Res Dev 27(4):245–253. doi:10.1007/s00450-011-0190-0

    Article  Google Scholar 

  8. Mohr B, Malony A, Shende S, Wolf F (2002) Design and prototype of a performance tool interface for openmp. J Supercomput 23(1):105–128. doi:10.1023/A:1015741304337

    Article  MATH  Google Scholar 

  9. Pillet V, Labarta J, Cortes T, Girona S (1995) Paraver: A tool to visualize and analyze parallel code. In: WoTUG-18, pp 17–31

    Google Scholar 

  10. Rountree B, Lowenthal D, Funk S, Freeh VW, De Supinski B, Schulz M (2007) Bounding energy consumption in large-scale mpi programs. In: Supercomputing, 2007. SC ’07. Proceedings of the 2007 ACM/IEEE conference, pp 1–9. doi:10.1145/1362622.1362688

    Google Scholar 

  11. Rountree B, Lownenthal DK, de Supinski BR, Schulz M, Freeh VW, Bletsch T (2009) Adagio: making dvs practical for complex hpc applications. In: Proceedings of the 23rd international conference on supercomputing, ICS ’09. ACM, New York, pp 460–469. doi:10.1145/1542275.1542340

    Chapter  Google Scholar 

  12. Schöne R, Hackenberg D (2011) On-line analysis of hardware performance events for workload characterization and processor frequency scaling decisions. In: Proceedings of the second joint WOSP/SIPEW international conference on performance engineering, ICPE ’11. ACM, New York, pp 481–486. doi:10.1145/1958746.1958819

    Chapter  Google Scholar 

  13. Schöne R, Hackenberg D, Molka D (2012) Memory performance at reduced cpu clock speeds: an analysis of current x86_64 processors. In: Proceedings of the 2012 USENIX conference on power-aware computing and systems, HotPower ’12. USENIX Association, Berkeley, pp 9

    Google Scholar 

  14. Schöne R, Tschüter R, Ilsche T, Hackenberg D (2011) The vampirtrace plugin counter interface: introduction and examples. In: Proceedings of the 2010 conference on parallel processing, euro-par 2010. Springer, Berlin, pp 501–511

    Google Scholar 

  15. Schulz M, Galarowicz J, Maghrak D, Hachfeld W, Montoya D, Cranford S (2008) Open|speedshop: An open source infrastructure for parallel performance analysis. Sci Program 16(2–3):105–121

    Google Scholar 

  16. Shende SS, Malony AD (2006) The tau parallel performance system. Int J High Perform Comput Appl 20(2):287–311. doi:10.1177/1094342006064482

    Article  Google Scholar 

  17. Tiwari A, Laurenzano M, Peraza J, Carrington L, Snavely A (2012) Green queue: customized large-scale clock frequency scaling. In: Cloud and green computing (CGC), 2012 second international conference, pp 260–267. doi:10.1109/CGC.2012.62

    Google Scholar 

  18. Tolentino M, Cameron KW (2012) The optimist, the pessimist, and the global race to exascale in 20 megawatts. Computer 45(1):95–97. doi:10.1109/MC.2012.34

    Article  Google Scholar 

  19. TOP500 org (2012) Titan cray xk7 overview on top500.org. Online http://www.top500.org/system/177975. Accessed April 2013

  20. Wu CJ, Martonosi M (2011) Characterization and dynamic mitigation of intra-application cache interference. In: Proceedings of the IEEE international symposium on performance analysis of systems and software, ISPASS ’11. IEEE Comput. Soc., Washington, pp 2–11. doi:10.1109/ISPASS.2011.5762710

    Chapter  Google Scholar 

Download references

Acknowledgements

This work has been funded by the Bundesministerium für Bildung und Forschung via the research project CoolSilicon (BMBF 16N10186).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert Schöne.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schöne, R., Molka, D. Integrating performance analysis and energy efficiency optimizations in a unified environment. Comput Sci Res Dev 29, 231–239 (2014). https://doi.org/10.1007/s00450-013-0243-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00450-013-0243-7

Keywords

Navigation