Skip to main content

Advanced Event-Sampling Support for PAPI

  • Conference paper
  • First Online:
Programming and Performance Visualization Tools (ESPT 2017, ESPT 2018, VPA 2017, VPA 2018)

Abstract

The PAPI performance library is a widely used tool for gathering performance data from running applications. Modern processors support advanced sampling interfaces, such as Intel’s Precise Event Based Sampling (PEBS) and AMD’s Instruction Based Sampling (IBS). The current PAPI sampling interface predates the existence of these interfaces and only provides simple instruction-pointer based samples.

We propose a new, improved, sampling interface that provides support for the extended sampling information available on modern hardware. We extend the PAPI interface to add a new PAPI_sample_init call that uses the Linux perf_event interface to access the extra sample information. A pointer to these samples is returned to the user, who can either decode them on the fly, or write them to disk for later analysis.

By providing extended sampling information, this new PAPI interface allows advanced performance analysis and optimization that was previously not possible. This will greatly enhance the ability to optimize software in modern extreme-scale programming environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Adhianto, L., et al.: HPCToolkit: tools for performance analysis of optimized parallel programs. Concurrency Comput.: Practice Exp. 22(6), 685–701 (2010)

    Google Scholar 

  2. Advanced Micro Devices: BIOS and Kernel Developer’s Guide (BKDG) For AMD Family 15h Models 00h–0Fh Processors, January 2013

    Google Scholar 

  3. Advanced Micro Devices: BIOS and Kernel Developer’s Guide (BKDG) For AMD Family 15h Models 30h–3Fh Processors, March 2014

    Google Scholar 

  4. AMD: AMD Family 15h Processor BIOS and Kernel Developer Guide (2011)

    Google Scholar 

  5. ARM: ARM Architecture Reference Manual Supplement Statistical Profiling Extension, for ARMv8-A (2017)

    Google Scholar 

  6. Drongowski, P., Yu, L., Swehosky, F., Suthikulpanit, S., Richter, R.: Incorporating instruction-based sampling into AMD CodeAnalyst. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software, pp. 119–120, March 2010

    Google Scholar 

  7. Drongowski, P.: Instruction-Based Sampling: A New Performance Analysis Technique for AMD Family 10h Processors. Advanced Micro Devices, Inc. (2007)

    Google Scholar 

  8. Eranian, S.: Linux \({\rm perf}\_{\rm events}\) status update. In: Scalable Tools Workshop, August 2016

    Google Scholar 

  9. Fässler, U., Nowak, A.: Perf file format. Technical report, CERN Openlab, September 2011

    Google Scholar 

  10. Gleixner, T., Molnar, I.: Performance counters for Linux (2009)

    Google Scholar 

  11. Gregg, B.: FlameGraphs. http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

  12. IBM: Linux on Z and LinuxONE: Device Drivers, Features, and Commands (2018)

    Google Scholar 

  13. Intel Corporation: Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3: System Programming Guide, June 2015

    Google Scholar 

  14. Juvva, K.: Memory bandwidth monitoring in Linux for HPC applications. In: Linux Con North America 2015, August 2015

    Google Scholar 

  15. Kleen, A.: Intel PMU profiling tools. https://github.com/andikleen/pmu-tools

  16. Kleen, A.: Adding processor trace support to Linux. Linux Weekly News (2015). https://lwn.net/Articles/648154/

  17. Kleen, A.: perf.data file format specification draft (2015). https://lwn.net/Articles/644919/

  18. Kleen, A., Strong, B.: Intel®processor trace on Linux. In: Tracing Summit 2015 (2015)

    Google Scholar 

  19. Knüpfer, A., et al.: The Vampir performance analysis tool-set. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds.) Tools for High Performance Computing, pp. 139–155. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68564-7_9

    Chapter  Google Scholar 

  20. Lachaize, R., Lepers, B., Quéma, V.: Memprof: a memory profiler for NUMA multicore systems. In: USENIX Annual Technical Conference, June 2012

    Google Scholar 

  21. Levinthal, D.: Gooda PMU event analysis package. https://github.com/David-Levinthal/gooda

  22. Lipp, M., et al.: Meltdown. ArXiv e-prints, January 2018

    Google Scholar 

  23. Liu, Y., Weaver, V.: Enhancing PAPI with low-overhead rdpmc reads. In: Proceedings of the 6th Workshop on Extreme-Scale Programming Tools, November 2017

    Google Scholar 

  24. Lopez, I., Moore, S., Weaver, V.: A prototype sampling interface for PAPI. In: Extreme Science Engineering Discovery Environment Conference, July 2015

    Google Scholar 

  25. McCurdy, C., Vetter, J.: Finding and fixing NUMA-related performance problems on multi-core platforms. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software, pp. 87–96, March 2010

    Google Scholar 

  26. Mucci, P.J., Browne, S., Deane, C., Ho, G.: PAPI: a portable interface to hardware performance counters. In: Proceedings of Department of Defense HPCMP User Group Conference, June 1999

    Google Scholar 

  27. Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. In: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 89–100, June 2007

    Google Scholar 

  28. Olsa, J.: Perf & CTF. In: Tracing Summit 2014 (2014)

    Google Scholar 

  29. Petitet, A., Whaley, R., Dongarra, J., Cleary, A., Luszczek, P.: HPL – a portable implementation of the high-performance linpack benchmark for distributed-memory computers. Innovative Computing Laboratory, Computer Science Department, University of Tennessee, v2.2, December 2017. http://www.netlib.org/benchmark/hpl/

  30. Ragate, S.: GPU PC sampling utility. Technical report, Innovative Computing Lab, University of Tennessee (2015)

    Google Scholar 

  31. Selva, M., Morel, L., Marquet, K.: numap: a portable library for low level memory profiling. Technical report RR-8879, INRIA, March 2016

    Google Scholar 

  32. Treibig, J., Hager, G., Wellein, G.: LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments. In: Proceedings of the First International Workshop on Parallel Software Tools and Tool Infrastructures, September 2010

    Google Scholar 

  33. Weaver, V.: \({\rm perf}\_{\rm event}\_{\rm open}\) manual page. In: Kerrisk, M. (ed.) Linux Programmer’s Manual, February 2018

    Google Scholar 

Download references

Acknowledgment

This work was supported by the National Science Foundation under Grant No. SSI-1450122.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vincent M. Weaver .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Smith, F., Weaver, V.M. (2019). Advanced Event-Sampling Support for PAPI. In: Bhatele, A., Boehme, D., Levine, J., Malony, A., Schulz, M. (eds) Programming and Performance Visualization Tools. ESPT ESPT VPA VPA 2017 2018 2017 2018. Lecture Notes in Computer Science(), vol 11027. Springer, Cham. https://doi.org/10.1007/978-3-030-17872-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-17872-7_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-17871-0

  • Online ISBN: 978-3-030-17872-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics