Encyclopedia of Parallel Computing

2011 Edition
| Editors: David Padua

Performance Analysis Tools

  • Michael Gerndt
Reference work entry
DOI: https://doi.org/10.1007/978-0-387-09766-4_267



Performance analysis tools support the application developer in tuning the application’s performance for a given architecture. They measure performance data during the execution of the application and provide means to analyze and interpret the provided data and to detect performance bottlenecks.



The development of high-performance applications requires a careful adaptation of the program to the underlying parallel architecture. Due to the manifold interrelations of the parallel program and the architecture, designing an application with optimal performance on parallel systems is almost impossible. Therefore, the application goes through a tuning cycle which consists of measuring performance, detecting performance bottlenecks, and applying program transformations. The assumption for this tuning approach is that the performance will be the same for different runs with the same resources and the...

This is a preview of subscription content, log in to check access.


  1. 1.
    Digital continuous profiling infrastructure. www.unix.digital.com/dcpi
  2. 2.
  3. 3.
    Dyninst api. www.dyninst.org
  4. 4.
    Performance application programming interface. icl.cs.utk.edu/papi
  5. 5.
    Buck B, Hollingsworth JK (2000) An API for runtime code patching. Int J High Perform Comput Appl 14(4):317–329Google Scholar
  6. 6.
    Casas M, Badia RM, Labarta J (2007) Automatic phase detection of MPI applications. In: Bischof C et al (eds) Proceedings of the international conference on parallel computing (ParCo ’07), Jülich/Aachen. Advances in Parallel Computing, vol 15, IOS, Amsterdam, pp 129–136Google Scholar
  7. 7.
    DeRose L, Hoover T, Hollingsworth JK (2001) The dynamic probe class library – an infrastructure for developing instrumentation for performance tools. IBM, 2001Google Scholar
  8. 8.
    Dongarra J, London K, Moore S, Mucci P, Terpstra D, You H, Zhou M (2003) Experiences and lessons learned with a portable interface to hardware performance counters. In: IPDPS ’03: Proceedings of the 17th international symposium on parallel and distributed processing, Washington, DC, IEEE Computer Society, p 289.2Google Scholar
  9. 9.
    Fürlinger K, Gerndt M, Dongarra J (2007) Scalability analysis of the SPEC OpenMP benchmarks on large-scale shared memory multiprocessors. In: Shi Y, van Albada GD, Dongarra J, Sloot PMA (eds) Computational science – ICCS 2007, Beijing. Lecture notes in computer science, vol 4488. Springer, Berlin, pp 815–822Google Scholar
  10. 10.
    Geimer M, Wolf F, Wylie BJN, Mohr B (2006) Scalable parallel trace-based performance analysis. In: Proceedings of the 13th European PVM/MPI users’ group meeting on recent advances in parallel virtual machine and message passing interface (EuroPVM/MPI 2006), Bonn, pp 303–312Google Scholar
  11. 11.
    Gerndt M, Ott M (2010) Automatic performance analysis with Periscope. Concurr Comput Pract Exp 22(6):736–748Google Scholar
  12. 12.
    Miller BP, Callaghan MD, Cargille JM, Hollingsworth JK, Irvin RB, Karavanic KL, Kunchithapadam K, Newhall T (1995) The Paradyn parallel performance measurement tool. IEEE Comput 28(11):37–46Google Scholar
  13. 13.
    Mohr B, Malony AD, Shende SS, Wolf F (2001) Towards a performance tool interface for OpenMP: An approach based on directive rewriting. In: Proceedings of the third workshop on OpenMP (EWOMP’01), Barcelona, September 2001Google Scholar
  14. 14.
    Müller MS, Knüpfer A, Jurenz M, Lieber M, Brunst H, Mix H, Nagel WE (2007) Developing scalable applications with Vampir, VampirServer and VampirTrace. In: Bischof C et al (eds) Proceedings of the international conference on parallel computing (ParCo 07), Jülich/Aachen. Advances in Parallel Computing, vol 15, IOS, Amsterdam, pp 113–120Google Scholar
  15. 15.
    Roth PC, Arnold DC, Miller BP (2003) MRNet: A software-based multicast/reduction network for scalable tools. In: Proceedings of the 2003 conference on supercomputing (SC 2003), Phoenix, November 2003Google Scholar
  16. 16.
    Shende SS, Malony AD (2006) The TAU parallel performance system. Int J High Perform Comput Appl, ACTS Collection Special Issue, SAGE, 20(2):287–311Google Scholar
  17. 17.
    Tallent NR, Mellor-Crummey JM, Adhianto L, Fagan MW, Krentel M (2009) Diagnosing performance bottlenecks in emerging petascale applications. In: SC ’09: Proceedings of the conference on high performance computing networking, storage and analysis, Portland, ACM, New York, pp 1–11Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Michael Gerndt
    • 1
  1. 1.Institut für InformatikTechnische Universität MünchenMünchenGermany