The predecessor of Scalasca, from which Scalasca evolved, is known by the name of KOJAK.
Scalasca is an open-source software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks – in particular those concerning communication and synchronization – and offers guidance in exploring their causes. Scalasca targets mainly scientific and engineering applications based on the programming interfaces MPI and OpenMP, including hybrid applications based on a combination of the two. The tool has been specifically designed for use on large-scale systems including IBM Blue Gene and Cray XT, but is also well suited for small- and medium-scale HPC platforms.
Driven by growing application requirements and accelerated by current trends in microprocessor design, the number of processor cores on modern supercomputers is expanding from...
- 2.Becker D, Wolf F, Frings W, Geimer M, Wylie BJN, Mohr B (2007) Automatic trace-based performance analysis of metacomputing applications. In: Proceedings of the international parallel and distributed processing symposium (IPDPS), Long Beach, CA, USA. IEEE Computer Society, Washington, DCGoogle Scholar
- 3.Böhme D, Geimer M, Wolf F, Arnold L (2010) Identifying the root causes of wait states in large-scale parallel applications. In: Proceedings of the 39th international conference on parallel processing (ICPP), San Diego, CA, USA. IEEE Computer Society, Washington, DC, pp 90–100Google Scholar
- 4.Frings W, Wolf F, Petkov V (2009) Scalable massively parallel I/O to task-local files. In: Proceedings of the ACM/IEEE conference on supercomputing (SC09), Portland, OR, USA, Nov 2009Google Scholar
- 5.Geimer M, Wolf F, Wylie BJN, Ábrahám E, Becker D, Mohr B (2010) The Scalasca performance toolset architecture. Concurr Comput Pract Exper 22(6):702–719Google Scholar
- 6.Geimer M, Wolf F, Wylie BJN, Mohr B (2009) A scalable tool architecture for diagnosing wait states in massively-parallel applications. Parallel Comput 35(7):375–388Google Scholar
- 7.Gibbon P, Frings W, Dominiczak S, Mohr B (2006) Performance analysis and visualization of the n-body tree code PEPC on massively parallel computers. In: Proceedings of the conference on parallel computing (ParCo), Málaga, Spain, Sept 2005 (NIC series), vol 33. John von Neumann-Institut für Computing, Jülich, pp 367–374Google Scholar
- 8.Hayes JC, Norman ML, Fiedler RA, Bordner JO, Li PS, Clark SE, ud-Doula A, Mac Low M-M (2006) Simulating radiating and magnetized flows in multiple dimensions with ZEUS-MP. Astrophys J Suppl 165(1):188–228Google Scholar
- 9.Hermanns M-A, Geimer M, Mohr B, Wolf F (2009) Scalable detection of MPI-2 remote memory access inefficiency patterns. In: Proceedings of the 16th European PVM/MPI users’ group meeting (EuroPVM/MPI), Espoo, Finland. Lecture notes in computer science, vol 5759. Springer, Berlin, pp 31–41Google Scholar
- 10.Hermanns M-A, Geimer M, Wolf F, Wylie BJN (2009) Verifying causality between distant performance phenomena in large-scale MPI applications. In Proceedings of the 17th Euromicro international conference on parallel, distributed, and network-based processing (PDP), Weimar, Germany. IEEE Computer Society, Washington, DC, pp 78–84Google Scholar
- 11.Jülich Supercomputing Centre and German Research School for Simulation Sciences. Scalasca parallel performance analysis toolset documentation (performance properties). http://www.scalasca.org/download/documentation/
- 12.Meira W Jr, LeBlanc TJ, Poulos A (1996) Waiting time analysis and performance visualization in Carnival. In: Proceedings of the SIGMETRICS symposium on parallel and distributed tools (SPDT’96), Philadelphia, PA, USA. ACMGoogle Scholar
- 14.Song F, Wolf F, Bhatia N, Dongarra J, Moore S (2004) An algebra for cross-experiment performance analysis. In: Proceedings of the international conference on parallel processing (ICPP), Montreal, Canada. IEEE Computer Society, Washington, DC, pp 63–72Google Scholar
- 15.Szebenyi Z, Gamblin T, Schulz M, de Supinski BR, Wolf F, Wylie BJN (2011) Reconciling sampling and direct instrumentation for unintrusive call-path profiling of MPI programs. In: Proceedings of the international parallel and distributed processing symposium (IPDPS), Anchorage, AK, USA. IEEE Computer Society, Washington, DCGoogle Scholar
- 16.Szebenyi Z, Wolf F, Wylie BJN (2009) Space-efficient time-series call-path profiling of parallel applications. In: Proceedings of the ACM/IEEE conference on supercomputing (SC09), Portland, OR, USA, Nov 2009Google Scholar
- 17.Szebenyi Z, Wylie BJN, Wolf F (2008) SCALASCA parallel performance analyses of SPEC MPI2007 applications. In: Proceedings of the 1st SPEC international performance evaluation workshop (SIPEW), Darmstadt, Germany. Lecture notes in computer science, vol 5119. Springer, Berlin, pp 99–123Google Scholar
- 18.Szebenyi Z, Wylie BJN, Wolf F (2009) Scalasca parallel performance analyses of PEPC. In: Proceedings of the workshop on productivity and performance (PROPER) in conjunction with Euro-Par, Las Palmas de Gran Canaria, Spain, August 2008. Lecture notes in computer science, vol 5415. Springer, Berlin, pp 305–314Google Scholar
- 19.Wolf F (2003) Automatic Performance Analysis on Parallel Computers with SMP Nodes. PhD thesis, RWTH Aachen, Forschungszentrum Jülich. ISBN 3-00-010003-2Google Scholar
- 20.Wolf F, Mohr B (2001) Specifying performance properties of parallel applications using compound events. Parallel Distrib Comput Pract 4(3):301–317Google Scholar
- 21.Wolf F, Mohr B (2003) Automatic performance analysis of hybrid MPI/OpenMP applications. J Syst Archit 49(10–11):421–439Google Scholar
- 22.Wolf F, Mohr B, Dongarra J, Moore S (2007) Automatic analysis of inefficiency patterns in parallel applications. Concurr Comput Pract Exper 19(11):1481–1496Google Scholar
- 24.Wylie BJN, Geimer M, Wolf F (2008) Performance measurement and analysis of large-scale parallel applications on leadership computing systems. Sci Program 16(2–3):167–181Google Scholar