The Journal of Supercomputing

, Volume 5, Issue 2–3, pp 137–162 | Cite as

Tracing application program execution on the CRAY X-MP and CRAY-2

  • Allen D. Malony
  • John L. Larson
  • Daniel A. Reed


Important insights into program operation can be gained by observing dynamic execution behavior. Unfortunately, many high-performance machines provide execution profile summaries as the only tool for performance investigation. We have developed a tracing library for the CRAY X-MP and CRAY-2 supercomputers that supports the low-overhead capture of execution events for sequential and multitasked programs. This library has been extended to use the automatic instrumentation facilities on these machines, allowing trace data from routine entry and exit, and other program segments, to be captured. To assess the utility of the trace-based tools, three of the Perfect Benchmark codes have been tested in scalar and vector modes with the tracing instrumentation. In addition to computing summary execution statistics from the traces, interesting execution dynamics appear when studying the trace histories. It is also possible to model application performance based on properties identified from traces. Our conclusion is that adding tracing support in Cray supercomputers can have significant returns in improved performance characterization and evaluation.


Instrumentation measurement tracing performance characterization application execution 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Berry, M. 1989. The Perfect Club Benchmarks: Effective performance evaluation of supercomputers. The Internat. J. Supercomputer Applications, 3, 3 (fall), 5–40.Google Scholar
  2. Carrington, D. 1983. Profiling under ELSI UNIX. Software Practice and Experience, 16, 9 (Sept.), 865–873.Google Scholar
  3. Cray Research, Inc. 1989. UNICOS Performance Utilities Reference Manual. May.Google Scholar
  4. Fromm, H., Hercksen, U., Herzog, U., John, K., Klar, R., and Kleinoder, W. 1983. Experiences with performance measurement and modeling of a processor array. IEEE Trans. Comps., 32, 1 (Jan.).Google Scholar
  5. Gehringer, E., Siewiorek, D., and Segall, Z. 1987. Parallel Processing: The CM * Experience. Digital Press.Google Scholar
  6. Graham, S., Kessler, P., and McKusick, M. 1982. gprof: A call graph execution profiler. In Proc., SIGPLAN '82 Symp. on Compiler Construction (Boston, June), ACM Press, pp. 120–126.Google Scholar
  7. Graham, S., Kessler, P., and McKusick, M. 1983. An execution profiler for modular programs. Software Practice and Experience, 13: 671–685.Google Scholar
  8. Jameson, A. 1983. Solution of the Euler equations for a two-dimensional transonic flow by a multigrid method. Applied Math. and Comp., 13:327.Google Scholar
  9. Larson, J. 1985. CRAY X-MP hardware performance monitor. Cray Channels.Google Scholar
  10. Larson, J., and Lutz, R. 1985. Perftrace user guide. Tech. rept., Cray Research, Inc. (Aug.).Google Scholar
  11. Malony, A. 1990. Performance observability. Ph.D. thesis, Dept. of Comp. Sci., Univ. of Ill. at Urbana-Champaign, Urbana, Ill.Google Scholar
  12. Malony, A. 1991. Event based performance perturbation: A case study. In Third ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (to appear).Google Scholar
  13. Malony, A., Reed, D., and Wijshoff, H. 1989. Performance measurement intrusion and perturbation analysis. Tech. rept. CSRD-923, Univ. of Ill., Center for Supercomputing Research and Development, Urbana, Ill.Google Scholar
  14. Mellor-Crummey, J. 1989. Debugging and analysis of large-scale parallel programs. Ph.D. thesis, Dept. of Comp. Sci., Univ. of Rochester, Rochester, N.Y.Google Scholar
  15. Model, M. 1978. Monitoring system behavior in a complex computational environment. Ph.D. thesis, Stanford Univ., Stanford, Calif.Google Scholar
  16. Noor, A., and Peters, J. 1985. Model-size reduction techniques for the analysis of symmetric anisotropic structures. Eng. Computations, 2, 4 (Apr.), 285.Google Scholar
  17. Orszag, S. 1984. Order and disorder in two and three-dimensional Benard convection. J. Fluid Mechanics, 174:1.Google Scholar
  18. Pointer, L. 1990. Perfect: Performance evaluation for cost-effective transformations—Report 2. Tech. rept. CSRD No. 964, Univ. of Ill. at Urbana-Champaign, Center for Supercomputing Research and Development, Urbana, Ill.Google Scholar
  19. Segall, Z., and Rudolph, L. 1985. PIE: A programming and instrumentation environment for parallel processing. IEEE Software, 2, 6 (Nov.), 22–37.Google Scholar
  20. Simmons, M., Koskela, R., and Bucher, L, eds. 1989. Instrumentation for Future Parallel Computing Systems. ACM Press.Google Scholar
  21. Simmons, M., Koskela, R., and Bucher, I., eds. 1990. Parallel Computer Systems: Performance Instrumentation and Visualization. ACM Press.Google Scholar

Copyright information

© Kluwer Academic Publishers 1991

Authors and Affiliations

  • Allen D. Malony
    • 1
  • John L. Larson
    • 1
  • Daniel A. Reed
    • 1
  1. 1.Center for Supercomputer Research and Development, University of IllinoisUrbanaUSA

Personalised recommendations