Workload analysis of computation intensive tasks: Case study on SPEC CPU95 benchmarks

  • Jens Simon
  • Marco Vieth
  • Reinhold Weicker
Workshop 16: Performance Evaluation and Prediction
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1300)


Several performance analysis tools have been developed with the drawback of dedicated hardware solutions or the compute intenseness of simulations. The modern microprocessors, with hardware support for counting of system hardware events, now make possible universal software tools for the performance analysis of complex application programs such as the SPEC benchmarks.

In this paper, we present a new method to determine system resource utilization (cache miss ratios, CPI values, branch miss predictions) of arbitrary programs, based on a sampling technique, combined with access to processor-internal event counter registers. We present the sprof tool set that is based on this method and enables also the detailed analysis of individual subroutines of a program, as they are executed over time. The high accuracy and the negligible overhead of the tool set is demonstrated. We used the SPEC95 benchmark suite, consisting of 8 integer and 10 floating-point intensive non-trivial programs that are commonly used to define the performance of workstations and servers. As an example, we present the analysis of a SPEC CPU95 benchmark program on different processor architectures.


Execution Time Event Counter Compress Function Device Driver UNIX System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. [DF96]
    Joseph D. Darcy and Manuel Fähndrich. Finding cache hotspots in SPEC95. University of California at Berkeley,∼darcy/IRAM/ 1996.Google Scholar
  2. [DR95]
    Kaivalya Dixit and Jeff Reilly. SPEC95 questions and answers. SPEC Newsletter, 7:7–10, September 1995.Google Scholar
  3. [GHPS93]
    Jeffrey D. Gee, Mark D. Hill, Dinosios N. Pnevmatikatos, and Alan J. Smith. Cache performance of the SPEC92 benchmark suite. IEEE Micro, 13(4):17–27, August 1993.CrossRefGoogle Scholar
  4. [GKM83]
    Susan L. Graham, Peter B. Kessler, and Marshall K. McKusick. An execution profiler for modular programs. Software Practice and Experience, 13:671–685, 1983.CrossRefGoogle Scholar
  5. [GL95]
    Hui Gao and John L. Larson. Workload characterization using the cray hardware performance monitor. Journal of Supercomputing, 9:391–412, 1995.CrossRefGoogle Scholar
  6. [GT95]
    Aaron Goldberg and John Trotter. Interrupt-based hardware support for profiling memory system performance. IEEE, pages 518 — 523, August 1995.Google Scholar
  7. [HMMS96]
    M. Horowitz, M. Martonosi, T.C. Mowry, and M.D. Smith. Informing memory operations: Providing memory performance feedback in modern processors. In Proceedings of the 23rd International Symposium on Computer Architecture, pages 260 — 270, 1996.Google Scholar
  8. [HP96]
    John L. Hennessy and David A. Patterson.Computer Architecture. A Quantitative Approach. Morgan Kaufmann, 1996.Google Scholar
  9. [Hun95]
    Doug Hunt. Advanced performance features of the 64-bit PA8000. In COMPCON'95, 1995.Google Scholar
  10. [INTEL]
    Intel Corporation. PentiumPro Processor User's Manual, volume 1–3. 1996.Google Scholar
  11. [LW94]
    Alvin R. Lebeck and David A. Wood. Cache profiling and the SPEC benchmarks: A case study. IEEE Computer, pages 15–26, October 1994.Google Scholar
  12. [MIPS]
    MIPS Corporation. MIPS R10000 Microprocessor User's Manual. 1995.Google Scholar
  13. [Pro]
    PROF User's Manual. UNIX Reference Manuals.Google Scholar
  14. [RBD+97]
    A. Reinefeld, R. Baraglia, T. Decker, J. Gehring, D. Laforenza, F. Ramme, T. Römke, and J. Simon. The MOL project: An Open, Extensible Metacomputer. In Proc. of Heterogeneous Computing Workshop HCW'97, IEEE Computer Science Press, pages 17–31, 1997.Google Scholar
  15. [SG94]
    Ashok Singhal and Aaron J. Goldberg. Architectural support for performance tuning: A case study on the SPARCcenter 2000. Proc. of 21th International Symposium on Computer Architecture, pages 48–59, 1994.CrossRefGoogle Scholar
  16. [WCNSH]
    E.H. Welbon, C.C. Chan-Nui, D.J. Shippy, and D.A. Hicks. Power2 performance monitor. PowerPC and POWER2. Technical Aspects of the New IBM RISC System/6000.Google Scholar
  17. [Wei96]
    Reinhold P. Weicker. A SPEC primer. SPEC World Wide Web Site,, 1996.Google Scholar
  18. [ZLTI96]
    Marco Zagha, Brond Larson, Steve Turner, and Marty Itzkowitz. Performance analysis using the MIPS R10000 performance counters. In Preceedings Supercomputing'96, 1996.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • Jens Simon
    • 1
  • Marco Vieth
    • 1
  • Reinhold Weicker
    • 2
  1. 1.PC2 - Paderborn Center for Parallel ComputingGermany
  2. 2.SNI - Siemens Nixdorf Informationssysteme AGGermany

Personalised recommendations