Advertisement

Abstract

Considerable effort has been devoted to the development of accurate trace-driven simulation models of today's computer systems. Unfortunately many modelers do not carefully inspect the input to their models. The fact is that the output of any model is only as good as the input to that model.

This paper discusses the many issues associated with the input traces used in trace-driven simulation. A description of the different types of traces is provided, followed by survey and discussion of the following trace issues: trace generation techniques, trace-length reduction techniques, trace selection and representativeness, and common trace misuse.

The aim of this tutorial paper is to equip modelers with enough information about the different trace types and tracing methodologies, so that they can be more critical of the quality of the input traces used in their trace-driven simulations.

Keywords

instruction traces address traces trace-driven simulation representativeness 

References

  1. 1.
    D. Ferrari, G. Serazzi, and A. Zeigner, Measurement and Tuning Computer Systems, Prentice Hall, 1983.Google Scholar
  2. 2.
    R. Jain, The Art of Computer Systems Performance Analysis, John Wiley and Sons, 1991.Google Scholar
  3. 3.
    A.J. Smith, “Cache Memories,” ACM Computing Surveys, Vol. 14, No. 3, Sept. 1982, pp. 473–530.Google Scholar
  4. 4.
    J. Goodman, “Using Cache Memory to Reduce Processor-Memory Traffic,” Proc. of the 10th International Symposium on Computer Architecture, June 1983, pp. 124–131.Google Scholar
  5. 5.
    J. Archibald and J.-L. Baer, “Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model,” ACM Transactions on Computers, Vol. 4, No. 4, Nov. 1986, pp. 273–298.Google Scholar
  6. 6.
    J. Tsai and A. Agarwal, “Analyzing Multiprocessor Cache Behavior Through Data Reference Modeling,” Proc. of Sigmetrics '93, May 1993, pp. 236–247.Google Scholar
  7. 7.
    i486 Microprocessor Programmer's Reference Manual, Intel Corporation, Santa Clara, CA, 1990.Google Scholar
  8. 8.
    Alpha Architecture Reference Manual, DEC, Burlington, MA, 1992.Google Scholar
  9. 9.
    J.T. Robinson and M.V. Devarakonda, “Data Cache Management Using Frequency-Based Replacement,” Proc. of Sigmetrics '90, May 1990, pp. 134–142.Google Scholar
  10. 10.
    A.J. Smith, “Disk Cache — Miss Ratio and Design Considerations,” ACM Transactions on Computer Systems, No. 3, Aug. 1985, pp. 161–203.Google Scholar
  11. 11.
    J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, “A Trace-Driven Analysis of the UNIX 4.2 BSD File System,” Proc. of the 10th Symposium on Operating System Principles, December 1985, pp.35–50.Google Scholar
  12. 12.
    A.J. Smith, “Analysis of Long Term File Reference Patterns for Application to File Migration Algorithms,” IEEE Transactions on Software Engineering, Vol. SE-7, No. 4, July 1981, pp. 403–417.Google Scholar
  13. 13.
    S.S. Lavenberg, Computer Performance Modeling Handbook, Academic Press, New York, N.Y., 1983.Google Scholar
  14. 14.
    S.J. Eggers, D.R. Keppel, E.J. Koldinger, and H.M. Levy, “Techniques For Efficient Inline Tracing on a Shared-memory Multiprocessor,” Proc. of Sigmetrics '90, May 1990, pp. 37–46.Google Scholar
  15. 15.
    C. Stephens, B. Cogswell, J. Heinlein, and G. Palmer, “ Instruction Level Profiling and Evaluation of the IBM RS/6000,” Proc. of the 18th International Symposium on Computer Architecture, May 1990, pp. 180–189.Google Scholar
  16. 16.
    A Borg., R. Kessler, and D.E. Wall, “Generation and Analysis of Very “Generation and Analysis of Very Long Address Traces,” Proc. of the 17th International Symposium on Computer Architecture, May 1990, pp. 270–279.Google Scholar
  17. 17.
    E.J. Koldinger, S.J. Eggers, and H.M. Levy, “On the Validity of Trace-Driven Simulation for Multiprocessors,” Proc. of the 18th International Symposium on Computer Architecture, May 1991, pp. 244–253.Google Scholar
  18. 18.
    C.B. Stunkel and W.K. Fuchs, “TRAPEDS: Producing Traces for Multicomputers Via Execution Driven Simulation,” Proc. of Sigmetrics '89, May 1989, pp. 70–78.Google Scholar
  19. 19.
    D.W. Wall, “Experience with a Software-Defined Machine Architecture,” ACM Transactions on Programming Languages and System, Vol. 14, No. 3, July 1992, pp. 299–338.Google Scholar
  20. 20.
    A. Agarwal, Analysis of Cache Performance for Operating Systems and Multiprogramming, Kluwer Academic Pub., Norwell, Mass., 1989.Google Scholar
  21. 21.
    A. Agarwal, R.L. Sites, and M. Horowitz, “ATUM: A Technique for Capturing Address Traces,” Proc. of the 17th International Symposium on Computer Architecture, May 1986, pp. 119–127.Google Scholar
  22. 22.
    J.K. Flanagan, B. Nelson, J. Archibald, and K. Drimsrud, “BACH: BYU Address Collection Hardware; The Collection of Complete Traces,” Proc. of the 6th International Conference on Modeling Techniques and Tools for Computer Performance Evaluation, Sept. 1992.Google Scholar
  23. 23.
    O.R. LaMaire and W.W. White, “The Contribution to Performance of Instruction Set Usage in System/370,” Proc. of the Fall Joint Computer Conference, Dallas, TX., Nov. 1986, pp. 665–674.Google Scholar
  24. 24.
    H. Davis, S.R. Goldschmidt, and J. Hennessy, “Tango: A Multiprocessor Simulation and Tracing System,” Proc. of International Conference on Parallel Processing, Aug. 1991, pp. 99–107.Google Scholar
  25. 25.
    Intel 80386 Programmer's Reference Manual, Intel Corporation, Santa Clara, CA, 1986.Google Scholar
  26. 26.
    VAX-11 Architecture Reference Manual, Digital Equipment Corporation, Bedford, MA, 1982, Form EK-VARAR-RM-001.Google Scholar
  27. 27.
    D.W. Clark, “Cache Performance in the VAX-11/780,” ACM Transactions on Computer Systems, Vol. 1, Feb. 1983, pp. 24–37.Google Scholar
  28. 28.
    D.R. Kaeli, O.R. LaMaire, P.P. Hennet, W.W. White, W. Starke, “Real-Time Trace Generation,” submitted to the International Journal of Computer Simulation, July 1993.Google Scholar
  29. 29.
    T. Horikawa, “TOPAZ: Hardware-Tracer Based Computer Performance Measurement and Evaluation System,” NEC Research and Development Vol. 33, No. 4, Oct. 1992, pp. 638–647.Google Scholar
  30. 30.
    MIPS Languages and Programmer's Manual, MIPS Computer Systems, Inc., 1986.Google Scholar
  31. 31.
    P.P. Hennet, O.R. LaMaire, P.J. Manning, and W.J. Starke, “Self-Clocking SRAM Sequential Memory System,” IBM Technical Disclosure Bulletin, Vol. 32, No. 2, July 1991, pp. 40–42.Google Scholar
  32. 32.
    A.D. Samples, “Mache: No-Loss Trace Compaction,” Proc. of Sigmetrics '89, May 1989, pp. 89–97.Google Scholar
  33. 33.
    J. Ziv, and A. Lempel, “A Universal Algorithm for Sequential Data Compression,” IEEE Transactions on Information Theory, Vol. 23, 1976, pp. 75–81.Google Scholar
  34. 34.
    P.J. Denning, “The Working Set Model for Program Behavior,” Communications of the ACM, 11(5), May 1968, pp. 323–333.Google Scholar
  35. 35.
    A.J. Smith, “Two Methods for the Efficient Analysis of Memory Address Trace Data,” IEEE Transactions on Software Engineering, Vol. SE-3, No. 1, January 1977, pp. 94–101.Google Scholar
  36. 36.
    T.R. Puzak, “Analysis of Cache Replacement-Algorithms,” Doctoral Dissertation, Univ. of Massachusetts, Amherst, Mass., February 1985.Google Scholar
  37. 37.
    W.H. Wang, J.L. Baer, “Efficient Trace-Driven Simulation Methods for Cache Performance Analysis,” Proc. of Sigmetrics '90, May 1990, pp. 27–36Google Scholar
  38. 38.
    A. Agarwal and M. Huffman, “Blocking: Exploiting Spatial Locality for Trace Compaction,” Proc. of Sigmetrics '90, May 1990, pp. 48–57.Google Scholar
  39. 39.
    J. Chame, and M. Dubois, “Cache Inclusion and Processor Sampling in Multiprocessor Simulations,” Proc. of Sigmetrics '93, May 1993, pp. 36–47.Google Scholar
  40. 40.
    K.M. Dixit, “CINT92 and CFP92 Benchmark Descriptions,” SPEC Newsletter, 3(4), Dec. 1991.Google Scholar
  41. 41.
    S.K. Dronamraju. S. Balan, and T. Morgan, “System Analysis and Comparison Using SPEC SDM 1,” SPEC Newsletter, 3(4), Dec. 1991.Google Scholar
  42. 42.
    J. Gray, The Benchmark Handbook, Morgan Kaufmann Pub., San Mateo, CA., 1993.Google Scholar
  43. 43.
    J.P. Singh, W.-D. Weber, and A. Gupta, “SPLASH: Stanford Parallel Applications for Shared-Memory,” Technical Report CSL-TR-91-469, Stanford University, April 1991.Google Scholar
  44. 44.
    L.A. Barroso, and M. Dubois, “The Performance of Cache-Coherent Ring-based Multiprocessors,” Proc. of the 20th International Symposium on Computer Architecture, May 1993, pp. 268–277.Google Scholar
  45. 45.
    A.L. Cox, and R.J. Fowler, “Adaptive Cache Coherency for Detecting Migratory Shared Data,” Proc. of the 20th International Symposium on Computer Architecture, May 1993, pp. 98–108.Google Scholar
  46. 46.
    G. Cybenko, L. Kipp, L. Pointer, and D. Kuck, “Supercomputer Performance Evaluation and the Perfect Benchmarks,” CSRD Report No. 965, Univ. of Illinois, March 1990.Google Scholar
  47. 47.
    S. Vajapenyam, G.S. Sohi, an W.-C. Hsu, “An Empirical Study of the CRAY Y-MP Processor using the PERFECT Club Benchmarks,” Proc. of the 18th International Symposium on Computer Architecture, May 1991, pp. 170–179.Google Scholar
  48. 48.
    M. Martonosi, and A. Gupta, “Effectiveness of Trace Sampling for Performance Debugging Tools,” Proc. of Sigmetrics '93, May 1993, pp. 248–259.Google Scholar
  49. 49.
    S. Laha, J.H. Patel, and R.K. Iyer, “Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems,” IEEE Transactions on Computers, Vol. 37, No. 11, Nov. 1988, pp. 1325–1336.Google Scholar
  50. 50.
    SPEC Benchmark Suite, Release 1, Supercomputing Review, 3(9), Sept. 1990, pp. 48–57.Google Scholar
  51. 51.
    H.S. Stone, and D. Thiebaut, “Footprints in the Cache,” Proceedings of Performance '86, May 1986, pp. 1–4.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • David R. Kaeli
    • 1
  1. 1.IBM T.J. Watson Research CenterYorktown Heights

Personalised recommendations