Advertisement

PSnAP: Accurate Synthetic Address Streams through Memory Profiles

  • Catherine Mills Olschanowsky
  • Mustafa M. Tikir
  • Laura Carrington
  • Allan Snavely
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5898)

Abstract

Memory address traces are an important information source; they drive memory simulations for performance modeling, systems design and application tuning. For long running applications, the direct use of an address trace is complicated by its size. Previous attempts to reduce trace size incurred a substantial penalty with respect to trace accuracy. We propose a novel method of memory profiling that enables the generation of highly accurate synthetic traces with space requirements typically under 1% of the original traces. We demonstrate the synthetic trace accuracy in terms of cache hit rates, spatial-temporal locality scores and locality surfaces. Simulated cache hit rates from synthetic traces are within 3.5% of observed and on average are within 1.0% for L1 cache. Our profiles are on average 60 times smaller than compressed traces. The combination of small profile sizes and high similarity to original traces makes our technique uniquely applicable to performance modeling and trace driven simulation of large-scale parallel scientific applications.

Keywords

High Performance Computing Address Stream Memory Access Pattern Reuse Distance Synthetic Trace 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Skadron, K., Martonosi, M., August, D.I., Hill, M.D., Lilja, D.J., Pai, V.S.: Challenges in computer architecture evaluation. Computer 36(8), 30–36 (2003)CrossRefGoogle Scholar
  2. 2.
    Mattson, R., Gecsei, J., Slutz, D., Traiger, I.: Evaluation techniques for storage hierarchies. IBM Systems Journal 9, 78–117 (1970)CrossRefGoogle Scholar
  3. 3.
    Calingaert, P.: System performance evaluation: survey and appraisal. Commun. ACM 10(1), 12–18 (1967)CrossRefGoogle Scholar
  4. 4.
    Anacker, W., Wang, C.P.: Evaluation of computing systems with memory hierarchies. IEEE Transactions on Electronic Computers 16(6), 670–679 (1967)CrossRefGoogle Scholar
  5. 5.
    Anacker, W., Wang, C.: Performance evaluation of computing systems with memory hierarchies. IEEE Transactions on Electronic Computers EC-16(6), 764–773 (1967)CrossRefGoogle Scholar
  6. 6.
    Snavely, A., Carrington, L., Wolter, N., Labarta, J., Badia, R., Purkayastha, A.: A framework for application performance modeling and prediction. In: ACM/IEEE Conference on High Performance Networking and Computing (2002)Google Scholar
  7. 7.
    Carrington, L., Wolter, N., Snavely, A., Lee, C.: Applying an automated framework to produce accurate blind performance predictions of full-scale hpc applications. In: UGC (2004)Google Scholar
  8. 8.
    Flanagan, J., Nelson, B., Thompson, G.: The inaccuracy of trace-driven simulation using incomplete multiprogramming trace data. In: MASCOTS (1996)Google Scholar
  9. 9.
    Kaeli, D.R.: Issues in trace-driven simulation. In: Donatiello, L., Nelson, R. (eds.) SIGMETRICS 1993 and Performance 1993. LNCS, vol. 729, pp. 224–244. Springer, Heidelberg (1993)CrossRefGoogle Scholar
  10. 10.
    Vanderwiel, S.P., Lilja, D.J.: Data prefetch mechanisms. ACM Comput. Surv. 32(2), 174–199 (2000)CrossRefGoogle Scholar
  11. 11.
    Murphy, R.C., Kogge, P.M.: On the memory access patterns of supercomputer applications: Benchmark selection and its implications. IEEE Trans. Comput. 56(7), 937–945 (2007)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Laurenzano, M., Simon, B., Snavely, A., Gunn, M.: Low cost trace-driven memory simulation using simpoint. In: Workshop on Binary Instrumentation and Applications (2005)Google Scholar
  13. 13.
    Gao, X.: Reducing time and space costs of memory tracing. PhD thesis, University of California at San Diego, La Jolla, CA, USA (2006)Google Scholar
  14. 14.
    Mitarai, S., Hirao, M., Matsumoto, T., Shinohara, A., Takeda, M., Arikawa, S.: Compressed pattern matching for SEQUITUR. In: Data Compression Conference, p. 469+ (2001)Google Scholar
  15. 15.
    Gao, X., Snavely, A., Carter, L.: Path grammar guided trace compression and trace approximation. In: International Symposium on High Performance Distributed Computing (2006)Google Scholar
  16. 16.
    Sorenson, E., Flanagan, J.: Evaluating synthetic trace models using locality surfaces. In: IEEE International Workshop on Workload Characterization, November 2002, pp. 23–33 (2002)Google Scholar
  17. 17.
    Grimsrud, K., Archibald, J., Frost, R., Nelson, B.: On the accuracy of memory reference models. In: The international conference on Computer performance evaluation: modelling techniques and tools, Secaucus, NJ, USA, pp. 369–388. Springer, New York (1994)Google Scholar
  18. 18.
    Tikir, M., Laurenzano, M., Carrington, L., Snavely, A.: The pmac binary instrumentation library for powerpc. In: Workshop on Binary Instrumentation and Applications (2006)Google Scholar
  19. 19.
    Agarwal, R.C., Alpern, B., Carter, L., Gustavson, F.G., Klepacki, D.J., Lawrence, R., Zubair, M.: High-performance parallel implementations of the NAS kernel benchmarks on the IBM sp2. IBM Systems Journal 34(2), 263–272 (1995)CrossRefGoogle Scholar
  20. 20.
    Aarseth, S.: Nbody2: a direct n-body integration code. New Astronomy 6, 277 (2001)CrossRefGoogle Scholar
  21. 21.
    Hennessy, J., Patterson, D.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco (2003)Google Scholar
  22. 22.
    Weinberg, J., Snavely, A.: Chameleon: A framework for observing, understanding, and imitating memory behavior. In: Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim, Norway (May 2008)Google Scholar
  23. 23.
    Sorenson, E.S., Flanagan, J.K.: Using locality surfaces to characterize the specint 2000 benchmark suite. In: Workload Characterization of Emerging Computer Applications, pp. 101–120. Kluwer Academic Publishers, Dordrecht (2001)Google Scholar
  24. 24.
    Gao, X., Laurenzano, M., Simon, B., Snavely, A.: Reducing overheads for acquiring dynamic traces. In: International Symposium on Workload Characterization (2005)Google Scholar
  25. 25.
    Denning, P.J.: On modeling program behavior. In: American Federation of Information Processing Societies joint computer conference, pp. 937–944. ACM, New York (1971)Google Scholar
  26. 26.
    Aho, A.V., Denning, P.J., Ullman, J.D.: Principles of optimal page replacement. J. ACM 18(1), 80–93 (1971)zbMATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    Spirn, J.: Distance string models for program behavior. Computer 9(11), 14–20 (1976)CrossRefGoogle Scholar
  28. 28.
    Thiebaut, D., Wolf, J., Stone, H.: Synthetic traces for trace-driven simulation of cache memories. IEEE Transactions on Computers 41(4), 388–410 (1992)CrossRefGoogle Scholar
  29. 29.
    Agarwal, A., Hennessy, J., Horowitz, M.: An analytical cache model. ACM Trans. Comput. Syst. 7(2), 184–215 (1989)CrossRefGoogle Scholar
  30. 30.
    Berg, E., Hagersten, E.: Statcache: a probabilistic approach to efficient and accurate data locality analysis. In: IEEE International Symposium on Performance Analysis of Systems and Software, Washington, DC, USA, pp. 20–27. IEEE Computer Society, Los Alamitos (2004)CrossRefGoogle Scholar
  31. 31.
    Archibald, J., Baer, J.L.: Cache coherence protocols: evaluation using a multiprocessor simulation model. ACM Trans. Comput. Syst. 4(4), 273–298 (1986)CrossRefGoogle Scholar
  32. 32.
    Hassan, R., Harris, A., Topham, N., Efthymiou, A.: Synthetic trace-driven simulation of cache memory. In: International Conference on Advanced Information Networking and Applications Workshop, May 2007, vol. 1, pp. 764–771 (2007)Google Scholar
  33. 33.
    Cascaval, C., DeRose, L., Padua, D.A., Reed, D.A.: Compile-time based performance prediction. In: Carter, L., Ferrante, J. (eds.) LCPC 1999. LNCS, vol. 1863, pp. 365–379. Springer, Heidelberg (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Catherine Mills Olschanowsky
    • 1
  • Mustafa M. Tikir
    • 2
  • Laura Carrington
    • 2
  • Allan Snavely
    • 1
    • 2
  1. 1.Department of Computer Science and EngineeringUniversity of California at San Diego 
  2. 2.San Diego Supercomputer Center 

Personalised recommendations