Skip to main content

Using Platform-Specific Performance Counters for Dynamic Compilation

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4339))

Abstract

Hardware performance counters provide information about events in the hardware platform (e.g., cache misses, pipeline stalls), in contrast to profiles that capture program properties (e.g., execution frequencies for basic blocks, methods, function calls). As platform architectures become more complex and also more diverse, it is important for a compiler to exploit platform-specific information. A dynamic (JIT) compiler is in the unique position to run on the same platform as the target application, but in practice, exploiting the wealth of information available through performance counters is far from easy. If a JIT compiler is to use performance counter information, this information must be fine-grained (e.g., attributing cache misses to a single load instruction) and must be obtainable without undue overhead. We present a runtime+compiler framework to tie hardware performance counter information to a dynamic compiler and argue that the overhead is low and fine-grained. As parallel architectures or multi-core architectures proliferate, performance issues will play a crucial role in all compilation engines, and our paper reports on a modular approach to make such counter information available to the compiler.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. IA-32 Intel Architecture Software Developer;s Manual, Volume 3: System Programming Guide (2005)

    Google Scholar 

  2. Adl-Tabatabai, A.-R., Hudson, R.L., Serrano, M.J., Subramoney, S.: Prefetch injection based on hardware monitoring and object metadata. In: Proc. of the ACM SIGPLAN 2004 Conf. on Programming language design and implementation, pp. 267–276. ACM Press, New York (2004)

    Chapter  Google Scholar 

  3. Alpern, B., Attanasio, C.R., Barton, J.J., Cocchi, A., Hummel, S.F., ber, D. L., Ngo, T., Mergen, M.F., Shepherd, J.C., Smith, S.: Implementing jalapeno in java. In: Conference on Object-Oriented, pp. 314–324 (1999)

    Google Scholar 

  4. Alpern, B., Attanasio, D., Barton, J., Burke, M., Cheng, P., Choi, J.-D., Cocchi, A., Fink, S., Grove, D., Hind, M., Hummel, S.F., Lieber, D., Litvinov, V., on Ngo, T., Mergen, M., Sarkar, V., Serrano, M., Shepherd, J., Smith, S., Sreedhar, V.C., rini Srinivasan, H., Whaley, J.: The Jalapeno virtual machine. IBM Systems Journal, Java Performance Issue 39(1) (2000)

    Google Scholar 

  5. Ammons, G., Ball, T., Larus, J.R.: Exploiting hardware performance counters with flow and context sensitive profiling. In: Proc. of the ACM SIGPLAN 1997 conference on Programming language design and implementation, pp. 85–96. ACM Press, New York (1997)

    Chapter  Google Scholar 

  6. Arnold, M., Fink, S., Grove, D., Hind, M., Sweeney, P.F.: Adaptive optimization in the jalapeo JVM. In: Proc. of the 15th ACM SIGPLAN conference on Objectoriented programming, systems, languages, and applications, pp. 47–65. ACM Press, New York (2000)

    Chapter  Google Scholar 

  7. Brink & Abyss, http://www.eg.bucknell.edu/bsprunt/emon/brinkabyss/brinkabyss.shtm

  8. Chang, P.P., Mahlke, S.A., Hwu, W.W.: Using profile information to assist classic code optimizations. Software Practice and Experience 21(12), 1301–1321 (1991)

    Article  Google Scholar 

  9. Georges, A., Buytaert, D., Eeckhout, L., Bosschere, K.D.: Method-level phase behavior in java workloads. In: Proc. of the 19th annual ACM SIGPLAN Conference on Object-oriented programming, systems, languages, and applications, pp. 270–287. ACM Press, New York (2004)

    Chapter  Google Scholar 

  10. Goldberg, A.J., Hennessy, J.L.: Performance debugging shared memory multiprocessor programs with mtool. In: Supercomputing 1991: Proc. of the 1991 ACM/IEEE conference on Supercomputing, pp. 481–490. ACM Press, New York (1991)

    Chapter  Google Scholar 

  11. Goldschmidt, S.R., Hennessy, J.L.: The accuracy of trace-driven simulations of multiprocessors. In: Proc. of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems, pp. 146–157. ACM Press, New York (1993)

    Chapter  Google Scholar 

  12. Hauswirth, M., Sweeney, P.F., Diwan, A., Hind, M.: Vertical profiling: understanding the behavior of object-priented applications. In: Proc. of the 19th annual ACM SIGPLAN Conference on Object-oriented programming, systems, languages, and applications, pp. 251–269. ACM Press, New York (2004)

    Chapter  Google Scholar 

  13. Huang, X., Blackburn, S.M., McKinley, K.S., Moss, J.E.B., Wang, Z., Cheng, P.: The garbage collection advantage: improving program locality. In: Proc. of the 19th annual ACM SIGPLAN Conference on Object-oriented programming, systems, languages, and applications, pp. 69–80. ACM Press, New York (2004)

    Chapter  Google Scholar 

  14. Lam, M.S., Rothberg, E.E., Wolf, M.E.: The cache performance and optimizations of block algorithms. In: 4th International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, CA, April 1991, pp. 63–74 (1991)

    Google Scholar 

  15. Lubeck, O., et al.: WS6: Hardware Performance Monitor Design and Functionality, Los Alamos Computer Science Institute Symposium (2005), Web archive, February 12-16 (2005), San Francisco (2005), http://lacsi.rice.edu/workshops/hpca11

  16. Mowry, T.C., Lam, M.S., Gupta, A.: Design and evaluation of a compiler algorithm for prefetching. In: Proc. of the 5th international conf. on Architectural support for programming languages and operating systems, pp. 62–73. ACM Press, New York (1992)

    Chapter  Google Scholar 

  17. Pettis, K., Hansen, R.: Profile guided code positioning. In: Proc. ACM SIGPLAN 1990 Conf. on Prog, White Plains, N.Y, pp. 16–27. ACM, New York (1990)

    Google Scholar 

  18. Rivera, G., Tseng, C.-W.: Data transformations for eliminating conflict misses. In: Proc. of the ACM SIGPLAN 1998 Conf. on Programming language design and implementation, pp. 38–49. ACM Press, New York (1998)

    Chapter  Google Scholar 

  19. Sprunt, B.: Pentium 4 performance monitoring features. IEEE Micro., 72–82 (July-August 2002)

    Google Scholar 

  20. Suganuma, T., Yasue, T., Kawahito, M., Komatsu, H., Nakatani, T.: A dynamic optimization framework for a Java just-in-time compiler. In: Conf. on Object- Oriented Programming, Systems, Languages & Applications (OOPSLA 2001), pp. 180–194 (2001)

    Google Scholar 

  21. The Standard Performance Evaluation Corporation. SPEC JBB2000 Benchmark, http://www.spec.org/jbb2000/

  22. The Standard Performance Evaluation Corporation. SPEC JVM98 Benchmarks (1996), http://www.spec.org/osg/jvm98

  23. Uhlig, R.A., Mudge, T.N.: Trace-driven memory simulation: a survey. ACM Comput. Surv. 29(2), 128–170 (1997)

    Article  Google Scholar 

  24. Vera, X., Bermudo, N., Llosa, J., Gonz´alez, A.: A fast and accurate framework to analyze and optimize cache memory behavior. ACM Trans. Program. Lang. Syst. 26(2), 263–300 (2004)

    Article  Google Scholar 

  25. Wolf, M.E., Lam, M.S.: A data locality optimizing algorithm. In: Proc. of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, Toronto, Ontario, Canada, June 1991, vol. 26, pp. 30–44 (1991)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schneider, F., Gross, T.R. (2006). Using Platform-Specific Performance Counters for Dynamic Compilation. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2005. Lecture Notes in Computer Science, vol 4339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69330-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69330-7_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69329-1

  • Online ISBN: 978-3-540-69330-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics