Using Platform-Specific Performance Counters for Dynamic Compilation

Schneider, Florian; Gross, Thomas R.

doi:10.1007/978-3-540-69330-7_23

Using Platform-Specific Performance Counters for Dynamic Compilation

Florian Schneider²⁰ &
Thomas R. Gross²⁰

Conference paper

519 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4339))

Abstract

Hardware performance counters provide information about events in the hardware platform (e.g., cache misses, pipeline stalls), in contrast to profiles that capture program properties (e.g., execution frequencies for basic blocks, methods, function calls). As platform architectures become more complex and also more diverse, it is important for a compiler to exploit platform-specific information. A dynamic (JIT) compiler is in the unique position to run on the same platform as the target application, but in practice, exploiting the wealth of information available through performance counters is far from easy. If a JIT compiler is to use performance counter information, this information must be fine-grained (e.g., attributing cache misses to a single load instruction) and must be obtainable without undue overhead. We present a runtime+compiler framework to tie hardware performance counter information to a dynamic compiler and argue that the overhead is low and fine-grained. As parallel architectures or multi-core architectures proliferate, performance issues will play a crucial role in all compilation engines, and our paper reports on a modular approach to make such counter information available to the compiler.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

IA-32 Intel Architecture Software Developer;s Manual, Volume 3: System Programming Guide (2005)
Google Scholar
Adl-Tabatabai, A.-R., Hudson, R.L., Serrano, M.J., Subramoney, S.: Prefetch injection based on hardware monitoring and object metadata. In: Proc. of the ACM SIGPLAN 2004 Conf. on Programming language design and implementation, pp. 267–276. ACM Press, New York (2004)
Chapter Google Scholar
Alpern, B., Attanasio, C.R., Barton, J.J., Cocchi, A., Hummel, S.F., ber, D. L., Ngo, T., Mergen, M.F., Shepherd, J.C., Smith, S.: Implementing jalapeno in java. In: Conference on Object-Oriented, pp. 314–324 (1999)
Google Scholar
Alpern, B., Attanasio, D., Barton, J., Burke, M., Cheng, P., Choi, J.-D., Cocchi, A., Fink, S., Grove, D., Hind, M., Hummel, S.F., Lieber, D., Litvinov, V., on Ngo, T., Mergen, M., Sarkar, V., Serrano, M., Shepherd, J., Smith, S., Sreedhar, V.C., rini Srinivasan, H., Whaley, J.: The Jalapeno virtual machine. IBM Systems Journal, Java Performance Issue 39(1) (2000)
Google Scholar
Ammons, G., Ball, T., Larus, J.R.: Exploiting hardware performance counters with flow and context sensitive profiling. In: Proc. of the ACM SIGPLAN 1997 conference on Programming language design and implementation, pp. 85–96. ACM Press, New York (1997)
Chapter Google Scholar
Arnold, M., Fink, S., Grove, D., Hind, M., Sweeney, P.F.: Adaptive optimization in the jalapeo JVM. In: Proc. of the 15th ACM SIGPLAN conference on Objectoriented programming, systems, languages, and applications, pp. 47–65. ACM Press, New York (2000)
Chapter Google Scholar
Brink & Abyss, http://www.eg.bucknell.edu/bsprunt/emon/brinkabyss/brinkabyss.shtm
Chang, P.P., Mahlke, S.A., Hwu, W.W.: Using profile information to assist classic code optimizations. Software Practice and Experience 21(12), 1301–1321 (1991)
Article Google Scholar
Georges, A., Buytaert, D., Eeckhout, L., Bosschere, K.D.: Method-level phase behavior in java workloads. In: Proc. of the 19th annual ACM SIGPLAN Conference on Object-oriented programming, systems, languages, and applications, pp. 270–287. ACM Press, New York (2004)
Chapter Google Scholar
Goldberg, A.J., Hennessy, J.L.: Performance debugging shared memory multiprocessor programs with mtool. In: Supercomputing 1991: Proc. of the 1991 ACM/IEEE conference on Supercomputing, pp. 481–490. ACM Press, New York (1991)
Chapter Google Scholar
Goldschmidt, S.R., Hennessy, J.L.: The accuracy of trace-driven simulations of multiprocessors. In: Proc. of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems, pp. 146–157. ACM Press, New York (1993)
Chapter Google Scholar
Hauswirth, M., Sweeney, P.F., Diwan, A., Hind, M.: Vertical profiling: understanding the behavior of object-priented applications. In: Proc. of the 19th annual ACM SIGPLAN Conference on Object-oriented programming, systems, languages, and applications, pp. 251–269. ACM Press, New York (2004)
Chapter Google Scholar
Huang, X., Blackburn, S.M., McKinley, K.S., Moss, J.E.B., Wang, Z., Cheng, P.: The garbage collection advantage: improving program locality. In: Proc. of the 19th annual ACM SIGPLAN Conference on Object-oriented programming, systems, languages, and applications, pp. 69–80. ACM Press, New York (2004)
Chapter Google Scholar
Lam, M.S., Rothberg, E.E., Wolf, M.E.: The cache performance and optimizations of block algorithms. In: 4th International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, CA, April 1991, pp. 63–74 (1991)
Google Scholar
Lubeck, O., et al.: WS6: Hardware Performance Monitor Design and Functionality, Los Alamos Computer Science Institute Symposium (2005), Web archive, February 12-16 (2005), San Francisco (2005), http://lacsi.rice.edu/workshops/hpca11
Mowry, T.C., Lam, M.S., Gupta, A.: Design and evaluation of a compiler algorithm for prefetching. In: Proc. of the 5th international conf. on Architectural support for programming languages and operating systems, pp. 62–73. ACM Press, New York (1992)
Chapter Google Scholar
Pettis, K., Hansen, R.: Profile guided code positioning. In: Proc. ACM SIGPLAN 1990 Conf. on Prog, White Plains, N.Y, pp. 16–27. ACM, New York (1990)
Google Scholar
Rivera, G., Tseng, C.-W.: Data transformations for eliminating conflict misses. In: Proc. of the ACM SIGPLAN 1998 Conf. on Programming language design and implementation, pp. 38–49. ACM Press, New York (1998)
Chapter Google Scholar
Sprunt, B.: Pentium 4 performance monitoring features. IEEE Micro., 72–82 (July-August 2002)
Google Scholar
Suganuma, T., Yasue, T., Kawahito, M., Komatsu, H., Nakatani, T.: A dynamic optimization framework for a Java just-in-time compiler. In: Conf. on Object- Oriented Programming, Systems, Languages & Applications (OOPSLA 2001), pp. 180–194 (2001)
Google Scholar
The Standard Performance Evaluation Corporation. SPEC JBB2000 Benchmark, http://www.spec.org/jbb2000/
The Standard Performance Evaluation Corporation. SPEC JVM98 Benchmarks (1996), http://www.spec.org/osg/jvm98
Uhlig, R.A., Mudge, T.N.: Trace-driven memory simulation: a survey. ACM Comput. Surv. 29(2), 128–170 (1997)
Article Google Scholar
Vera, X., Bermudo, N., Llosa, J., Gonz´alez, A.: A fast and accurate framework to analyze and optimize cache memory behavior. ACM Trans. Program. Lang. Syst. 26(2), 263–300 (2004)
Article Google Scholar
Wolf, M.E., Lam, M.S.: A data locality optimizing algorithm. In: Proc. of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, Toronto, Ontario, Canada, June 1991, vol. 26, pp. 30–44 (1991)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for Software Technology, Department of Computer Science, ETH Zürich, Zürich, Switzerland
Florian Schneider & Thomas R. Gross

Authors

Florian Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Thomas R. Gross
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

BSC-UPC,
Eduard Ayguadé
Department of Computer Science, Louisiana State University, 70803, Baton Rouge, LA, USA
Gerald Baumgartner
Dept. of Electrical and Computer Engg., Louisiana State University, Baton Rouge, LA, USA
J. Ramanujam
Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Avenue, 43210, Columbus, OH, USA
P. Sadayappan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schneider, F., Gross, T.R. (2006). Using Platform-Specific Performance Counters for Dynamic Compilation. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2005. Lecture Notes in Computer Science, vol 4339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69330-7_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-69330-7_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69329-1
Online ISBN: 978-3-540-69330-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics