Abstract
Profiling can effectively analyze program behavior and provide critical information for feedback-directed or dynamic optimizations. Based on memory profiling, reuse distance analysis has shown much promise in predicting data locality for a program using inputs other than the profiled ones. Both whole-program and instruction-based locality can be accurately predicted by reuse distance analysis.
Reuse distance analysis abstracts a cluster of memory references for a particular instruction having similar reuse distance values into a locality pattern. Prior work has shown that a significant number of memory instructions have multiple locality patterns, a property not desirable for many instruction-based memory optimizations. This paper investigates the relationship between locality patterns and execution paths by analyzing reuse distance distribution along each dynamic path to an instruction. Here a path is defined as the program execution trace from the previous access of a memory location to the current access. By differentiating locality patterns with the context of execution paths, the proposed analysis can expose optimization opportunities tailored only to a specific subset of paths leading to an instruction.
In this paper, we present an effective method for path-based reuse distance profiling and analysis. We have observed that a significant percentage of the multiple locality patterns for an instruction can be uniquely related to a particular execution path in the program. In addition, we have also investigated the influence of inputs on reuse distance distribution for each path/instruction pair. The experimental results show that the path-based reuse distance is highly predictable, as a function of the data size, for a set of SPEC CPU2000 programs.
Keywords
- Integer Program
- Locality Pattern
- Memory Location
- Cache Line
- Execution Path
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This work was partially supported by U.S. NSF grant CCR-0312892.
Download conference paper PDF
References
Ding, C., Zhong, Y.: Predicting whole-program locality through reuse distance analysis. In: Proceedings of the 2003 ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, California, pp. 245–257 (2003)
Zhong, Y., Dropsho, S., Ding, C.: Miss rate prediction across all program inputs. In: Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, New Orleans, LA, pp. 91–101 (2003)
Fang, C., Carr, S., Önder, S., Wang, Z.: Reuse-distance-based miss-rate prediction on a per instruction basis. In: Proceedings of the Second ACM Workshop on Memory System Performance, Washington, D.C, pp. 60–68 (2004)
Fang, C., Carr, S., Önder, S., Wang, Z.: Instruction based memory distance analysis and its application to optimization. In: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, St. Louis, MO (2005)
Goff, G., Kennedy, K., Tseng, C.: Practical dependence testing. In: Proceedings of the SIGPLAN 1991 Conference on Programming Language Design and Implementation, Toronto, Canada, pp. 15–29 (1991)
Pugh, W.: A practical algorithm for exact array dependence analysis. Communications of the ACM 35(8), 102–114 (1992)
McKinley, K.S., Carr, S., Tseng, C.: Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems 18(4), 424–453 (1996)
Wolf, M.E., Lam, M.: A data locality optimizing algorithm. In: Proceedings of the SIGPLAN 1991 Conference on Programming Language Design and Implementation, Toronto, Canada, pp. 30–44 (1991)
Beyls, K., D’Hollander, E.: Generating cache hints for improved program efficiency. Journal of Systems Architecture 51(4) (2005)
Mattson, R.L., Gecsei, J., Slutz, D., Traiger, I.L.: Evaluation techniques for storage hierarchies. IBM Systems Journal 9(2), 78–117 (1970)
Almasi, G., Cascaval, C., Padua, D.: Calculating stack distance efficiently. In: Proceedings of the first ACM Workshop on Memory System Performance, Berlin, Germany (2002)
Cascaval, C., Padua, D.: Estimating cache misses and locality using stack distance. In: Proceedings of the 17th International Conference on Supercomputing, San Francisco, CA, pp. 150–159 (2003)
Sugumar, R.A., Abraham, S.G.: Efficient simulation of caches under optimal replacement with applications to miss characterization. In: Proceedings of the ACM SIGMETRICS Conference on Measurement & Modeling Computer Systems, Santa Clara, CA, pp. 24–35 (1993)
Zhong, Y., Ding, C., Kennedy, K.: Reuse distance analysis for scientific programs. In: Proceedings of Workshop on Language, Compilers, and Runtime Systems for Scalable Compilers, Washington, DC (2002)
Beyls, K., D’Hollander, E.: Reuse distance as a metric for cache behavior. In: Proceedings of the IASTED Conference on Parallel and Distributed Computing and Systems (2001)
Ding, C.: Improving effective bandwidth through compiler enhancement of global and dynamic reuse. PhD thesis, Rice University (2000)
Zhong, Y., Orlovich, M., Shen, X., Ding, C.: Array regrouping and structure splitting using whole-program reference affinity. In: Proceedings of the 2004 ACM SIGPLAN Conference on Programming Language Design and Implementation, Washington, D.C (2004)
Shen, X., Zhong, Y., Ding, C.: Locality phase prediction. In: Proceedings of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XI), Boston, MA (2004)
Beyls, K., D’Hollander, E.: Reuse distance-based cache hint selection. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, p. 265. Springer, Heidelberg (2002)
Beyls, K., D‘Hollander, E., Vandeputte, F.: RDVIS: A tool that visualizes the causes of low locality and hints program optimizations. In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2005. LNCS, vol. 3515, pp. 166–173. Springer, Heidelberg (2005)
Marin, G., Mellor-Crummey, J.: Cross architecture performance predictions for scientific applications using parameterized models. In: Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems, New York (2004)
Ammons, G., Larus, J.R.: Improving data-flow analysis with path profiles. In: Proceedings of the SIGPLAN 1998 Conference on Programming Language Design and Implementation, Montreal, Canada, pp. 72–84 (1998)
Ball, T., Larus, J.R.: Efficient path profiling. In: Proceedings of the 29th International Symposium on Microarchitecture, Paris, France, pp. 46–57 (1996)
Larus, J.R.: Whole program paths. In: Proceedings of the SIGPLAN 1999 Conference on Programming Language Design and Implementation, Atlanta, GA, pp. 259–269 (1999)
Mowry, T., Luk, C.K.: Predicting data cache misses in non-numberic applications through correlation profiling. In: Proceedings of the 30th International Symposium on Microarchitecture, North Carolina, United States, pp. 314–320 (1997)
Burger, D.C., Austin, T.M.: The SimpleScalar tool set, version 2.0. Computer Architecture News 25(3), 13–25 (1997)
KleinOsowski, A., Lilja, D.: Minnespec: A new spec benchmark workload for simulation-based computer architecture research. Computer Architecture Letters 1 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fang, C., Carr, S., Önder, S., Wang, Z. (2006). Path-Based Reuse Distance Analysis. In: Mycroft, A., Zeller, A. (eds) Compiler Construction. CC 2006. Lecture Notes in Computer Science, vol 3923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11688839_4
Download citation
DOI: https://doi.org/10.1007/11688839_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33050-9
Online ISBN: 978-3-540-33051-6
eBook Packages: Computer ScienceComputer Science (R0)
