Supporting Cache Locality Optimization with a Toolset

  • Jie Tao
  • Wolfgang Karl
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4128)


Cache performance significantly influences the computation power of modern processors. With the trend of microprocessor design for both general use and embedded systems towards chip-multiple, cache performance becomes more important because an off-chip access is rather expensive in comparison with on-chip references. This means cache locality optimization remains a hot research area for the next generation of computer architectures.

In this paper we present a tool environment aiming at providing the programmers sufficient support in the task of optimizing source codes for better runtime cache behavior. This environment contains a set of tools ranging from profiling, analysis, and simulation tools for gathering performance data, to visualization tools for graphical presentation and platforms for program development. Together, these tools establish a feedback loop for tuning cache performance on current and emerging uniprocessor and multiprocessor systems.


Visualization Tool Access Pattern Cache Line Performance Counter Cache Performance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    DeRose, L., Ekanadham, K., Hollingsworth, J.K., Sbaraglia, S.: SIGMA: A Simulator Infrastructure to Guide Memory Analysis. In: Supercomputing 2002: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pp. 1–13 (2002)Google Scholar
  2. 2.
    Mohan, T., et al.: Identifying and Exploiting Spatial Regularity in Data Memory References. In: Supercomputing 2003 (November 2003)Google Scholar
  3. 3.
    HP. Perfmon Project Web Site, available at
  4. 4.
    Intel Corporation. Intel Itanium Architecture Software Developer’s Manual, vol. 1–3 (2002), available at
  5. 5.
    Ishizaka, K., Obata, M., Kasahara, H.: Cache Optimization for Coarse Grain Task Parallel Processing Using Inter-Array Padding. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 64–76. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Quaing, B., Tao, J., Karl, W.: YACO: A User Conducted Visualization Tool for Supporting Cache Optimization. In: Yang, L.T., Rana, O.F., Di Martino, B., Dongarra, J. (eds.) HPCC 2005. LNCS, vol. 3726, pp. 694–703. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  7. 7.
    Rigoutsos, I., Floratos, A.: Combinatorial Pattern Discovery in Biological Sequences: the TEIRESIAS Algorithm. Bioinformatics 14(1), 55–67 (1998)CrossRefGoogle Scholar
  8. 8.
    Rivera, G., Tseng, C.: Tiling Optimizations for 3D Scientific Computations. In: Proceedings of Supercomputing 2000 (2000)Google Scholar
  9. 9.
    Shen, X., Gao, Y., Ding, C., Archambault, R.: Lightweight Reference Affinity Analysis. In: ICS 2005: Proceedings of the 19th annual international conference on Supercomputing, New York, pp. 131–140 (2005)Google Scholar
  10. 10.
    Tao, J., Schloissnig, S., Karl, W.: Analysis of the Spatial and Temporal Locality in Data Accesses. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 502–509. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Yu, Y., Beyls, K., D’Hollander, E.H.: Visualizing the Impact of the Cache on Program Execution. In: Proceedings of the 5th International Conference on Information Visualization (IV 2001), July 2001, pp. 336–341 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jie Tao
    • 1
  • Wolfgang Karl
    • 1
  1. 1.Institut für Technische InformatikUniversität Karlsruhe (TH)KarlsruheGermany

Personalised recommendations