Near-Optimal Padding for Removing Conflict Misses

  • Xavier Vera
  • Josep Llosa
  • Antonio González
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2481)


The effectiveness of the memory hierarchy is critical for the performance of current processors. The performance of the memory hierarchy can be improved by means of program transformations such as padding, which is a code transformation targeted to reduce conflict misses. This paper presents a novel approach to perform near-optimal padding for multi-level caches. It analyzes programs, detecting conflict misses by means of the Cache Miss Equations. A genetic algorithm is used to compute the parameter values that enhance the program. Our results show that it can remove practically all conflicts among variables in the SPECfp95, targeting all the different cache levels simultaneously.


Genetic Algorithm Loop Nest Cache Size Cache Line Memory Hierarchy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ayguadé, E., et al.: A uniform internal representation for high-level and instructionlevel transformations. UPC (1995)Google Scholar
  2. 2.
    Bermudo, N., Vera, X., González, A., Llosa, J.: An efficient solver for cache miss equations. In: IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2000 (2000)Google Scholar
  3. 3.
    Carmean, D.: Inside the Pentium 4 Processor Micro-Architecture (2000),
  4. 4.
    Clauss, P.: Counting solutions to linear and non-linear constraints through Ehrhart polynomials. In: ACM International Conference on Supercomputing (ICS 1996), Philadelphia, pp. 278–285 (1996)Google Scholar
  5. 5.
    Fernández, A.: A quantitative analysis of the SPECfp95. Technical Report UPCDAC- 1999-12, Universitat Polit‘ecnica de Catalunya (March 1999)Google Scholar
  6. 6.
    Ghosh, S., Martonosi, M., Malik, S.: Cache miss equations: a compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Languages and Systems 21(4), 703–746 (1999)CrossRefGoogle Scholar
  7. 7.
    Gill, Murray, Wright: Practical optimization. Academic Press, London (1981)zbMATHGoogle Scholar
  8. 8.
    Glover, Laguna: Tabu search. Kluwer, Dordrecht (1997)zbMATHGoogle Scholar
  9. 9.
    Goldberg, D.: Genetic algorithms in search, optimizations and machine learning. Addison-Wesley, Reading (1989)Google Scholar
  10. 10.
    Hansen, Jaumard, Mathon: Constrained nonlinear 0-1 programming. ORSA Journal on Computing (1995)Google Scholar
  11. 11.
    Holland, J.: Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor (1975)Google Scholar
  12. 12.
    Host, Pardalos, Thoai: Introduction to global optimization. Kluwer, Dordrecht (1995)Google Scholar
  13. 13.
    Kirkpatrick, Gelatt, Vecchi: Optimization by simulated annealing. Science 220 (1983)Google Scholar
  14. 14.
    McKinley, K.S., Temam, O.: A quantitative analysis of loop nest locality. In: Proc. of VII Int. Conf. on Architectural Support for Programming Languages and Operating Systems, ASPLOS 1996 (1996)Google Scholar
  15. 15.
    Michalewicz, Z.: Genetic algorithms+Data structures=Evolution Programs. Springer, Heidelberg (1994)zbMATHGoogle Scholar
  16. 16.
    Padua, D., et al.: Polaris developer’s document (1994)Google Scholar
  17. 17.
    Rivera, G., Tseng, C.-W.: Data transformations for eliminating conflict misses. In: ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation (PLDI 1998), pp. 38–49 (1998)Google Scholar
  18. 18.
    Rivera, G., Tseng, C.-W.: Eliminating conflict misses for high performance architectures. In: ACM Internacional Conference on Supercomputing, ICS 1998 (1998)Google Scholar
  19. 19.
    Rivera, G., Tseng, C.-W.: Locality optimizations for multi-level caches. In: Supercomputing, SC 1999 (1999)Google Scholar
  20. 20.
    Torn, Zilinskas: Global optimization. Springer, Heidelberg (1989)Google Scholar
  21. 21.
    Vera, X., Llosa, J., González, A., Ciuraneta, C.: A fast implementation of cache miss equations. In: 8th International Workshop on Compilers for Parallel Computers, CPC 2000 (2000)Google Scholar
  22. 22.
    Vera, X., Xue, J.: Let’s study whole program cache behaviour analitically. In: International Symposium on High-Performance Computer Architecture (HPCA 8), Cambridge (February 2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Xavier Vera
    • 1
  • Josep Llosa
    • 2
  • Antonio González
    • 2
  1. 1.Institutionen för Datateknik, Mälardalens HögskolaVästeråsSweden
  2. 2.Computer Architecture DepartmentUniversitat Politècnica de Catalunya-BarcelonaBarcelonaSpain

Personalised recommendations