Applying Loop Tiling and Unrolling to a Sparse Kernel Code
Code transformations to optimize the performance work well where a very precise data dependence analysis can be done at compile time. However, current compilers usually do not optimize irregular codes, because they contain input dependent and/or dynamic memory access patterns. This paper presents how we can adapt two representative loop transformations, tiling and unrolling, to codes with irregular computations, obtaining a significant performance improvement over the original non-transformed code. Experiments of our proposals are conducted on three different hardware platforms. A very known sparse kernel code is used as an example code to show performance improvements.
- 1.Carr, S., Mckinley, K.S., Tseng, C.: Compiler Optimizations for Improving Data Locality. In: 6th International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA (October 1994)Google Scholar
- 2.Kandemir, M., Ramanujam, J.: Data Relaton Vectors: A New Abstraction for Data Optimizations. IEEE Transactions on Computers 50(8) (August 2001)Google Scholar
- 3.O’Boyle, M., Knijnenburg, P.: Integrating Loop and Data Transformations for Global Optimizations. In: IEEE International Conference on Parallel Architectures and Compilation Techniques, Paris, France (October 1998)Google Scholar
- 4.Rivera, G., Tseng, C.-W.: Data Transformations for Eliminating Conflict Misses. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, Montreal, Canada (June 1998)Google Scholar