Is Cache-Oblivious DGEMM Viable?
- Cite this paper as:
- Gunnels J.A., Gustavson F.G., Pingali K., Yotov K. (2007) Is Cache-Oblivious DGEMM Viable?. In: Kågström B., Elmroth E., Dongarra J., Waśniewski J. (eds) Applied Parallel Computing. State of the Art in Scientific Computing. PARA 2006. Lecture Notes in Computer Science, vol 4699. Springer, Berlin, Heidelberg
We present a study of implementations of DGEMM using both the cache-oblivious and cache-conscious programming styles. The cache-oblivious programs use recursion and automatically block DGEMM operands A,B,C for the memory hierarchy. The cache-conscious programs use iteration and explicitly block A,B,C for register files, all caches and memory. Our study shows that the cache-oblivious programs achieve substantially less performance than the cache-conscious programs. We discuss why this is so and suggest approaches for improving the performance of cache-oblivious programs.
Unable to display preview. Download preview PDF.