Chapter

Applied Parallel Computing. State of the Art in Scientific Computing

Volume 4699 of the series Lecture Notes in Computer Science pp 919-928

Is Cache-Oblivious DGEMM Viable?

  • John A. GunnelsAffiliated withIBM T. J. Watson Research Center, Yorktown Heights, NY 10598
  • , Fred G. GustavsonAffiliated withIBM T. J. Watson Research Center, Yorktown Heights, NY 10598
  • , Keshav PingaliAffiliated withDept. of Computer Science, Cornell University, Ithaca, NY 14853
  • , Kamen YotovAffiliated withDept. of Computer Science, Cornell University, Ithaca, NY 14853

* Final gross prices may vary according to local VAT.

Get Access

Abstract

We present a study of implementations of DGEMM using both the cache-oblivious and cache-conscious programming styles. The cache-oblivious programs use recursion and automatically block DGEMM operands A,B,C for the memory hierarchy. The cache-conscious programs use iteration and explicitly block A,B,C for register files, all caches and memory. Our study shows that the cache-oblivious programs achieve substantially less performance than the cache-conscious programs. We discuss why this is so and suggest approaches for improving the performance of cache-oblivious programs.