Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW Position Paper

  • Richard Vuduc
  • James W. Demmel
Conference paper

DOI: 10.1007/3-540-45350-4_14

Volume 1924 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Vuduc R., Demmel J.W. (2000) Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW Position Paper. In: Taha W. (eds) Semantics, Applications, and Implementation of Program Generation. SAIG 2000. Lecture Notes in Computer Science, vol 1924. Springer, Berlin, Heidelberg

Abstract

Achieving peak performance in important numerical kernels such as dense matrix multiply or sparse-matrix vector multiplication usually requires extensive, machine-dependent tuning by hand. In response, a number automatic tuning systems have been developed which typically operate by (1) generating multiple implementations of a kernel, and (2) empirically selecting an optimal implementation. One such system is FFTW (Fastest Fourier Transform in the West) for the discrete Fourier transform. In this paper, we review FFTW’s inner workings with an emphasis on its code generator, and report on our empirical evaluation of the system on two different hardware and compiler platforms. We then describe a number of our own extensions to the FFTW code generator that compute effcient discrete cosine transforms and show promising speed-ups over a vendor-tuned library. We also comment on current opportunities to develop tuning systems in the spirit of FFTW for other widely-used kernels.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Richard Vuduc
    • 1
  • James W. Demmel
    • 2
  1. 1.Computer Science DivisionUniversity of California at BerkeleyBerkeley, CAUSA
  2. 2.Computer Science Division and Dept. of MathematicsUniversity of California at BerkeleyBerkeley, CAUSA