Advertisement

An Efficient and Flexible Parallel FFT Implementation Based on FFTW

  • Michael Pippig
Conference paper

Abstract

In this paper we describe a new open source software library called PFFT [12], which was developed for calculating parallel complex to complex FFTs on massively parallel architectures. It combines the flexible user interface and hardware adaptiveness of FFTW [7] with a highly scalable two-dimensional data decomposition. We use a transpose FFT algorithm, that consist of one-dimensional FFTs and global data transpositions. For the implementation we utilize the FFTW software library. Therefore we are able to generalize our algorithm straight forward to d-dimensional FFTs, d≥3, real to complex FFTs and even completely in place transformations. Further retained FFTW features like the selection of planning effort via flags and a separate communicator handle distinguish PFFT from other public available parallel FFT implementations. Automatic ghost cell creation and support of oversampled FFTs complete the outstanding flexibility of PFFT. Our runtime tests up to 262144 cores of the BlueGene/P supercomputer prove PFFT to be as fast as the well known P3DFFT [11] software package, while the flexibility of FFTW is still preserved.

Keywords

Fast Fourier Transform Software Library Wall Clock Time Fast Fourier Transform Algorithm Input Array 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work was supported by the BMBF grant 01IH08001B. We are grateful to the Jülich Supercomputing Center for providing the computational resources on Jülich BlueGene/P (JuGene) and Jülich Research on Petaflop Architectures (JuRoPA).

References

  1. 1.
    Eleftheriou, M., Fitch, B.G., Rayshubskiy, A., Ward, T.J.C., Germain, R.S.: Performance measurements of the 3d FFT on the Blue Gene/L supercomputer. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005 Parallel Processing, Lecture Notes in Computer Science, vol. 3648, pp. 795–803. Springer (2005)Google Scholar
  2. 2.
    Eleftheriou, M., Fitch, B.G., Rayshubskiy, A., Ward, T.J.C., Germain, R.S.: Scalable framework for 3d FFTs on the Blue Gene/L supercomputer: Implementation and early performance measurements. IBM Journal of Research and Development 49, 457–464 (2005)CrossRefGoogle Scholar
  3. 3.
    Eleftheriou, M., Moreira, J.E., Fitch, B.G., Germain, R.S.: Parallel FFT subroutine library. URL http://www.alphaworks.ibm.com/tech/bgl3dfft
  4. 4.
    Eleftheriou, M., Moreira, J.E., Fitch, B.G., Germain, R.S.: A volumetric FFT for BlueGene/L. In: Pinkston, T.M., Prasanna, V.K. (Eds.) HiPC, Lecture Notes in Computer Science, vol. 2913, pp. 194–203. Springer (2003)Google Scholar
  5. 5.
    Filippone, S.: The IBM parallel engineering and scientific subroutine library. In: Dongarra, J., Madsen, K., Wasniewski, J. (Eds.) PARA, Lecture Notes in Computer Science, vol. 1041, pp. 199–206. Springer (1995)Google Scholar
  6. 6.
    Frigo, M., Johnson, S.G.: FFTW, C subroutine library. http://www.fftw.org. URL http://www.fftw.org
  7. 7.
    Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proceedings of the IEEE 93, 216–231 (2005)CrossRefGoogle Scholar
  8. 8.
    Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S.: Cache-oblivious algorithms. In: Proceedings of 40th Ann. Symp. on Foundations of Comp. Sci. (FOCS), pp. 285–297. IEEE Comput. Soc. (1999)Google Scholar
  9. 9.
    Gupta, A., Kumar, V.: The scalability of FFT on parallel computers. IEEE Transactions on Parallel and Distributed Systems 4, 922–932 (1993)CrossRefGoogle Scholar
  10. 10.
    Intel Corporation: Intel math kernel library. URL http://software.intel.com/en-us/intel-mkl/
  11. 11.
    Pekurovsky, D.: P3DFFT, Parallel FFT subroutine library. URL http://www.sdsc.edu/us/resources/p3dfft
  12. 12.
    Pippig, M.: PFFT, Parallel FFT subroutine library. URL http://www.tu-chemnitz.de/~mpip
  13. 13.
    Plimpton, S.: Parallel FFT subroutine library. URL http://www.sandia.gov/~sjplimp/docs/fft/README.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.Department of MathematicsChemnitz University of TechnologyChemnitzGermany

Personalised recommendations