Advertisement

On the Use of Small 2D Convolutions on GPUs

  • Shams A. H. Al Umairy
  • Alexander S. van Amesfoort
  • Irwan D. Setija
  • Martijn C. van Beurden
  • Henk J. Sips
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6161)

Abstract

Computing many small 2D convolutions using FFTs is a basis for a large number of applications in many domains in science and engineering, among them electromagnetic diffraction modeling in physics. The GPU architecture seems to be a suitable architecture to accelerate these convolutions, but reaching high application performance requires substantial development time and non-portable optimizations. In this work, we present the techniques, performance results and considerations to accelerate small 2D convolutions using CUDA, and compare performance to a multi-threaded CPU implementation. To improve programmability and performance of applications that make heavy use of small convolutions, we argue that two improvements to software and hardware are needed: FFT libraries must be extended with a single convolution function and communication bandwidth between CPU and GPU needs to be drastically improved.

Keywords

Fast Fourier Transform Graphic Processing Unit Graphic Processing Unit Memory Graphic Processing Unit Architecture Fast Fourier Transform Size 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Govindaraju, N., Lloyd, B., Dotsenko, Y., Smith, B., Manferdelli, J.: High performance discrete fourier transforms on graphics processors. In: Proc. of the ACM/IEEE Conf. on Supercomputing, pp. 1–12. IEEE Press, Los Alamitos (2008)Google Scholar
  2. 2.
    Podlozhnyuk, V.: Image convolution with CUDA. Tech. rep., NVIDIA (2007)Google Scholar
  3. 3.
    NVIDIA: CUDA Programming Guide (February 2010)Google Scholar
  4. 4.
    Podlozhnyuk, V.: FFT-based 2D convolution. Tech. rep., NVIDIA (2007)Google Scholar
  5. 5.
    Podlozhnyuk, V.: Image convolution with CUDA. Tech. rep., NVIDIA (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Shams A. H. Al Umairy
    • 1
  • Alexander S. van Amesfoort
    • 1
  • Irwan D. Setija
    • 2
  • Martijn C. van Beurden
    • 3
  • Henk J. Sips
    • 1
  1. 1.Delft University of TechnologyDelftThe Netherlands
  2. 2.ASMLEindhovenThe Netherlands
  3. 3.Eindhoven University of TechnologyEindhovenThe Netherlands

Personalised recommendations