Random Fields Generation on the GPU with the Spectral Turning Bands Method

  • Lars Hunger
  • Biagio Cosenza
  • Stefan Kimeswenger
  • Thomas Fahringer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8632)

Abstract

Random field (RF) generation algorithms are of paramount importance for many scientific domains, such as astrophysics, geostatistics, computer graphics and many others. Some examples are the generation of initial conditions for cosmological simulations or hydrodynamical turbulence driving. In the latter a new random field is needed every time-step. Current approaches commonly make use of 3D FFT (Fast Fourier Transform) and require the whole generated field to be stored in memory. Moreover, they are limited to regular rectilinear meshes and need an extra processing step to support non-regular meshes.

In this paper, we introduce TBARF (Turning BAnd Random Fields), a RF generation algorithm based on the turning band method that is optimized for massively parallel hardware such as GPUs. Our algorithm replaces the 3D FFT with a lower order, one-dimensional FFT followed by a projection step, and is further optimized with loop unrolling and blocking. We show that TBARF can easily generate RF on non-regular (non uniform) meshes and can afford mesh sizes bigger than the available GPU memory by using a streaming, out-of-core approach. TBARF is 2 to 5 times faster than the traditional methods when generating RFs with more than 16M cells. It can also generate RF on non-regular meshes, and has been successfully applied to two real case scenarios: planetary nebulae and cosmological simulations.

Keywords

gpu random field turning band fft astrophysics non uniform mesh non-regular mesh gpgpu spectral methods 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    NVIDIA: CUDA Compute Unified Device Architecture Reference ManualGoogle Scholar
  2. 2.
    Khronos OpenCL Working Group: The OpenCL Specification 1.1Google Scholar
  3. 3.
    Mantoglou, A.: Digital Simulation of Multivariate Two- and Three-Dimensional Stochastic Processes with a Spectral Turning Bands Method. Mathematical Geology 19(2), 129–149 (1987)Google Scholar
  4. 4.
    Emery, X., Lantuéjoul, C.: TBSIM: A computer program for conditional simulation of three-dimensional Gaussian random fields via the turning bands method. Computers & Geosciences 32, 1615–1628 (2006)CrossRefGoogle Scholar
  5. 5.
    Springel, V., White, S.D.M., Jenkins, A., Frenk, C.S., Yoshida, N., Gao, L., Navarro, J., Thacker, R., Croton, D., Helly, J., Peacock, J.A., Cole, S., Thomas, P., Couchman, H., Evrard, A., Colberg, J., Pearce, F.: Simulations of the formation, evolution and clustering of galaxies and quasars. Nature 435, 629–636 (2005)CrossRefGoogle Scholar
  6. 6.
    Dietrich, C.R., Newsam, G.N.: Fast and Exact Simulation of Stationary Gaussian Processes through Circulant Embedding of the Covariance Matrix. SIAM Journal on Scientific Computing 18(4), 1088–1107 (1997)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Springel, V.: E pur si muove: Galilean-invariant cosmological hydrodynamical simulations on a moving mesh. Monthly Notices of the Royal Astronomical Society 401(2), 791–851 (2010)CrossRefGoogle Scholar
  8. 8.
    Stone, J.: Direct Numerical Simulations of Compressible Magnetohydrodynamical Turbulence Interstellar Turbulence. In: Proceedings of the 2nd Guillermo Haro Conference, p. 267. Cambridge University Press (1999)Google Scholar
  9. 9.
    Frigo, M., Johnson, S.G.: The Design and Implementation of FFTW3. Proceedings of the IEEE 93(2), 216–231 (2005)CrossRefGoogle Scholar
  10. 10.
    Volkov, V., Kazian, B.: Fitting FFT onto G80 Architecture Report. University of California, Berkeley (2008)Google Scholar
  11. 11.
    NVIDIA: CUDA CUFFT Library, Version 2.3 (2009)Google Scholar
  12. 12.
    Govindaraju, N., Lloyd, B., Dotsenko, Y., Smith, B., Manferdelli, J.: High performance discrete fourier transforms on graphics processors. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC), pp. 2:1–2:12 (2008)Google Scholar
  13. 13.
    Nukada, A., Matsuoka, S.: Auto-tuning 3-D FFT Library for Cuda GPUs. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC), pp. 30:1–30:10 (2009)Google Scholar
  14. 14.
    Sarkar, V.: Optimized Unrolling of Nested Loops. International Journal of Parallel Programming 2(5), 545–581 (2001)CrossRefGoogle Scholar
  15. 15.
    Yang, Y., Xiang, P., Kong, J., Zhou, H.: A GPGPU compiler for memory optimization and parallelism management In: Proceedings of the 2010 ACM SIGPLAN PLDI, pp. 86–97 (2010)Google Scholar
  16. 16.
    Wolfe, M.: More Iteration Space Tiling. In: Proceedings of the ACMIEEE Conference on Supercomputing, pp. 655–664 (1989)Google Scholar
  17. 17.
    Murthy, G.S., Ravishankar, M., Baskaran, M.M., Sadayappan, P.: Optimal Loop Unrolling For GPGPU Programs. In: IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–11 (2010)Google Scholar
  18. 18.
    Kofler, K., Grasso, I., Cosenza, B., Fahringer, T.: An Automatic Input-Sensitive Approach for Heterogeneous Task Partitioning. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, pp. 149–160 (2013)Google Scholar
  19. 19.
    Kofler, K., Steinhauser, D., Cosenza, B., Grasso, I., Schindler, S., Fahringer, T.: Kd-tree Based N-Body Simulations with Volume-Mass Heuristic on the GPU. In: Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC)Google Scholar
  20. 20.
    Grasso, I., Pellegrini, S., Cosenza, B., Fahringer, T.: LibWater: Heterogeneous Distributed Computing Made Easy. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, pp. 161–172 (2013)Google Scholar
  21. 21.
    Jordan, H., Thoman, P., Durillo, J.J., Pellegrini, S., Gschwandtner, P., Fahringer, T., Moritsch, H.: A Multi-Objective Auto-Tuning Framework for Parallel Codes. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pp. 10:1–10:12 (2012)Google Scholar
  22. 22.
    Chen, Y., Cui, X., Mei, H.: Large-scale FFT on GPU Clusters. In: Proceedings of the 24th ACM International Conference on Supercomputing (ICS), pp. 315–324 (2010)Google Scholar
  23. 23.
    Matheron, G.: The intrinsic random functions and their application. Adv. Appl. Prob. 5, 439–468 (1973)MathSciNetCrossRefMATHGoogle Scholar
  24. 24.
    Chiles, J.P., Delfiner, P.: Geostatistics: Modeling Spatial Uncertainty. John Wiley & Sons, New York (1999)CrossRefMATHGoogle Scholar
  25. 25.
    Kasdin, N.J., Walter, T.: Discrete Simulation of Power Law noise. In: 46th Proceedings of the 1992 IEEE Frequency Control Symposium, pp. 274–283 (1992)Google Scholar
  26. 26.
    Carrettoni, M., Cremonesi, O.: Generation of noise time series with arbitrary power spectrum. Computer Physics Communications 181(12), 1982–1985 (2010)CrossRefGoogle Scholar
  27. 27.
    Engelberg, S.: Random signals and noise: A mathematical introduction, p. 130. CRC Press (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Lars Hunger
    • 1
    • 4
  • Biagio Cosenza
    • 2
  • Stefan Kimeswenger
    • 1
    • 3
  • Thomas Fahringer
    • 2
  1. 1.Institute for Astro- and Particle PhysicsUniversity of InnsbruckAustria
  2. 2.Institute of Computer ScienceUniversity of InnsbruckAustria
  3. 3.Instituto de AstronomíaUniversidad Católica del Norte AntofagastaChile
  4. 4.BrainLinks-BrainToolsUniversity of FreiburgGermany

Personalised recommendations