Shallow Water Simulations on Multiple GPUs

  • Martin Lilleeng Sætra
  • André Rigland Brodtkorb
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7134)


We present a state-of-the-art shallow water simulator running on multiple GPUs. Our implementation is based on an explicit high-resolution finite volume scheme suitable for modeling dam breaks and flooding. We use row domain decomposition to enable multi-GPU computations, and perform traditional CUDA block decomposition within each GPU for further parallelism. Our implementation shows near perfect weak and strong scaling, and enables simulation of domains consisting of up-to 235 million cells at a rate of over 1.2 gigacells per second using four Fermi-generation GPUs. The code is thoroughly benchmarked using three different systems, both high-performance and commodity-level systems.


Graphic Processing Unit Shallow Water Equation Ghost Cell Global Domain Multiple GPUs 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brodtkorb, A., Dyken, C., Hagen, T., Hjelmervik, J., Storaasli, O.: State-of-the-art in heterogeneous computing. Journal of Scientific Programming 18(1), 1–33 (2010)CrossRefGoogle Scholar
  2. 2.
    Owens, J., Houston, M., Luebke, D., Green, S., Stone, J., Phillips, J.: GPU computing. Proceedings of the IEEE 96(5), 879–899 (2008)CrossRefGoogle Scholar
  3. 3.
    Meuer, H., Strohmaier, E., Dongarra, J., Simon, H.: Top 500 supercomputer sites (November 2010),
  4. 4.
    Hagen, T., Henriksen, M., Hjelmervik, J., Lie, K.A.: How to solve systems of conservation laws numerically using the graphics processor as a high-performance computational engine. In: Hasle, G., Lie, K.A., Quak, E. (eds.) Geometrical Modeling, Numerical Simulation, and Optimization: Industrial Mathematics at SINTEF, pp. 211–264. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Hagen, T.R., Lie, K.-A., Natvig, J.R.: Solving the Euler Equations on Graphics Processing Units. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3994, pp. 220–227. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Brandvik, T., Pullan, G.: Acceleration of a two-dimensional Euler flow solver using commodity graphics hardware. Journal of Mechanical Engineering Science 221(12), 1745–1748 (2007)CrossRefGoogle Scholar
  7. 7.
    Brandvik, T., Pullan, G.: Acceleration of a 3D Euler solver using commodity graphics hardware. In: Proceedings of the 46th AIAA Aerospace Sciences Meeting. Number 2008-607 (2008)Google Scholar
  8. 8.
    Klöckner, A., Warburton, T., Bridge, J., Hesthaven, J.: Nodal discontinuous Galerkin methods on graphics processors. Journal of Computational Physics 228(21), 7863–7882 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Wang, P., Abel, T., Kaehler, R.: Adaptive mesh fluid simulations on GPU. New Astronomy 15(7), 581–589 (2010)CrossRefGoogle Scholar
  10. 10.
    Antoniou, A., Karantasis, K., Polychronopoulos, E., Ekaterinaris, J.: Acceleration of a finite-difference weno scheme for large-scale simulations on many-core architectures. In: Proceedings of the 48th AIAA Aerospace Sciences Meeting (2010)Google Scholar
  11. 11.
    Hagen, T., Hjelmervik, J., Lie, K.A., Natvig, J., Henriksen, M.: Visual simulation of shallow-water waves. Simulation Modelling Practice and Theory 13(8), 716–726 (2005)CrossRefGoogle Scholar
  12. 12.
    Liang, W.-Y., Hsieh, T.-J., Satria, M.T., Chang, Y.-L., Fang, J.-P., Chen, C.-C., Han, C.-C.: A GPU-Based Simulation of Tsunami Propagation and Inundation. In: Hua, A., Chang, S.-L. (eds.) ICA3PP 2009. LNCS, vol. 5574, pp. 593–603. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  13. 13.
    Lastra, M., Mantas, J., Ureña, C., Castro, M., García- Rodríguez, J.: Simulation of shallow-water systems using graphics processing units. Mathematics and Computers in Simulation 80(3), 598–618 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    de la Asunción, M., Mantas, J., Castro, M.: Simulation of one-layer shallow water systems on multicore and CUDA architectures. The Journal of Supercomputing, 1–9 (2010) (published online)Google Scholar
  15. 15.
    de la Asunción, M., Mantas, J.M., Castro, M.J.: Programming CUDA-Based GPUs to Simulate Two-Layer Shallow Water Flows. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010. LNCS, vol. 6272, pp. 353–364. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    Brodtkorb, A., Hagen, T.R., Lie, K.A., Natvig, J.R.: Simulation and visualization of the Saint-Venant system using GPUs. Computing and Visualization in Science (2010) (forthcoming)Google Scholar
  17. 17.
    Micikevicius, P.: 3D finite difference computation on GPUs using CUDA. In: GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, pp. 79–84. ACM, New York (2009)CrossRefGoogle Scholar
  18. 18.
    Playne, D., Hawick, K.: Asynchronous communication schemes for finite difference methods on multiple GPUs. In: 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), pp. 763–768 (May 2010)Google Scholar
  19. 19.
    Komatitsch, D., Göddeke, D., Erlebacher, G., Michéa, D.: Modeling the propagation of elastic waves using spectral elements on a cluster of 192 GPUs. Computer Science - Research and Development 25, 75–82 (2010)CrossRefzbMATHGoogle Scholar
  20. 20.
    Acuña, M., Aoki, T.: Real-time tsunami simulation on multi-node GPU cluster. In: ACM/IEEE Conference on Supercomputing (2009) (poster)Google Scholar
  21. 21.
    Rostrup, S., De Sterck, H.: Parallel hyperbolic PDE simulation on clusters: Cell versus GPU. Computer Physics Communications 181(12), 2164–2179 (2010)CrossRefzbMATHGoogle Scholar
  22. 22.
    Kurganov, A., Petrova, G.: A second-order well-balanced positivity preserving central-upwind scheme for the Saint-Venant system. Communications in Mathematical Sciences 5, 133–160 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Brodtkorb, A., Sætra, M., Altinakar, M.: Efficient shallow water simulations on GPUs: Implementation, visualization, verification, and validation (preprint)Google Scholar
  24. 24.
    Shu, C.W.: Total-variation-diminishing time discretizations. SIAM Journal of Scientific and Statistical Computing 9(6), 1073–1084 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    NVIDIA: NVIDIA CUDA reference manual 3.1 (2010)Google Scholar
  26. 26.
    Ding, C., He, Y.: A ghost cell expansion method for reducing communications in solving PDE problems. In: ACM/IEEE Conference on Supercomputing, pp. 50–50. IEEE Computer Society Press, Los Alamitos (2001)Google Scholar
  27. 27.
    Palmer, B., Nieplocha, J.: Efficient algorithms for ghost cell updates on two classes of MPP architectures. In: Akl, S., Gonzalez, T. (eds.) Proceedings of the 14th IASTED International Conference on Parallel and Distributed Computing and Systems, pp. 192–197. ACTA Press, Cambridge (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Martin Lilleeng Sætra
    • 1
  • André Rigland Brodtkorb
    • 2
  1. 1.Center of Mathematics for ApplicationsUniversity of OsloOsloNorway
  2. 2.Dept. Appl. Math.SINTEFOsloNorway

Personalised recommendations