Skip to main content

Shallow Water DG Simulations on FPGAs: Design and Comparison of a Novel Code Generation Pipeline

Part of the Lecture Notes in Computer Science book series (LNCS,volume 13948)

Abstract

FPGAs are receiving increased attention as a promising architecture for accelerators in HPC systems. Evolving and maturing development tools based on high-level synthesis promise productivity improvements for this technology. However, up to now, FPGA designs for complex simulation workloads, like shallow water simulations based on discontinuous Galerkin discretizations, rely to a large degree on manual application-specific optimizations. In this work, we present a new approach to port shallow water simulations to FPGAs, based on a code-generation framework for high-level abstractions in combination with a template-based stencil processing library that provides FPGA-specific optimizations for a streaming execution model. The new implementation uses a structured grid representation suitable for stencil computations and is compared to an adaptation from an existing hand-optimized FPGA dataflow design supporting unstructured meshes. While there are many differences, for example in the numerical details and problem scalability to be discussed, we demonstrate that overall both approaches can yield meaningful results at competitive performance for the same target FPGA, thus demonstrating a new level of maturity for FPGA-accelerated scientific simulations.

Keywords

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
EUR   29.95
Price includes VAT (Finland)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR   58.84
Price includes VAT (Finland)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR   76.99
Price includes VAT (Finland)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://i10git.cs.fau.de/ocean/ghoddess-release.

  2. 2.

    https://i10git.cs.fau.de/pycodegen/pystencils.

  3. 3.

    https://github.com/pc2/StencilStream.

References

  1. Aizinger, V., Dawson, C.: A discontinuous Galerkin method for two-dimensional flow and transport in shallow water. Adv. Water Resour. 25(1), 67–84 (2002). https://doi.org/10.1016/S0309-1708(01)00019-7

    CrossRef  Google Scholar 

  2. Bauer, M., et al.: Code generation for massively parallel phase-field simulations. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2019), pp. 1–32. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3295500.3356186

  3. Chi, Y., Cong, J.: Exploiting computation reuse for stencil accelerators. In: 2020 57th ACM/IEEE Design Automation Conference (DAC), pp. 1–6. IEEE, San Francisco, CA, USA (2020). https://doi.org/10.1109/DAC18072.2020.9218680

  4. Faghih-Naini, S., Aizinger, V.: p-adaptive discontinuous Galerkin method for the shallow water equations with a parameter-free error indicator. Int. J. Geomath. 13(1), 18 (2022). https://doi.org/10.1007/s13137-022-00208-3

    CrossRef  MathSciNet  MATH  Google Scholar 

  5. Faghih-Naini, S., Kuckuk, S., Aizinger, V., Zint, D., et al.: Quadrature-free discontinuous Galerkin method with code generation features for shallow water equations on automatically generated block-structured meshes. Adv. Water Resour. 138, 103552 (2020). https://doi.org/10.1016/j.advwatres.2020.103552

    CrossRef  Google Scholar 

  6. Faj, J., Plessl, C., Kenter, T., Faghih-Naini, S., Aizinger, V.: Scalable multi-FPGA design of a discontinuous Galerkin shallow-water model on unstructured meshes. In: Proceedings of the Platform for Advanced Scientific Computing Conference (PASC) (2023, to appear)

    Google Scholar 

  7. de Fine Licht, J., Kuster, A., De Matteis, T., Ben-Nun, T., et al.: Stencilflow: mapping large stencil programs to distributed spatial computing systems. In: 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 315–326. IEEE (2021). https://doi.org/10.1109/CGO51591.2021.9370315

  8. Gruber, T., Eitzinger, J., Hager, G., Wellein, G.: LIKWID. Zenodo (2022). https://doi.org/10.5281/ZENODO.7432487

  9. Hajduk, H., Kuzmin, D., Aizinger, V.: New directional vector limiters for discontinuous Galerkin methods. J. Comput. Phys. 384, 308–325 (2019). https://doi.org/10.1016/j.jcp.2019.01.032

    CrossRef  MathSciNet  MATH  Google Scholar 

  10. Kenter, T.: Invited tutorial: OpenCL design flows for Intel and Xilinx FPGAs: using common design patterns and dealing with vendor-specific differences. In: Proc. Int. Workshop on FPGAs for Software Programmers (FSP), collocated with Int. Conf. on Field Programmable Logic and Applications (FPL) (2019)

    Google Scholar 

  11. Kenter, T., Förstner, J., Plessl, C.: Flexible FPGA design for FDTD using OpenCL. In: Proc. Int. Conf. on Field Programmable Logic and Applications (FPL), pp. 1–7. IEEE (2017). https://doi.org/10.23919/FPL.2017.8056844

  12. Kenter, T., et al.: OpenCL-based FPGA design to accelerate the nodal discontinuous Galerkin method for unstructured meshes. In: Proc. IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM), pp. 189–196. IEEE (2018). https://doi.org/10.1109/FCCM.2018.00037

  13. Kenter, T., Shambhu, A., Faghih-Naini, S., Aizinger, V.: Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA. In: Proceedings of the Platform for Advanced Scientific Computing Conference, pp. 1–11. ACM, Geneva, Switzerland (2021). https://doi.org/10.1145/3468267.3470617

  14. Kono, F., Nakasato, N., Hayashi, K., Vazhenin, A., Sedukhin, S.: Evaluations of OpenCL-written tsunami simulation on FPGA and comparison with GPU implementation. J. Supercomput. 74(6), 2747–2775 (2018). https://doi.org/10.1007/s11227-018-2315-8

    CrossRef  Google Scholar 

  15. Lavrentiev, M., Lysakov, K., Marchuk, A., Oblaukhov, K., et al.: Algorithmic design of an FPGA-based calculator for fast evaluation of tsunami wave danger. Algorithms 14(12), 343 (2021). https://doi.org/10.3390/a14120343

    CrossRef  Google Scholar 

  16. Lengauer, C., et al.: ExaStencils: advanced multigrid solver generation. In: Bungartz, H.-J., Reiz, S., Uekermann, B., Neumann, P., Nagel, W.E. (eds.) Software for Exascale Computing - SPPEXA 2016-2019. LNCSE, vol. 136, pp. 405–452. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47956-5_14

    CrossRef  Google Scholar 

  17. Meurer, A., Smith, C.P., Paprocki, M., Čertík, O., et al.: SymPy: symbolic computing in python. PeerJ Comput. Sci. 3, e103 (2017). https://doi.org/10.7717/peerj-cs.103

    CrossRef  Google Scholar 

  18. Nagasu, K., Sano, K., Kono, F., Nakasato, N.: FPGA-based tsunami simulation: Performance comparison with GPUs, and roofline model for scalability analysis. J. Parallel Distrib. Comput. 106, 153–169 (2017). https://doi.org/10.1016/j.jpdc.2016.12.015

    CrossRef  Google Scholar 

  19. Silva, B., Braeken, A., Touhafi, A., D’Hollander, E.: Performance modeling for FPGAs: extending the roofline model with high-level synthesis tools. Int. J. Reconfigurable Comput. 2013, 7 (2013). https://doi.org/10.1155/2013/428078

    CrossRef  Google Scholar 

  20. Siracusa, M., Del Sozzo, E., Rabozzi, M., Di Tucci, L., et al.: A comprehensive methodology to optimize FPGA designs via the roofline model. IEEE Trans. Comput. 71(8), 1903–1915 (2022). https://doi.org/10.1109/TC.2021.3111761

    CrossRef  MATH  Google Scholar 

  21. Trimberger, S.M.S.: Three ages of FPGAs: a retrospective on the first thirty years of FPGA technology: this paper reflects on how Moore’s law has driven the design of FPGAs through three epochs: the age of invention, the age of expansion, and the age of accumulation. IEEE Solid-State Circuits Mag. 10(2), 16–29 (2018). https://doi.org/10.1109/MSSC.2018.2822862

    CrossRef  Google Scholar 

  22. Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009). https://doi.org/10.1145/1498765.1498785

    CrossRef  Google Scholar 

  23. Zint, D., Grosso, R., Aizinger, V., Faghih-Naini, S., et al.: Automatic generation of load-balancing-aware block-structured grids for complex ocean domains. In: 30th International Meshing Roundtable (SIAM IMR 2022). Zenodo (2022). https://doi.org/10.5281/zenodo.6562440

  24. Zint, D., Grosso, R., Aizinger, V., Köstler, H.: Generation of block structured grids on complex domains for high performance simulation. Comput. Math. Math. Phys. 59(12), 2108–2123 (2019). https://doi.org/10.1134/S0965542519120182

    CrossRef  MathSciNet  MATH  Google Scholar 

  25. Zohouri, H.R., Podobas, A., Matsuoka, S.: Combined spatial and temporal blocking for high-performance stencil computation on FPGAs using OpenCL. In: Proc. Int. Symp. on Field-Programmable Gate Arrays (FPGA 2018), pp. 153–162. ACM, New York, NY, USA (2018). https://doi.org/10.1145/3174243.3174248

Download references

Acknowledgments

The authors gratefully acknowledge the funding of this project by computing time provided by the Paderborn Center for Parallel Computing (PC2). The authors gratefully acknowledge the scientific support and HPC resources provided by the Erlangen National High Performance Computing Center (NHR@FAU) of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU). The hardware is funded by the German Research Foundation (DFG). The work in this paper was supported in part by the DFG through grant AI 117/6-1 ‘Performance optimized software strategies for unstructured-mesh applications in ocean modeling’.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christoph Alt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alt, C. et al. (2023). Shallow Water DG Simulations on FPGAs: Design and Comparison of a Novel Code Generation Pipeline. In: Bhatele, A., Hammond, J., Baboulin, M., Kruse, C. (eds) High Performance Computing. ISC High Performance 2023. Lecture Notes in Computer Science, vol 13948. Springer, Cham. https://doi.org/10.1007/978-3-031-32041-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-32041-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-32040-8

  • Online ISBN: 978-3-031-32041-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics