Performance and Scalability Improvements for Discontinuous Galerkin Solutions to Conservation Laws on Unstructured Grids

Brus, S. R.; Wirasaet, D.; Westerink, J. J.; Dawson, C.

doi:10.1007/s10915-016-0249-y

Performance and Scalability Improvements for Discontinuous Galerkin Solutions to Conservation Laws on Unstructured Grids

Published: 01 August 2016

Volume 70, pages 210–242, (2017)
Cite this article

Journal of Scientific Computing Aims and scope Submit manuscript

S. R. Brus¹,
D. Wirasaet¹,
J. J. Westerink¹ &
…
C. Dawson²

585 Accesses
6 Citations
Explore all metrics

Abstract

This paper presents a computational framework developed to improve both the serial and parallel performance of two dimensional, unstructured, discontinuous Galerkin (DG) solutions to hyperbolic conservation laws. The coding techniques employed factor in advancements trending in HPC technologies. They are designed to maximize loop vectorization, efficiently utilize cache, facilitate straightforward shared memory parallelization, reduce message passing volume, and increase the overlap between computation and communication. With today’s CPU technology and HPC networks rapidly evolving, it is important to quantitatively assess and compare these methodologies with standard paradigms in order to maximize current computational resources. In our benchmark studies, we specifically investigate the shallow water equations to show that the refactored algorithm implementation is able to provide a significant performance increase over the conventional elemental DG code structure in terms of both CPU time and parallel scalability. Our results show that the serial optimizations result in a 28–38 % performance increase. For parallel computations our improvements give rise to a 1.5–2.0 speedup factor for local problem sizes between 10 and 2000 elements per core, regardless of the overall problem size. The computational benchmarks were performed on the Lonestar and Stampede supercomputers at the Texas Advanced Computing Center.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Achieving high performance and portable parallel GMRES algorithm for compressible flow simulations on unstructured grids

Article 09 June 2023

Towards HPC simulations of billion-cell reservoirs by multiscale mixed methods

Article 23 February 2022

ExaDG: High-Order Discontinuous Galerkin for the Exa-Scale

References

Baggag, A., Atkins, H., Keyes, D.: Parallel implementation of the discontinuous Galerkin method. In: Parallel Computational Fluid Dynamics, pp. 115–122 (1999)
Bell, J.B., Dawson, C.N., Shubin, G.R.: An unsplit, higher order Godunov method for scalar conservation laws in multiple dimensions. J. Comput. Phys. 74(1), 1–24 (1988)
Article MATH Google Scholar
Biswas, R., Devine, K.D., Flaherty, J.E.: Parallel, adaptive finite element methods for conservation laws. Appl. Numer. Math. 14(1), 255–283 (1994)
Article MathSciNet MATH Google Scholar
Cockburn, B., Shu, C.: The Runge-Kutta discontinuous Galerkin method for conservation laws V. Multidimnesional systems. J. Comput. Phys. 141, 199–224 (1998)
Article MathSciNet MATH Google Scholar
Cockburn, B., Shu, C.W.: TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws. II. General framework. Math. Comput. 52(186), 411–435 (1989)
MathSciNet MATH Google Scholar
Dubiner, M.: Spectral methods on triangle and other domains. J. Sci. Comput. 6, 345–390 (1991)
Article MathSciNet MATH Google Scholar
Franchetti, F., Kral, S., Lorenz, J., Ueberhuber, C.: Efficient utilization of SIMD extensions. Proc. IEEE 93(2), 409–425 (2005)
Article Google Scholar
Gwennap, L.: Sandy bridge spans generations. Microprocess. Rep. 9(27), 10-01 (2010)
Google Scholar
Hager, G., Wellein, G.: Introduction to High Performance Computing for Scientists and Engineers. CRC Press, Boca Raton (2010)
Book Google Scholar
Hassaballah, M., Omran, S., Mahdy, Y.B.: A review of SIMD multimedia extensions and their usage in scientific and engineering applications. Comput. J. 51(6), 630–649 (2008)
Article Google Scholar
Hu, C., Shu, C.W.: Weighted essentially non-oscillatory schemes on triangular meshes. J. Comput. Phys. 150(1), 97–127 (1999)
Article MathSciNet MATH Google Scholar
Intel Corporation: Intel 64 and IA-32 Architectures Software Developer’s Manual, vol. 1 (2015). http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf
Karypis, G., Kumar, V.: Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput. 48(1), 96–129 (1998)
Article MathSciNet MATH Google Scholar
Kelly, J.F., Giraldo, F.X.: Continuous and discontinuous Galerkin methods for a scalable three-dimensional nonhydrostatic atmospheric model: Limited-area mode. J. Comput. Phys. 231(24), 7988–8008 (2012)
Article MathSciNet MATH Google Scholar
Klöckner, A., Warburton, T., Bridge, J., Hesthaven, J.: Nodal discontinuous Galerkin methods on graphics processors. J. Comput. Phys. 228(21), 7863–7882 (2009)
Article MathSciNet MATH Google Scholar
Koornwinder, T.: Two-variable analogues of the classical orthogonal polynomials. In: Theory and Applications of Special Functions, pp. 435–495 (1975)
Kubatko, E.J., Bunya, S., Dawson, C., Westerink, J.J., Mirabito, C.: A performance comparison of continuous and discontinuous finite element shallow water models. J. Sci. Comput. 40, 315–339 (2009)
Article MathSciNet MATH Google Scholar
Kubatko, E.J., Westerink, J.J., Dawson, C.: hp discontinuous Galerkin methods for advection dominated problems in shallow water flow. Comput. Methods Appl. Mech. Eng. 196, 437–451 (2006)
Article MATH Google Scholar
LeVeque, R.J.: Numerical Methods for Conservation Laws, vol. 132. Birkhäuser, Basel (1992)
Book MATH Google Scholar
Mudge, T.: Power: a first-class architectural design constraint. Computer 34(4), 52–58 (2001)
Article Google Scholar
Proriol, J.: Sur une famille de polynomes á deux variables orthogonaux dans un triangle. C. R. Acad. Sci. 245(26), 2459–2461 (1957)
MathSciNet MATH Google Scholar
Reguly, I.Z., Lszl, E., Mudalige, G.R., Giles, M.B.: Vectorizing unstructured mesh computations for many-core architectures. Concurr. Comput. Pract. Exp. 28(2), 557–577 (2016)
Article Google Scholar
Salehipour, H., Stuhne, G., Peltier, W.: A higher order discontinuous Galerkin, global shallow water model: global ocean tides and aquaplanet benchmarks. Ocean Model. 69, 93–107 (2013)
Article Google Scholar
Satish, N., Kim, C., Chhugani, J., Saito, H., Krishnaiyer, R., Smelyanskiy, M., Girkar, M., Dubey, P.: Can traditional programming bridge the ninja performance gap for parallel computing applications? SIGARCH Comput. Architect. News 40(3), 440–451 (2012)
Article Google Scholar
Sutter, H.: The free lunch is over: a fundamental turn toward concurrency in software. Dr. Dobbs J. 30(3), 202–210 (2005)
Google Scholar
Tanaka, S., Bunya, S., Westerink, J., Dawson, C., Luettich, R.A.: Scalability of an unstructured grid continuous Galerkin based hurricane storm surge model. J. Sci. Comput. 46(3), 329–358 (2011)
Article MathSciNet MATH Google Scholar
Wirasaet, D., Tanaka, S., Kubatko, E.J., Westerink, J.J., Dawson, C.: A performance comparison of nodal discontinuous Galerkin methods on triangles and quadrilaterals. Int. J. Numer. Methods Fluids 64, 1326–1362 (2010)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

This work was supported by the National Science Foundation Grants DMS-1228212, ACI-1339738, and ACI-1339801. J.J. Westerink was also partly supported by the Henry J. Massman and the Joseph and Nona Ahearn endowments at the University of Notre Dame. The authors acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing HPC resources that have contributed to the research results reported within this paper. Also we would like to thank TACC research scientist John McCalpin for assisting with the hardware counter results. URL: http://www.tacc.utexas.edu. The benchmark studies were performed using the XSEDE Allocation TG-DMS080016N.

Author information

Authors and Affiliations

Computational Hydraulics Laboratory, Department of Civil and Environmental Engineering and Earth Sciences, University of Notre Dame, Notre Dame, IN, USA
S. R. Brus, D. Wirasaet & J. J. Westerink
Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX, USA
C. Dawson

Authors

S. R. Brus
View author publications
You can also search for this author in PubMed Google Scholar
D. Wirasaet
View author publications
You can also search for this author in PubMed Google Scholar
J. J. Westerink
View author publications
You can also search for this author in PubMed Google Scholar
C. Dawson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. R. Brus.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brus, S.R., Wirasaet, D., Westerink, J.J. et al. Performance and Scalability Improvements for Discontinuous Galerkin Solutions to Conservation Laws on Unstructured Grids. J Sci Comput 70, 210–242 (2017). https://doi.org/10.1007/s10915-016-0249-y

Download citation

Received: 30 October 2015
Revised: 26 May 2016
Accepted: 09 July 2016
Published: 01 August 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s10915-016-0249-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance and Scalability Improvements for Discontinuous Galerkin Solutions to Conservation Laws on Unstructured Grids

Abstract

Access this article

Similar content being viewed by others

Achieving high performance and portable parallel GMRES algorithm for compressible flow simulations on unstructured grids

Towards HPC simulations of billion-cell reservoirs by multiscale mixed methods

ExaDG: High-Order Discontinuous Galerkin for the Exa-Scale

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Performance and Scalability Improvements for Discontinuous Galerkin Solutions to Conservation Laws on Unstructured Grids

Abstract

Access this article

Similar content being viewed by others

Achieving high performance and portable parallel GMRES algorithm for compressible flow simulations on unstructured grids

Towards HPC simulations of billion-cell reservoirs by multiscale mixed methods

ExaDG: High-Order Discontinuous Galerkin for the Exa-Scale

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation