Evaluation of Two Parallel Finite Element Implementations of the Time-Dependent Advection Diffusion Problem: GPU versus Cluster Considering Time and Energy Consumption

De Souza, Alberto F.; Veronese, Lucas; Lima, Leonardo M.; Badue, Claudine; Catabriga, Lucia

doi:10.1007/978-3-642-38718-0_17

Alberto F. De Souza¹⁹,
Lucas Veronese¹⁹,
Leonardo M. Lima²⁰,
Claudine Badue¹⁹ &
…
Lucia Catabriga¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7851))

Included in the following conference series:

International Conference on High Performance Computing for Computational Science

2014 Accesses

Abstract

We analyze two parallel finite element implementations of the 2D time-dependent advection diffusion problem, one for multi-core clusters and one for CUDA-enabled GPUs, and compare their performances in terms of time and energy consumption. The parallel CUDA-enabled GPU implementation was derived from the multi-core cluster version. Our experimental results show that a desktop machine with a single CUDA-enabled GPU can achieve performance higher than a 24-machine (96 cores) cluster in this class of finite element problems. Also, the CUDA-enabled GPU implementation consumes less than one twentieth of the energy (Joules) consumed by the multi-core cluster implementation while solving a whole instance of the finite element problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Brooks, A.N., Hughes, T.J.R.: Streamline upwind/Petrov-Galerkin formulations for convection dominated flows with particular emphasis on the incompressible Navier-Stokes equations. Computer Methods in Applied Mechanics and Engineering 32, 199–259 (1982)
Article MathSciNet MATH Google Scholar
Catabriga, L., Coutinho, A.L.G.A.: Implicit SUPG solution of Euler equations using edge-based data structures. Computer Methods in Applied Mechanics and Engineering 191, 3477–3490 (2002)
Article MathSciNet MATH Google Scholar
Cohen, J.M., Molemaker, M.J.: Cohen and M. Jeroen Molemaker. A fast double precision CFD code using CUDA. In: Proceedings of the 21st Parallel Computational Fluid Dynamics, Monffett Fiel, California (2010)
Google Scholar
Coutinho, A.L.G.A., Martins, M.A.D., Alves, J.L.D., Landau, L., Moraes, A.: Edge-based finite element techniques for nonlinear solid mechanics problems. International Journal for Numerical Methods in Engineering 50, 2053–2068 (2001)
Article MATH Google Scholar
Huang, S., Xiao, S., Feng, W.: On the energy efficiency of graphics processing units for scientific computing. In: Proceedings of the IEEE International Symposium on Parallel & Distributed Processing, pp. 1–8 (2009)
Google Scholar
Hughes, T.J.R.: The Finite Element Method. Linear Static and Dynamic Finite Element Analysis. Prentice-Hall, Englewood Cliffs (1987)
MATH Google Scholar
Jacobsen, D.A., Thibault, J.C., Senocak, I.: An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters. In: Proceedings of the 48th AIAA Aerospace Sciences Meeting, Orlando, Florida (2010)
Google Scholar
Jimack, P.K., Touheed, N.: Developing parallel finite element software using mpi. In: Topping, B.H.V., Lammer, L. (eds.) High Performance Computing for Computational Mechanics, pp. 15–38. Saxe-Coburg Publications (2000)
Google Scholar
Karypis, G., Kumar, V.: Multilevel k-way partioning scheme for irregular graphs. Technical Report 95-064, Department of Computer Science, University of Minnesota (1995)
Google Scholar
Kirk, D.B., Hwu, W.W.: Programming massively parallel processors: a hands-on approach. Elsevier (2010)
Google Scholar
Klockner, A., Warburton, T., Bridge, J., Hesthaven, J.S.: Nodal discontinuous Galerkin methods on graphics processors. J. Comput. Phys. 228, 7863–7882 (2009)
Article MathSciNet Google Scholar
NVIDIA. NVIDIA CUDA 3.0 - Programming Guide. NVIDIA Corporation (2010)
Google Scholar
Saad, Y.: Iterative Methods for Sparse Linear Systems. PWS Publishing, Boston (1996)
MATH Google Scholar
Senocak, I., Thibault, J., Caylor, M.: Rapid-response urban CFD simulations using a GPU computing paradigm on desktop supercomputer. In: Proceedings of the Eighth Symposium on the Urban Environment. Phoenix, Arizona (2009)
Google Scholar
Tezduyar, T.E., Hughes, T.J.R.: Finite element formulations for convection dominated flows with particular emphasis on the compressible Euler equations. In: Proceedings of AIAA 21st Aerospace Sciences Meeting, AIAA Paper 83-0125, Reno, Nevada (1983)
Google Scholar
Thibault, J.C., Senocak, I.: CUDA implementation of a Navier-Stokes solver on multi-GPU desktop platforms for incompressible flows. In: Proceedings of the 7th AIAA Aerospace Sciences Meeting Including The New Horizons Forum and Aerospace Exposition, Orlando, Florida (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Informática, Universidade Federal do Espírito Santo, Vitória, Brazil
Alberto F. De Souza, Lucas Veronese, Claudine Badue & Lucia Catabriga
Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo, Vitória, Brazil
Leonardo M. Lima

Authors

Alberto F. De Souza
View author publications
You can also search for this author in PubMed Google Scholar
Lucas Veronese
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo M. Lima
View author publications
You can also search for this author in PubMed Google Scholar
Claudine Badue
View author publications
You can also search for this author in PubMed Google Scholar
Lucia Catabriga
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INPT (ENSEEIHT) - IRIT, University of Toulouse, 31062, Toulouse, France
Michel Daydé
Lawrence Berkeley National Laboratory, 94720-8139, Berkeley, CA, USA
Osni Marques
Information Technology Center, The University of Tokyo, 113-8658, Tokyo, Japan
Kengo Nakajima

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

De Souza, A.F., Veronese, L., Lima, L.M., Badue, C., Catabriga, L. (2013). Evaluation of Two Parallel Finite Element Implementations of the Time-Dependent Advection Diffusion Problem: GPU versus Cluster Considering Time and Energy Consumption. In: Daydé, M., Marques, O., Nakajima, K. (eds) High Performance Computing for Computational Science - VECPAR 2012. VECPAR 2012. Lecture Notes in Computer Science, vol 7851. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38718-0_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-38718-0_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38717-3
Online ISBN: 978-3-642-38718-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics