Abstract
We analyze two parallel finite element implementations of the 2D time-dependent advection diffusion problem, one for multi-core clusters and one for CUDA-enabled GPUs, and compare their performances in terms of time and energy consumption. The parallel CUDA-enabled GPU implementation was derived from the multi-core cluster version. Our experimental results show that a desktop machine with a single CUDA-enabled GPU can achieve performance higher than a 24-machine (96 cores) cluster in this class of finite element problems. Also, the CUDA-enabled GPU implementation consumes less than one twentieth of the energy (Joules) consumed by the multi-core cluster implementation while solving a whole instance of the finite element problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brooks, A.N., Hughes, T.J.R.: Streamline upwind/Petrov-Galerkin formulations for convection dominated flows with particular emphasis on the incompressible Navier-Stokes equations. Computer Methods in Applied Mechanics and Engineering 32, 199–259 (1982)
Catabriga, L., Coutinho, A.L.G.A.: Implicit SUPG solution of Euler equations using edge-based data structures. Computer Methods in Applied Mechanics and Engineering 191, 3477–3490 (2002)
Cohen, J.M., Molemaker, M.J.: Cohen and M. Jeroen Molemaker. A fast double precision CFD code using CUDA. In: Proceedings of the 21st Parallel Computational Fluid Dynamics, Monffett Fiel, California (2010)
Coutinho, A.L.G.A., Martins, M.A.D., Alves, J.L.D., Landau, L., Moraes, A.: Edge-based finite element techniques for nonlinear solid mechanics problems. International Journal for Numerical Methods in Engineering 50, 2053–2068 (2001)
Huang, S., Xiao, S., Feng, W.: On the energy efficiency of graphics processing units for scientific computing. In: Proceedings of the IEEE International Symposium on Parallel & Distributed Processing, pp. 1–8 (2009)
Hughes, T.J.R.: The Finite Element Method. Linear Static and Dynamic Finite Element Analysis. Prentice-Hall, Englewood Cliffs (1987)
Jacobsen, D.A., Thibault, J.C., Senocak, I.: An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters. In: Proceedings of the 48th AIAA Aerospace Sciences Meeting, Orlando, Florida (2010)
Jimack, P.K., Touheed, N.: Developing parallel finite element software using mpi. In: Topping, B.H.V., Lammer, L. (eds.) High Performance Computing for Computational Mechanics, pp. 15–38. Saxe-Coburg Publications (2000)
Karypis, G., Kumar, V.: Multilevel k-way partioning scheme for irregular graphs. Technical Report 95-064, Department of Computer Science, University of Minnesota (1995)
Kirk, D.B., Hwu, W.W.: Programming massively parallel processors: a hands-on approach. Elsevier (2010)
Klockner, A., Warburton, T., Bridge, J., Hesthaven, J.S.: Nodal discontinuous Galerkin methods on graphics processors. J. Comput. Phys. 228, 7863–7882 (2009)
NVIDIA. NVIDIA CUDA 3.0 - Programming Guide. NVIDIA Corporation (2010)
Saad, Y.: Iterative Methods for Sparse Linear Systems. PWS Publishing, Boston (1996)
Senocak, I., Thibault, J., Caylor, M.: Rapid-response urban CFD simulations using a GPU computing paradigm on desktop supercomputer. In: Proceedings of the Eighth Symposium on the Urban Environment. Phoenix, Arizona (2009)
Tezduyar, T.E., Hughes, T.J.R.: Finite element formulations for convection dominated flows with particular emphasis on the compressible Euler equations. In: Proceedings of AIAA 21st Aerospace Sciences Meeting, AIAA Paper 83-0125, Reno, Nevada (1983)
Thibault, J.C., Senocak, I.: CUDA implementation of a Navier-Stokes solver on multi-GPU desktop platforms for incompressible flows. In: Proceedings of the 7th AIAA Aerospace Sciences Meeting Including The New Horizons Forum and Aerospace Exposition, Orlando, Florida (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
De Souza, A.F., Veronese, L., Lima, L.M., Badue, C., Catabriga, L. (2013). Evaluation of Two Parallel Finite Element Implementations of the Time-Dependent Advection Diffusion Problem: GPU versus Cluster Considering Time and Energy Consumption. In: Daydé, M., Marques, O., Nakajima, K. (eds) High Performance Computing for Computational Science - VECPAR 2012. VECPAR 2012. Lecture Notes in Computer Science, vol 7851. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38718-0_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-38718-0_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38717-3
Online ISBN: 978-3-642-38718-0
eBook Packages: Computer ScienceComputer Science (R0)