Comparing CUDA, OpenCL and OpenGL Implementations of the Cardiac Monodomain Equations
Computer simulations of cardiac electrophysiology are a helpful tool in the study of bioelectric activity of the heart. The cardiac monodomain model comprises a nonlinear system of partial differential equations and its numerical solution represents a very intensive computational task due to the required fine spatial and temporal resolution. Recent studies have shown that the use of GPU as a general purpose processor can greatly improve the performance of simulations. The aim of this work is to study the performance of different GPU programming interfaces for the solution of the cardiac monodomain equations. Three different GPU implementations are compared, OpenGL, NVIDIA CUDA and OpenCL, to a CPU multicore implementation that uses OpenMP. The OpenGL approach showed to be the fastest with a speedup of 446 (compared to the multicore implementation) for the solution of the nonlinear system of ordinary differential equations (ODEs) associated to the solution of the cardiac model, whereas CUDA was the fastest for the numerical solution of the parabolic partial differential equation with a speedup of 8. Although OpenCL provides code portability between different accelerators, the OpenCL version was slower for the solution of the parabolic equation and as fast as CUDA for the solution of the system of ODEs, showing to be a portable way of programming scientific applications but not as efficient as CUDA when running on Nvidia GPUs.
KeywordsThread Block CUDA Implementation OpenCL Kernel Monodomain Model Fragment Processor
Unable to display preview. Download preview PDF.
- 1.Amorim, R.M., Haase, G., Liebmann, M., dos Santos, R.W.: Comparing CUDA and OpenGL implementations for a Jacobi iteration. In: International Conference on High Performance Computing & Simulation (HPCS 2009). pp. 22–32 (2009)Google Scholar
- 5.Sundnes, J., Lines, G.T., Cai, X., Nielsen, B.F., Mardal, K.A., Tveito, A.: Computing the Electrical Activity in the Heart. Springer (2006)Google Scholar
- 7.Bell, N., Garland, M.: Efficient Sparse Matrix-Vector Multiplication on CUDA. Tech. rep., NVidia Corporation (2008)Google Scholar
- 9.Rocha, B.M., Campos, F.O., Amorim, R.M., Plank, G., dos Santos, R.W., Liebmann, M., Haase, G.: Accelerating cardiac excitation spread simulations using graphics processing units. Concurrency and Computation: Practice and Experience (2010)Google Scholar
- 11.Saad, Y.: Iterative Methods for Sparse Linear Systems. PWS Publishing Company (1996)Google Scholar