Abstract
A new Graphics Processing Unit (GPU) parallelization strategy is proposed to accelerate sparse finite element computation for three dimensional electromagnetic analysis. The parallelization strategy is employed based on a new compression format called sliced ELL Four (sliced ELL-F). The sliced ELL-F format-based parallelization strategy is designed for hastening many addition, dot product, and Sparse Matrix Vector Product (SMVP) operations in the Conjugate Gradient Norm (CGN) calculation of finite element equations. The new implementation of SMVP on GPUs is evaluated. The proposed strategy executed on a GPU can efficiently solve sparse finite element equations, especially when the equations are huge sparse (size of most rows in a coefficient matrix is less than 8). Numerical results show the sliced ELL-F format-based parallelization strategy can reach significant speedups compared to Compressed Sparse Row (CSR) format.
Similar content being viewed by others
References
S. Velamparambil, S. MacKinnon-Cormier, J. Perry, et al. GPU accelerated krylov subspace methods for computational electromagnetics. European Microwave Conference (EuMC), Amsterdam, NED, Oct. 2008, 1312–1314.
T. Wang, Y. Yao, L. Han, et al. Implementation of Jacobi iterative method on graphics processor unit. IEEE International Conference on Intelligent Computing and Intelligent Systems, Shanghai, China, Nov. 2009, 324–327.
R. Chen, X. Kan, and J. Ding. Acceleration of MoM solver for scattering using graphics processing units (GPUs). Cross Strait Tri-Regional Radio Science and Wireless Technology Conference, Fuzhou, China, May 2008, 63–66.
S. E. Krakiwsky, L. E. Turner, and M. Okoniewski. Acceleration of finite difference time-domain (FDTD) using graphics processor units (GPU). IEEE MTT-S International Microwave Symposium Diges, Fort Worth, TX, USA, Jun. 2004, 1033–1036.
T. P. Stefanski and T. D. Drysdale. Acceleration of the 3D ADIFDTD method using graphics processor units. IEEE MTT-S International Microwave Symposium Digest, Boston, MA, USA, Jul. 2009, 241–244.
F. V. Rossi and P. P. M. So. Hardware accelerated symmetric condensed node TLM procedure for NVIDIA graphics processing units. IEEE AP-S/URSI Antennas and Propagation Society International Symposium, Charleston, SC, Jul. 2009, 1–4.
L. E. Garcia-Castillo, I. Gomez-Revuelto, F. Saez de Adana, et al. A finite element method for the analysis of radiation and scattering of electromagnetic waves on complex environments. Computer Methods in Applied Mechanics and Engineering, 194(2005)2, 637–655.
L. Jian and K. T. Chau. Design and analysis of a magneticgeared electronic-continuously variable transmission system using finite element method. Progress in Electromagnetics Research, 107(2010)7, 47–61.
G. Apaydin. Efficient finite-element method for electromagnetics. IEEE Antennas and Propagation Magazine, 51(2009)10, 61–71.
C. Chen. The conjugate gradient solver accelerated by GPU for solving wave-propagation problems. http://www.kfunigraz.ac.at/~haasegu/Lectures/GPU_CUDA/WS10/Reports/Report_ChaoChen.pdf, Mar. 2011.
L. Bergamaschi, G. Gambolati and M. Putti. Iterative methods for the partial symmetric eigenproblem. Conference on Iterative Methods, Copper Mountain, USA, Apr. 2000.
Y. Zhang and Q. Sun. Preconditioned bi-conjugate gradient method of large-scale sparse complex linear equation group. Chinese Journal of Electronics, 20(2011)1, 192–194.
S. Olver. GMRES for the differentiation operator. SIAM Journal on Numerical Analysis, 47(2009)5, 3359–3373.
J. M. Jin. The Finite Element Method in Electromagnetics. New York, US, John Wiley & Sons, Inc., 2002, 610–613.
NVIDIA CUDA Programming Guide Version 3.0. 2/20/2010[Z]. 2010, 1–153.
N. Bell and M. Garland. Efficient sparse matrix-vector multiplication on CUDA. ACM/IEEE Conference on Supercomputing (SC), Portland, OR, USA, 2009, 1–32.
D. R. Kincaid, T. C. Oppe, and D. M. Young. ITPACKV 2D User’s Guide. http://rene.ma.utexas.edu/CNA/ITPACK/manuals/userv2d/, May 1989.
R. T. Mills, E. F. D’Azevedo, and M. R. Fahey. Progress towards optimizing the PETSc numerical Toolkit on the Cray X1. Government report, USA, 2005.
F. Vázquez, G. Ortega, J. J. Fernandez, et al. Improving the performance of the sparse matrix vector product with GPUs. IEEE International Conference on Computer and Information Technology (CIT’2010), Bradford, UK, Jun. 2010, 1146–1151.
A. Monakov, A. Lokhmotov, and A. Avetisyan. Automatically tuning sparse matrix-vector multiplication for GPU architectures. High Performance Embedded Architectures and Compilers, Lecture Notes in Computer Science, 5952(2010)1, 111–125.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Natural Science Foundation of China (No. 60801039).
Communication author: Tian Jin, born in 1982, female, Ph.D. candidate.
About this article
Cite this article
Tian, J., Gong, L., Shi, X. et al. GPU-accelerated fem solver for three dimensional electromagnetic analysis. J. Electron.(China) 28, 615–622 (2011). https://doi.org/10.1007/s11767-012-0766-2
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11767-012-0766-2
Key words
- Finite Element Method (FEM)
- Graphics Processing Unit (GPU)
- Parallelization strategy
- Conjugate Gradient Norm (CGN)
- Sliced ELL Four (sliced ELL-F)