GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review

Pikle, Nileshchandra K; Sathe, Shailesh R; Vyavhare, Arvind Y

doi:10.1007/s12046-018-0892-0

GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review

Published: 22 June 2018

Volume 43, article number 111, (2018)
Cite this article

Sādhanā Aims and scope Submit manuscript

Nileshchandra K Pikle¹,
Shailesh R Sathe¹ &
Arvind Y Vyavhare²

830 Accesses
10 Citations
Explore all metrics

Abstract

Parallelization of the finite-element method (FEM) has been contemplated by the scientific and high-performance computing community for over a decade. Most of the computations in the FEM are related to linear algebra that includes matrix and vector computations. These operations have the single-instruction multiple-data (SIMD) computation pattern, which is beneficial for shared-memory parallel architectures. General-purpose graphics processing units (GPGPUs) have been effectively utilized for the parallelization of FEM computations ever since 2007. The solver step of the FEM is often carried out using conjugate gradient (CG)-type iterative methods because of their larger convergence rates and greater opportunities for parallelization. Although the SIMD computation patterns in the FEM are intrinsic for GPU computing, there are some pitfalls, such as the underutilization of threads, uncoalesced memory access, lower arithmetic intensity, limited faster memories on GPUs and synchronizations. Nevertheless, FEM applications have been successfully deployed on GPUs over the last 10 years to achieve a significant performance improvement. This paper presents a comprehensive review of the parallel optimization strategies applied in each step of the FEM. The pitfalls and trade-offs linked to each step in the FEM are also discussed in this paper. Furthermore, some extraordinary methods that exploit the tremendous amount of computing power of a GPU are also discussed. The proposed review is not limited to a single field of engineering. Rather, it is applicable to all fields of engineering and science in which FEM-based simulations are necessary.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs

Strategies for Efficient Execution of Pipelined Conjugate Gradient Method on GPU Systems

Performance Analysis of Different Iterative Solvers Parallelized On GPU Architecture

References

Zienkiewicz O C, Taylor R L and Nithiarasu P 2000 The finite element method: solid mechanics, vol. 2. Oxford: Butterworth-heinemann
Google Scholar
Singh I V, Mishra B K, Brahmankar M, Bhasin V, Sharma K and Khan I A 2014 Numerical simulations of 3-d cracks using coupled EFGM and FEM. Int. J. Comput. Methods Eng. Sci. Mech. 15(3): 227–231
Article Google Scholar
Jin J M 2015 The finite element method in electromagnetics, 3rd ed. New York: John Wiley & Sons
Google Scholar
Moratal D 2012 Finite element analysis-from biomedical applications to industrial development. London: InTech
Book Google Scholar
Argyris J 1954 and 1955 Energy theorems and structural analysis. Aircraft Engineering re-printed 1990 London: Butterworth’s Scientific Publications
Clough W R 1960 The finite element method in plane stress analysis. In: Proceedings of the 2nd Conference on Electronic Computation, A.S.C.E. Structural Division, Pittsburgh, Pennsylvania
Banaś K, Płaszewski P and Macoił P 2014 Numerical integration on GPUs for higher order finite elements. Comput. Math. Appl. 67(6): 1319–1344
Article MathSciNet MATH Google Scholar
Komatitsch D, Michéa D and Erlebacher G 2009 Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA. J. Parallel Distrib. Comput. 69(5): 451–460
Article Google Scholar
Dongarra J Survey of sparse matrix storage formats. www.netlib.org/utk/papers/templates/node90.html (visited 10th May 2017)
Bell N and Garland M 2008 Efficient sparse matrix–vector multiplication on CUDA. Nvidia Technical Report NVR-2008-004, Nvidia Corporation
Barrett R, Berry M, Chan T F, Demmel J, Donato J, Dongarra J, Eijkhout V, Pozo R, Romine C and Van der Vorst H 1994 Templates for the solution of linear systems: building blocks for iterative methods, 2nd ed. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania
Book MATH Google Scholar
Carey G F and Jiang B 1986 Element-by-element linear and nonlinear solution schemes. Commun. Appl. Numer. Methods 2(2): 145–153
Article MATH Google Scholar
Carey G F, Barragy E, Mclay R and Sharma M 1988 Element-by-element vector and parallel computations. Commun. Appl. Numer. Methods 4(3): 299–307
Article MathSciNet MATH Google Scholar
Nickolls J and Kirk D 2009 Graphics and computing GPUs. In: Patterson D A and Hennessy J L Computer organization and design, 4th ed. Appendix A: 1–77
NVIDIA CUDA 2007 Compute unified device architecture programming guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html (visited 23rd September 2017)
Owens J D, Luebke D, Govindaraju N, Harris M, Krüger J, Lefohn A E and Purcell T J 2007 A survey of general-purpose computation on graphics hardware.Comput. Graph. Forum 26(1): 80–113
Article Google Scholar
Liu Y, Jiao S, Wu W and De S 2008 GPU accelerated fast FEM deformation simulation. In: Proceedings of the Asia Pacific Conference on Circuits and Systems, APCCAS 2008, IEEE Macao, pp. 606–609
Kákay A, Westphal E and Hertel R 2010 Speedup of FEM micromagnetic simulations with graphical processing units.IEEE Trans. Magn. 46(6): 2303–2306
Article Google Scholar
Brodtkorb A R, Hagen T R and Sætra M L 2013 Graphics processing unit (GPU) programming strategies and trends in GPU computing. J. Parallel Distrib. Comput. 73(1): 4–13
Article Google Scholar
Hoole S R H, Karthik V U, Sivasuthan S, Rahunanthan A, Tyagarajan R S and Jayakumar P 2015 Finite elements, design optimization, and nondestructive evaluation: a review in magnetics, and future directions in GPU-based, element-by-element coupled optimization and NDE. Int. J. Appl. Electromagn. Mech. 47(3): 607–627
Google Scholar
Sanders J and Kandrot E 2010 CUDA by example: an introduction to general-purpose GPU programming. Massachusetts: Addison-Wesley Professional
Google Scholar
Ho-Le K 1988 Finite element mesh generation methods: a review and classification.Comput. Aided Des. 20(1): 27–38
Article MATH Google Scholar
Sivasuthan S, Karthik V U, Jayakumar P, Thyagarajan R S, Udpa L and Hoole S R H 2015 A script-based, parameterized finite element mesh for design and NDE on a GPU. IETE Tech. Rev. 32(2): 94–103
Article Google Scholar
Reddy J N 1993 An introduction to the finite element method, 2nd ed. New York: McGraw-Hill, vol. 2, no. 2.2
Garcia-Ruiz M J and Steven G P 1999 Fixed grid finite elements in elasticity problems. Eng. Comput. 16(2): 145–164
Article MATH Google Scholar
Krużel F and Banaś K 2013 Vectorized OpenCL implementation of numerical integration for higher order finite elements. Comput. Math. Appl. 66(10): 2030–2044
Article MATH Google Scholar
Solin P, Segeth K and Dolezel I 2003 Higher-order finite element methods. Boca Raton: Chapman & Hall, CRC Press
Google Scholar
Macioł P, Płaszewski P and Banaś K 2010 3D finite element numerical integration on GPUs. Procedia Comput. Sci. 1(1): 1093–1100
Article MATH Google Scholar
Filipovič J, Peterlík I and Fousek J 2009 GPU acceleration of equations assembly in finite elements method—preliminary results. In: Proceedings of the Symposium on Application Accelerators in HPC (SAAHPC)
Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2012 Accuracy, memory, and speed strategies in GPU-based finite-element matrix-generation. IEEE Antennas Wirel. Propag. Lett. 11: 1346–1349
Article MATH Google Scholar
Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2013 Generation of large finite element matrices on multiple graphics processors. Int. J. Numer. Methods Eng. 94(2): 204–220
Article MathSciNet MATH Google Scholar
Nvidia Corporation 2008 Cublas library. Version 2.0, NVIDIA, Santa Clara, California
Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2012 Finite element matrix generation on a GPU. Prog. Electromagn. Res. 128: 249–265
Article MATH Google Scholar
Munshi A, Gaster B R, Mattson T G, Fung J and Ginsburg D 2011 OpenCL programming guide. London: Pearson Education
Google Scholar
Banaś K, Krużel F and Bielański J 2016 Finite element numerical integration for first order approximations on multi-and many-core architectures. Comput. Methods Appl. Mech. Eng. 305: 827–848
Article MathSciNet Google Scholar
Woźniak M 2015 Fast GPU integration algorithm for isogeometric finite element method solvers using task dependency graphs. J. Comput. Sci. 11: 145–152
Article MathSciNet Google Scholar
Mamza J, Makyla P, Dziekonski A, Lamecki A and Mrozowski M 2012 Multi-core and multiprocessor implementation of numerical integration in Finite Element Method. In: Proceedings of the 19th International Conference on Microwaves, Radar & Wireless Communications, IEEE, Warsaw, vol. 2, pp. 457–461
Knepley M G and Terrel A R 2013 Finite element integration on GPUs. ACM Trans. Math. Softw. (TOMS) 39(2): 10:1–13
Article MathSciNet MATH Google Scholar
Cecka C, Lew A and Darve E 2010 Introduction to assembly of finite element methods on graphics processors. IOP Conf. Ser. Mater. Sci. Eng. 10(1): 012009
Article MATH Google Scholar
Iwashita T and Shimasaki M 2002 Algebraic multicolor ordering for parallelized ICCG solver in finite-element analyses. IEEE Trans. Magn. 38(2): 429–432
Article Google Scholar
Iwashita T and Shimasaki M 2003 Algebraic block red–black ordering method for parallelized ICCG solver with fast convergence and low communication costs. IEEE Trans. Magn. 39(3): 1713–1716
Article Google Scholar
Fu Z, Lewis T J, Kirby R M and Whitaker R T 2014 Architecting the finite element method pipeline for the GPU. J. Comput. Appl. Math. 257: 195–211
Article MathSciNet MATH Google Scholar
Cecka C, Lew A J and Darve E 2011 Assembly of finite element methods on graphics processors. Int. J. Numer. Methods Eng. 85(5): 640–669
Article MATH Google Scholar
Markall G R, Ham D A and Kelly Paul H J 2010 Towards generating optimized finite element solvers for GPUs from high-level specifications. Procedia Comput. Sci. 1(1): 1815–1823
Article Google Scholar
Markall G R, Slemmer A, Ham D A, Kelly P H J, Cantwell C D and Sherwin S J 2013 Finite element assembly strategies on multicore and manycore architectures. Int. J. Numer. Methods Fluids 71(1): 80–97
Article Google Scholar
Sanfui S and Sharma D 2017 A two-kernel based strategy for performing assembly in FEA on the graphic processing unit. In: Proceedings of the IEEE International Conference on Advances in Mechanical, Industrial, Automation and Management Systems (AMIAMS), pp. 1–9
Cecka C, Lew A and Darve E 2011 Application of assembly of finite element methods on graphics processors for real-time elastodynamics. In: GPU computing gems, Jade ed. Massachusetts: Morgan Kaufmann, chapter 16, pp. 187–205
Meng H T, Nie B L, Wong S, Macon C and Jin J M 2014 GPU accelerated finite-element computation for electromagnetic analysis. IEEE Antennas Propag. Mag. 56(2): 39–62
Article Google Scholar
Reguly I Z and Giles M B 2015 Finite element algorithms and data structures on graphical processing units. Int. J. Parallel Program. 43(2): 203–239
Article Google Scholar
Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2014 GPU-accelerated finite-element matrix generation for lossless, lossy, and tensor media. IEEE Antennas Propag. Mag. 56(5): 186–197
Article MATH Google Scholar
Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2017 Communication and load balancing optimization for finite element electromagnetic simulations using multi-GPU workstation. IEEE Trans. Microw. Theory Tech. 65(8): 2661–2671
Article Google Scholar
Logg A, Mardal M A and Wells G N 2012 Automated solution of differential equations by the finite element method: the FEniCS book, vol. 84. New York–Heidelberg–Dordrecht–London: Springer
MATH Google Scholar
Dupont T, Hoffman J, Jansson J, Johnson C, Kirby Robert C, Knepley M, Larson M , Logg A and Scott R 2003 The fenics project. Tech. Rep. 200321, Chalmers Finite Element Center Preprint Series
Luporini F, Varbanescu A L, Rathgeber F, Bercea G T, Ramanujam J, Ham D A and Kelly P H J 2014 COFFEE: an optimizing compiler for finite element local assembly. arXiv preprint arXiv:1407.0904
Shewchuk J R 1994 An introduction to the conjugate gradient method without the agonizing pain. Technical Report CMU-CS-94-125, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania
Itu L M, Suciu C, Moldoveanu F and Postelnicu A 2011 Comparison of single and double floating point precision performance for Tesla architecture GPUs. Bull. Transilv. Univ. Brasov Ser. I Eng. Sci. 4(53): 131–138
Google Scholar
Göddeke D, Strzodka R and Turek R 2007 Performance and accuracy of hardware-oriented native-, emulated-and mixed-precision solvers in FEM simulations. Int. J. Parallel Emerg. Distrib. Syst. 22(4): 221–256
Article MathSciNet MATH Google Scholar
Baboulin M, Buttari A, Dongarra J, Kurzak J, Langou J, Langou Julien, Luszczek P and Tomov S 2009 Accelerating scientific computations with mixed precision algorithms. Comput. Phys. Commun. 180(12): 2526–2533
Article MATH Google Scholar
Buttari A, Dongarra J, Kurzak J, Langou Julie, Langou Julien, Luszczek P and Tomov S 2006 Exploiting mixed precision floating point hardware in scientific computations. In: Proceedings of the High Performance Computing Workshop, pp. 19–36
Göddeke D, Strzodka R and Turek S 2005 Accelerating double precision FEM simulations with GPUs. In: Proceedings of ASIM 18th Symposium on Simulation Technique
Cosgrove J D F, Díaz J C and Griewank A 1992 Approximate inverse preconditionings for sparse linear systems. Int. J. Comput. Math. 44(1–4): 91–110
Article MATH Google Scholar
Li R and Saad Y 2013 GPU-accelerated preconditioned iterative linear solvers. J. Supercomput. 63(2): 443–466
Article Google Scholar
Naumov M, Chien L S, Vandermersch P and Kapasi U 2010 Cusparse library. Presented at: GPU Technology Conference San Jose
Wang E, Zhang Q, Shen B, Zhang G, Lu X, Wu Q and Wang Y 2014 Intel math kernel library. In: High-performance computing on the Intel®Xeon Phi \(^{TM}.\) Springer International Publishing, pp. 167–188
Naumov M 2011 Incomplete-LU and Cholesky preconditioned iterative methods using CUSPARSE and CUBLAS. Nvidia Technical Report and White Paper
Fialko S Y and Zeglen F 2016 Preconditioned conjugate gradient method for solution of large finite element problems on CPU and GPU. J. Telecommun. Inf. Technol. nr 2: 26–33
Google Scholar
Gao J, Liang R and Wang J 2014 Research on the conjugate gradient algorithm with a modified incomplete Cholesky preconditioner on GPU. J. Parallel Distrib. Comput. 74(2): 2088–2098
Article Google Scholar
Benzi M, Meyer C D and Tůma M 1996 A sparse approximate inverse preconditioner for the conjugate gradient method. SIAM J. Sci. Comput. 17(5): 1135–1149
Article MathSciNet MATH Google Scholar
Grote M J and Huckle T 1997 Parallel preconditioning with sparse approximate inverses. SIAM J. Sci. Comput. 18(3): 838–853
Article MathSciNet MATH Google Scholar
Ament M, Knittel G, Weiskopf D and Straßer W 2010 A parallel preconditioned conjugate gradient solver for the Poisson problem on a multi-GPU platform. In: Proceedings of the 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, IEEE, pp. 583–592
Helfenstein R and Koko J 2012 Parallel preconditioned conjugate gradient algorithm on GPU. J. Comput. Appl. Math. 236(15): 3584–3590
Article MathSciNet MATH Google Scholar
Gravvanis G A 2002 Explicit approximate inverse preconditioning techniques. Arch. Comput. Methods Eng. 9(4): 371–402
Article MATH Google Scholar
Gravvanis G A, Filelis-Papadopoulos C K and Giannoutakis K M 2012 Solving finite difference linear systems on GPUs: CUDA based parallel explicit preconditioned biconjugate conjugate gradient type methods. J. Supercomput. 61(3): 590–604
Article Google Scholar
Cuthill E and McKee J 1972 Several strategies for reducing the bandwidth of matrices. In: Rose D J and Willoughby R A (Eds.) Sparse matrices and their applications. New York: Springer, pp. 157–166
Chapter Google Scholar
Fujiwara K, Nakata T and Fusayasu H 1993 Acceleration of convergence characteristic of the ICCG method. IEEE Trans. Magn. 29(2): 1958–1961
Article Google Scholar
Camargos A F P De, Silva V C, Guichon J M and Munier G 2014 Efficient parallel preconditioned conjugate gradient solver on GPU for FE modeling of electromagnetic fields in highly dissipative media. IEEE Trans. Magn. 50(2): 569–572
Article Google Scholar
Bernaschi M, Bisson M, Fantozzi C and Janna C 2016 A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units. SIAM J. Sci. Comput. 38(1): C53–C72
Article MathSciNet MATH Google Scholar
Bell N and Garland M 2017 https://code.google.com/archive/p/cusp-library/downloads (visited 23rd June)
Monakov A and Avetisyan A 2009 Implementing blocked sparse matrix–vector multiplication on NVIDIA GPUs. Embedded computer systems: architectures, modeling, and simulation, pp. 289–297
Choi J W, Singh A and Vuduc R W 2010 Model-driven autotuning of sparse matrix–vector multiply on GPUs. ACM Sigplan Not. 45(5): 115–126
Article Google Scholar
Vázquez F, Fernández J J and Garzón E M 2011 A new approach for sparse matrix vector product on NVIDIA GPUs. Concurr. Comput. Pract. Exp. 23(8): 815–826
Article Google Scholar
Pichel J C, Rivera F F, Fernández M and Rodríguez A 2012 Optimization of sparse matrix–vector multiplication using reordering techniques on GPUs. Microprocess. Microsyst. 36(2): 65–77
Article Google Scholar
Dang H V and Schmidt B 2012 The sliced COO format for sparse matrix–vector multiplication on CUDA-enabled GPUs. Procedia Comput. Sci. 9: 57–66
Article Google Scholar
Dang H V and Schmidt B 2013 CUDA-enabled sparse matrix–vector multiplication on GPUs using atomic operations. Parallel Comput. 39(11): 737–750
Article MathSciNet Google Scholar
Monakov A, Lokhmotov A and Avetisyan A 2010 Automatically tuning sparse matrix–vector multiplication for GPU architectures. HiPEAC Proceedings, Lecture Notes in Computer Science 5952, pp. 111–125
Article Google Scholar
Kreutzer M, Hager G, Wellein G, Fehske H and Bishop A R 2014 A unified sparse matrix data format for efficient general sparse matrix–vector multiplication on modern processors with wide SIMD units. SIAM J. Sci. Comput. 36(5): C401–C423
Article MathSciNet MATH Google Scholar
Anzt H, Tomov S and Dongarra J Implementing a sparse matrix vector product for the SELL-C/SELL-C- \(\sigma \) formats on NVIDIA GPUs. University of Tennessee, Tech. Rep., UT-EECS-14-727
Filippone S, Cardellini V, Barbieri D and Fanfarillo A 2017 Sparse matrix–vector multiplication on GPGPUs. ACM Trans. Math. Softw. (TOMS) 43(4): 30
Article MathSciNet MATH Google Scholar
Gao J, Wang Y and Wang J 2017 A novel multigraphics processing unit parallel optimization framework for the sparse matrixvector multiplication. Concurr. Comput. Pract. Exp. 29(5): e3936
Article Google Scholar
Gao J, Zhou Y, He G and Xia Y 2017 A multi-GPU parallel optimization model for the preconditioned conjugate gradient algorithm. Parallel Comput. 63: 1–16
Article MathSciNet Google Scholar
Flegar G and Quintana-Ortí E S Balanced CSR sparse matrix–vector product on graphics processors. In: Proceedings of the European Conference on Parallel Processing. Cham: Springer, pp. 697–709
Merrill D and Garland M 2016 Merge-based parallel sparse matrix–vector multiplication. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, UT, Salt Lake City, pp. 678–689
Yang W, Li K and Li K 2017 A hybrid computing method of SpMV on CPUGPU heterogeneous computing systems. J. Parallel Distrib. Comput. 104: 49–60
Article Google Scholar
Lin S and Xie Z 2017 A Jacobi PCG solver for sparse linear systems on multi-GPU cluster. J. Supercomput. 73(1): 433–454
Article Google Scholar
Cevahir A, Nukada A and Matsuoka S 2009 Fast conjugate gradients with multiple GPUs. In: Proceedings of the International Conference on Computational Science, LNCS 5544. Berlin–Heidelberg: Springer, pp. 893–903
Martínez-Frutos J, Martínez-Castejón P J and Herrero-Pérez D 2015 Fine-grained GPU implementation of assembly-free iterative solver for finite element problems. Comput. Struct. 157: 9–18
Article Google Scholar
Kiss I, Gyimothy S and Badics Z 2012 Parallel realization of the element-by-element FEM technique by CUDA. IEEE Trans. Magn. 48(2): 507–510
Article Google Scholar
Fernández D M, Dehnavi M M, Gross W J and Giannacopoulos D 2012 Alternate parallel processing approach for FEM. IIEEE Trans. Magn. 48(2): 399–402
Article Google Scholar
Hughes T J R, Levit I and Winget J 1983 An element-by-element solution algorithm for problems of structural and solid mechanics. Comput. Methods Appl. Mech. Eng. 36(2): 241–254
Article MATH Google Scholar
Yan X, Han X, Wu D, Xie D, Bai B and Ren Z 2017 Research on preconditioned conjugate gradient method based on EBE-FEM and the application in electromagnetic field analysis. IEEE Trans. Magn. 53(6): 1–4
Article Google Scholar
Akbariyeh A, Dennis B H, Wang B P and Lawrence K L 2015 Comparison of GPU-based parallel assembly and assembly-free sparse matrix vector multiplication for finite element analysis of three-dimensional structures. In: Proceedings of the Fifteenth International Conference on Civil, Structural and Environmental Engineering Computing, Civil-Comp Press, Stirlingshire, Scotland
Martínez-Frutos J and Herrero-Pérez D 2015 Efficient matrix-free GPU implementation of fixed grid finite element analysis. Finite Elem. Anal. Des. 104: 61–71
Article Google Scholar
Bendsøe M P and Sigmund O 2004 Topology optimization theory, methods, and applications. Berlin–Heidelberg: Springer
MATH Google Scholar
Martínez-Frutos J, Martínez-Castejón P J and Herrero-Pérez D 2017 Efficient topology optimization using GPU computing with multilevel granularity. Adv. Eng. Softw. 106: 47–62
Article Google Scholar
Martínez-Frutos J and Herrero-Pérez D 2017 GPU acceleration for evolutionary topology optimization of continuum structures using isosurfaces. Comput. Struct. 182: 119–136
Article Google Scholar
Ram L and Sharma D 2017 Evolutionary and GPU computing for topology optimization of structures. Swarm Evol. Comput. 35: 1–13
Article Google Scholar
Martínez-Frutos J and Herrero-Pérez D 2016 Large-scale robust topology optimization using multi-GPU systems. Comput. Methods Appl. Mech. Eng. 311: 393–414
Article MathSciNet Google Scholar
Baca V, Horak Z, Mikulenka P and Dzupa V 2008 Comparison of an inhomogeneous orthotropic and isotropic material models used for FE analyses. Med. Eng. Phys. 30(7): 924–930
Article Google Scholar
Cai Y, Li G and Wang H 2013 A parallel node-based solution scheme for implicit finite element method using GPU. Procedia Eng. 61: 318–324
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, India
Nileshchandra K Pikle & Shailesh R Sathe
Department of Applied Mechanics, Visvesvaraya National Institute of Technology, Nagpur, India
Arvind Y Vyavhare

Authors

Nileshchandra K Pikle
View author publications
You can also search for this author in PubMed Google Scholar
Shailesh R Sathe
View author publications
You can also search for this author in PubMed Google Scholar
Arvind Y Vyavhare
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nileshchandra K Pikle.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pikle, N.K., Sathe, S.R. & Vyavhare, A.Y. GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review. Sādhanā 43, 111 (2018). https://doi.org/10.1007/s12046-018-0892-0

Download citation

Received: 07 August 2017
Revised: 01 December 2017
Accepted: 08 December 2017
Published: 22 June 2018
DOI: https://doi.org/10.1007/s12046-018-0892-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review

Abstract

Access this article

Similar content being viewed by others

Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs

Strategies for Efficient Execution of Pipelined Conjugate Gradient Method on GPU Systems

Performance Analysis of Different Iterative Solvers Parallelized On GPU Architecture

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review

Abstract

Access this article

Similar content being viewed by others

Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs

Strategies for Efficient Execution of Pipelined Conjugate Gradient Method on GPU Systems

Performance Analysis of Different Iterative Solvers Parallelized On GPU Architecture

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation