Skip to main content
Log in

GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review

  • Published:
Sādhanā Aims and scope Submit manuscript

Abstract

Parallelization of the finite-element method (FEM) has been contemplated by the scientific and high-performance computing community for over a decade. Most of the computations in the FEM are related to linear algebra that includes matrix and vector computations. These operations have the single-instruction multiple-data (SIMD) computation pattern, which is beneficial for shared-memory parallel architectures. General-purpose graphics processing units (GPGPUs) have been effectively utilized for the parallelization of FEM computations ever since 2007. The solver step of the FEM is often carried out using conjugate gradient (CG)-type iterative methods because of their larger convergence rates and greater opportunities for parallelization. Although the SIMD computation patterns in the FEM are intrinsic for GPU computing, there are some pitfalls, such as the underutilization of threads, uncoalesced memory access, lower arithmetic intensity, limited faster memories on GPUs and synchronizations. Nevertheless, FEM applications have been successfully deployed on GPUs over the last 10 years to achieve a significant performance improvement. This paper presents a comprehensive review of the parallel optimization strategies applied in each step of the FEM. The pitfalls and trade-offs linked to each step in the FEM are also discussed in this paper. Furthermore, some extraordinary methods that exploit the tremendous amount of computing power of a GPU are also discussed. The proposed review is not limited to a single field of engineering. Rather, it is applicable to all fields of engineering and science in which FEM-based simulations are necessary.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Similar content being viewed by others

References

  1. Zienkiewicz O C, Taylor R L and Nithiarasu P 2000 The finite element method: solid mechanics, vol. 2. Oxford: Butterworth-heinemann

    Google Scholar 

  2. Singh I V, Mishra B K, Brahmankar M, Bhasin V, Sharma K and Khan I A 2014 Numerical simulations of 3-d cracks using coupled EFGM and FEM. Int. J. Comput. Methods Eng. Sci. Mech. 15(3): 227–231

    Article  Google Scholar 

  3. Jin J M 2015 The finite element method in electromagnetics, 3rd ed. New York: John Wiley & Sons

    Google Scholar 

  4. Moratal D 2012 Finite element analysis-from biomedical applications to industrial development. London: InTech

    Book  Google Scholar 

  5. Argyris J 1954 and 1955 Energy theorems and structural analysis. Aircraft Engineering re-printed 1990 London: Butterworth’s Scientific Publications

  6. Clough W R 1960 The finite element method in plane stress analysis. In: Proceedings of the 2nd Conference on Electronic Computation, A.S.C.E. Structural Division, Pittsburgh, Pennsylvania

  7. Banaś K, Płaszewski P and Macoił P 2014 Numerical integration on GPUs for higher order finite elements. Comput. Math. Appl. 67(6): 1319–1344

    Article  MathSciNet  MATH  Google Scholar 

  8. Komatitsch D, Michéa D and Erlebacher G 2009 Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA. J. Parallel Distrib. Comput. 69(5): 451–460

    Article  Google Scholar 

  9. Dongarra J Survey of sparse matrix storage formats. www.netlib.org/utk/papers/templates/node90.html (visited 10th May 2017)

  10. Bell N and Garland M 2008 Efficient sparse matrix–vector multiplication on CUDA. Nvidia Technical Report NVR-2008-004, Nvidia Corporation

  11. Barrett R, Berry M, Chan T F, Demmel J, Donato J, Dongarra J, Eijkhout V, Pozo R, Romine C and Van der Vorst H 1994 Templates for the solution of linear systems: building blocks for iterative methods, 2nd ed. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania

    Book  MATH  Google Scholar 

  12. Carey G F and Jiang B 1986 Element-by-element linear and nonlinear solution schemes. Commun. Appl. Numer. Methods 2(2): 145–153

    Article  MATH  Google Scholar 

  13. Carey G F, Barragy E, Mclay R and Sharma M 1988 Element-by-element vector and parallel computations. Commun. Appl. Numer. Methods 4(3): 299–307

    Article  MathSciNet  MATH  Google Scholar 

  14. Nickolls J and Kirk D 2009 Graphics and computing GPUs. In: Patterson D A and Hennessy J L Computer organization and design, 4th ed. Appendix A: 1–77

  15. NVIDIA CUDA 2007 Compute unified device architecture programming guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html (visited 23rd September 2017)

  16. Owens J D, Luebke D, Govindaraju N, Harris M, Krüger J, Lefohn A E and Purcell T J 2007 A survey of general-purpose computation on graphics hardware.Comput. Graph. Forum 26(1): 80–113

    Article  Google Scholar 

  17. Liu Y, Jiao S, Wu W and De S 2008 GPU accelerated fast FEM deformation simulation. In: Proceedings of the Asia Pacific Conference on Circuits and Systems, APCCAS 2008, IEEE Macao, pp. 606–609

  18. Kákay A, Westphal E and Hertel R 2010 Speedup of FEM micromagnetic simulations with graphical processing units.IEEE Trans. Magn. 46(6): 2303–2306

    Article  Google Scholar 

  19. Brodtkorb A R, Hagen T R and Sætra M L 2013 Graphics processing unit (GPU) programming strategies and trends in GPU computing. J. Parallel Distrib. Comput. 73(1): 4–13

    Article  Google Scholar 

  20. Hoole S R H, Karthik V U, Sivasuthan S, Rahunanthan A, Tyagarajan R S and Jayakumar P 2015 Finite elements, design optimization, and nondestructive evaluation: a review in magnetics, and future directions in GPU-based, element-by-element coupled optimization and NDE. Int. J. Appl. Electromagn. Mech. 47(3): 607–627

    Google Scholar 

  21. Sanders J and Kandrot E 2010 CUDA by example: an introduction to general-purpose GPU programming. Massachusetts: Addison-Wesley Professional

    Google Scholar 

  22. Ho-Le K 1988 Finite element mesh generation methods: a review and classification.Comput. Aided Des. 20(1): 27–38

    Article  MATH  Google Scholar 

  23. Sivasuthan S, Karthik V U, Jayakumar P, Thyagarajan R S, Udpa L and Hoole S R H 2015 A script-based, parameterized finite element mesh for design and NDE on a GPU. IETE Tech. Rev. 32(2): 94–103

    Article  Google Scholar 

  24. Reddy J N 1993 An introduction to the finite element method, 2nd ed. New York: McGraw-Hill, vol. 2, no. 2.2

  25. Garcia-Ruiz M J and Steven G P 1999 Fixed grid finite elements in elasticity problems. Eng. Comput. 16(2): 145–164

    Article  MATH  Google Scholar 

  26. Krużel F and Banaś K 2013 Vectorized OpenCL implementation of numerical integration for higher order finite elements. Comput. Math. Appl. 66(10): 2030–2044

    Article  MATH  Google Scholar 

  27. Solin P, Segeth K and Dolezel I 2003 Higher-order finite element methods. Boca Raton: Chapman & Hall, CRC Press

    Google Scholar 

  28. Macioł P, Płaszewski P and Banaś K 2010 3D finite element numerical integration on GPUs. Procedia Comput. Sci. 1(1): 1093–1100

    Article  MATH  Google Scholar 

  29. Filipovič J, Peterlík I and Fousek J 2009 GPU acceleration of equations assembly in finite elements method—preliminary results. In: Proceedings of the Symposium on Application Accelerators in HPC (SAAHPC)

  30. Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2012 Accuracy, memory, and speed strategies in GPU-based finite-element matrix-generation. IEEE Antennas Wirel. Propag. Lett. 11: 1346–1349

    Article  MATH  Google Scholar 

  31. Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2013 Generation of large finite element matrices on multiple graphics processors. Int. J. Numer. Methods Eng. 94(2): 204–220

    Article  MathSciNet  MATH  Google Scholar 

  32. Nvidia Corporation 2008 Cublas library. Version 2.0, NVIDIA, Santa Clara, California

  33. Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2012 Finite element matrix generation on a GPU. Prog. Electromagn. Res. 128: 249–265

    Article  MATH  Google Scholar 

  34. Munshi A, Gaster B R, Mattson T G, Fung J and Ginsburg D 2011 OpenCL programming guide. London: Pearson Education

    Google Scholar 

  35. Banaś K, Krużel F and Bielański J 2016 Finite element numerical integration for first order approximations on multi-and many-core architectures. Comput. Methods Appl. Mech. Eng. 305: 827–848

    Article  MathSciNet  Google Scholar 

  36. Woźniak M 2015 Fast GPU integration algorithm for isogeometric finite element method solvers using task dependency graphs. J. Comput. Sci. 11: 145–152

    Article  MathSciNet  Google Scholar 

  37. Mamza J, Makyla P, Dziekonski A, Lamecki A and Mrozowski M 2012 Multi-core and multiprocessor implementation of numerical integration in Finite Element Method. In: Proceedings of the 19th International Conference on Microwaves, Radar & Wireless Communications, IEEE, Warsaw, vol. 2, pp. 457–461

  38. Knepley M G and Terrel A R 2013 Finite element integration on GPUs. ACM Trans. Math. Softw. (TOMS) 39(2): 10:1–13

    Article  MathSciNet  MATH  Google Scholar 

  39. Cecka C, Lew A and Darve E 2010 Introduction to assembly of finite element methods on graphics processors. IOP Conf. Ser. Mater. Sci. Eng. 10(1): 012009

    Article  MATH  Google Scholar 

  40. Iwashita T and Shimasaki M 2002 Algebraic multicolor ordering for parallelized ICCG solver in finite-element analyses. IEEE Trans. Magn. 38(2): 429–432

    Article  Google Scholar 

  41. Iwashita T and Shimasaki M 2003 Algebraic block red–black ordering method for parallelized ICCG solver with fast convergence and low communication costs. IEEE Trans. Magn. 39(3): 1713–1716

    Article  Google Scholar 

  42. Fu Z, Lewis T J, Kirby R M and Whitaker R T 2014 Architecting the finite element method pipeline for the GPU. J. Comput. Appl. Math. 257: 195–211

    Article  MathSciNet  MATH  Google Scholar 

  43. Cecka C, Lew A J and Darve E 2011 Assembly of finite element methods on graphics processors. Int. J. Numer. Methods Eng. 85(5): 640–669

    Article  MATH  Google Scholar 

  44. Markall G R, Ham D A and Kelly Paul H J 2010 Towards generating optimized finite element solvers for GPUs from high-level specifications. Procedia Comput. Sci. 1(1): 1815–1823

    Article  Google Scholar 

  45. Markall G R, Slemmer A, Ham D A, Kelly P H J, Cantwell C D and Sherwin S J 2013 Finite element assembly strategies on multicore and manycore architectures. Int. J. Numer. Methods Fluids 71(1): 80–97

    Article  Google Scholar 

  46. Sanfui S and Sharma D 2017 A two-kernel based strategy for performing assembly in FEA on the graphic processing unit. In: Proceedings of the IEEE International Conference on Advances in Mechanical, Industrial, Automation and Management Systems (AMIAMS), pp. 1–9

  47. Cecka C, Lew A and Darve E 2011 Application of assembly of finite element methods on graphics processors for real-time elastodynamics. In: GPU computing gems, Jade ed. Massachusetts: Morgan Kaufmann, chapter 16, pp. 187–205

  48. Meng H T, Nie B L, Wong S, Macon C and Jin J M 2014 GPU accelerated finite-element computation for electromagnetic analysis. IEEE Antennas Propag. Mag. 56(2): 39–62

    Article  Google Scholar 

  49. Reguly I Z and Giles M B 2015 Finite element algorithms and data structures on graphical processing units. Int. J. Parallel Program. 43(2): 203–239

    Article  Google Scholar 

  50. Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2014 GPU-accelerated finite-element matrix generation for lossless, lossy, and tensor media. IEEE Antennas Propag. Mag. 56(5): 186–197

    Article  MATH  Google Scholar 

  51. Dziekonski A, Sypek P, Lamecki A and Mrozowski M 2017 Communication and load balancing optimization for finite element electromagnetic simulations using multi-GPU workstation. IEEE Trans. Microw. Theory Tech. 65(8): 2661–2671

    Article  Google Scholar 

  52. Logg A, Mardal M A and Wells G N 2012 Automated solution of differential equations by the finite element method: the FEniCS book, vol. 84. New York–Heidelberg–Dordrecht–London: Springer

    MATH  Google Scholar 

  53. Dupont T, Hoffman J, Jansson J, Johnson C, Kirby Robert C, Knepley M, Larson M , Logg A and Scott R 2003 The fenics project. Tech. Rep. 200321, Chalmers Finite Element Center Preprint Series

  54. Luporini F, Varbanescu A L, Rathgeber F, Bercea G T, Ramanujam J, Ham D A and Kelly P H J 2014 COFFEE: an optimizing compiler for finite element local assembly. arXiv preprint arXiv:1407.0904

  55. Shewchuk J R 1994 An introduction to the conjugate gradient method without the agonizing pain. Technical Report CMU-CS-94-125, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania

  56. Itu L M, Suciu C, Moldoveanu F and Postelnicu A 2011 Comparison of single and double floating point precision performance for Tesla architecture GPUs. Bull. Transilv. Univ. Brasov Ser. I Eng. Sci. 4(53): 131–138

    Google Scholar 

  57. Göddeke D, Strzodka R and Turek R 2007 Performance and accuracy of hardware-oriented native-, emulated-and mixed-precision solvers in FEM simulations. Int. J. Parallel Emerg. Distrib. Syst. 22(4): 221–256

    Article  MathSciNet  MATH  Google Scholar 

  58. Baboulin M, Buttari A, Dongarra J, Kurzak J, Langou J, Langou Julien, Luszczek P and Tomov S 2009 Accelerating scientific computations with mixed precision algorithms. Comput. Phys. Commun. 180(12): 2526–2533

    Article  MATH  Google Scholar 

  59. Buttari A, Dongarra J, Kurzak J, Langou Julie, Langou Julien, Luszczek P and Tomov S 2006 Exploiting mixed precision floating point hardware in scientific computations. In: Proceedings of the High Performance Computing Workshop, pp. 19–36

  60. Göddeke D, Strzodka R and Turek S 2005 Accelerating double precision FEM simulations with GPUs. In: Proceedings of ASIM 18th Symposium on Simulation Technique

  61. Cosgrove J D F, Díaz J C and Griewank A 1992 Approximate inverse preconditionings for sparse linear systems. Int. J. Comput. Math. 44(1–4): 91–110

    Article  MATH  Google Scholar 

  62. Li R and Saad Y 2013 GPU-accelerated preconditioned iterative linear solvers. J. Supercomput. 63(2): 443–466

    Article  Google Scholar 

  63. Naumov M, Chien L S, Vandermersch P and Kapasi U 2010 Cusparse library. Presented at: GPU Technology Conference San Jose

  64. Wang E, Zhang Q, Shen B, Zhang G, Lu X, Wu Q and Wang Y 2014 Intel math kernel library. In: High-performance computing on the Intel®Xeon Phi \(^{TM}.\) Springer International Publishing, pp. 167–188

  65. Naumov M 2011 Incomplete-LU and Cholesky preconditioned iterative methods using CUSPARSE and CUBLAS. Nvidia Technical Report and White Paper

  66. Fialko S Y and Zeglen F 2016 Preconditioned conjugate gradient method for solution of large finite element problems on CPU and GPU. J. Telecommun. Inf. Technol. nr 2: 26–33

    Google Scholar 

  67. Gao J, Liang R and Wang J 2014 Research on the conjugate gradient algorithm with a modified incomplete Cholesky preconditioner on GPU. J. Parallel Distrib. Comput. 74(2): 2088–2098

    Article  Google Scholar 

  68. Benzi M, Meyer C D and Tůma M 1996 A sparse approximate inverse preconditioner for the conjugate gradient method. SIAM J. Sci. Comput. 17(5): 1135–1149

    Article  MathSciNet  MATH  Google Scholar 

  69. Grote M J and Huckle T 1997 Parallel preconditioning with sparse approximate inverses. SIAM J. Sci. Comput. 18(3): 838–853

    Article  MathSciNet  MATH  Google Scholar 

  70. Ament M, Knittel G, Weiskopf D and Straßer W 2010 A parallel preconditioned conjugate gradient solver for the Poisson problem on a multi-GPU platform. In: Proceedings of the 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, IEEE, pp. 583–592

  71. Helfenstein R and Koko J 2012 Parallel preconditioned conjugate gradient algorithm on GPU. J. Comput. Appl. Math. 236(15): 3584–3590

    Article  MathSciNet  MATH  Google Scholar 

  72. Gravvanis G A 2002 Explicit approximate inverse preconditioning techniques. Arch. Comput. Methods Eng. 9(4): 371–402

    Article  MATH  Google Scholar 

  73. Gravvanis G A, Filelis-Papadopoulos C K and Giannoutakis K M 2012 Solving finite difference linear systems on GPUs: CUDA based parallel explicit preconditioned biconjugate conjugate gradient type methods. J. Supercomput. 61(3): 590–604

    Article  Google Scholar 

  74. Cuthill E and McKee J 1972 Several strategies for reducing the bandwidth of matrices. In: Rose D J and Willoughby R A (Eds.) Sparse matrices and their applications. New York: Springer, pp. 157–166

    Chapter  Google Scholar 

  75. Fujiwara K, Nakata T and Fusayasu H 1993 Acceleration of convergence characteristic of the ICCG method. IEEE Trans. Magn. 29(2): 1958–1961

    Article  Google Scholar 

  76. Camargos A F P De, Silva V C, Guichon J M and Munier G 2014 Efficient parallel preconditioned conjugate gradient solver on GPU for FE modeling of electromagnetic fields in highly dissipative media. IEEE Trans. Magn. 50(2): 569–572

    Article  Google Scholar 

  77. Bernaschi M, Bisson M, Fantozzi C and Janna C 2016 A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units. SIAM J. Sci. Comput. 38(1): C53–C72

    Article  MathSciNet  MATH  Google Scholar 

  78. Bell N and Garland M 2017 https://code.google.com/archive/p/cusp-library/downloads (visited 23rd June)

  79. Monakov A and Avetisyan A 2009 Implementing blocked sparse matrix–vector multiplication on NVIDIA GPUs. Embedded computer systems: architectures, modeling, and simulation, pp. 289–297

  80. Choi J W, Singh A and Vuduc R W 2010 Model-driven autotuning of sparse matrix–vector multiply on GPUs. ACM Sigplan Not. 45(5): 115–126

    Article  Google Scholar 

  81. Vázquez F, Fernández J J and Garzón E M 2011 A new approach for sparse matrix vector product on NVIDIA GPUs. Concurr. Comput. Pract. Exp. 23(8): 815–826

    Article  Google Scholar 

  82. Pichel J C, Rivera F F, Fernández M and Rodríguez A 2012 Optimization of sparse matrix–vector multiplication using reordering techniques on GPUs. Microprocess. Microsyst. 36(2): 65–77

    Article  Google Scholar 

  83. Dang H V and Schmidt B 2012 The sliced COO format for sparse matrix–vector multiplication on CUDA-enabled GPUs. Procedia Comput. Sci. 9: 57–66

    Article  Google Scholar 

  84. Dang H V and Schmidt B 2013 CUDA-enabled sparse matrix–vector multiplication on GPUs using atomic operations. Parallel Comput. 39(11): 737–750

    Article  MathSciNet  Google Scholar 

  85. Monakov A, Lokhmotov A and Avetisyan A 2010 Automatically tuning sparse matrix–vector multiplication for GPU architectures. HiPEAC Proceedings, Lecture Notes in Computer Science 5952, pp. 111–125

    Article  Google Scholar 

  86. Kreutzer M, Hager G, Wellein G, Fehske H and Bishop A R 2014 A unified sparse matrix data format for efficient general sparse matrix–vector multiplication on modern processors with wide SIMD units. SIAM J. Sci. Comput. 36(5): C401–C423

    Article  MathSciNet  MATH  Google Scholar 

  87. Anzt H, Tomov S and Dongarra J Implementing a sparse matrix vector product for the SELL-C/SELL-C- \(\sigma \) formats on NVIDIA GPUs. University of Tennessee, Tech. Rep., UT-EECS-14-727

  88. Filippone S, Cardellini V, Barbieri D and Fanfarillo A 2017 Sparse matrix–vector multiplication on GPGPUs. ACM Trans. Math. Softw. (TOMS) 43(4): 30

    Article  MathSciNet  MATH  Google Scholar 

  89. Gao J, Wang Y and Wang J 2017 A novel multigraphics processing unit parallel optimization framework for the sparse matrixvector multiplication. Concurr. Comput. Pract. Exp. 29(5): e3936

    Article  Google Scholar 

  90. Gao J, Zhou Y, He G and Xia Y 2017 A multi-GPU parallel optimization model for the preconditioned conjugate gradient algorithm. Parallel Comput. 63: 1–16

    Article  MathSciNet  Google Scholar 

  91. Flegar G and Quintana-Ortí E S Balanced CSR sparse matrix–vector product on graphics processors. In: Proceedings of the European Conference on Parallel Processing. Cham: Springer, pp. 697–709

  92. Merrill D and Garland M 2016 Merge-based parallel sparse matrix–vector multiplication. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, UT, Salt Lake City, pp. 678–689

  93. Yang W, Li K and Li K 2017 A hybrid computing method of SpMV on CPUGPU heterogeneous computing systems. J. Parallel Distrib. Comput. 104: 49–60

    Article  Google Scholar 

  94. Lin S and Xie Z 2017 A Jacobi PCG solver for sparse linear systems on multi-GPU cluster. J. Supercomput. 73(1): 433–454

    Article  Google Scholar 

  95. Cevahir A, Nukada A and Matsuoka S 2009 Fast conjugate gradients with multiple GPUs. In: Proceedings of the International Conference on Computational Science, LNCS 5544. Berlin–Heidelberg: Springer, pp. 893–903

  96. Martínez-Frutos J, Martínez-Castejón P J and Herrero-Pérez D 2015 Fine-grained GPU implementation of assembly-free iterative solver for finite element problems. Comput. Struct. 157: 9–18

    Article  Google Scholar 

  97. Kiss I, Gyimothy S and Badics Z 2012 Parallel realization of the element-by-element FEM technique by CUDA. IEEE Trans. Magn. 48(2): 507–510

    Article  Google Scholar 

  98. Fernández D M, Dehnavi M M, Gross W J and Giannacopoulos D 2012 Alternate parallel processing approach for FEM. IIEEE Trans. Magn. 48(2): 399–402

    Article  Google Scholar 

  99. Hughes T J R, Levit I and Winget J 1983 An element-by-element solution algorithm for problems of structural and solid mechanics. Comput. Methods Appl. Mech. Eng. 36(2): 241–254

    Article  MATH  Google Scholar 

  100. Yan X, Han X, Wu D, Xie D, Bai B and Ren Z 2017 Research on preconditioned conjugate gradient method based on EBE-FEM and the application in electromagnetic field analysis. IEEE Trans. Magn. 53(6): 1–4

    Article  Google Scholar 

  101. Akbariyeh A, Dennis B H, Wang B P and Lawrence K L 2015 Comparison of GPU-based parallel assembly and assembly-free sparse matrix vector multiplication for finite element analysis of three-dimensional structures. In: Proceedings of the Fifteenth International Conference on Civil, Structural and Environmental Engineering Computing, Civil-Comp Press, Stirlingshire, Scotland

  102. Martínez-Frutos J and Herrero-Pérez D 2015 Efficient matrix-free GPU implementation of fixed grid finite element analysis. Finite Elem. Anal. Des. 104: 61–71

    Article  Google Scholar 

  103. Bendsøe M P and Sigmund O 2004 Topology optimization theory, methods, and applications. Berlin–Heidelberg: Springer

    MATH  Google Scholar 

  104. Martínez-Frutos J, Martínez-Castejón P J and Herrero-Pérez D 2017 Efficient topology optimization using GPU computing with multilevel granularity. Adv. Eng. Softw. 106: 47–62

    Article  Google Scholar 

  105. Martínez-Frutos J and Herrero-Pérez D 2017 GPU acceleration for evolutionary topology optimization of continuum structures using isosurfaces. Comput. Struct. 182: 119–136

    Article  Google Scholar 

  106. Ram L and Sharma D 2017 Evolutionary and GPU computing for topology optimization of structures. Swarm Evol. Comput. 35: 1–13

    Article  Google Scholar 

  107. Martínez-Frutos J and Herrero-Pérez D 2016 Large-scale robust topology optimization using multi-GPU systems. Comput. Methods Appl. Mech. Eng. 311: 393–414

    Article  MathSciNet  Google Scholar 

  108. Baca V, Horak Z, Mikulenka P and Dzupa V 2008 Comparison of an inhomogeneous orthotropic and isotropic material models used for FE analyses. Med. Eng. Phys. 30(7): 924–930

    Article  Google Scholar 

  109. Cai Y, Li G and Wang H 2013 A parallel node-based solution scheme for implicit finite element method using GPU. Procedia Eng. 61: 318–324

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nileshchandra K Pikle.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pikle, N.K., Sathe, S.R. & Vyavhare, A.Y. GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review. Sādhanā 43, 111 (2018). https://doi.org/10.1007/s12046-018-0892-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12046-018-0892-0

Keywords

Navigation