Advertisement

Accelerating the Conjugate Gradient Algorithm with GPUs in CFD Simulations

  • Hartwig Anzt
  • Marc Baboulin
  • Jack Dongarra
  • Yvan Fournier
  • Frank Hulsemann
  • Amal KhabouEmail author
  • Yushan Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10150)

Abstract

This paper illustrates how GPU computing can be used to accelerate computational fluid dynamics (CFD) simulations. For sparse linear systems arising from finite volume discretization, we evaluate and optimize the performance of Conjugate Gradient (CG) routines designed for manycore accelerators and compare against an industrial CPU-based implementation. We also investigate how the recent advances in preconditioning, such as iterative Incomplete Cholesky (IC, as symmetric case of ILU) preconditioning, match the requirements for solving real world problems.

Notes

Acknowledgements

This work was funded by the contract P02220 between Université Paris-Sud and EDF. We are grateful to Karl Rupp (TU Wien) for his support in using the ViennaCL library.

References

  1. 1.
    Aliaga, J.I., Pérez, J., Quintana-Ortí, E.S.: Systematic fusion of CUDA kernels for iterative sparse linear system solvers. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 675–686. Springer, Heidelberg (2015). doi: 10.1007/978-3-662-48096-0_52 CrossRefGoogle Scholar
  2. 2.
    Aliaga, J.I., Perez, J., Quintana-Orti, E.S., Anzt, H.: Reformulated conjugate gradient for the energy-aware solution of linear systems on GPUs. In: 2013 42nd International Conference on Parallel Processing (ICPP), pp. 320–329, October 2013Google Scholar
  3. 3.
    Anzt, H., Chow, E., Dongarra, J.: Iterative sparse triangular solves for preconditioning. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 650–661. Springer, Heidelberg (2015). doi: 10.1007/978-3-662-48096-0_50 CrossRefGoogle Scholar
  4. 4.
    Anzt, H., Tomov, S., Dongarra, J.: Energy efficiency and performance frontiers for sparse computations on GPU supercomputers. In: Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2015, pp. 1–10. ACM, New York (2015)Google Scholar
  5. 5.
    Archambeau, F., Méchitoua, N., Sakiz, M.: Code Saturne: A Finite Volume Code for the computation of turbulent incompressible flows - Industrial Applications. Int. J. Finite 1(1) (2004)Google Scholar
  6. 6.
    Chow, E., Anzt, H., Dongarra, J.: Asynchronous iterative algorithm for computing incomplete factorizations on GPUs. In: Kunkel, J.M., Ludwig, T. (eds.) ISC High Performance 2015. LNCS, vol. 9137, pp. 1–16. Springer, Cham (2015). doi: 10.1007/978-3-319-20119-1_1 CrossRefGoogle Scholar
  7. 7.
    Chow, E., Patel, A.: Fine-grained parallel incomplete LU factorization. SIAM J. Sci. Comput. 37, C169–C193 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
  9. 9.
    NVIDIA Corporation. CUDA C best practices guide. http://docs.nvidia.com/cuda/cuda-c-best-practices-guide/
  10. 10.
    NVIDIA Corporation. CUDA Toolkit Documentation v7.5, September 2015Google Scholar
  11. 11.
    Rupp, K., Rudolf, F., Weinbub, J.: ViennaCL - a high level linear algebra library for GPUs and multi-core CPUs. In: International Workshop on GPUs and Scientific Applications, pp. 51–56 (2010)Google Scholar
  12. 12.
    Saad, Y.: Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Philadelphia (2003)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Hartwig Anzt
    • 1
  • Marc Baboulin
    • 2
  • Jack Dongarra
    • 1
  • Yvan Fournier
    • 3
  • Frank Hulsemann
    • 3
  • Amal Khabou
    • 2
    Email author
  • Yushan Wang
    • 2
  1. 1.Innovative Computing LaboratoryUniversity of TennesseeKnoxvilleUSA
  2. 2.Laboratoire de Recherche en InformatiqueUniversité Paris-SudOrsayFrance
  3. 3.EDF R&DClamartFrance

Personalised recommendations