Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Recent progress and challenges in exploiting graphics processors in computational fluid dynamics


The progress made in accelerating simulations of fluid flow using GPUs, and the challenges that remain, are surveyed. The review first provides an introduction to GPU computing and programming, and discusses various considerations for improved performance. Case studies comparing the performance of CPU- and GPU-based solvers for the Laplace and incompressible Navier–Stokes equations are performed in order to demonstrate the potential improvement even with simple codes. Recent efforts to accelerate CFD simulations using GPUs are reviewed for laminar, turbulent, and reactive flow solvers. Also, GPU implementations of the lattice Boltzmann method are reviewed. Finally, recommendations for implementing CFD codes on GPUs are given and remaining challenges are discussed, such as the need to develop new strategies and redesign algorithms to enable GPU acceleration.

This is a preview of subscription content, log in to check access.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7


  1. 1.

    Recent GPU hardware allows a three-dimensional grid.

  2. 2.

    The full source code is available:

  3. 3.

    The full source code is available online:


  1. 1.

    Adams LM, Ortega JM (1982) A multi-color SOR method for parallel computation. ICASE report 82-9, ICASE-NASA Langley Research Center, Hampton, VA

  2. 2.

    Aidun CK, Clausen JR (2010) Lattice–Boltzmann method for complex flows. Annu Rev Fluid Mech 42(1):439–472

  3. 3.

    Alfonsi G, Ciliberti SA, Mancini M, Primavera L (2011) Performances of Navier–Stokes solver on a hybrid CPU/GPU computing system. In: Malyshkin V (ed) Parallel comput technologies. Lecture notes in computer science. Springer, Berlin Heidelberg, pp 404–416

  4. 4.

    Amdahl GM (1967) Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the AFIPS Spring joint computer conference, pp 483–485

  5. 5.

    Anderson JA, Lorenz CD, Travesset A (2008) General purpose molecular dynamics simulations fully implemented on graphics processing units. J Comput Phys 227(10):5342–5359

  6. 6.

    Asouti VG, Trompoukis XS, Kampolis IC, Giannakoglou KC (2010) Unsteady CFD computations using vertex-centered finite volumes for unstructured grids on graphics processing units. Int J Numer Methods Fluids 67(2):232–246

  7. 7.

    Bailey P, Myre J, Walsh SDC, Lilja DJ, Saar MO (2009) Accelerating lattice Boltzmann fluid flow simulations using graphics processors. In: 2009 international conference on parallel processing (ICPP 2009). IEEE, New York, pp 550–557

  8. 8.

    Beberg AL, Ensign DL, Jayachandran G, Khaliq S, Pande VS (2009) Folding@home: Lessons from eight years of volunteer distributed computing. In: IEEE international symposium on parallel distributed processing

  9. 9.

    Bernaschi M, Fatica M, Melchionna S, Succi S, Kaxiras E (2010) A flexible high-performance lattice Boltzmann GPU code for the simulations of fluid flows in complex geometries. Concurr Comput Pract Exp 22(1):1–14

  10. 10.

    Bernaschi M, Bisson M, Fatica M, Melchionna S, Succi S (2013) Petaflop hydrokinetic simulations of complex flows on massive GPU clusters. Comput Phys Commun 184(2):329–341

  11. 11.

    Birk M, Guth A, Zapf M, Balzer M, Ruiter N, Hübner M, Becker J (2011) Acceleration of image reconstruction in 3D ultrasound computer tomography: an evaluation of CPU, GPU and FPGA computing. In: 2011 conference on design and architectures for signal and image processing

  12. 12.

    Birk M, Zapf M, Balzer M, Ruiter N, Becker J (2012) A comprehensive comparison of GPU- and FPGA-based acceleration of reflection image reconstruction for 3D ultrasound computer tomography. J Real-Time Image Process (in press). doi:10.1007/s11554-012-0267-4

  13. 13.

    Block BJ, Lukáčová-Medvid’ová M, Virnau P, Yelash L (2012) Accelerated GPU simulation of compressible flow by the discontinuous evolution Galerkin method. Eur Phys J Spec Top 210:119–132

  14. 14.

    Bolz J, Farmer I, Grinspun E, Schröder P (2003) Sparse matrix solvers on the GPU: Conjugate gradients and multigrid. ACM Trans Graph 22(3):917–924

  15. 15.

    Boyer M, Tarjan D, Acton ST, Skadron K (2009) Accelerating leukocyte tracking using CUDA: A case study in leveraging manycore coprocessors. In: 2009 IEEE international symposium on parallel & distributed processing. IEEE, New York

  16. 16.

    Brandvik T, Pullan G (2007) Acceleration of a two-dimensional Euler flow solver using commodity graphics hardware. J Mech Eng Sci 221(12):1745–1748

  17. 17.

    Brandvik T, Pullan G (2008) Acceleration of a 3D Euler solver using commodity graphics hardware. In: 46th AIAA Aerospace Sciences Meeting

  18. 18.

    Brandvik T, Pullan G (2011) An accelerated 3D Navier–Stokes solver for flows in turbomachines. J Turbomach 133(2):021025

  19. 19.

    Burke MP, Chaos M, Ju Y, Dryer FL, Klippenstein SJ (2011) Comprehensive H2/O2 kinetic model for high-pressure combustion. Int J Chem Kinet 44(7):444–474

  20. 20.

    Chabalko C, Fitzgerald T, Balachandran B (2013) GPGPU implementation and benchmarking of the unsteady vortex lattice method. In: 51st AIAA aerospace sciences meeting. American Institute of Aeronautics and Astronautics, Reston

  21. 21.

    Chandra R, Dagum L, Kohr D, Maydan D, McDonald J, Menon R (2001) Parallel programming in OpenMP. Academic Press, San Diego

  22. 22.

    Chang SC (1995) The method of space-time conservation element and solution element—a new approach for solving the Navier–Stokes and Euler equations. J Comput Phys 119(2):295–324

  23. 23.

    Chen JH, Choudhary A, de Supinski B, DeVries M, Hawkes ER, Klasky S, Liao WK, Ma KL, Mellor-Crummey J, Podhorszki N, Sankaran R, Shende S, Yoo CS (2009) Terascale direct numerical simulations of turbulent combustion using S3D. Comput Sci Discovery 2(1):015,001

  24. 24.

    Chen S, Doolen GD (1998) Lattice Boltzmann method for fluid flows. Annu Rev Fluid Mech 30:329–364

  25. 25.

    Cohen JM, Molemaker MJ (2009) A fast double precision CFD code using CUDA. In: Parallel CFD 2009

  26. 26.

    Corrigan A, Camelli FF, Löhner R, Wallin J (2011) Running unstructured grid-based CFD solvers on modern graphics hardware. Int J Numer Methods Fluids 66(2):221–229

  27. 27.

    Dagum L, Menon R (1998) OpenMP: An industry-standard API for shared-memory programming. IEEE Comput Sci Eng 5(1):46–55

  28. 28.

    DeLeon R, Senocak I (2012) GPU-accelerated large-eddy simulation of turbulent channel flows. In: 50th AIAA aerospace sciences meeting

  29. 29.

    DeLeon R, Jacobsen D, Senocak I (2013) Large-eddy simulations of turbulent incompressible flows on GPU clusters. Comput Sci Eng 15(1):26–33

  30. 30.

    Elsen E, Houston M, Vishal V, Darve E, Hanrahan P, Pande V (2006) N-body simulation on GPUs. In: Proceedings of the 2006 ACM/IEEE conference on supercomputing. ACM, New York. No 188 in SC ’06

  31. 31.

    Elsen E, LeGresley P, Darve E (2008) Large calculation of the flow over a hypersonic vehicle using a GPU. J Comput Phys 227(24):10,148–10,161

  32. 32.

    Fan Z, Qiu F, Kaufman A, Yoakum-Stover S (2004) GPU cluster for high performance computing. In: Proceedings of the 2004 ACM/IEEE conference on supercomputing. IEEE Computer Society, Washington, p 47

  33. 33.

    Fatone L, Giacinti M, Mariani F, Recchioni MC, Zirilli F (2012) Parallel option pricing on GPU: barrier options and realized variance options. J Supercomput 62(3):1480–1501

  34. 34.

    Friedrichs MS, Eastman P, Vaidyanathan V, Houston M, Legrand S, Beberg AL, Ensign DL, Bruns CM, Pande VS (2009) Accelerating molecular dynamic simulation on graphics processing units. J Comput Chem 30(6):864–872

  35. 35.

    Griebel M, Zaspel P (2010) A multi-GPU accelerated solver for the three-dimensional two-phase incompressible Navier–Stokes equations. Comput Sci Res Dev 25(1–2):65–73

  36. 36.

    Griebel M, Dornseifer T, Neunhoeffer T (1998) Numerical simulation in fluid dynamics. SIAM, Philadelphia

  37. 37.

    Hagen TR, Lie KA, Natvig JR (2006) Solving the Euler equations on graphics processing units. In: Alexandrov V, van Albada G, Sloot P, Dongarra J (eds) Computational science—ICCS 2006, Part IV. Lecture notes in computer science. Springer, Berlin Heidelberg, pp 220–227

  38. 38.

    Hardy DJ, Stone JE, Schulten K (2009) Multilevel summation of electrostatic potentials using graphics processing units. Parallel Comput 35(3):164–177

  39. 39.

    Harris MJ (2004) Fast fluid dynamics simulation on the GPU. In: Fernando R (ed) GPU gems. Addison-Wesley, Reading, pp 637–665

  40. 40.

    Harris MJ, Baxter WV, Scheuermann T, Lastra A (2003) Simulation of cloud dynamics on graphics hardware. In: HWWS ’03: proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on graphics hardware, Eurographics association, pp 92–101

  41. 41.

    Harten A, Lax PD, van Leer B (1983) On upstream differencing and Godunov-type schemes for hyperbolic conservation laws. SIAM Rev 25(1):35–61

  42. 42.

    Hawkes ER, Sankaran R, Sutherland JC, Chen JH (2005) Direct numerical simulation of turbulent combustion: fundamental insights towards predictive models. J Phys Conf Ser 16:65–79

  43. 43.

    Herbinet O, Pitz WJ, Westbrook CK (2008) Detailed chemical kinetic oxidation mechanism for a biodiesel surrogate. Combust Flame 154(3):507–528

  44. 44.

    Herbinet O, Pitz WJ, Westbrook CK (2010) Detailed chemical kinetic mechanism for the oxidation of biodiesel fuels blend surrogate. Combust Flame 157(5):893–908

  45. 45.

    Humphrey JR, Price DK, Spagnoli KE, Paolini AL, Kelmelis EJ (2010) CULA: Hybrid GPU accelerated linear algebra routines. In: Kelmelis EJ (ed) Modeling and simulation for defense systems and applications V. Proc SPIE, vol 7705, p 770502

  46. 46.

    IEEE (2008) IEEE standard for floating-point arithmetic. IEEE Standard 754-2008. doi:10.1109/IEEESTD.2008.4610935

  47. 47.

    Iman Gohari SM, Esfahanian V, Moqtaderi H (2013) Coalesced computations of the incompressible Navier–Stokes equations over an airfoil using graphics processing units. Comput Fluids 80:102–115

  48. 48.

    Jacobsen DA, Senocak I (2013) Multi-level parallelism for incompressible flow computations on GPU clusters. Parallel Comput 39:1–20

  49. 49.

    Jacobsen DA, Thibault JC, Senocak I (2010) An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters. In: 48th AIAA ASM

  50. 50.

    Jang B, Schaa D, Mistry P, Kaeli D (2011) Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Trans Parallel Distrib Syst 22(1):105–118

  51. 51.

    Jespersen DC (2010) Acceleration of a CFD code with a GPU. Sci Program 18(3–4):193–201

  52. 52.

    Jian L, Wang C, Liu Y, Liang S, Yi W, Shi Y (2013) Parallel data mining techniques on graphics processing unit with compute unified device architecture (CUDA). J Supercomput 64(3):942–967

  53. 53.

    Kampolis IC, Trompoukis XS, Asouti VG, Giannakoglou KC (2010) CFD-based analysis and two-level aerodynamic optimization on graphics processing units. Comput Methods Appl Mech Eng 199(9–12):712–722

  54. 54.

    Khajeh-Saeed A, Perot JB (2013) Direct numerical simulation of turbulence using GPU accelerated supercomputers. J Comput Phys 235:241–257

  55. 55.

    Kirk DB, Hwu WMW (2010) Programming massively parallel processors. Morgan Kaufmann, San Mateo

  56. 56.

    Krüger J, Westermann R (2003) Linear algebra operators for GPU implementation of numerical algorithms. ACM Trans Graph 22(3):908–916

  57. 57.

    Kuo FA, Smith MR, Hsieh CW, Chou CY, Wu JS (2011) GPU acceleration for general conservation equations and its application to several engineering problems. Comput Fluids 45(1):147–154

  58. 58.

    Kuznik F, Obrecht C, Rusaouen G, Roux JJ (2010) LBM based flow simulation using GPU computing processor. Comput Math Appl 59(7):2380–2392

  59. 59.

    Le HP, Cambier J, Cole LK (2013) GPU-based flow simulation with detailed chemical kinetics. Comput Phys Commun 184(3):596–606

  60. 60.

    Lefebvre M, Guillen P, Gouez JML, Basdevant C (2012) Optimizing 2D and 3D structured Euler CFD solvers on graphical processing units. Comput Fluids 70:136–147

  61. 61.

    Levesque JM, Sankaran R, Grout R (2012) Hybridizing S3D into an exascale application using OpenACC. In: Proceedings of the International conference on high performance computing, networking, storage and analysis, pp 15:1–15:11

  62. 62.

    Li W, Wei X, Kaufman A (2003) Implementing lattice Boltzmann computation on graphics hardware. Vis Comput 19(7–8):444–456

  63. 63.

    Li W, Fan Z, Wei X, Kaufman A (2005) Flow simulation with complex boundaries. Addison-Wesley, Reading. Chap 47, pp 747–764. GPU Gems 2

  64. 64.

    Liang L, Stevens J, Farrell JT (2009) A dynamic adaptive chemistry scheme for reactive flow computations. Proc Combust Inst 32(1):527–534

  65. 65.

    Liu J, Ma Z, Li S, Zhao Y (2011) A GPU accelerated red-black SOR algorithm for computational fluid dynamics problems. Adv Mater Res 320:335–340

  66. 66.

    Liu Y, Liu X, Wu E (2004) Real-time 3D fluid simulation on GPU with complex obstacles. In: Proceedings of the computer graphics and applications, 12th Pacific conference, pp 247–256

  67. 67.

    Lu T, Law CK (2009) Toward accommodating realistic fuel chemistry in large-scale computations. Prog Energy Combust Sci 35(2):192–215

  68. 68.

    Mehl M, Pitz WJ, Westbrook CK, Curran HJ (2011) Kinetic modeling of gasoline surrogate components and mixtures under engine conditions. Proc Combust Inst 33:193–200

  69. 69.

    Molemaker J, Cohen JM, Patel S, Noh J (2008) Low viscosity flow simulations for animation. In: 2008 ACM SIGGRAPH/Eurographics symposium on computer animation, Eurographics Association, pp 9–18

  70. 70.

    Mueller K, Xu F (2006) Practical considerations for GPU-accelerated CT. In: 3rd IEEE ISBI, pp 1184–1187

  71. 71.

    Munshi A (2011) The OpenCL specification. Khronos Group

  72. 72.

    Niemeyer KE, Sung CJ, Fotache CG, Lee JC (2011) Turbulence-chemistry closure method using graphics processing units: a preliminary test. In: 7th Fall technical meeting of the Eastern states section of the Combustion Institute

  73. 73.

    Nimmagadda VK, Akoglu A, Hariri S, Moukabary T (2011) Cardiac simulation on multi-GPU platform. J Supercomput 59(3):1360–1378

  74. 74.

    NVIDIA (2010) Tesla c2050/c2070 GPU computing processor

  75. 75.

    NVIDIA (2011) CUDA C programming guide. 4th edn

  76. 76.

    Obrecht C, Kuznik F, Tourancheau B, Roux JJ (2011) A new approach to the lattice Boltzmann method for graphics processing units. Comput Math Appl 61(12):3628–3638

  77. 77.

    Olivares-Amaya R, Watson MA, Edgar RG, Vogt L, Shao Y, Aspuru-Guzik A (2010) Accelerating correlated quantum chemistry calculations using graphical processing units and a mixed precision matrix multiplication library. J Chem Theory Comput 6(1):135–144

  78. 78.

    OpenACC (2011) OpenACC application programming interface.

  79. 79.

    OpenMP Architecture Review Board (2011) OpenMP application program interface.

  80. 80.

    Owens JD, Luebke D, Govindaraju N, Harris MJ, Krueger J, Lefohn AE, Purcell TJ (2007) A survey of general-purpose computation on graphics hardware. Comput Graph Forum 26(1):80–113

  81. 81.

    Owens JD, Houston M, Luebke D, Green S, Stone JE, Phillips JC (2008) GPU computing. Proc IEEE 96(5):879–899

  82. 82.

    Pagès G, Wilbertz B (2012) GPGPUs in computational finance: massive parallel comput. for American style options. Concurr Comput Pract Exp 24(8):837–848

  83. 83.

    Peters N (1984) Laminar diffusion flamelet models in non-premixed turbulent combustion. Prog Energy Combust Sci 10(3):319–339

  84. 84.

    Peters N (1986) Laminar flamelet concepts in turbulent combustion. Symp (Int) Combust 21:1231–1250

  85. 85.

    Phillips EH, Zhang Y, Davis RL, Owens JD (2009) Rapid aerodynamic performance prediction on a cluster of graphics processing units. In: 47th AIAA aerospace sciences meeting

  86. 86.

    Phillips EH, Davis RL, Owens JD (2010) Unsteady turbulent simulations on a cluster of graphics processors. In: 40th AIAA fluid dynamics conference

  87. 87.

    Pratx G, Xing L (2011) GPU computing in medical physics: A review. Med Phys 38(5):2685–2697

  88. 88.

    Qin Z, Lissianski VV, Yang H, Gardiner WC, Davis SG, Wang H (2000) Combustion chemistry of propane: a case study of detailed reaction mechanism optimization. Proc Combust Inst 28:1663–1669

  89. 89.

    Ran W, Cheng W, Qin F, Luo X (2011) GPU accelerated CESE method for 1D shock tube problems. J Comput Phys 230(24):8797–8812

  90. 90.

    Reyes R, López I, Fumero JJ, de Sande F (2012) Directive-based programming for GPUs: A comparative study. In: IEEE 14th international conference on high performance computing and communications, pp 410–417

  91. 91.

    Reyes R, López-Rodríguez I, Fumero JJ, de Sande F (2012) accULL: An OpenACC implementation with CUDA and OpenCL support. In: Kaklamanis C, Papatheodorou T, Spirakis PG (eds) Euro-Par 2012 parallel processing. LNCS, vol 7484. Springer, Berlin Heidelberg, pp 871–882

  92. 92.

    Rinaldi PR, Dari EA, Vénere MJ, Clausse A (2012) A Lattice–Boltzmann solver for 3D fluid simulation on GPU. Simul Model Pract Theory 25:163–171

  93. 93.

    Salvadore F, Bernardini M, Botti M (2013) GPU accelerated flow solver for direct numerical simulation of turbulent flows. J Comput Phys 235:129–142

  94. 94.

    Sanders J, Kandrot E (2010) CUDA by example: an introduction to general-purpose GPU programming, 1st edn. Addison-Wesley, Reading

  95. 95.

    Sankaran R (2013) GPU-accelerated software library for unsteady flamelet modeling of turbulent combustion with complex chemical kinetics. In: 51st AIAA aerospace sciences meeting

  96. 96.

    Scheidegger CE, Comba JLD, da Cunha RD (2005) Practical CFD simulations on programmable graphics hardware using SMAC. Comput Graph Forum 24(4):715–728

  97. 97.

    Sherbondy A, Houston M, Napel S (2003) Fast volume segmentation with simultaneous visualization using programmable graphics hardware. In: 14th IEEE visualization conference, pp 171–176

  98. 98.

    Shi Y, Liang L, Ge H, Reitz RD (2010) Acceleration of the chemistry solver for modeling DI engine combustion using dynamic adaptive chemistry (DAC) schemes. Combust Theory Model 14(1):69–89

  99. 99.

    Shi Y, Green WH, Wong HW, Oluwole OO (2011) Redesigning combustion modeling algorithms for the graphics processing unit (GPU): Chemical kinetic rate evaluation and ordinary differential equation integration. Combust Flame 158(5):836–847

  100. 100.

    Shi Y, Oluwole OO, Wong HW, Green WH (2011) A multi-approach algorithm for enabling efficient application of very large, highly detailed reaction mechanisms in multi-dimensional HCCI engine simulations. In: 7th national combustion meeting

  101. 101.

    Shi Y, Green WH, Wong HW, Oluwole OO (2012) Accelerating multi-dimensional combustion simulations using hybrid CPU-based implicit/GPU-based explicit ODE integration. Combust Flame 159(7):2388–2397

  102. 102.

    Shinn AF, Vanka SP (2009) Implementation of a semi-implicit pressure-based multigrid fluid flow algorithm on a graphics processing unit. ASME Conf Proc 2009(43864):125–133

  103. 103.

    Shinn AF, Vanka SP (2011) Large eddy simulations of film-cooling flows with a micro-ramp vortex generator. ASME Conf Proc 2011(54921):439–451

  104. 104.

    Shinn AF, Vanka SP, Hwu WW (2010) Direct numerical simulation of turbulent flow in a square duct using a graphics processing unit (GPU). In: 40th fluid dynamics conference & exhibit

  105. 105.

    Spafford K, Meredith J, Vetter J, Chen JH, Grout R, Sankaran R (2010) Accelerating S3D: A GPGPU case study. In: Lin HX, Alexander M, Forsell M, Knüpfer A, Prodan R, Sousa L, Streit A (eds) Euro-Par 2009 parallel processing workshops. LNCS, vol 6043. Springer, Berlin Heidelberg, pp 122–131

  106. 106.

    Stam J (1999) Stable fluids. In: SIGGRAPH ’99. ACM Press/Addison-Wesley, Reading, pp 121–128

  107. 107.

    Stone CP, Davis RL, Sekar B (2013) Techniques for solving stiff chemical kinetics on GPUs. In: 51st AIAA aerospace sciences meeting

  108. 108.

    Stone JE, Phillips JC, Freddolino PL, Hardy DJ, Trabuco LG, Schulten K (2007) Accelerating molecular modeling applications with graphics processors. J Comput Chem 28(16):2618–2640

  109. 109.

    Surkov V (2010) Parallel option pricing with Fourier space time-stepping method on graphics processing units. Parallel Comput 36(7):372–380

  110. 110.

    Thibault JC, Senocak I (2009) CUDA implementation of a Navier–Stokes solver on multi-GPU desktop platforms for incompressible flows. In: 47th AIAA aerospace sciences meeting

  111. 111.

    Thibault JC, Senocak I (2012) Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms. J Supercomput 59(2):693–719

  112. 112.

    Tölke J, Krafczyk M (2008) TeraFLOP computing on a desktop PC with GPUs for 3D CFD. Int J Comput Fluid Dyn 22(7):443–456

  113. 113.

    Vanka SP, Shinn AF, Sahu KC (2011) Computational fluid dynamics using graphics processing units: challenges and opportunities. ASME Conf Proc 2011(54921):429–437

  114. 114.

    Vogt L, Olivares-Amaya R, Kermes S, Shao Y, Amador-Bedolla C, Aspuru-Guzik A (2008) Accelerating resolution-of-the-identity second-order Møller–Plesset quantum chemistry calculations with graphical processing units. J Phys Chem A 112(10):2049–2057

  115. 115.

    Wienke S, Springer P, Terboven C, an Mey D (2012) OpenACC—first experiences with real-world applications. In: Kaklamanis C, Papatheodorou T, Spirakis PG (eds) Euro-Par 2012 parallel processing. LNCS, vol 7484. Springer, Berlin Heidelberg, pp 859–870

  116. 116.

    Xu Y, Xu L, Zhang DD, Yao JF (2013) Investigation of solving 3D Navier–Stokes equations with hybrid spectral scheme using GPU. In: Yuen DA, Wang L, Chi X, Johnsson L, Ge W, Shi Y (eds) GPU solutions to multi-scale problems in science and engineering. Lecture notes in earth system sciences. Springer, Berlin, Heidelberg, pp 283–293

  117. 117.

    Yasuda K (2008) Accelerating density functional calculations with graphics processing unit. J Chem Theory Comput 4(8):1230–1236

  118. 118.

    Yetter RA, Dryer FL, Rabitz H (1991) A comprehensive reaction mechanism for carbon monoxide/hydrogen/oxygen kinetics. Combust Sci Technol 79:97–128

  119. 119.

    Young DM (1971) Iterative solution of large linear systems. Academic Press, New York

  120. 120.

    Zambon AC, Chelliah HK (2007) Explicit reduced reaction models for ignition, flame propagation, and extinction of C2H4/CH4H2 and air systems. Combust Flame 150(1–2):71–91

  121. 121.

    Zaspel P, Griebel M (2013) Solving incompressible two-phase flows on multi-GPU clusters. Comput Fluids 80:356–364

  122. 122.

    Zhao Y (2007) Lattice Boltzmann based PDE solver on the GPU. Vis Comput 24(5):323–333

Download references


This work was supported by the National Science Foundation under grant number 0932559, the US Department of Defense through the National Defense Science and Engineering Graduate Fellowship program, the National Science Foundation Graduate Research Fellowship under grant number DGE-0951783, and the Combustion Energy Frontier Research Center—an Energy Frontier Research Center funded by the US Department of Energy, Office of Science, Office of Basic Energy Sciences under award number DE-SC0001198.

Author information

Correspondence to Kyle E. Niemeyer.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Niemeyer, K.E., Sung, C. Recent progress and challenges in exploiting graphics processors in computational fluid dynamics. J Supercomput 67, 528–564 (2014).

Download citation


  • Graphics processing unit (GPU)
  • Computational fluid dynamics (CFD)
  • Laminar flows
  • Turbulent flow
  • Reactive flow
  • CUDA