The Journal of Supercomputing

, Volume 67, Issue 2, pp 528–564 | Cite as

Recent progress and challenges in exploiting graphics processors in computational fluid dynamics



The progress made in accelerating simulations of fluid flow using GPUs, and the challenges that remain, are surveyed. The review first provides an introduction to GPU computing and programming, and discusses various considerations for improved performance. Case studies comparing the performance of CPU- and GPU-based solvers for the Laplace and incompressible Navier–Stokes equations are performed in order to demonstrate the potential improvement even with simple codes. Recent efforts to accelerate CFD simulations using GPUs are reviewed for laminar, turbulent, and reactive flow solvers. Also, GPU implementations of the lattice Boltzmann method are reviewed. Finally, recommendations for implementing CFD codes on GPUs are given and remaining challenges are discussed, such as the need to develop new strategies and redesign algorithms to enable GPU acceleration.


Graphics processing unit (GPU) Computational fluid dynamics (CFD) Laminar flows Turbulent flow Reactive flow CUDA 


  1. 1.
    Adams LM, Ortega JM (1982) A multi-color SOR method for parallel computation. ICASE report 82-9, ICASE-NASA Langley Research Center, Hampton, VA Google Scholar
  2. 2.
    Aidun CK, Clausen JR (2010) Lattice–Boltzmann method for complex flows. Annu Rev Fluid Mech 42(1):439–472 MathSciNetGoogle Scholar
  3. 3.
    Alfonsi G, Ciliberti SA, Mancini M, Primavera L (2011) Performances of Navier–Stokes solver on a hybrid CPU/GPU computing system. In: Malyshkin V (ed) Parallel comput technologies. Lecture notes in computer science. Springer, Berlin Heidelberg, pp 404–416 Google Scholar
  4. 4.
    Amdahl GM (1967) Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the AFIPS Spring joint computer conference, pp 483–485 Google Scholar
  5. 5.
    Anderson JA, Lorenz CD, Travesset A (2008) General purpose molecular dynamics simulations fully implemented on graphics processing units. J Comput Phys 227(10):5342–5359 MATHGoogle Scholar
  6. 6.
    Asouti VG, Trompoukis XS, Kampolis IC, Giannakoglou KC (2010) Unsteady CFD computations using vertex-centered finite volumes for unstructured grids on graphics processing units. Int J Numer Methods Fluids 67(2):232–246 MathSciNetGoogle Scholar
  7. 7.
    Bailey P, Myre J, Walsh SDC, Lilja DJ, Saar MO (2009) Accelerating lattice Boltzmann fluid flow simulations using graphics processors. In: 2009 international conference on parallel processing (ICPP 2009). IEEE, New York, pp 550–557 Google Scholar
  8. 8.
    Beberg AL, Ensign DL, Jayachandran G, Khaliq S, Pande VS (2009) Folding@home: Lessons from eight years of volunteer distributed computing. In: IEEE international symposium on parallel distributed processing Google Scholar
  9. 9.
    Bernaschi M, Fatica M, Melchionna S, Succi S, Kaxiras E (2010) A flexible high-performance lattice Boltzmann GPU code for the simulations of fluid flows in complex geometries. Concurr Comput Pract Exp 22(1):1–14 Google Scholar
  10. 10.
    Bernaschi M, Bisson M, Fatica M, Melchionna S, Succi S (2013) Petaflop hydrokinetic simulations of complex flows on massive GPU clusters. Comput Phys Commun 184(2):329–341 Google Scholar
  11. 11.
    Birk M, Guth A, Zapf M, Balzer M, Ruiter N, Hübner M, Becker J (2011) Acceleration of image reconstruction in 3D ultrasound computer tomography: an evaluation of CPU, GPU and FPGA computing. In: 2011 conference on design and architectures for signal and image processing Google Scholar
  12. 12.
    Birk M, Zapf M, Balzer M, Ruiter N, Becker J (2012) A comprehensive comparison of GPU- and FPGA-based acceleration of reflection image reconstruction for 3D ultrasound computer tomography. J Real-Time Image Process (in press). doi:10.1007/s11554-012-0267-4 Google Scholar
  13. 13.
    Block BJ, Lukáčová-Medvid’ová M, Virnau P, Yelash L (2012) Accelerated GPU simulation of compressible flow by the discontinuous evolution Galerkin method. Eur Phys J Spec Top 210:119–132 Google Scholar
  14. 14.
    Bolz J, Farmer I, Grinspun E, Schröder P (2003) Sparse matrix solvers on the GPU: Conjugate gradients and multigrid. ACM Trans Graph 22(3):917–924 Google Scholar
  15. 15.
    Boyer M, Tarjan D, Acton ST, Skadron K (2009) Accelerating leukocyte tracking using CUDA: A case study in leveraging manycore coprocessors. In: 2009 IEEE international symposium on parallel & distributed processing. IEEE, New York Google Scholar
  16. 16.
    Brandvik T, Pullan G (2007) Acceleration of a two-dimensional Euler flow solver using commodity graphics hardware. J Mech Eng Sci 221(12):1745–1748 Google Scholar
  17. 17.
    Brandvik T, Pullan G (2008) Acceleration of a 3D Euler solver using commodity graphics hardware. In: 46th AIAA Aerospace Sciences Meeting Google Scholar
  18. 18.
    Brandvik T, Pullan G (2011) An accelerated 3D Navier–Stokes solver for flows in turbomachines. J Turbomach 133(2):021025 Google Scholar
  19. 19.
    Burke MP, Chaos M, Ju Y, Dryer FL, Klippenstein SJ (2011) Comprehensive H2/O2 kinetic model for high-pressure combustion. Int J Chem Kinet 44(7):444–474 Google Scholar
  20. 20.
    Chabalko C, Fitzgerald T, Balachandran B (2013) GPGPU implementation and benchmarking of the unsteady vortex lattice method. In: 51st AIAA aerospace sciences meeting. American Institute of Aeronautics and Astronautics, Reston Google Scholar
  21. 21.
    Chandra R, Dagum L, Kohr D, Maydan D, McDonald J, Menon R (2001) Parallel programming in OpenMP. Academic Press, San Diego Google Scholar
  22. 22.
    Chang SC (1995) The method of space-time conservation element and solution element—a new approach for solving the Navier–Stokes and Euler equations. J Comput Phys 119(2):295–324 MATHMathSciNetGoogle Scholar
  23. 23.
    Chen JH, Choudhary A, de Supinski B, DeVries M, Hawkes ER, Klasky S, Liao WK, Ma KL, Mellor-Crummey J, Podhorszki N, Sankaran R, Shende S, Yoo CS (2009) Terascale direct numerical simulations of turbulent combustion using S3D. Comput Sci Discovery 2(1):015,001 Google Scholar
  24. 24.
    Chen S, Doolen GD (1998) Lattice Boltzmann method for fluid flows. Annu Rev Fluid Mech 30:329–364 MathSciNetGoogle Scholar
  25. 25.
    Cohen JM, Molemaker MJ (2009) A fast double precision CFD code using CUDA. In: Parallel CFD 2009 Google Scholar
  26. 26.
    Corrigan A, Camelli FF, Löhner R, Wallin J (2011) Running unstructured grid-based CFD solvers on modern graphics hardware. Int J Numer Methods Fluids 66(2):221–229 MATHGoogle Scholar
  27. 27.
    Dagum L, Menon R (1998) OpenMP: An industry-standard API for shared-memory programming. IEEE Comput Sci Eng 5(1):46–55 Google Scholar
  28. 28.
    DeLeon R, Senocak I (2012) GPU-accelerated large-eddy simulation of turbulent channel flows. In: 50th AIAA aerospace sciences meeting Google Scholar
  29. 29.
    DeLeon R, Jacobsen D, Senocak I (2013) Large-eddy simulations of turbulent incompressible flows on GPU clusters. Comput Sci Eng 15(1):26–33 Google Scholar
  30. 30.
    Elsen E, Houston M, Vishal V, Darve E, Hanrahan P, Pande V (2006) N-body simulation on GPUs. In: Proceedings of the 2006 ACM/IEEE conference on supercomputing. ACM, New York. No 188 in SC ’06 Google Scholar
  31. 31.
    Elsen E, LeGresley P, Darve E (2008) Large calculation of the flow over a hypersonic vehicle using a GPU. J Comput Phys 227(24):10,148–10,161 Google Scholar
  32. 32.
    Fan Z, Qiu F, Kaufman A, Yoakum-Stover S (2004) GPU cluster for high performance computing. In: Proceedings of the 2004 ACM/IEEE conference on supercomputing. IEEE Computer Society, Washington, p 47 Google Scholar
  33. 33.
    Fatone L, Giacinti M, Mariani F, Recchioni MC, Zirilli F (2012) Parallel option pricing on GPU: barrier options and realized variance options. J Supercomput 62(3):1480–1501 Google Scholar
  34. 34.
    Friedrichs MS, Eastman P, Vaidyanathan V, Houston M, Legrand S, Beberg AL, Ensign DL, Bruns CM, Pande VS (2009) Accelerating molecular dynamic simulation on graphics processing units. J Comput Chem 30(6):864–872 Google Scholar
  35. 35.
    Griebel M, Zaspel P (2010) A multi-GPU accelerated solver for the three-dimensional two-phase incompressible Navier–Stokes equations. Comput Sci Res Dev 25(1–2):65–73 Google Scholar
  36. 36.
    Griebel M, Dornseifer T, Neunhoeffer T (1998) Numerical simulation in fluid dynamics. SIAM, Philadelphia Google Scholar
  37. 37.
    Hagen TR, Lie KA, Natvig JR (2006) Solving the Euler equations on graphics processing units. In: Alexandrov V, van Albada G, Sloot P, Dongarra J (eds) Computational science—ICCS 2006, Part IV. Lecture notes in computer science. Springer, Berlin Heidelberg, pp 220–227 Google Scholar
  38. 38.
    Hardy DJ, Stone JE, Schulten K (2009) Multilevel summation of electrostatic potentials using graphics processing units. Parallel Comput 35(3):164–177 Google Scholar
  39. 39.
    Harris MJ (2004) Fast fluid dynamics simulation on the GPU. In: Fernando R (ed) GPU gems. Addison-Wesley, Reading, pp 637–665 Google Scholar
  40. 40.
    Harris MJ, Baxter WV, Scheuermann T, Lastra A (2003) Simulation of cloud dynamics on graphics hardware. In: HWWS ’03: proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on graphics hardware, Eurographics association, pp 92–101 Google Scholar
  41. 41.
    Harten A, Lax PD, van Leer B (1983) On upstream differencing and Godunov-type schemes for hyperbolic conservation laws. SIAM Rev 25(1):35–61 MATHMathSciNetGoogle Scholar
  42. 42.
    Hawkes ER, Sankaran R, Sutherland JC, Chen JH (2005) Direct numerical simulation of turbulent combustion: fundamental insights towards predictive models. J Phys Conf Ser 16:65–79 Google Scholar
  43. 43.
    Herbinet O, Pitz WJ, Westbrook CK (2008) Detailed chemical kinetic oxidation mechanism for a biodiesel surrogate. Combust Flame 154(3):507–528 Google Scholar
  44. 44.
    Herbinet O, Pitz WJ, Westbrook CK (2010) Detailed chemical kinetic mechanism for the oxidation of biodiesel fuels blend surrogate. Combust Flame 157(5):893–908 Google Scholar
  45. 45.
    Humphrey JR, Price DK, Spagnoli KE, Paolini AL, Kelmelis EJ (2010) CULA: Hybrid GPU accelerated linear algebra routines. In: Kelmelis EJ (ed) Modeling and simulation for defense systems and applications V. Proc SPIE, vol 7705, p 770502 Google Scholar
  46. 46.
    IEEE (2008) IEEE standard for floating-point arithmetic. IEEE Standard 754-2008. doi:10.1109/IEEESTD.2008.4610935
  47. 47.
    Iman Gohari SM, Esfahanian V, Moqtaderi H (2013) Coalesced computations of the incompressible Navier–Stokes equations over an airfoil using graphics processing units. Comput Fluids 80:102–115 Google Scholar
  48. 48.
    Jacobsen DA, Senocak I (2013) Multi-level parallelism for incompressible flow computations on GPU clusters. Parallel Comput 39:1–20 MathSciNetGoogle Scholar
  49. 49.
    Jacobsen DA, Thibault JC, Senocak I (2010) An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters. In: 48th AIAA ASM Google Scholar
  50. 50.
    Jang B, Schaa D, Mistry P, Kaeli D (2011) Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Trans Parallel Distrib Syst 22(1):105–118 Google Scholar
  51. 51.
    Jespersen DC (2010) Acceleration of a CFD code with a GPU. Sci Program 18(3–4):193–201 Google Scholar
  52. 52.
    Jian L, Wang C, Liu Y, Liang S, Yi W, Shi Y (2013) Parallel data mining techniques on graphics processing unit with compute unified device architecture (CUDA). J Supercomput 64(3):942–967 Google Scholar
  53. 53.
    Kampolis IC, Trompoukis XS, Asouti VG, Giannakoglou KC (2010) CFD-based analysis and two-level aerodynamic optimization on graphics processing units. Comput Methods Appl Mech Eng 199(9–12):712–722 MATHMathSciNetGoogle Scholar
  54. 54.
    Khajeh-Saeed A, Perot JB (2013) Direct numerical simulation of turbulence using GPU accelerated supercomputers. J Comput Phys 235:241–257 MathSciNetGoogle Scholar
  55. 55.
    Kirk DB, Hwu WMW (2010) Programming massively parallel processors. Morgan Kaufmann, San Mateo Google Scholar
  56. 56.
    Krüger J, Westermann R (2003) Linear algebra operators for GPU implementation of numerical algorithms. ACM Trans Graph 22(3):908–916 Google Scholar
  57. 57.
    Kuo FA, Smith MR, Hsieh CW, Chou CY, Wu JS (2011) GPU acceleration for general conservation equations and its application to several engineering problems. Comput Fluids 45(1):147–154 MATHGoogle Scholar
  58. 58.
    Kuznik F, Obrecht C, Rusaouen G, Roux JJ (2010) LBM based flow simulation using GPU computing processor. Comput Math Appl 59(7):2380–2392 MATHGoogle Scholar
  59. 59.
    Le HP, Cambier J, Cole LK (2013) GPU-based flow simulation with detailed chemical kinetics. Comput Phys Commun 184(3):596–606 Google Scholar
  60. 60.
    Lefebvre M, Guillen P, Gouez JML, Basdevant C (2012) Optimizing 2D and 3D structured Euler CFD solvers on graphical processing units. Comput Fluids 70:136–147 MathSciNetGoogle Scholar
  61. 61.
    Levesque JM, Sankaran R, Grout R (2012) Hybridizing S3D into an exascale application using OpenACC. In: Proceedings of the International conference on high performance computing, networking, storage and analysis, pp 15:1–15:11 Google Scholar
  62. 62.
    Li W, Wei X, Kaufman A (2003) Implementing lattice Boltzmann computation on graphics hardware. Vis Comput 19(7–8):444–456 Google Scholar
  63. 63.
    Li W, Fan Z, Wei X, Kaufman A (2005) Flow simulation with complex boundaries. Addison-Wesley, Reading. Chap 47, pp 747–764. GPU Gems 2 Google Scholar
  64. 64.
    Liang L, Stevens J, Farrell JT (2009) A dynamic adaptive chemistry scheme for reactive flow computations. Proc Combust Inst 32(1):527–534 Google Scholar
  65. 65.
    Liu J, Ma Z, Li S, Zhao Y (2011) A GPU accelerated red-black SOR algorithm for computational fluid dynamics problems. Adv Mater Res 320:335–340 Google Scholar
  66. 66.
    Liu Y, Liu X, Wu E (2004) Real-time 3D fluid simulation on GPU with complex obstacles. In: Proceedings of the computer graphics and applications, 12th Pacific conference, pp 247–256 Google Scholar
  67. 67.
    Lu T, Law CK (2009) Toward accommodating realistic fuel chemistry in large-scale computations. Prog Energy Combust Sci 35(2):192–215 Google Scholar
  68. 68.
    Mehl M, Pitz WJ, Westbrook CK, Curran HJ (2011) Kinetic modeling of gasoline surrogate components and mixtures under engine conditions. Proc Combust Inst 33:193–200 Google Scholar
  69. 69.
    Molemaker J, Cohen JM, Patel S, Noh J (2008) Low viscosity flow simulations for animation. In: 2008 ACM SIGGRAPH/Eurographics symposium on computer animation, Eurographics Association, pp 9–18 Google Scholar
  70. 70.
    Mueller K, Xu F (2006) Practical considerations for GPU-accelerated CT. In: 3rd IEEE ISBI, pp 1184–1187 Google Scholar
  71. 71.
    Munshi A (2011) The OpenCL specification. Khronos Group Google Scholar
  72. 72.
    Niemeyer KE, Sung CJ, Fotache CG, Lee JC (2011) Turbulence-chemistry closure method using graphics processing units: a preliminary test. In: 7th Fall technical meeting of the Eastern states section of the Combustion Institute Google Scholar
  73. 73.
    Nimmagadda VK, Akoglu A, Hariri S, Moukabary T (2011) Cardiac simulation on multi-GPU platform. J Supercomput 59(3):1360–1378 Google Scholar
  74. 74.
    NVIDIA (2010) Tesla c2050/c2070 GPU computing processor Google Scholar
  75. 75.
    NVIDIA (2011) CUDA C programming guide. 4th edn Google Scholar
  76. 76.
    Obrecht C, Kuznik F, Tourancheau B, Roux JJ (2011) A new approach to the lattice Boltzmann method for graphics processing units. Comput Math Appl 61(12):3628–3638 MATHGoogle Scholar
  77. 77.
    Olivares-Amaya R, Watson MA, Edgar RG, Vogt L, Shao Y, Aspuru-Guzik A (2010) Accelerating correlated quantum chemistry calculations using graphical processing units and a mixed precision matrix multiplication library. J Chem Theory Comput 6(1):135–144 Google Scholar
  78. 78.
    OpenACC (2011) OpenACC application programming interface.
  79. 79.
    OpenMP Architecture Review Board (2011) OpenMP application program interface.
  80. 80.
    Owens JD, Luebke D, Govindaraju N, Harris MJ, Krueger J, Lefohn AE, Purcell TJ (2007) A survey of general-purpose computation on graphics hardware. Comput Graph Forum 26(1):80–113 Google Scholar
  81. 81.
    Owens JD, Houston M, Luebke D, Green S, Stone JE, Phillips JC (2008) GPU computing. Proc IEEE 96(5):879–899 Google Scholar
  82. 82.
    Pagès G, Wilbertz B (2012) GPGPUs in computational finance: massive parallel comput. for American style options. Concurr Comput Pract Exp 24(8):837–848 Google Scholar
  83. 83.
    Peters N (1984) Laminar diffusion flamelet models in non-premixed turbulent combustion. Prog Energy Combust Sci 10(3):319–339 Google Scholar
  84. 84.
    Peters N (1986) Laminar flamelet concepts in turbulent combustion. Symp (Int) Combust 21:1231–1250 Google Scholar
  85. 85.
    Phillips EH, Zhang Y, Davis RL, Owens JD (2009) Rapid aerodynamic performance prediction on a cluster of graphics processing units. In: 47th AIAA aerospace sciences meeting Google Scholar
  86. 86.
    Phillips EH, Davis RL, Owens JD (2010) Unsteady turbulent simulations on a cluster of graphics processors. In: 40th AIAA fluid dynamics conference Google Scholar
  87. 87.
    Pratx G, Xing L (2011) GPU computing in medical physics: A review. Med Phys 38(5):2685–2697 Google Scholar
  88. 88.
    Qin Z, Lissianski VV, Yang H, Gardiner WC, Davis SG, Wang H (2000) Combustion chemistry of propane: a case study of detailed reaction mechanism optimization. Proc Combust Inst 28:1663–1669 Google Scholar
  89. 89.
    Ran W, Cheng W, Qin F, Luo X (2011) GPU accelerated CESE method for 1D shock tube problems. J Comput Phys 230(24):8797–8812 MATHGoogle Scholar
  90. 90.
    Reyes R, López I, Fumero JJ, de Sande F (2012) Directive-based programming for GPUs: A comparative study. In: IEEE 14th international conference on high performance computing and communications, pp 410–417 Google Scholar
  91. 91.
    Reyes R, López-Rodríguez I, Fumero JJ, de Sande F (2012) accULL: An OpenACC implementation with CUDA and OpenCL support. In: Kaklamanis C, Papatheodorou T, Spirakis PG (eds) Euro-Par 2012 parallel processing. LNCS, vol 7484. Springer, Berlin Heidelberg, pp 871–882 Google Scholar
  92. 92.
    Rinaldi PR, Dari EA, Vénere MJ, Clausse A (2012) A Lattice–Boltzmann solver for 3D fluid simulation on GPU. Simul Model Pract Theory 25:163–171 Google Scholar
  93. 93.
    Salvadore F, Bernardini M, Botti M (2013) GPU accelerated flow solver for direct numerical simulation of turbulent flows. J Comput Phys 235:129–142 MathSciNetGoogle Scholar
  94. 94.
    Sanders J, Kandrot E (2010) CUDA by example: an introduction to general-purpose GPU programming, 1st edn. Addison-Wesley, Reading Google Scholar
  95. 95.
    Sankaran R (2013) GPU-accelerated software library for unsteady flamelet modeling of turbulent combustion with complex chemical kinetics. In: 51st AIAA aerospace sciences meeting Google Scholar
  96. 96.
    Scheidegger CE, Comba JLD, da Cunha RD (2005) Practical CFD simulations on programmable graphics hardware using SMAC. Comput Graph Forum 24(4):715–728 Google Scholar
  97. 97.
    Sherbondy A, Houston M, Napel S (2003) Fast volume segmentation with simultaneous visualization using programmable graphics hardware. In: 14th IEEE visualization conference, pp 171–176 Google Scholar
  98. 98.
    Shi Y, Liang L, Ge H, Reitz RD (2010) Acceleration of the chemistry solver for modeling DI engine combustion using dynamic adaptive chemistry (DAC) schemes. Combust Theory Model 14(1):69–89 MATHGoogle Scholar
  99. 99.
    Shi Y, Green WH, Wong HW, Oluwole OO (2011) Redesigning combustion modeling algorithms for the graphics processing unit (GPU): Chemical kinetic rate evaluation and ordinary differential equation integration. Combust Flame 158(5):836–847 Google Scholar
  100. 100.
    Shi Y, Oluwole OO, Wong HW, Green WH (2011) A multi-approach algorithm for enabling efficient application of very large, highly detailed reaction mechanisms in multi-dimensional HCCI engine simulations. In: 7th national combustion meeting Google Scholar
  101. 101.
    Shi Y, Green WH, Wong HW, Oluwole OO (2012) Accelerating multi-dimensional combustion simulations using hybrid CPU-based implicit/GPU-based explicit ODE integration. Combust Flame 159(7):2388–2397 Google Scholar
  102. 102.
    Shinn AF, Vanka SP (2009) Implementation of a semi-implicit pressure-based multigrid fluid flow algorithm on a graphics processing unit. ASME Conf Proc 2009(43864):125–133 Google Scholar
  103. 103.
    Shinn AF, Vanka SP (2011) Large eddy simulations of film-cooling flows with a micro-ramp vortex generator. ASME Conf Proc 2011(54921):439–451 Google Scholar
  104. 104.
    Shinn AF, Vanka SP, Hwu WW (2010) Direct numerical simulation of turbulent flow in a square duct using a graphics processing unit (GPU). In: 40th fluid dynamics conference & exhibit Google Scholar
  105. 105.
    Spafford K, Meredith J, Vetter J, Chen JH, Grout R, Sankaran R (2010) Accelerating S3D: A GPGPU case study. In: Lin HX, Alexander M, Forsell M, Knüpfer A, Prodan R, Sousa L, Streit A (eds) Euro-Par 2009 parallel processing workshops. LNCS, vol 6043. Springer, Berlin Heidelberg, pp 122–131 Google Scholar
  106. 106.
    Stam J (1999) Stable fluids. In: SIGGRAPH ’99. ACM Press/Addison-Wesley, Reading, pp 121–128 Google Scholar
  107. 107.
    Stone CP, Davis RL, Sekar B (2013) Techniques for solving stiff chemical kinetics on GPUs. In: 51st AIAA aerospace sciences meeting Google Scholar
  108. 108.
    Stone JE, Phillips JC, Freddolino PL, Hardy DJ, Trabuco LG, Schulten K (2007) Accelerating molecular modeling applications with graphics processors. J Comput Chem 28(16):2618–2640 Google Scholar
  109. 109.
    Surkov V (2010) Parallel option pricing with Fourier space time-stepping method on graphics processing units. Parallel Comput 36(7):372–380 MATHMathSciNetGoogle Scholar
  110. 110.
    Thibault JC, Senocak I (2009) CUDA implementation of a Navier–Stokes solver on multi-GPU desktop platforms for incompressible flows. In: 47th AIAA aerospace sciences meeting Google Scholar
  111. 111.
    Thibault JC, Senocak I (2012) Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms. J Supercomput 59(2):693–719 Google Scholar
  112. 112.
    Tölke J, Krafczyk M (2008) TeraFLOP computing on a desktop PC with GPUs for 3D CFD. Int J Comput Fluid Dyn 22(7):443–456 MATHGoogle Scholar
  113. 113.
    Vanka SP, Shinn AF, Sahu KC (2011) Computational fluid dynamics using graphics processing units: challenges and opportunities. ASME Conf Proc 2011(54921):429–437 Google Scholar
  114. 114.
    Vogt L, Olivares-Amaya R, Kermes S, Shao Y, Amador-Bedolla C, Aspuru-Guzik A (2008) Accelerating resolution-of-the-identity second-order Møller–Plesset quantum chemistry calculations with graphical processing units. J Phys Chem A 112(10):2049–2057 Google Scholar
  115. 115.
    Wienke S, Springer P, Terboven C, an Mey D (2012) OpenACC—first experiences with real-world applications. In: Kaklamanis C, Papatheodorou T, Spirakis PG (eds) Euro-Par 2012 parallel processing. LNCS, vol 7484. Springer, Berlin Heidelberg, pp 859–870 Google Scholar
  116. 116.
    Xu Y, Xu L, Zhang DD, Yao JF (2013) Investigation of solving 3D Navier–Stokes equations with hybrid spectral scheme using GPU. In: Yuen DA, Wang L, Chi X, Johnsson L, Ge W, Shi Y (eds) GPU solutions to multi-scale problems in science and engineering. Lecture notes in earth system sciences. Springer, Berlin, Heidelberg, pp 283–293 Google Scholar
  117. 117.
    Yasuda K (2008) Accelerating density functional calculations with graphics processing unit. J Chem Theory Comput 4(8):1230–1236 Google Scholar
  118. 118.
    Yetter RA, Dryer FL, Rabitz H (1991) A comprehensive reaction mechanism for carbon monoxide/hydrogen/oxygen kinetics. Combust Sci Technol 79:97–128 Google Scholar
  119. 119.
    Young DM (1971) Iterative solution of large linear systems. Academic Press, New York MATHGoogle Scholar
  120. 120.
    Zambon AC, Chelliah HK (2007) Explicit reduced reaction models for ignition, flame propagation, and extinction of C2H4/CH4H2 and air systems. Combust Flame 150(1–2):71–91 Google Scholar
  121. 121.
    Zaspel P, Griebel M (2013) Solving incompressible two-phase flows on multi-GPU clusters. Comput Fluids 80:356–364 Google Scholar
  122. 122.
    Zhao Y (2007) Lattice Boltzmann based PDE solver on the GPU. Vis Comput 24(5):323–333 Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of Mechanical and Aerospace EngineeringCase Western Reserve UniversityClevelandUSA
  2. 2.Department of Mechanical EngineeringUniversity of ConnecticutStorrsUSA
  3. 3.School of Mechanical, Industrial, and Manufacturing EngineeringOregon State UniversityCorvallisUSA

Personalised recommendations