GPU-Based Parallel Integration of Large Numbers of Independent ODE Systems



The task of integrating a large number of independent ODE systems arises in various scientific and engineering areas. For nonstiff systems, common explicit integration algorithms can be used on GPUs, where individual GPU threads concurrently integrate independent ODEs with different initial conditions or parameters. One example is the fifth-order adaptive Runge–Kutta–Cash–Karp (RKCK) algorithm. In the case of stiff ODEs, standard explicit algorithms require impractically small time-step sizes for stability reasons, and implicit algorithms are therefore commonly used instead to allow larger time steps and reduce the computational expense. However, typical high-order implicit algorithms based on backwards differentiation formulae (e.g., VODE, LSODE) involve complex logical flow that causes severe thread divergence when implemented on GPUs, limiting the performance. Therefore, alternate algorithms are needed. A GPU-based Runge–Kutta–Chebyshev (RKC) algorithm can handle moderate levels of stiffness and performs significantly faster than not only an equivalent CPU version but also a CPU-based implicit algorithm (VODE) based on results shown in the literature. In this chapter, we present the mathematical background, implementation details, and source code for the RKCK and RKC algorithms for use integrating large numbers of independent systems of ODEs on GPUs. In addition, brief performance comparisons are shown for each algorithm, demonstrating the potential benefit of moving to GPU-based ODE integrators.


Combustion Expense Advection 



This work was supported by the US Department of Defense through the National Defense Science and Engineering Graduate Fellowship program, the National Science Foundation Graduate Research Fellowship under grant number DGE-0951783, and the Combustion Energy Frontier Research Center—an Energy Frontier Research Center funded by the US Department of Energy, Office of Science, Office of Basic Energy Sciences under award number DE-SC0001198.


  1. 1.
    Alexandrov, V., Sameh, A., Siddique, Y., Zlatev, Z.: Numerical integration of chemical ODE problems arising in air pollution models. Environ. Monit. Assess. 2(4), 365–377 (1997). doi:10.1023/A:1019086016734CrossRefGoogle Scholar
  2. 2.
    Barry, D., Miller, C., Culligan, P., Bajracharya, K.: Analysis of split operator methods for nonlinear and multispecies groundwater chemical transport models. Math. Comput. Simul. 43(3–6), 331–341 (1997). doi:10.1016/S0378-4754(97)00017-7CrossRefGoogle Scholar
  3. 3.
    Barry, D., Bajracharya, K., Crapper, M., Prommer, H., Cunningham, C.: Comparison of split-operator methods for solving coupled chemical non-equilibrium reaction/groundwater transport models. Math. Comput. Simul. 53(1–2), 113–127 (2000). doi:10.1016/S0378-4754(00)00182-8CrossRefGoogle Scholar
  4. 4.
    Cash, J.R., Karp, A.H.: A variable order Runge–Kutta method for initial value problems with rapidly varying right-hand sides. ACM Trans. Math. Softw. 16(3), 201–222 (1990)CrossRefMATHMathSciNetGoogle Scholar
  5. 5.
    Day, M.S., Bell, J.B.: Numerical simulation of laminar reacting flows with complex chemistry. Combust. Theory Model. 4(4), 535–556 (2000). doi:10.1088/1364-7830/4/4/309CrossRefMATHGoogle Scholar
  6. 6.
    Dematte, L., Prandi, D.: GPU computing for systems biology. Brief. Bioinform. 11(3), 323–333 (2010). doi:10.1093/bib/bbq006CrossRefGoogle Scholar
  7. 7.
    Geršgorin, S.: Über die abgrenzung der eigenwerte einer matrix. Bulletin de l’Académie des Sciences de l’URSS. Classe des sciences mathématiques et na (6), 749–754 (1931)Google Scholar
  8. 8.
    Hairer, E., Wanner, G.: Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, 2nd edn. Springer Series in Computational Mathematics, vol. 14. Springer, Berlin/Heidelberg (1996)Google Scholar
  9. 9.
    Hairer, E., Wanner, G., Nørsett, S.P.: Solving Ordinary Differential Equations I: Nonstiff Problems, 2nd edn. Springer Series in Computational Mathematics, vol. 8. Springer, Berlin/Heidelberg (1993). doi:10.1007/978-3-540-78862-1Google Scholar
  10. 10.
    Helton, J., Davis, F.: Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliab. Eng. Syst. Saf. 81(1), 23–69 (2003). doi:10.1016/S0951-8320(03)00058-9CrossRefGoogle Scholar
  11. 11.
    Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1990)MATHGoogle Scholar
  12. 12.
    Jang, B., Schaa, D., Mistry, P., Kaeli, D.: Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Trans. Parallel Distrib. Syst. 22(1), 105–118 (2011). doi:10.1109/TPDS.2010.107CrossRefGoogle Scholar
  13. 13.
    Kim, J., Cho, S.Y.: Computation accuracy and efficiency of the time-splitting method in solving atmospheric transport/chemistry equations. Atmos. Environ. 31(15), 2215–2224 (1997)CrossRefGoogle Scholar
  14. 14.
    Kirk, D.B., Hwu, W.W.: Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann, Burlington (2010)Google Scholar
  15. 15.
    Knio, O.M., Najm, H.N., Wyckoff, P.S.: A semi-implicit numerical scheme for reacting flow II. Stiff, operator-split formulation. J. Comput. Phys. 154, 428–467 (1999). doi:10.1006/jcph.1999.6322Google Scholar
  16. 16.
    Kühn, C., Wierling, C., Kühn, A., Klipp, E., Panopoulou, G., Lehrach, H., Poustka, A.: Monte Carlo analysis of an ODE model of the sea urchin endomesoderm network. BMC Syst. Biol. 3, 83 (2009). doi:10.1186/1752-0509-3-83CrossRefGoogle Scholar
  17. 17.
    Law, C.K.: Combustion Physics. Cambridge University Press, New York (2006)CrossRefGoogle Scholar
  18. 18.
    Marino, S., Hogue, I.B., Ray, C.J., Kirschner, D.E.: A methodology for performing global uncertainty and sensitivity analysis in systems biology. J. Theor. Biol. 254(1), 178–19 (2008). doi:10.1016/j.jtbi.2008.04.011CrossRefMathSciNetGoogle Scholar
  19. 19.
    Marinov, N.M.: A detailed chemical kinetic model for high temperature ethanol oxidation. Int. J. Chem. Kinet. 31(3), 183–220 (1999)CrossRefGoogle Scholar
  20. 20.
    Mazzia, F., Magherini, C.: Test Set for Initial Value Problem Solvers, Release 2.4. Department of Mathematics, University of Bari and INdAM, Research Unit of Bari (2008). Available at
  21. 21.
    Niemeyer, K.E., Sung, C.J.: Accelerating moderately stiff chemical kinetics in reactive-flow simulations using GPUs. J. Comput. Phys. 256, 854–871 (2014). doi:10.1016/ Scholar
  22. 22.
    Niemeyer, K.E., Sung, C.J., Fotache, C.G., Lee, J.C.: Turbulence-chemistry closure method using graphics processing units: a preliminary test. In: 7th Fall Technical Meeting of the Eastern States Section of the Combustion Institute, Storrs (2011)Google Scholar
  23. 23.
    Nimmagadda, V.K., Akoglu, A., Hariri, S., Moukabary, T.: Cardiac simulation on multi-GPU platform. J. Supercomput. 59(3), 1360–1378 (2011). doi:10.1007/s11227-010-0540-xCrossRefGoogle Scholar
  24. 24.
    OpenMP Architecture Review Board: OpenMP Application Program Interface Version 3.0. (2008)
  25. 25.
    Oran, E.S., Boris, J.P.: Numerical Simulation of Reactive Flow, 2nd edn. Cambridge University Press, Cambridge (2001)MATHGoogle Scholar
  26. 26.
    Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in Fortran 77: The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)Google Scholar
  27. 27.
    Ren, Z., Pope, S.B.: Second-order splitting schemes for a class of reactive systems. J. Comput. Phys. 227(17), 8165–8176 (2008). doi:10.1016/ Scholar
  28. 28.
    Schwer, D., Lu, P., Green, W.H., Semiao, V.: A consistent-splitting approach to computing stiff steady-state reacting flows with adaptive chemistry. Combust. Theory Model. 7(2), 383–399 (2003). doi:10.1088/1364-7830/7/2/310CrossRefGoogle Scholar
  29. 29.
    Shi, Y., Green, W.H., Wong, H., Oluwole, O.O.: Accelerating multi-dimensional combustion simulations using hybrid CPU-based implicit/GPU-based explicit ODE integration. Combust. Flame 159(7), 2388–2397 (2012). doi:10.1016/j.combustflame.2012.02.016CrossRefGoogle Scholar
  30. 30.
    Sommeijer, B.P., Shampine, L.F., Verwer, J.G.: RKC: an explicit solver for parabolic PDEs. J. Comput. Appl. Math. 88(2), 315–326 (1997)CrossRefMathSciNetGoogle Scholar
  31. 31.
    Sportisse, B.: An analysis of operator splitting techniques in the stiff case. J. Comput. Phys. 161(1), 140–168 (2000)CrossRefMATHMathSciNetGoogle Scholar
  32. 32.
    Stone, C.P., Davis, R.L.: Techniques for solving stiff chemical kinetics on graphical processing units. J. Propulsion Power 29(4), 764–773 (2013). doi:10.2514/1.B34874CrossRefGoogle Scholar
  33. 33.
    Strang, G.: On the construction and comparison of difference schemes. SIAM J. Numer. Anal. 5(3), 506–517 (1968)CrossRefMATHMathSciNetGoogle Scholar
  34. 34.
    Sundnes, J., Nielsen, B.F., Mardal, K., Cai, X., Lines, G., Tveito, A.: On the computational complexity of the bidomain and the monodomain models of electrophysiology. Ann. Biomed. Eng. 34(7), 1088–1097 (2006). doi:10.1007/s10439-006-9082-zCrossRefGoogle Scholar
  35. 35.
    van der Houwen, P.J.: The development of Runge–Kutta methods for partial differential equations. Appl. Numer. Math. 20, 261–272 (1996)CrossRefMATHMathSciNetGoogle Scholar
  36. 36.
    van der Houwen, P.J., Sommeijer, B.P.: On the internal stability of explicit, m-stage Runge-Kutta methods for large m-values. Z. Angew. Math. Mech. 60(10), 479–485 (1980)CrossRefMATHMathSciNetGoogle Scholar
  37. 37.
    Verwer, J.G.: Explicit Runge–Kutta methods for parabolic partial differential equations. Appl. Numer. Math. 22, 359–379 (1996)CrossRefMATHMathSciNetGoogle Scholar
  38. 38.
    Verwer, J.G., Hundsdorfer, W., Sommeijer, B.P.: Convergence properties of the Runge–Kutta–Chebyshev method. Numer. Math. 57, 57–178 (1990)CrossRefMathSciNetGoogle Scholar
  39. 39.
    Verwer, J.G., Sommeijer, B.P., Hundsdorfer, W.: RKC time-stepping for advection–diffusion–reaction problems. J. Comput. Phys. 201(1), 61–79 (2004). doi:10.1016/ Scholar
  40. 40.
    Zhou, Y., Liepe, J., Sheng, X., Stumpf, M.P.H., Barnes, C.: GPU accelerated biochemical network simulation. Bioinformatics 27(6), 874–876 (2011). doi:10.1093/bioinformatics/btr015CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.School of Mechanical, Industrial, & Manufacturing EngineeringOregon State UniversityCorvallisUSA
  2. 2.Department of Mechanical EngineeringUniversity of ConnecticutStorrsUSA

Personalised recommendations