Skip to main content

GPU-Based Parallel Integration of Large Numbers of Independent ODE Systems

  • Chapter
  • First Online:
Numerical Computations with GPUs

Abstract

The task of integrating a large number of independent ODE systems arises in various scientific and engineering areas. For nonstiff systems, common explicit integration algorithms can be used on GPUs, where individual GPU threads concurrently integrate independent ODEs with different initial conditions or parameters. One example is the fifth-order adaptive Runge–Kutta–Cash–Karp (RKCK) algorithm. In the case of stiff ODEs, standard explicit algorithms require impractically small time-step sizes for stability reasons, and implicit algorithms are therefore commonly used instead to allow larger time steps and reduce the computational expense. However, typical high-order implicit algorithms based on backwards differentiation formulae (e.g., VODE, LSODE) involve complex logical flow that causes severe thread divergence when implemented on GPUs, limiting the performance. Therefore, alternate algorithms are needed. A GPU-based Runge–Kutta–Chebyshev (RKC) algorithm can handle moderate levels of stiffness and performs significantly faster than not only an equivalent CPU version but also a CPU-based implicit algorithm (VODE) based on results shown in the literature. In this chapter, we present the mathematical background, implementation details, and source code for the RKCK and RKC algorithms for use integrating large numbers of independent systems of ODEs on GPUs. In addition, brief performance comparisons are shown for each algorithm, demonstrating the potential benefit of moving to GPU-based ODE integrators.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    An equivalence ratio of one indicates the mixture of fuel and oxidizer set to an appropriate ratio for complete combustion.

References

  1. Alexandrov, V., Sameh, A., Siddique, Y., Zlatev, Z.: Numerical integration of chemical ODE problems arising in air pollution models. Environ. Monit. Assess. 2(4), 365–377 (1997). doi:10.1023/A:1019086016734

    Article  Google Scholar 

  2. Barry, D., Miller, C., Culligan, P., Bajracharya, K.: Analysis of split operator methods for nonlinear and multispecies groundwater chemical transport models. Math. Comput. Simul. 43(3–6), 331–341 (1997). doi:10.1016/S0378-4754(97)00017-7

    Article  Google Scholar 

  3. Barry, D., Bajracharya, K., Crapper, M., Prommer, H., Cunningham, C.: Comparison of split-operator methods for solving coupled chemical non-equilibrium reaction/groundwater transport models. Math. Comput. Simul. 53(1–2), 113–127 (2000). doi:10.1016/S0378-4754(00)00182-8

    Article  Google Scholar 

  4. Cash, J.R., Karp, A.H.: A variable order Runge–Kutta method for initial value problems with rapidly varying right-hand sides. ACM Trans. Math. Softw. 16(3), 201–222 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  5. Day, M.S., Bell, J.B.: Numerical simulation of laminar reacting flows with complex chemistry. Combust. Theory Model. 4(4), 535–556 (2000). doi:10.1088/1364-7830/4/4/309

    Article  MATH  Google Scholar 

  6. Dematte, L., Prandi, D.: GPU computing for systems biology. Brief. Bioinform. 11(3), 323–333 (2010). doi:10.1093/bib/bbq006

    Article  Google Scholar 

  7. Geršgorin, S.: Über die abgrenzung der eigenwerte einer matrix. Bulletin de l’Académie des Sciences de l’URSS. Classe des sciences mathématiques et na (6), 749–754 (1931)

    Google Scholar 

  8. Hairer, E., Wanner, G.: Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, 2nd edn. Springer Series in Computational Mathematics, vol. 14. Springer, Berlin/Heidelberg (1996)

    Google Scholar 

  9. Hairer, E., Wanner, G., Nørsett, S.P.: Solving Ordinary Differential Equations I: Nonstiff Problems, 2nd edn. Springer Series in Computational Mathematics, vol. 8. Springer, Berlin/Heidelberg (1993). doi:10.1007/978-3-540-78862-1

    Google Scholar 

  10. Helton, J., Davis, F.: Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliab. Eng. Syst. Saf. 81(1), 23–69 (2003). doi:10.1016/S0951-8320(03)00058-9

    Article  Google Scholar 

  11. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1990)

    MATH  Google Scholar 

  12. Jang, B., Schaa, D., Mistry, P., Kaeli, D.: Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Trans. Parallel Distrib. Syst. 22(1), 105–118 (2011). doi:10.1109/TPDS.2010.107

    Article  Google Scholar 

  13. Kim, J., Cho, S.Y.: Computation accuracy and efficiency of the time-splitting method in solving atmospheric transport/chemistry equations. Atmos. Environ. 31(15), 2215–2224 (1997)

    Article  Google Scholar 

  14. Kirk, D.B., Hwu, W.W.: Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann, Burlington (2010)

    Google Scholar 

  15. Knio, O.M., Najm, H.N., Wyckoff, P.S.: A semi-implicit numerical scheme for reacting flow II. Stiff, operator-split formulation. J. Comput. Phys. 154, 428–467 (1999). doi:10.1006/jcph.1999.6322

    Google Scholar 

  16. Kühn, C., Wierling, C., Kühn, A., Klipp, E., Panopoulou, G., Lehrach, H., Poustka, A.: Monte Carlo analysis of an ODE model of the sea urchin endomesoderm network. BMC Syst. Biol. 3, 83 (2009). doi:10.1186/1752-0509-3-83

    Article  Google Scholar 

  17. Law, C.K.: Combustion Physics. Cambridge University Press, New York (2006)

    Book  Google Scholar 

  18. Marino, S., Hogue, I.B., Ray, C.J., Kirschner, D.E.: A methodology for performing global uncertainty and sensitivity analysis in systems biology. J. Theor. Biol. 254(1), 178–19 (2008). doi:10.1016/j.jtbi.2008.04.011

    Article  MathSciNet  Google Scholar 

  19. Marinov, N.M.: A detailed chemical kinetic model for high temperature ethanol oxidation. Int. J. Chem. Kinet. 31(3), 183–220 (1999)

    Article  Google Scholar 

  20. Mazzia, F., Magherini, C.: Test Set for Initial Value Problem Solvers, Release 2.4. Department of Mathematics, University of Bari and INdAM, Research Unit of Bari (2008). Available at http://www.dm.uniba.it/~testset

  21. Niemeyer, K.E., Sung, C.J.: Accelerating moderately stiff chemical kinetics in reactive-flow simulations using GPUs. J. Comput. Phys. 256, 854–871 (2014). doi:10.1016/j.jcp.2013.09.025

    Article  MathSciNet  Google Scholar 

  22. Niemeyer, K.E., Sung, C.J., Fotache, C.G., Lee, J.C.: Turbulence-chemistry closure method using graphics processing units: a preliminary test. In: 7th Fall Technical Meeting of the Eastern States Section of the Combustion Institute, Storrs (2011)

    Google Scholar 

  23. Nimmagadda, V.K., Akoglu, A., Hariri, S., Moukabary, T.: Cardiac simulation on multi-GPU platform. J. Supercomput. 59(3), 1360–1378 (2011). doi:10.1007/s11227-010-0540-x

    Article  Google Scholar 

  24. OpenMP Architecture Review Board: OpenMP Application Program Interface Version 3.0. http://www.openmp.org/mp-documents/spec30.pdf (2008)

  25. Oran, E.S., Boris, J.P.: Numerical Simulation of Reactive Flow, 2nd edn. Cambridge University Press, Cambridge (2001)

    MATH  Google Scholar 

  26. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in Fortran 77: The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)

    Google Scholar 

  27. Ren, Z., Pope, S.B.: Second-order splitting schemes for a class of reactive systems. J. Comput. Phys. 227(17), 8165–8176 (2008). doi:10.1016/j.jcp.2008.05.019

    Article  MATH  MathSciNet  Google Scholar 

  28. Schwer, D., Lu, P., Green, W.H., Semiao, V.: A consistent-splitting approach to computing stiff steady-state reacting flows with adaptive chemistry. Combust. Theory Model. 7(2), 383–399 (2003). doi:10.1088/1364-7830/7/2/310

    Article  Google Scholar 

  29. Shi, Y., Green, W.H., Wong, H., Oluwole, O.O.: Accelerating multi-dimensional combustion simulations using hybrid CPU-based implicit/GPU-based explicit ODE integration. Combust. Flame 159(7), 2388–2397 (2012). doi:10.1016/j.combustflame.2012.02.016

    Article  Google Scholar 

  30. Sommeijer, B.P., Shampine, L.F., Verwer, J.G.: RKC: an explicit solver for parabolic PDEs. J. Comput. Appl. Math. 88(2), 315–326 (1997)

    Article  MathSciNet  Google Scholar 

  31. Sportisse, B.: An analysis of operator splitting techniques in the stiff case. J. Comput. Phys. 161(1), 140–168 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  32. Stone, C.P., Davis, R.L.: Techniques for solving stiff chemical kinetics on graphical processing units. J. Propulsion Power 29(4), 764–773 (2013). doi:10.2514/1.B34874

    Article  Google Scholar 

  33. Strang, G.: On the construction and comparison of difference schemes. SIAM J. Numer. Anal. 5(3), 506–517 (1968)

    Article  MATH  MathSciNet  Google Scholar 

  34. Sundnes, J., Nielsen, B.F., Mardal, K., Cai, X., Lines, G., Tveito, A.: On the computational complexity of the bidomain and the monodomain models of electrophysiology. Ann. Biomed. Eng. 34(7), 1088–1097 (2006). doi:10.1007/s10439-006-9082-z

    Article  Google Scholar 

  35. van der Houwen, P.J.: The development of Runge–Kutta methods for partial differential equations. Appl. Numer. Math. 20, 261–272 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  36. van der Houwen, P.J., Sommeijer, B.P.: On the internal stability of explicit, m-stage Runge-Kutta methods for large m-values. Z. Angew. Math. Mech. 60(10), 479–485 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  37. Verwer, J.G.: Explicit Runge–Kutta methods for parabolic partial differential equations. Appl. Numer. Math. 22, 359–379 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  38. Verwer, J.G., Hundsdorfer, W., Sommeijer, B.P.: Convergence properties of the Runge–Kutta–Chebyshev method. Numer. Math. 57, 57–178 (1990)

    Article  MathSciNet  Google Scholar 

  39. Verwer, J.G., Sommeijer, B.P., Hundsdorfer, W.: RKC time-stepping for advection–diffusion–reaction problems. J. Comput. Phys. 201(1), 61–79 (2004). doi:10.1016/j.jcp.2004.05.002

    Article  MATH  MathSciNet  Google Scholar 

  40. Zhou, Y., Liepe, J., Sheng, X., Stumpf, M.P.H., Barnes, C.: GPU accelerated biochemical network simulation. Bioinformatics 27(6), 874–876 (2011). doi:10.1093/bioinformatics/btr015

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the US Department of Defense through the National Defense Science and Engineering Graduate Fellowship program, the National Science Foundation Graduate Research Fellowship under grant number DGE-0951783, and the Combustion Energy Frontier Research Center—an Energy Frontier Research Center funded by the US Department of Energy, Office of Science, Office of Basic Energy Sciences under award number DE-SC0001198.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kyle E. Niemeyer .

Editor information

Editors and Affiliations

Appendix

Appendix

Various methods may be used to calculate the spectral radius, including the Gershgorin circle theorem [7, 11] that provides an upper-bound estimate. Here, we provide a function based on a nonlinear power method [30].

1 __device__ double

2 rkcSpecRad (const double t, const double* y, const double g, const double* F, const double hMax, double* v, double* Fv) {

3 // maximum number of iterations

4 const int itmax = 50;

5

6 double small = 1.0 / hmax;

7

8 double nrm1 = 0.0;

9 double nrm2 = 0.0;

10 for (int i = 0; i < NEQN; ++i) {

11 nrm1 += (y[i] * y[i]);

12 nrm2 += (v[i] * v[i]);

13 }

14 nrm1 = sqrt(nrm1);

15 nrm2 = sqrt(nrm2);

16

17 double dynrm;

18 if ((nrm1 != 0.0) && (nrm2 != 0.0)) {

19 dynrm = nrm1 * sqrt(UROUND);

20 for (int i = 0; i < NEQN; ++i) {

21 v[i] = y[i] + v[i] * (dynrm / nrm2);

22 }

23 } else if (nrm1 != 0.0) {

24 dynrm = nrm1 * sqrt(UROUND);

25 for (int i = 0; i < NEQN; ++i) {

26 v[i] = y[i] * (1.0 + sqrt(UROUND));

27 }

28 } else if (nrm2 != 0.0) {

29 dynrm = UROUND;

30 for (int i = 0; i < NEQN; ++i) {

31 v[i] *= (dynrm / nrm2);

32 }

33 } else {

34 dynrm = UROUND;

35 for (int i = 0; i < NEQN; ++i) {

36 v[i] = UROUND;

37 }

38 }

39

40 // now iterate using nonlinear power method

41 double sigma = 0.0;

42 for (int iter = 1; iter <= itmax; ++iter) {

43

44 dydt (t, pr, v, Fv);

45

46 nrm1 = 0.0;

47 for (int i = 0; i < NEQN; ++i) {

48 nrm1 += ((Fv[i] - F[i]) * (Fv[i] - F[i]));

49 }

50 nrm1 = sqrt(nrm1);

51 nrm2 = sigma;

52 sigma = nrm1 / dynrm;

53

54 nrm2 = fabs(sigma - nrm2) / sigma;

55 if ((iter >= 2) && (fabs(sigma - nrm2) <= (fmax(sigma, small) * 0.01))) {

56 for (int i = 0; i < NEQN; ++i) {

57 v[i] -= y[i];

58 }

59 return (1.2 * sigma);

60 }

61

62 if (nrm1 != 0.0) {

63 for (int i = 0; i < NEQN; ++i) {

64 v[i] = y[i] + ((Fv[i] - F[i]) * (dynrm / nrm1));

65 }

66 } else {

67 int ind = (iter % NEQN);

68 v[ind] = y[ind] - (v[ind] - y[ind]);

69 }

70 }

71 return (1.2 * sigma);

72 }

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Niemeyer, K.E., Sung, CJ. (2014). GPU-Based Parallel Integration of Large Numbers of Independent ODE Systems. In: Kindratenko, V. (eds) Numerical Computations with GPUs. Springer, Cham. https://doi.org/10.1007/978-3-319-06548-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06548-9_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06547-2

  • Online ISBN: 978-3-319-06548-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics