GPU-Based Parallel Integration of Large Numbers of Independent ODE Systems

Niemeyer, Kyle E.; Sung, Chih-Jen

doi:10.1007/978-3-319-06548-9_8

Kyle E. Niemeyer² &
Chih-Jen Sung³

3082 Accesses
1 Citations
1 Altmetric

Abstract

The task of integrating a large number of independent ODE systems arises in various scientific and engineering areas. For nonstiff systems, common explicit integration algorithms can be used on GPUs, where individual GPU threads concurrently integrate independent ODEs with different initial conditions or parameters. One example is the fifth-order adaptive Runge–Kutta–Cash–Karp (RKCK) algorithm. In the case of stiff ODEs, standard explicit algorithms require impractically small time-step sizes for stability reasons, and implicit algorithms are therefore commonly used instead to allow larger time steps and reduce the computational expense. However, typical high-order implicit algorithms based on backwards differentiation formulae (e.g., VODE, LSODE) involve complex logical flow that causes severe thread divergence when implemented on GPUs, limiting the performance. Therefore, alternate algorithms are needed. A GPU-based Runge–Kutta–Chebyshev (RKC) algorithm can handle moderate levels of stiffness and performs significantly faster than not only an equivalent CPU version but also a CPU-based implicit algorithm (VODE) based on results shown in the literature. In this chapter, we present the mathematical background, implementation details, and source code for the RKCK and RKC algorithms for use integrating large numbers of independent systems of ODEs on GPUs. In addition, brief performance comparisons are shown for each algorithm, demonstrating the potential benefit of moving to GPU-based ODE integrators.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
An equivalence ratio of one indicates the mixture of fuel and oxidizer set to an appropriate ratio for complete combustion.

References

Alexandrov, V., Sameh, A., Siddique, Y., Zlatev, Z.: Numerical integration of chemical ODE problems arising in air pollution models. Environ. Monit. Assess. 2(4), 365–377 (1997). doi:10.1023/A:1019086016734
Article Google Scholar
Barry, D., Miller, C., Culligan, P., Bajracharya, K.: Analysis of split operator methods for nonlinear and multispecies groundwater chemical transport models. Math. Comput. Simul. 43(3–6), 331–341 (1997). doi:10.1016/S0378-4754(97)00017-7
Article Google Scholar
Barry, D., Bajracharya, K., Crapper, M., Prommer, H., Cunningham, C.: Comparison of split-operator methods for solving coupled chemical non-equilibrium reaction/groundwater transport models. Math. Comput. Simul. 53(1–2), 113–127 (2000). doi:10.1016/S0378-4754(00)00182-8
Article Google Scholar
Cash, J.R., Karp, A.H.: A variable order Runge–Kutta method for initial value problems with rapidly varying right-hand sides. ACM Trans. Math. Softw. 16(3), 201–222 (1990)
Article MATH MathSciNet Google Scholar
Day, M.S., Bell, J.B.: Numerical simulation of laminar reacting flows with complex chemistry. Combust. Theory Model. 4(4), 535–556 (2000). doi:10.1088/1364-7830/4/4/309
Article MATH Google Scholar
Dematte, L., Prandi, D.: GPU computing for systems biology. Brief. Bioinform. 11(3), 323–333 (2010). doi:10.1093/bib/bbq006
Article Google Scholar
Geršgorin, S.: Über die abgrenzung der eigenwerte einer matrix. Bulletin de l’Académie des Sciences de l’URSS. Classe des sciences mathématiques et na (6), 749–754 (1931)
Google Scholar
Hairer, E., Wanner, G.: Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, 2nd edn. Springer Series in Computational Mathematics, vol. 14. Springer, Berlin/Heidelberg (1996)
Google Scholar
Hairer, E., Wanner, G., Nørsett, S.P.: Solving Ordinary Differential Equations I: Nonstiff Problems, 2nd edn. Springer Series in Computational Mathematics, vol. 8. Springer, Berlin/Heidelberg (1993). doi:10.1007/978-3-540-78862-1
Google Scholar
Helton, J., Davis, F.: Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliab. Eng. Syst. Saf. 81(1), 23–69 (2003). doi:10.1016/S0951-8320(03)00058-9
Article Google Scholar
Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1990)
MATH Google Scholar
Jang, B., Schaa, D., Mistry, P., Kaeli, D.: Exploiting memory access patterns to improve memory performance in data-parallel architectures. IEEE Trans. Parallel Distrib. Syst. 22(1), 105–118 (2011). doi:10.1109/TPDS.2010.107
Article Google Scholar
Kim, J., Cho, S.Y.: Computation accuracy and efficiency of the time-splitting method in solving atmospheric transport/chemistry equations. Atmos. Environ. 31(15), 2215–2224 (1997)
Article Google Scholar
Kirk, D.B., Hwu, W.W.: Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann, Burlington (2010)
Google Scholar
Knio, O.M., Najm, H.N., Wyckoff, P.S.: A semi-implicit numerical scheme for reacting flow II. Stiff, operator-split formulation. J. Comput. Phys. 154, 428–467 (1999). doi:10.1006/jcph.1999.6322
Google Scholar
Kühn, C., Wierling, C., Kühn, A., Klipp, E., Panopoulou, G., Lehrach, H., Poustka, A.: Monte Carlo analysis of an ODE model of the sea urchin endomesoderm network. BMC Syst. Biol. 3, 83 (2009). doi:10.1186/1752-0509-3-83
Article Google Scholar
Law, C.K.: Combustion Physics. Cambridge University Press, New York (2006)
Book Google Scholar
Marino, S., Hogue, I.B., Ray, C.J., Kirschner, D.E.: A methodology for performing global uncertainty and sensitivity analysis in systems biology. J. Theor. Biol. 254(1), 178–19 (2008). doi:10.1016/j.jtbi.2008.04.011
Article MathSciNet Google Scholar
Marinov, N.M.: A detailed chemical kinetic model for high temperature ethanol oxidation. Int. J. Chem. Kinet. 31(3), 183–220 (1999)
Article Google Scholar
Mazzia, F., Magherini, C.: Test Set for Initial Value Problem Solvers, Release 2.4. Department of Mathematics, University of Bari and INdAM, Research Unit of Bari (2008). Available at http://www.dm.uniba.it/~testset
Niemeyer, K.E., Sung, C.J.: Accelerating moderately stiff chemical kinetics in reactive-flow simulations using GPUs. J. Comput. Phys. 256, 854–871 (2014). doi:10.1016/j.jcp.2013.09.025
Article MathSciNet Google Scholar
Niemeyer, K.E., Sung, C.J., Fotache, C.G., Lee, J.C.: Turbulence-chemistry closure method using graphics processing units: a preliminary test. In: 7th Fall Technical Meeting of the Eastern States Section of the Combustion Institute, Storrs (2011)
Google Scholar
Nimmagadda, V.K., Akoglu, A., Hariri, S., Moukabary, T.: Cardiac simulation on multi-GPU platform. J. Supercomput. 59(3), 1360–1378 (2011). doi:10.1007/s11227-010-0540-x
Article Google Scholar
OpenMP Architecture Review Board: OpenMP Application Program Interface Version 3.0. http://www.openmp.org/mp-documents/spec30.pdf (2008)
Oran, E.S., Boris, J.P.: Numerical Simulation of Reactive Flow, 2nd edn. Cambridge University Press, Cambridge (2001)
MATH Google Scholar
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in Fortran 77: The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)
Google Scholar
Ren, Z., Pope, S.B.: Second-order splitting schemes for a class of reactive systems. J. Comput. Phys. 227(17), 8165–8176 (2008). doi:10.1016/j.jcp.2008.05.019
Article MATH MathSciNet Google Scholar
Schwer, D., Lu, P., Green, W.H., Semiao, V.: A consistent-splitting approach to computing stiff steady-state reacting flows with adaptive chemistry. Combust. Theory Model. 7(2), 383–399 (2003). doi:10.1088/1364-7830/7/2/310
Article Google Scholar
Shi, Y., Green, W.H., Wong, H., Oluwole, O.O.: Accelerating multi-dimensional combustion simulations using hybrid CPU-based implicit/GPU-based explicit ODE integration. Combust. Flame 159(7), 2388–2397 (2012). doi:10.1016/j.combustflame.2012.02.016
Article Google Scholar
Sommeijer, B.P., Shampine, L.F., Verwer, J.G.: RKC: an explicit solver for parabolic PDEs. J. Comput. Appl. Math. 88(2), 315–326 (1997)
Article MathSciNet Google Scholar
Sportisse, B.: An analysis of operator splitting techniques in the stiff case. J. Comput. Phys. 161(1), 140–168 (2000)
Article MATH MathSciNet Google Scholar
Stone, C.P., Davis, R.L.: Techniques for solving stiff chemical kinetics on graphical processing units. J. Propulsion Power 29(4), 764–773 (2013). doi:10.2514/1.B34874
Article Google Scholar
Strang, G.: On the construction and comparison of difference schemes. SIAM J. Numer. Anal. 5(3), 506–517 (1968)
Article MATH MathSciNet Google Scholar
Sundnes, J., Nielsen, B.F., Mardal, K., Cai, X., Lines, G., Tveito, A.: On the computational complexity of the bidomain and the monodomain models of electrophysiology. Ann. Biomed. Eng. 34(7), 1088–1097 (2006). doi:10.1007/s10439-006-9082-z
Article Google Scholar
van der Houwen, P.J.: The development of Runge–Kutta methods for partial differential equations. Appl. Numer. Math. 20, 261–272 (1996)
Article MATH MathSciNet Google Scholar
van der Houwen, P.J., Sommeijer, B.P.: On the internal stability of explicit, m-stage Runge-Kutta methods for large m-values. Z. Angew. Math. Mech. 60(10), 479–485 (1980)
Article MATH MathSciNet Google Scholar
Verwer, J.G.: Explicit Runge–Kutta methods for parabolic partial differential equations. Appl. Numer. Math. 22, 359–379 (1996)
Article MATH MathSciNet Google Scholar
Verwer, J.G., Hundsdorfer, W., Sommeijer, B.P.: Convergence properties of the Runge–Kutta–Chebyshev method. Numer. Math. 57, 57–178 (1990)
Article MathSciNet Google Scholar
Verwer, J.G., Sommeijer, B.P., Hundsdorfer, W.: RKC time-stepping for advection–diffusion–reaction problems. J. Comput. Phys. 201(1), 61–79 (2004). doi:10.1016/j.jcp.2004.05.002
Article MATH MathSciNet Google Scholar
Zhou, Y., Liepe, J., Sheng, X., Stumpf, M.P.H., Barnes, C.: GPU accelerated biochemical network simulation. Bioinformatics 27(6), 874–876 (2011). doi:10.1093/bioinformatics/btr015
Article Google Scholar

Download references

Acknowledgements

This work was supported by the US Department of Defense through the National Defense Science and Engineering Graduate Fellowship program, the National Science Foundation Graduate Research Fellowship under grant number DGE-0951783, and the Combustion Energy Frontier Research Center—an Energy Frontier Research Center funded by the US Department of Energy, Office of Science, Office of Basic Energy Sciences under award number DE-SC0001198.

Author information

Authors and Affiliations

School of Mechanical, Industrial, & Manufacturing Engineering, Oregon State University, Corvallis, OR, USA
Kyle E. Niemeyer
Department of Mechanical Engineering, University of Connecticut, Storrs, CT, USA
Chih-Jen Sung

Authors

Kyle E. Niemeyer
View author publications
You can also search for this author in PubMed Google Scholar
Chih-Jen Sung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kyle E. Niemeyer .

Editor information

Editors and Affiliations

National Center for Supercomputing Applications, University of Illinois, Urbana, Illinois, USA
Volodymyr Kindratenko

Appendix

Various methods may be used to calculate the spectral radius, including the Gershgorin circle theorem [7, 11] that provides an upper-bound estimate. Here, we provide a function based on a nonlinear power method [30].

1 __device__ double

2 rkcSpecRad (const double t, const double* y, const double g, const double* F, const double hMax, double* v, double* Fv) {

3 // maximum number of iterations

4 const int itmax = 50;

5

6 double small = 1.0 / hmax;

7

8 double nrm1 = 0.0;

9 double nrm2 = 0.0;

10 for (int i = 0; i < NEQN; ++i) {

11 nrm1 += (y[i] * y[i]);

12 nrm2 += (v[i] * v[i]);

13 }

14 nrm1 = sqrt(nrm1);

15 nrm2 = sqrt(nrm2);

16

17 double dynrm;

18 if ((nrm1 != 0.0) && (nrm2 != 0.0)) {

19 dynrm = nrm1 * sqrt(UROUND);

20 for (int i = 0; i < NEQN; ++i) {

21 v[i] = y[i] + v[i] * (dynrm / nrm2);

22 }

23 } else if (nrm1 != 0.0) {

24 dynrm = nrm1 * sqrt(UROUND);

25 for (int i = 0; i < NEQN; ++i) {

26 v[i] = y[i] * (1.0 + sqrt(UROUND));

27 }

28 } else if (nrm2 != 0.0) {

29 dynrm = UROUND;

30 for (int i = 0; i < NEQN; ++i) {

31 v[i] *= (dynrm / nrm2);

32 }

33 } else {

34 dynrm = UROUND;

35 for (int i = 0; i < NEQN; ++i) {

36 v[i] = UROUND;

37 }

38 }

39

40 // now iterate using nonlinear power method

41 double sigma = 0.0;

42 for (int iter = 1; iter <= itmax; ++iter) {

43

44 dydt (t, pr, v, Fv);

45

46 nrm1 = 0.0;

47 for (int i = 0; i < NEQN; ++i) {

48 nrm1 += ((Fv[i] - F[i]) * (Fv[i] - F[i]));

49 }

50 nrm1 = sqrt(nrm1);

51 nrm2 = sigma;

52 sigma = nrm1 / dynrm;

53

54 nrm2 = fabs(sigma - nrm2) / sigma;

55 if ((iter >= 2) && (fabs(sigma - nrm2) <= (fmax(sigma, small) * 0.01))) {

56 for (int i = 0; i < NEQN; ++i) {

57 v[i] -= y[i];

58 }

59 return (1.2 * sigma);

60 }

61

62 if (nrm1 != 0.0) {

63 for (int i = 0; i < NEQN; ++i) {

64 v[i] = y[i] + ((Fv[i] - F[i]) * (dynrm / nrm1));

65 }

66 } else {

67 int ind = (iter % NEQN);

68 v[ind] = y[ind] - (v[ind] - y[ind]);

69 }

70 }

71 return (1.2 * sigma);

72 }

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Niemeyer, K.E., Sung, CJ. (2014). GPU-Based Parallel Integration of Large Numbers of Independent ODE Systems. In: Kindratenko, V. (eds) Numerical Computations with GPUs. Springer, Cham. https://doi.org/10.1007/978-3-319-06548-9_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-06548-9_8
Published: 09 June 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06547-2
Online ISBN: 978-3-319-06548-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

GPU-Based Parallel Integration of Large Numbers of Independent ODE Systems

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation