Locality Optimized Shared-Memory Implementations of Iterated Runge-Kutta Methods

  • Matthias Korch
  • Thomas Rauber
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4641)


Iterated Runge-Kutta (IRK) methods are a class of explicit solution methods for initial value problems of ordinary differential equations (ODEs) which possess a considerable potential for parallelism across the method and the ODE system. In this paper, we consider the sequential and parallel implementation of IRK methods with the main focus on the optimization of the locality behavior. We introduce different implementation variants for sequential and shared-memory computer systems and analyze their runtime and cache performance on two modern supercomputer systems.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ehrig, R., Nowak, U., Deuflhard, P.: Massively parallel linearly-implicit extrapolation algorithms as a powerful tool in process simulation. In: Parallel Computing: Fundamentals, Applications and New Directions, pp. 517–524. Elsevier, Amsterdam (1998)Google Scholar
  2. 2.
    Burrage, K.: Parallel and Sequential Methods for Ordinary Differential Equations. Oxford Science Publications, Oxford (1995)zbMATHGoogle Scholar
  3. 3.
    Nørsett, S.P., Simonsen, H.H.: Aspects of parallel Runge-Kutta methods. In: Numerical Methods for Ordinary Differential Equations. LNM, vol. 1386, pp. 103–117 (1989)Google Scholar
  4. 4.
    van der Houwen, P.J., Sommeijer, B.P.: Parallel iteration of high-order Runge-Kutta methods with stepsize control. J. Comput. Appl. Math. 29, 111–127 (1990)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Jackson, K.R., Nørsett, S.P.: The potential for parallelism in Runge-Kutta methods. Part 1: RK formulas in standard form. SIAM J. Numer. Anal. 32(1), 49–82 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Choi, J., Dongarra, J.J., Ostrouchov, L.S., Petitet, A.P., Walker, D.W., Whaley, R.C.: Design and implementation of the ScaLAPACK LU, QR and Cholesky factorization routines. Sci. Prog. 5, 173–184 (1996)Google Scholar
  7. 7.
    Anderson, E., Bai, Z., Bischof, C., Blackford, L.S., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarlin, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn., SIAM (1999)Google Scholar
  8. 8.
    Bilmes, J., Asanovic, K., Chin, C.W., Demmel, J.: Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology. In: 11th ACM Int. Conf. on Supercomputing, ACM Press, New York (1997)Google Scholar
  9. 9.
    Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated empirical optimizations of software and the ATLAS project. Par. Comp. 27(1–2), 3–35 (2001)zbMATHCrossRefGoogle Scholar
  10. 10.
    Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures: A Dependence Based Approach. Morgan Kaufmann, San Francisco (2002)Google Scholar
  11. 11.
    McKinley, K.S.: A compiler optimization algorithm for shared-memory multiprocessors. IEEE Trans. Par. Dist. Syst. 9(8), 769–787 (1998)CrossRefGoogle Scholar
  12. 12.
    Irigoin, F., Triolet, R.: Supernode partitioning. In: ACM Symposium on Principles of Programming Languages, San Diego, Calif., pp. 319–329. ACM Press, New York (1988)Google Scholar
  13. 13.
    Ghosh, S., Martonosi, M., Malik, S.: Cache miss equations: A compiler framework for analyzing and tuning memory behavior. ACM Trans. Prog. Lang. Syst (TOPLAS) 21(4), 703–746 (1999)CrossRefGoogle Scholar
  14. 14.
    Rauber, T., Rünger, G.: Improving locality for ODE solvers by program transformations. Sci. Prog. 12(3), 133–154 (2004)Google Scholar
  15. 15.
    Korch, M.: Effiziente Implementierung eingebetteter Runge-Kutta-Verfahren durch Ausnutzung der Speicherzugriffslokalität. Doctoral thesis, University of Bayreuth, Bayreuth, Germany (December 2006)Google Scholar
  16. 16.
    Korch, M., Rauber, T.: Optimizing locality and scalability of embedded Runge-Kutta solvers using block-based pipelining. J. Par. Distr. Comp. 66(3), 444–468 (2006)zbMATHCrossRefGoogle Scholar
  17. 17.
    Rauber, T., Rünger, G.: Parallel implementations of iterated Runge-Kutta methods. Int. J. Supercomp. App. 10(1), 62–90 (1996)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Matthias Korch
    • 1
  • Thomas Rauber
    • 1
  1. 1.University of Bayreuth, Department of Computer Science 

Personalised recommendations