Skip to main content

Performance Optimization of RK Methods Using Block-Based Pipelining

  • Chapter
Performance Analysis and Grid Computing

Abstract

The efficiency of modern microprocessors is extremely sensitive towards the structure and memory access pattern of programs to be executed. This is caused by memory hierarchies which were introduced to reduce average memory access times. In this paper, we consider embedded Runge-Kutta (RK) methods for the solution of ordinary differential equations arising from space discretization problems for partial differential equations and study their efficient implementation on modern microprocessors. Different program variants with different execution orders and storage schemes are investigated. In particular, we explore how the potential parallelism in the stage vector computation can be exploited in a pipelining approach in order to improve the locality behavior of the RK implementations. Experiments show that this results in efficiency improvements on several recent processors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. E. Anderson, Z. Bai, C. Bischof, L. S. Blackford, J. Demmel, J. Dongarra, J.Du Croz, A. Greenbaum, S. Hammarlin, A. McKenney, and D. Sorensen. LAPACK Users’ Guide, Third Edition. SIAM, 1999.

    Book  Google Scholar 

  2. R. Berrendorf and B. Mohr. PCL — The Performance Counter Library, Version 2.0. Research Centre JĂ¼lich, September 2000.

    Google Scholar 

  3. J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology. In 11th ACM Int. Conf. on Supercomputing, 1997.

    Google Scholar 

  4. R. W. Brankin, I. Gladwell, and L. F. Shampine. RKSUITE release 1.0, 1991.

    Google Scholar 

  5. J. Choi, J. J. Dongarra, L. S. Ostrouchov, A. P. Petitet, D. W. Walker, and R. C. Whaley. Design and implementation of the ScaLAPACK LU, QR and Cholesky factorization routines. Scientific Programming, 5:173–184, 1996.

    Google Scholar 

  6. W.H. Enright, D.J. Higham, B. Owren, and Ph.W. Sharp. A survey of the explicit Runge-Kutta method. Technical Report 94–291, University of Toronto, Department of Computer Science, 1995.

    Google Scholar 

  7. E. Fehlberg. Classical fifth-, sixth-, seventh- and eighth order Runge-Kutta formulas with step size control. Computing, 4:93–106, 1969.

    Article  MathSciNet  MATH  Google Scholar 

  8. K. S. Gatlin and L. Carter. Architecture-cognizant divide and conquer algorithms. In Proc. of Supercomputing’99 Conference, 1999.

    Google Scholar 

  9. S. Goedecker and A. Hoisie. Performance Optimization of Numerically Intensive Codes. SIAM, 2001.

    Book  MATH  Google Scholar 

  10. E. Hairer, S. P. Nørsett, and G. Wanner. Solving Ordinary Differential Equations I: Nonstiff Problems. Springer-Verlag, Berlin, 1993.

    MATH  Google Scholar 

  11. P. J. Prince and J. R. Dormand. High order embedded Runge-Kutta formulae. J. Comp. Appl. Math., 7(l):67–75, 1981.

    Article  MathSciNet  MATH  Google Scholar 

  12. T. Rauber and G. RĂ¼nger. Parallel Execution of Embedded and Iterated Runge-Kutta Methods. Concurrency: Practice and Experience, 11 (7):367–385, 1999.

    Article  Google Scholar 

  13. T. Rauber and G. RĂ¼nger. Optimizing Locality for ODE Solvers. In Proceedings of the 15th ACM International Conference on Supercomputing, pages 123–132. ACM Press, 2001.

    Google Scholar 

  14. L. Stals and U. RĂ¼de. Data local iterative methods for the efficient solution of partial differential equations. In Proc. of Computational Techniques and Applications, 1997.

    Google Scholar 

  15. C. WeiĂŸ, W. Karl, M. Kowarschik, and U. RĂ¼de. Memory characteristics of iterative methods. In Proceedings of the ACM/IEEE SC99 Conference, Portland, Oregon, November 1999.

    Google Scholar 

  16. R. C. Whaley and J. J. Dongarra. Automatically tuned linear algebra software. Technical report, University of Tennessee, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer Science+Business Media New York

About this chapter

Cite this chapter

Korch, M., Rauber, T., RĂ¼nger, G. (2004). Performance Optimization of RK Methods Using Block-Based Pipelining. In: Getov, V., Gerndt, M., Hoisie, A., Malony, A., Miller, B. (eds) Performance Analysis and Grid Computing. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0361-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-0361-3_3

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5038-5

  • Online ISBN: 978-1-4615-0361-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics