Skip to main content

A Study on the Influence of Caching: Sequences of Dense Linear Algebra Kernels

  • Conference paper
  • First Online:
High Performance Computing for Computational Science -- VECPAR 2014 (VECPAR 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8969))

Abstract

It is universally known that caching is critical to attain high-performance implementations: In many situations, data locality (in space and time) plays a bigger role than optimizing the (number of) arithmetic floating point operations. In this paper, we show evidence that at least for linear algebra algorithms, caching is also a crucial factor for accurate performance modeling and performance prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    With \(n = 1{,}568 = 2^5 \cdot 7^2\), we choose a matrix size that is not a power of \(2\) to avoid performance artifacts due to the specific problem size.

  2. 2.

    The subscripts R through U are the values of the flag arguments side, uplo, trans, and diag; they distinguish the form of the operation performed by the kernel.

  3. 3.

    Read from the CPU’s time stamp counter through the assembly instruction rdtsc.

  4. 4.

    The system fluctuations cause variations of the dgeqrf timings of 0.057 % on average. With the exception of the tiny dcopy s, these fluctuations are not significant.

  5. 5.

    By “touching”, we mean a simple read+write access to the data, e.g. .

  6. 6.

    The length of the list can be safely restricted to the number of kernel calls per iteration of the blocked algorithm.

  7. 7.

    For \(n = 2400\), the upper triangular portion of the matrix is about twice as large as the cache size.

References

  1. Peise, E., Bientinesi, P.: Performance modeling for dense linear algebra. In: Proceedings of the 3rd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12), November 2012

    Google Scholar 

  2. Whaley, R.: Empirically tuning lapack’s blocking factor for increased performance. In: 2008 International Multiconference on Computer Science and Information Technology, IMCSIT 2008, pp. 303–310, October 2008

    Google Scholar 

  3. Lam, M.D., Rothberg, E.E., Wolf, M.E.: The cache performance and optimizations of blocked algorithms. In: Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS IV, pp. 63–74. ACM, New York (1991)

    Google Scholar 

  4. Iakymchuk, R., Bientinesi, P.: Modeling performance through memory-stalls. ACM SIGMETRICS Perform. Eval. Rev. 40(2), 86–91 (2012)

    Article  Google Scholar 

  5. OpenBLAS: http://www.openblas.net/

Download references

Acknowledgments

Financial support from the Deutsche Forschungsgemeinschaft (DFG) through grant GSC 111 and the Deutsche Telekom Stiftung is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elmar Peise .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Peise, E., Bientinesi, P. (2015). A Study on the Influence of Caching: Sequences of Dense Linear Algebra Kernels. In: Daydé, M., Marques, O., Nakajima, K. (eds) High Performance Computing for Computational Science -- VECPAR 2014. VECPAR 2014. Lecture Notes in Computer Science(), vol 8969. Springer, Cham. https://doi.org/10.1007/978-3-319-17353-5_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-17353-5_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-17352-8

  • Online ISBN: 978-3-319-17353-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics