An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms

  • Markus Kowarschik
  • Christian Weiß
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2625)


In order to mitigate the impact of the growing gap between CPU speed and main memory performance, today’s computer architectures implement hierarchical memory structures. The idea behind this approach is to hide both the low main memory bandwidth and the latency of main memory accesses which is slow in contrast to the floating-point performance of the CPUs. Usually, there is a small and expensive high speed memory sitting on top of the hierarchy which is usually integrated within the processor chip to provide data with low latency and high bandwidth; i.e., the CPU registers. Moving further away from the CPU, the layers of memory successively become larger and slower. The memory components which are located between the processor core and main memory are called cache memories or caches. They are intended to contain copies of main memory blocks to speed up accesses to frequently needed data [378], [392]. The next lower level of the memory hierarchy is the main memory which is large but also comparatively slow. While external memory such as hard disk drives or remote memory components in a distributed computing environment represent the lower end of any common hierarchical memory design, this paper focuses on optimization techniques for enhancing cache performance.


Loop Nest Cache Line Memory Block Memory Hierarchy Cache Performance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Markus Kowarschik
    • 1
  • Christian Weiß
    • 2
  1. 1.Lehrstuhl füur Informatik 10Friedrich-Alexander-Universität Erlangen-NürnbergErlangenGermany
  2. 2.Lehrstuhl für Rechnertechnik und RechnerorganisationTechnische Universität MünchenMünchenGermany

Personalised recommendations