Skip to main content

Layout, Array

  • Reference work entry
  • 203 Accesses

Definitions

A high-performance architecture needs a fast processor, but a fast processor is useless if a memory subsystem does not provide data at the rate of several words per clock cycle. Run-of-the-mill memory chips in today technology have a latency of the order of ten to a hundred processor cycles, far more than the necessary performance. The usual method for increasing the memory bandwith as seen by the processor is to implement a cache, i.e., a small but fast memory which is geared to hold frequently used data. Caches work best when used by programs with almost random but nonuniform addressing patterns. However, high-performance applications, like linear algebra or signal processing, have a tendency to use very regular adressing patterns, which degrade cache performance. In linear algebra codes, and also in stream processing, one finds long sequences of accesses to regularly increasing adresses. In image processing, a template moves regularly across a pixel array.

To take...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   1,600.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   1,799.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Bibliography

  1. Budnik P, Kuck DJ (1971) The organisation and use of parallel memories. IEEE Trans Comput C-20:1566–1569

    Article  MATH  Google Scholar 

  2. de Dinechin BD (1991) A ultra fast euclidean division algorithm for prime memory systems. In: Supercomputing’91, Albuquerque, pp 56–65

    Google Scholar 

  3. Jalby W, Frailong JM, Lenfant J (1984) Diamond schemes: an organization of parallel memories for efficient array processing. Technical Report RR-342, INRIA

    Google Scholar 

  4. Shapiro HD (1978) Theoretical limitations on the efficient use of parallel memories. IEEE Trans Comput C-27:421–428

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this entry

Cite this entry

Feautrier, P. (2011). Layout, Array. In: Padua, D. (eds) Encyclopedia of Parallel Computing. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09766-4_171

Download citation

Publish with us

Policies and ethics