Skip to main content
  • 149 Accesses

Abstract

A high-performance pipelined machine requires a memory system of equally high performance if the processor is to be fully utilized. This chapter deals with the issue of matching processor and memory performances. In a simple machine in which a single main memory module is connected directly to a processor (with no intermediate storage), the effective rate at which operands and instructions can be processed is limited by the rate at which the memory unit can deliver them. But in a high-performance machine, the access time of a single main memory unit typically exceeds the cycle time of the processor by a large margin, and obtaining the highest performance possible requires some changes in the basic design of the memory system. The development of faster memory as a solution to problem is not viable due to limitations in the technology that is available within a given period: the performance of the technology used for main memories typically improves at a rate that is less than that used processors, and the use for main memory of the fastest available logic is never cost-effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Bibliography

  1. AMD. 1997. AMD-K6 MMX Processor. Advanced Micro Devices, Sunnyvale, California.

    Google Scholar 

  2. Alpert, D.B. and M.J. Flynn. 1988. Performance trade-offs for microprocessor cache memories. IEEE Micro, 8(4):44–54.

    Article  Google Scholar 

  3. Boland, L.J., G.D. Granito, A.U. Marcotte, B.U. Messina, and J.W. Smith. 1967. The IBM System/360 model 91: storage system. IBM Journal of Research and Development, 11(1):54–68.

    Article  MATH  Google Scholar 

  4. Budnik, P.P. and D.J. Kuck. 1971. The organization and use of parallel memories. IEEE Transactions on Computers, C-20(12):1566–1569.

    Article  Google Scholar 

  5. Burnett, G.J. and E.G. Coffman. 1975. Analysis of interleaved memory systems using blockage buffers. Communications of the ACM, 18(2):91–95.

    Article  MATH  Google Scholar 

  6. CDC 1987. Cyber 200 Model 205 Computer System. Hardware Reference Manual. Control Data Corporation, Minneapolis, Minnesota, USA.

    Google Scholar 

  7. Diefendorff, K. and M. Allen, 1992. Organization of the Motorola 88110 superscalar RISC microprocessor. IEEE Micro, 12 (4):40–63.

    Article  Google Scholar 

  8. Edmondson, J.H., P. Rubinfield, R. Preston, and V. Rajagopalan. 1995. Superscalar instruction execution in the Alpha 21164 microprocessor. IEEE Micro, 15(2):33–43.

    Article  Google Scholar 

  9. Fite, D.B., T. Fossum, and D. Manley. 1990. Design strategies for the VAX 9000 system. Digital Technical Journal, 2(4):13–24.

    Google Scholar 

  10. Hill, M.D. 1987. Aspects of Cache Memory and Instruction Buffer Performance. Technical Report No. UCB/CSD 87/382, Computer Science Division, University of California, Berkeley.

    Google Scholar 

  11. Hill, M.D. 1988. A case for direct-mapped caches. IEEE Computer, December.

    Google Scholar 

  12. Hsu, W.-C. and J.E. Smith. 1998. A performance study of cache prefetching methods. IEEE Transactions on Computers, 47(5):497–508.

    Article  Google Scholar 

  13. Inayoshi et al. 1988. Realization of the Gmicro/200. IEEE Micro, 8(2):12–21.

    Article  Google Scholar 

  14. Jouppi, N. 1990. Improving direct-mapped cache performance by addition of a small fully-associative cache and prefetch buffers. In: 17th International Symposium on Computer Architecture, pp 364–373.

    Google Scholar 

  15. Jouppi, N.P. 1993. Cache write policies and performance. In: Proceedings, 20th International Symposium on Computer Architecture, pp 191–201.

    Chapter  Google Scholar 

  16. Jouppi, N.P. and S.J.E. Wilton. 1994. Trade-offs in two-level on-chip caching. In: Proceedings, 21st International Symposium on Computer Architecture, pp 34–45.

    Chapter  Google Scholar 

  17. Kroft, D. 1981. Lockup-free instruction fetch/prefetch cache organization. In: Proceedings, 8th International Symposium on Computer Architecture, pp 81–85.

    Google Scholar 

  18. Lawrie, D., and C. Vora. 1982. The prime memory system for array access. IEEE Transactions on Computers, 31(5):435–432.

    Article  MATH  Google Scholar 

  19. Matick, R, R. Mao, and S. Ray. 1989. Architecture, design, and operating characteristics of a 12ns CMOS functional chip cache. IBM Journal of Research and Development, 33(5):524–539.

    Article  Google Scholar 

  20. Matick, R.E. 1977. Computer Storage Systems and Technology. Wiley and Sons, New York.

    Google Scholar 

  21. Meade, R.M. 1971. Design approaches for cache memory control. Computer Design, January.

    Google Scholar 

  22. Morris, D. and R.N. Ibbett. 1979. The MU5 Computer System. Springer-Verlag, New York.

    Google Scholar 

  23. Pohm, A.V. and O.P. Agrawal. 1983. High-Speed memory Systems. Reston Publishers, Reston, Virginia, USA.

    Google Scholar 

  24. Przybylski, S.A. 1990. Cache Design: A Performance-Directed Approach. Mogran Kaufmann Publishers, San Mateo, California.

    MATH  Google Scholar 

  25. Rau, R., D.W.L. Yau, W. Yen, and R.A. Towle. 1989. The Cydra 5 departmental supercomputer. IEEE Computer, 22 (1):12–35.

    Article  Google Scholar 

  26. Rau, B.R. 1991. Pseudo-randomly interleaved memory. In: Proceedings, 18th International Symposium on Computer Architecture, pp 74–83.

    Google Scholar 

  27. Smith, A.J. 1978. Sequential-program prefetching in memory hierarchies. IEEE Computer, December.

    Google Scholar 

  28. Smith, A.J. 1987. Line (block) size for CPU cache hierarchies. IEEE Transactions on Computers, C-36(9):1063–1075.

    Article  Google Scholar 

  29. Smith, A.J. 1982. Cache memories. ACM Computing Surveys, 14(3):473–530.

    Article  Google Scholar 

  30. Song, P. 1997. IBM’s Power3 to replace P2SC. Microprocessor Report, 11(5).

    Google Scholar 

  31. Tse, J. and A.J. Smith. 1998. CPU prefetching: timing evaluation of hardware implementations IEEE Transactions on Computers, 47(5):509–526.

    Article  Google Scholar 

  32. Thornton, J.E. 1970. Design of a Computer: The Control Data 6600. Scott, Foresman, and Co., Illinois, USA.

    Google Scholar 

  33. Tremblay, M., D. Greenley, and K. Normoyle. 1995. The design of the microarchitecture of the U1traSPARC-1. Proceedings of the IEEE, 83(12).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media New York

About this chapter

Cite this chapter

Omondi, A.R. (1999). High-Performance Memory Systems. In: The Microarchitecture of Pipelined and Superscalar Computers. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-2989-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4757-2989-4_3

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-5081-9

  • Online ISBN: 978-1-4757-2989-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics