High-Performance Memory Systems

Omondi, Amos R.

doi:10.1007/978-1-4757-2989-4_3

Amos R. Omondi²

149 Accesses

Abstract

A high-performance pipelined machine requires a memory system of equally high performance if the processor is to be fully utilized. This chapter deals with the issue of matching processor and memory performances. In a simple machine in which a single main memory module is connected directly to a processor (with no intermediate storage), the effective rate at which operands and instructions can be processed is limited by the rate at which the memory unit can deliver them. But in a high-performance machine, the access time of a single main memory unit typically exceeds the cycle time of the processor by a large margin, and obtaining the highest performance possible requires some changes in the basic design of the memory system. The development of faster memory as a solution to problem is not viable due to limitations in the technology that is available within a given period: the performance of the technology used for main memories typically improves at a rate that is less than that used processors, and the use for main memory of the fastest available logic is never cost-effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bibliography

AMD. 1997. AMD-K6 MMX Processor. Advanced Micro Devices, Sunnyvale, California.
Google Scholar
Alpert, D.B. and M.J. Flynn. 1988. Performance trade-offs for microprocessor cache memories. IEEE Micro, 8(4):44–54.
Article Google Scholar
Boland, L.J., G.D. Granito, A.U. Marcotte, B.U. Messina, and J.W. Smith. 1967. The IBM System/360 model 91: storage system. IBM Journal of Research and Development, 11(1):54–68.
Article MATH Google Scholar
Budnik, P.P. and D.J. Kuck. 1971. The organization and use of parallel memories. IEEE Transactions on Computers, C-20(12):1566–1569.
Article Google Scholar
Burnett, G.J. and E.G. Coffman. 1975. Analysis of interleaved memory systems using blockage buffers. Communications of the ACM, 18(2):91–95.
Article MATH Google Scholar
CDC 1987. Cyber 200 Model 205 Computer System. Hardware Reference Manual. Control Data Corporation, Minneapolis, Minnesota, USA.
Google Scholar
Diefendorff, K. and M. Allen, 1992. Organization of the Motorola 88110 superscalar RISC microprocessor. IEEE Micro, 12 (4):40–63.
Article Google Scholar
Edmondson, J.H., P. Rubinfield, R. Preston, and V. Rajagopalan. 1995. Superscalar instruction execution in the Alpha 21164 microprocessor. IEEE Micro, 15(2):33–43.
Article Google Scholar
Fite, D.B., T. Fossum, and D. Manley. 1990. Design strategies for the VAX 9000 system. Digital Technical Journal, 2(4):13–24.
Google Scholar
Hill, M.D. 1987. Aspects of Cache Memory and Instruction Buffer Performance. Technical Report No. UCB/CSD 87/382, Computer Science Division, University of California, Berkeley.
Google Scholar
Hill, M.D. 1988. A case for direct-mapped caches. IEEE Computer, December.
Google Scholar
Hsu, W.-C. and J.E. Smith. 1998. A performance study of cache prefetching methods. IEEE Transactions on Computers, 47(5):497–508.
Article Google Scholar
Inayoshi et al. 1988. Realization of the Gmicro/200. IEEE Micro, 8(2):12–21.
Article Google Scholar
Jouppi, N. 1990. Improving direct-mapped cache performance by addition of a small fully-associative cache and prefetch buffers. In: 17th International Symposium on Computer Architecture, pp 364–373.
Google Scholar
Jouppi, N.P. 1993. Cache write policies and performance. In: Proceedings, 20th International Symposium on Computer Architecture, pp 191–201.
Chapter Google Scholar
Jouppi, N.P. and S.J.E. Wilton. 1994. Trade-offs in two-level on-chip caching. In: Proceedings, 21st International Symposium on Computer Architecture, pp 34–45.
Chapter Google Scholar
Kroft, D. 1981. Lockup-free instruction fetch/prefetch cache organization. In: Proceedings, 8th International Symposium on Computer Architecture, pp 81–85.
Google Scholar
Lawrie, D., and C. Vora. 1982. The prime memory system for array access. IEEE Transactions on Computers, 31(5):435–432.
Article MATH Google Scholar
Matick, R, R. Mao, and S. Ray. 1989. Architecture, design, and operating characteristics of a 12ns CMOS functional chip cache. IBM Journal of Research and Development, 33(5):524–539.
Article Google Scholar
Matick, R.E. 1977. Computer Storage Systems and Technology. Wiley and Sons, New York.
Google Scholar
Meade, R.M. 1971. Design approaches for cache memory control. Computer Design, January.
Google Scholar
Morris, D. and R.N. Ibbett. 1979. The MU5 Computer System. Springer-Verlag, New York.
Google Scholar
Pohm, A.V. and O.P. Agrawal. 1983. High-Speed memory Systems. Reston Publishers, Reston, Virginia, USA.
Google Scholar
Przybylski, S.A. 1990. Cache Design: A Performance-Directed Approach. Mogran Kaufmann Publishers, San Mateo, California.
MATH Google Scholar
Rau, R., D.W.L. Yau, W. Yen, and R.A. Towle. 1989. The Cydra 5 departmental supercomputer. IEEE Computer, 22 (1):12–35.
Article Google Scholar
Rau, B.R. 1991. Pseudo-randomly interleaved memory. In: Proceedings, 18th International Symposium on Computer Architecture, pp 74–83.
Google Scholar
Smith, A.J. 1978. Sequential-program prefetching in memory hierarchies. IEEE Computer, December.
Google Scholar
Smith, A.J. 1987. Line (block) size for CPU cache hierarchies. IEEE Transactions on Computers, C-36(9):1063–1075.
Article Google Scholar
Smith, A.J. 1982. Cache memories. ACM Computing Surveys, 14(3):473–530.
Article Google Scholar
Song, P. 1997. IBM’s Power3 to replace P2SC. Microprocessor Report, 11(5).
Google Scholar
Tse, J. and A.J. Smith. 1998. CPU prefetching: timing evaluation of hardware implementations IEEE Transactions on Computers, 47(5):509–526.
Article Google Scholar
Thornton, J.E. 1970. Design of a Computer: The Control Data 6600. Scott, Foresman, and Co., Illinois, USA.
Google Scholar
Tremblay, M., D. Greenley, and K. Normoyle. 1995. The design of the microarchitecture of the U1traSPARC-1. Proceedings of the IEEE, 83(12).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Flinders University, Adelaide, Australia
Amos R. Omondi

Authors

Amos R. Omondi
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Omondi, A.R. (1999). High-Performance Memory Systems. In: The Microarchitecture of Pipelined and Superscalar Computers. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-2989-4_3

Download citation

DOI: https://doi.org/10.1007/978-1-4757-2989-4_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5081-9
Online ISBN: 978-1-4757-2989-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics