Abstract
A high-performance pipelined machine requires a memory system of equally high performance if the processor is to be fully utilized. This chapter deals with the issue of matching processor and memory performances. In a simple machine in which a single main memory module is connected directly to a processor (with no intermediate storage), the effective rate at which operands and instructions can be processed is limited by the rate at which the memory unit can deliver them. But in a high-performance machine, the access time of a single main memory unit typically exceeds the cycle time of the processor by a large margin, and obtaining the highest performance possible requires some changes in the basic design of the memory system. The development of faster memory as a solution to problem is not viable due to limitations in the technology that is available within a given period: the performance of the technology used for main memories typically improves at a rate that is less than that used processors, and the use for main memory of the fastest available logic is never cost-effective.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Bibliography
AMD. 1997. AMD-K6 MMX Processor. Advanced Micro Devices, Sunnyvale, California.
Alpert, D.B. and M.J. Flynn. 1988. Performance trade-offs for microprocessor cache memories. IEEE Micro, 8(4):44–54.
Boland, L.J., G.D. Granito, A.U. Marcotte, B.U. Messina, and J.W. Smith. 1967. The IBM System/360 model 91: storage system. IBM Journal of Research and Development, 11(1):54–68.
Budnik, P.P. and D.J. Kuck. 1971. The organization and use of parallel memories. IEEE Transactions on Computers, C-20(12):1566–1569.
Burnett, G.J. and E.G. Coffman. 1975. Analysis of interleaved memory systems using blockage buffers. Communications of the ACM, 18(2):91–95.
CDC 1987. Cyber 200 Model 205 Computer System. Hardware Reference Manual. Control Data Corporation, Minneapolis, Minnesota, USA.
Diefendorff, K. and M. Allen, 1992. Organization of the Motorola 88110 superscalar RISC microprocessor. IEEE Micro, 12 (4):40–63.
Edmondson, J.H., P. Rubinfield, R. Preston, and V. Rajagopalan. 1995. Superscalar instruction execution in the Alpha 21164 microprocessor. IEEE Micro, 15(2):33–43.
Fite, D.B., T. Fossum, and D. Manley. 1990. Design strategies for the VAX 9000 system. Digital Technical Journal, 2(4):13–24.
Hill, M.D. 1987. Aspects of Cache Memory and Instruction Buffer Performance. Technical Report No. UCB/CSD 87/382, Computer Science Division, University of California, Berkeley.
Hill, M.D. 1988. A case for direct-mapped caches. IEEE Computer, December.
Hsu, W.-C. and J.E. Smith. 1998. A performance study of cache prefetching methods. IEEE Transactions on Computers, 47(5):497–508.
Inayoshi et al. 1988. Realization of the Gmicro/200. IEEE Micro, 8(2):12–21.
Jouppi, N. 1990. Improving direct-mapped cache performance by addition of a small fully-associative cache and prefetch buffers. In: 17th International Symposium on Computer Architecture, pp 364–373.
Jouppi, N.P. 1993. Cache write policies and performance. In: Proceedings, 20th International Symposium on Computer Architecture, pp 191–201.
Jouppi, N.P. and S.J.E. Wilton. 1994. Trade-offs in two-level on-chip caching. In: Proceedings, 21st International Symposium on Computer Architecture, pp 34–45.
Kroft, D. 1981. Lockup-free instruction fetch/prefetch cache organization. In: Proceedings, 8th International Symposium on Computer Architecture, pp 81–85.
Lawrie, D., and C. Vora. 1982. The prime memory system for array access. IEEE Transactions on Computers, 31(5):435–432.
Matick, R, R. Mao, and S. Ray. 1989. Architecture, design, and operating characteristics of a 12ns CMOS functional chip cache. IBM Journal of Research and Development, 33(5):524–539.
Matick, R.E. 1977. Computer Storage Systems and Technology. Wiley and Sons, New York.
Meade, R.M. 1971. Design approaches for cache memory control. Computer Design, January.
Morris, D. and R.N. Ibbett. 1979. The MU5 Computer System. Springer-Verlag, New York.
Pohm, A.V. and O.P. Agrawal. 1983. High-Speed memory Systems. Reston Publishers, Reston, Virginia, USA.
Przybylski, S.A. 1990. Cache Design: A Performance-Directed Approach. Mogran Kaufmann Publishers, San Mateo, California.
Rau, R., D.W.L. Yau, W. Yen, and R.A. Towle. 1989. The Cydra 5 departmental supercomputer. IEEE Computer, 22 (1):12–35.
Rau, B.R. 1991. Pseudo-randomly interleaved memory. In: Proceedings, 18th International Symposium on Computer Architecture, pp 74–83.
Smith, A.J. 1978. Sequential-program prefetching in memory hierarchies. IEEE Computer, December.
Smith, A.J. 1987. Line (block) size for CPU cache hierarchies. IEEE Transactions on Computers, C-36(9):1063–1075.
Smith, A.J. 1982. Cache memories. ACM Computing Surveys, 14(3):473–530.
Song, P. 1997. IBM’s Power3 to replace P2SC. Microprocessor Report, 11(5).
Tse, J. and A.J. Smith. 1998. CPU prefetching: timing evaluation of hardware implementations IEEE Transactions on Computers, 47(5):509–526.
Thornton, J.E. 1970. Design of a Computer: The Control Data 6600. Scott, Foresman, and Co., Illinois, USA.
Tremblay, M., D. Greenley, and K. Normoyle. 1995. The design of the microarchitecture of the U1traSPARC-1. Proceedings of the IEEE, 83(12).
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media New York
About this chapter
Cite this chapter
Omondi, A.R. (1999). High-Performance Memory Systems. In: The Microarchitecture of Pipelined and Superscalar Computers. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-2989-4_3
Download citation
DOI: https://doi.org/10.1007/978-1-4757-2989-4_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5081-9
Online ISBN: 978-1-4757-2989-4
eBook Packages: Springer Book Archive