The Journal of Supercomputing

, Volume 8, Issue 4, pp 345–369 | Cite as

Scalable cache consistency for hierarchically structured multiprocessors

  • Keith Farkas
  • Zvonko Vranesic
  • Michael Stumm


This paper presents a new cache consistency scheme for hierarchically structured shared-memory multiprocessors. The scheme is simple, fast and efficient, and it does not require a large amount of state information to be maintained. The scheme exploits the broadcast capability of these systems, but limits the extent of the broadcasts by means of a novel filtering mechanism. As a specific example, it is shown how the proposed cache consistency scheme can be implemented on the Hector multiprocessor architecture. Using trace-driven simulations, we demonstrate that the scheme is scalable and performs well for common applications.


Cache consistency shared-memory multiprocessors hierarchically structured multiprocessors limited broadcast 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Barroso, L., and Dubois, M. 1991. Cache coherence on a slotted ring. InConference Proceedings-International Conference on Parallel Processing (Austin, Texas, Aug. 12–16), CRC Press Inc., pp. I-230–I-237.Google Scholar
  2. Basket, F., Jermoluk, T., and Solomon, D. 1988. The 4D-MP graphics superworkstation: Computing + graphics = 40 MIPS + 40 MFLOPS and 100,000 lighted polygons per second. InConference Proceedings-The 33rd IEEE Computer Society International Conference — COMPCON (San Francisco, California, Feb. 24 – Mar. 4), IEEE Computer Society Press, pp. 468–471.Google Scholar
  3. Chaiken, D., Fields, C., Kurihara, K., and Agarwal, A. 1990. Directory-based cache coherence in large-scale multiprocessors.Computer, 23, 6 (June): 49–58.Google Scholar
  4. Chaiken, D., Kubiatowicz, J., and Agarwal, A. 1991. LimitLESS directories: A scalable cache coherence scheme. InConference Proceedings-The Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Santa Clara, California, April 8–11), The Association for Computing Machinery (ACM), pp. 224–234.Google Scholar
  5. Dubois, M., Scheurich, C., and Briggs, F. A. 1986. Memory access buffering in multiprocessors. InConference Proceedings-The 13th Annual International Symposium on Computer Architecture (Tokyo, Japan, June 2–5), IEEE Computer Society Press, pp. 434–442.Google Scholar
  6. Farkas, K. I. 1991. A decentralized hierarchical cache-consistency scheme for shared-memory multiprocessors. Master's thesis, University of Toronto, Technical Report no. EECG TR-91-04-01 (Electrical Engineering Computer Group).Google Scholar
  7. Frank, S., Rothnie, J., and Burkhardt, H. 1993. The KSR1: Bridging the gap between shared memory and MPPs. InConference Proceedings-IEEE Compcon 1993 Digest of Papers, (San Francisco, California, Feb. 22–26), IEEE Computer Society Press, pp. 285–294.Google Scholar
  8. Fu, J., Keller, J., and Haduch, K. 1987. Aspects of the VAX 8800 C box design.Digital Technical Journal, 4, 2 (Feb.): 41–51.Google Scholar
  9. Gehringer, E., Siewiorek, D., and Segall, Z. 1987.Parallel Processing: The Cm* Experience. Digital Press.Google Scholar
  10. Gharachorloo, K., Lenoski, D., Laudon, J., Gibbons, P., Gupta, A., and Hennessy, J. 1990. Memory consistency and event ordering in scalable shared-memory multiprocessors. InConference Proceedings-The 17th Annual International Symposium on Computer Architecture (Seattle, Washington, May 28–31), IEEE Computer Society Press, pp. 15–26.Google Scholar
  11. Goodman, J. 1991. Cache consistency and sequential consistency. Technical Report no. 1006, Computer Sciences Department, University of Wisconsin-Madison.Google Scholar
  12. Gustavson, D. 1992. The Scalable Coherent Interface and related standards projects.IEEE Micro, 12, 1 (Jan.): 10–22.Google Scholar
  13. Konicek, J. 1991. The organization of the Cedar system. InConference Proceedings-The 1991 International Conference on Parallel Processing, (Austin, Texas, Aug. 12–16), CRC Press Inc., pp. 149–156.Google Scholar
  14. KSR. 1992. KSR1 principles of operation. Technical report, Kendall Square Research.Google Scholar
  15. Lamport, L. 1979. How to make a multiprocessor computer that correctly executes multiprocess programs.IEEE Transactions on Computers, C-28, 9 (Sept.): 690–691.Google Scholar
  16. Landin, A., Hagersten, E., and Haridi, S. 1991. Race-free interconnection networks and multiprocessor consistency. InConference Proceedings-The 18th Annual International Symposium on Computer Architecture (Toronto, Canada, May 27–30), IEEE Computer Society Press, pp. 106–115.Google Scholar
  17. Lenoski, A. D., Laudon, J., Gharachorloo, K., Gupta, A., and Hennessy, J. 1990. Directory-based cache coherence protocol for the DASH multiprocessor. InConference Proceedings-The 17th Annual International Symposium on Computer Architecture (Seattle, Washington, May 28–31), IEEE Computer Society Press, pp. 148–158.Google Scholar
  18. Lenoski, D., Laudon, J., Gharachorloo, K., Weber, W.-D., Gupta, A., Hennessy, J., Horowitz, M., and Lam, M. 1992. The Stanford Dash multiprocessor.Computer, 25, 3 (March): 63–79.Google Scholar
  19. Scheurich, C. E. 1989. Access ordering and coherence in shared memory multiprocessors. Ph.D. thesis, University of Southern California, Technical Report no. CENG 89-19 (Computer Engineering).Google Scholar
  20. Singh, J. P., Weber, W.-D., and Gupta, A. 1991. SPLASH: Stanford parallel applications for shared memory. Technical Report CSL-TR-91-469, Computer Systems Laboratory, Stanford University.Google Scholar
  21. Stumm, M., Vranesic, Z., White, R., Farkas, K., and Unrau, R. 1993. Experiences with the Hector multiprocessor. InConference Proceedings-Seventh International Parallel Processing Symposium (Newport Beach, California, April 13–16), IEEE Computer Society Press, pp. 10–18.Google Scholar
  22. Veenstra, J. E., and Fowler, R. J. 1994. MINT: A front end for efficient simulation of shared-memory multiprocessors. InWorkshop Proceedings-The Second International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (Los Alamitos, Jan.), IEEE Computer Society Press, pp. 201–207.Google Scholar
  23. Vranesic, Z. G., Stumm, M., Lewis, D. M., and White, R. 1991. Hector: A hierarchically structured shared-memory multiprocessor.Computer, 24, 1 (Jan.): 72–79.Google Scholar

Copyright information

© Kluwer Academic Publishers 1995

Authors and Affiliations

  • Keith Farkas
    • 1
  • Zvonko Vranesic
    • 1
  • Michael Stumm
    • 1
  1. 1.Department of Electrical and Computer EngineeringUniversity of TorontoTorontoCanada

Personalised recommendations