Memory assignment for multiprocessor caches through grey coloring

  • Anant Agarwal
  • John V. Guttag
  • Christoforos N. Hadjicostis
  • Marios C. Papaefthymiou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 817)


The achieved performance of multiprocessors is heavily dependent on the performance of their caches. Cache performance is severely degraded when data tiles used by a program conflict in the caches. This paper explores techniques for improving multiprocessor performance by improving cache utilization. Specifically, we investigate the problem of statically assigning data tiles to memory in a way that minimizes the impact of collisions in multiprocessor caches. We define the problem precisely and present an efficient procedure for finding solutions to it. The procedure incorporates a new technique, grey coloring, that reduces the maximum number of conflicts in any cache in the system by distributing cache misses evenly among processors.


Grey Coloring Access Pattern Critical Number Cache Line Conflict Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    A. Agarwal, D. Chaiken, G. D'Souza, K. Johnson, D. Kranz, J. Kubiatowicz, K. Kurihara, B. Lim, G. Maa, D. Nussbaum, M. Parkin, and D. Yeung. The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor. In Workshop on Scalable Shared Memory Multiprocessors. Kluwer Academic Publishers, 1991. Also appears as MIT/LCS Memo TM-454, 1991.Google Scholar
  2. [2]
    A. Agarwal, D. Kranz, and V. Natarajan. Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared Memory Multiprocessors. Technical Report MIT/LCS TM-481, Massachusetts Institute of Technology, December 1992. A short version also appears in ICPP 1993.Google Scholar
  3. [3]
    J. M. Anderson and M. S. Lam. Global Optimizations for Parallelism and Locality on Scalable Parallel Machines. In Proceedings of SIGPLAN '93, Conference on Programming Languages Design and Implementation, June 1993.Google Scholar
  4. [4]
    R. Barua, A. Agarwal, D. Kranz, and V. Natarajan. Addressing Partitioned Arrays in Distributed Shared Memory. In preparation, August 1993.Google Scholar
  5. [5]
    A. Blum. Some tools for approximate 3-coloring. Proc. of the 31st Annual Symposium on Foundations of Computer Science, pages 554–562, October 1990.Google Scholar
  6. [6]
    D. Callahan and K. Kennedy. Compiling Programs for Distributed-Memory Multiprocessors. Journal of Supercomputing, 2(151–169), October 1988.Google Scholar
  7. [7]
    D. Chaiken, C. Fields, K. Kurihara, and A. Agarwal. Directory-Based Cache-Coherence in Large-Scale Multiprocessors. IEEE Computer, 23(6):41–58, June 1990.Google Scholar
  8. [8]
    G. J. Chaitin. Register allocation and spilling via graph coloring. In Proc. of the ACM Sigplan '82 Symposium on Compiler Construction, pages 22–31, June 1982.Google Scholar
  9. [9]
    F. C. Chow and J. L. Hennessy. The prioriry-based coloring approach to register allocation. ACM Transactions on Programming Languages and Systems, 12(4), October 1990.Google Scholar
  10. [10]
    S. J. Eggers and R. H. Katz. Evaluating the performance of four snooping cache coherency protocols. In Proceedings of the 16th International Symposium on Computer Architecture, New York, June 1989. IEEE.Google Scholar
  11. [11]
    M. R. Garey and D. S. Johnson. Computers and Intractability. W. H. Freeman and Co., San Francisco, 1979.Google Scholar
  12. [12]
    C. N. Hadjicostis. Heuristics for solving the memory assignment problem in multiprocessor caches. Bachelor's thesis, Massachusetts Institute of Technology, May 1993.Google Scholar
  13. [13]
    D. O. Tanguay Jr. Compile-Time Loop Splitting for Distributed Memory Multiprocessors. Technical Report TM-490, MIT, May 1993.Google Scholar
  14. [14]
    C. Koelbel, P. Mehrotra, and J. Van Rosendale. Supporting Shared Data Structures on Distributed Memory Architectures. In Proceedings Principles and Practice of Parallel Programming II, ACM, March 1990. ACM.Google Scholar
  15. [15]
    D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, J. Hennessy, M. Horowitz, and M. Lam. Design of the Stanford DASH Multiprocessor. Computer Systems Laboratory TR 89-403, Stanford University, December 1989.Google Scholar
  16. [16]
    Encore Multimax. Encore, Marlboro, Massachusetts.Google Scholar
  17. [17]
    J. Ramanujam and P. Sadayappan. Compile-Time Techniques for Data Distribution in Distributed Memory Machines. IEEE Transactions on Parallel and Distributed Systems, 2(4):472–482, October 1991.CrossRefGoogle Scholar
  18. [18]
    J. B. Rothnie. Architecture of the KSR1 Computer System, March 1992. MIT LCS Seminar, Cambridge, MA.Google Scholar
  19. [19]
    Sequent Symmetry. Sequent, Portland, Oregon.Google Scholar
  20. [20]
    A. Wigderson. Improving the performance guarantee for approximate graph coloring. JACM, 30(4):729–735, 1983.MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1994

Authors and Affiliations

  • Anant Agarwal
    • 1
  • John V. Guttag
    • 1
  • Christoforos N. Hadjicostis
    • 1
  • Marios C. Papaefthymiou
    • 2
  1. 1.Massachusetts Institute of TechnologyCambridge
  2. 2.Yale UniversityNew Haven

Personalised recommendations