Skip to main content
Log in

On the Correctness of Program Execution When Cache Coherence Is Maintained Locally at Data-Sharing Boundaries in Distributed Shared Memory Multiprocessors

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Emerging multiprocessor architectures such as chip multiprocessors, embedded architectures, and massively parallel architectures, demand faster, more efficient, and more scalable cache coherence schemes. In devising more cost-efficient schemes, formal insights into a system model is deemed useful. We, in this paper, build formalisms for execution in cache based Distributed shared-memory multiprocessors (DSM) obeying Release Consistency model, and derive conditions for cache coherence. A cost-efficient cache coherence scheme without directories is designed. Our approach relies on processor directed coherence actions, which are early in nature. The scheme exploits sharing information provided by a programmer-centric framework. Per-processor coherence buffers (CB) are employed to impose coherence on live shared variables between consecutive release points in the execution. Simulation of 8 entry 4-way associative CB based system achieves a speedup of 1.07–4.31 over full-map 3-hop directory scheme for six of the SPLASH-2 benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. Rick Merrit, Intel,Sun sketch multiprocessor chip plans, URL:http://www.ee-times.com/story/OEG20011210S0069 (December 2001), URL www.eetimes.com/ story/OEG20011210S0069.

  2. L. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, and S. Qadeer, Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing, Proceedings of the 9th International Symposium on Computer Architecture (ISCA '00), pp.282–305 (2000).

  3. Embedded.com web page,Cache Coherence Issues for Real-Time Multiprocessing, URL:www.embedded.com/97/feat9702.htm (2001).

  4. Company Report,Using ASCI Blue-Pacific.Lawrence Livermore National Laboratory, URL:www.research.ibm.com/about/history.shtml (2001).

  5. L.Choi and P.-C.Yew,Hardware and Compiler-Directed Cache Coherence in Large-Scale Multiprocessors:Design Considerations and Performance Study,IEEE Transactions on Parallel and Distributed Systems, 11(4):375–394 (April 2000).

    Article  Google Scholar 

  6. D.J.Lilja,Cache Coherence in Large-Scale Shared-Memory Multiprocessors:Issues and Comparisons,ACM Computing Surveys, 3(25):303–338 (September 1993).

  7. S.Nandy and R.Narayan,An Incessantly Coherent Cache Scheme for Shared Memory Multithreaded Systems,Technical Report LCS,CSG-Memo 356,Massachusetts Institute of Technology (September 1994).

  8. A.Agarwal, R.Simoni, J.Hennessy,and M.Horowitz,An Evaluation of Directory scheme for Cache Coherence,Proceedings of the ISCA '88,pp.280–289 (May 1988).

  9. D.V.James, A.T.Laundrie, S.Gjessing,and G.S.Sohi,Distributed-Directory Scheme:Scalable Coherent Interface,IEEE Computer, 23(6):74–77 (June 1990).

    Google Scholar 

  10. H.Grahn and P.Stenström,Comparative Evaluation of Latency-Tolerating and-Reducing Techniques for Hardware-Only and Software-Only Directory Protocols, Journal of Parallel and Distributed Computing, 60(7):807–834 (2000)

    Article  Google Scholar 

  11. M.D.Hill, J.L.Larus, S.K.Reinhardt,and D.A.Wood,Cooperative Shared Memory:Software and Hardware for Scalable Multiprocessors,Architectural Support for Programming Languages and Operating Systems,pp.262–273 (June 1992).

  12. A.R.Lebeck and D.A.Wood,Dynamic self-invalidation:Reducing coherence over-head in Shared-memory multiprocessors,Proceedings of the ISCA '95,pp.48–59 (May 1995).

  13. A.Lai, C.Fide,and B.Falsa,Dead-Block Prediction Dead-Block Correlating Pre-fetchers,Proceedings of the ISCA '01 (June 2001).

  14. D.Shasha and M.Snir,Efficient and Correct Execution of Parallel Programs that Share Memory,ACM Transactions on Programming Languages and Systems, 10(2):282–312 (April 1988).

    Article  Google Scholar 

  15. J.Lee and D.A.Padua,Hiding Relaxed Memory Consistency with a Compiler,IEEE Transactions on Computers, 50(8):824–832 (2001).

    Article  Google Scholar 

  16. K.Gharachorloo, D.Lenoski, J.Laudon, P.Gibbons, A.Gupta,and J.Hennessy, Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors, Proceedings of the ISCA '90,pp.15–27 (May 1990).

  17. S.V.Adve and K.Gharachorloo,Shared Memory Consistency Models:A Tutorial, IEEE Computer, 29(12):66–76 (December 1996).

    Google Scholar 

  18. J.L.Hennessy and D.A.Patterson,Computer Architecture:A Quantitative Approach, Second Edition,Morgan Kaufmann Publishers,Inc.(1996).

  19. S.V.Adve, M.D.Hill, B.P.Miller,and R.H.B.Netzer,Detecting Data Races on Weak Memory Systems,Proceedings of the ISCA '91,pp.234–243 (May 1991).

  20. S.V.Adve and M.D.Hill,A Uni ed Formalization of Four Shared-Memory Models, IEEE Transactions on Parallel and Distributed Systems, 4(6):613–624 (June 1993).

    Article  Google Scholar 

  21. L.Lamport,How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor,IEEE Transactions on Computers, 46(7):779–782 (1997).

  22. L.Lamport,How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs,IEEE Transactions on Computers, C-28(9):690–691 (September 1979).

  23. S.V.Adve,Designing Memory Consistency Models for Shared Memory Multiprocessors, Technical Report CS 1198,University of Wisconsin Madison (December 1993).

  24. P.R.Vijay, S.Pai,and S.V.Adve,An Evaluation of Memory Consistency Models for Shared-Memory Systems with ILP Processors,Proceedings of the 7th Annual International Conference on Architectural Support for Programming Languages and Operating Systems,pp.12–23 (1996).

  25. S.V.Adve and M.D.Hill,Weak Ordering-a new definition,Proceedings of the ISCA, pp.2–14 (May 1990).

  26. No author,Alpha Architecture Reference Manual (1992).

  27. No author,Motorola Inc.PowerPC 603 RISC Microprocessor User 's Manual (1994).

  28. K.C.Yeager,The MIPS R10000 Superscalar Microprocessor,IEEE Micro, 16(2):28–40 (April 1996).

  29. D.E.Culler and J.P.S.with Anoop Gupta,Parallel Computer Architecture:A Hard-ware/Software Approach,Morgan Kaufmann Publishers,Inc.(1999).

  30. H.S.Devi, S.K.Nandy,and S.Balakrishnan,Enforcing Cache Coherence at Data Sharing Boundaries without Global Control:A Hardware-Software Approach,Proceedings of the Euro-Par (August 2002).

  31. S.Adve, V.Pai,and P.Ranganathan,Recent Advances in Memory Consistency Models for Hardware Shared Memory Systems,Proceedings of the IEEE, 87(3):445–455 (March 1999).

    Article  Google Scholar 

  32. James Laurus,EEL:An Executable Editing Library,URL:http://www.cs.wisc.edu/la-rus/eel.html (June 1996),URL http://www.cs.wisc.edu/larus/eel.html

  33. J.R.Goodman,Using Cache Memory to Reduce Processor-Memory Traffic,10th ISCA,pp.124–131,Stockholm,Sweden (1983).

  34. J.Torrellas, M.S.Lam,and J.L.Hennessy,False Sharing and Spatial Locality in Multiprocessor Caches,IEEE Transactions on Computers, 43(6):651–663 (1994).

    Article  Google Scholar 

  35. C.J.Hughes, V.S.Pai, P.Ranganathan,and S.V.Adve,Rsim:Simulating Shared-Memory Multiprocessors with ILP Processors,IEEE Computer, 35(2):40–49 (February 2002).

    Google Scholar 

  36. A.Gupta and W.-D.Weber,Cache Invalidation Patterns in Shared-Memory Multiprocessors,IEEE Transactions on Computers, 41(7):794–810 (July 1992).

    Article  Google Scholar 

  37. J.K.Bennett, J.B.Carter,and W.Zwaenepoel,Adaptive Software Cache Management for Distributed Shared Memory Architectures,Proceedings of the 17th ISCA '90), pp.125–135 (1990).

  38. D.Kroft,Lockup-Free Instruction Fetch/Prefetch Cache Organization,Proceedings of the 8th ISCA,pp.81–87 (May 1981).

  39. V.Ramachandran, B.Grayson,and M.Dahlin,Emulations Between QSM,BSP and LogP:A Framework for General-Purpose Parallel Algorithm Design (1998),URL citeseer.nj.nec.com/479566.html

Download references

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sarojadevi, H., Nandy, S.K. & Balakrishnan, S. On the Correctness of Program Execution When Cache Coherence Is Maintained Locally at Data-Sharing Boundaries in Distributed Shared Memory Multiprocessors. International Journal of Parallel Programming 32, 415–446 (2004). https://doi.org/10.1023/B:IJPP.0000038070.79088.0b

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:IJPP.0000038070.79088.0b

Navigation