Skip to main content

Evaluation of Cache Coherence Mechanisms for Multicore Processors

  • Chapter
  • First Online:
Computational Intelligence and Efficiency in Engineering Systems

Part of the book series: Studies in Computational Intelligence ((SCI,volume 595))

  • 1163 Accesses

Abstract

Multiple core designs have become commonplace in the processor marketplace, and are therefore a major focus in modern computer architecture research. Thus, for both product development and research, multiple core processor performance evaluation is a mandatory step in marketplace. Multicore computing has presented many challenges for system designers; one of which is data consistency between a shared cache or memory and the local caches of the chip. This is also known as cache coherency. The cache coherence mechanisms are a key component in the direction of accomplishing the goal of continuing exponential performance growth through widespread thread-level parallelism. In the scope of this research, we have studied the available efficient methods and protocols used to achieve cache coherence in multicore architectures. These protocols were further modeled and evaluated utilizing Simics simulator for multicore architectures. We also explored the weaknesses and strengths of different protocols and discussed the way of improving them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abts, D., Scott, S., Lilja, D.J.: So many states, so little time: verifying memory coherence in the Cray X1. In: IEEE Proceedings of International Parallel and Distributed Processing Symposium, p. 10(2003)

    Google Scholar 

  2. Agarwal, A., Bianchini, R., Chaiken, D., Johnson, K.L., Kranz, D., Kubiatowicz, J., Lim, B.-H., Mackenzie, K., Yeung, D.: The MIT Alewife machine: architecture and performance. In: IEEE Proceedings of 22nd Annual International Symposium on Computer Architecture, pp. 2–13 (1995)

    Google Scholar 

  3. Alameldeen, A.R., Martin, M.M., Mauer, C.J., Moore, K.E., Xu, M., Hill, M.D., Wood, D.A., Sorin, D.J.: Simulating a 2M Commercial Server on a 2K PC. Computer 36(2), 50–57 (2003)

    Article  Google Scholar 

  4. Al-Manasia, M., Al-Omari, F., Al-Jarrah, M.: Modeling and evaluation of cache coherence mechanisms for multicore processors. Masters Thesis, Yarmouk University (2011). http://repository.yu.edu.jo/handle/123456789/1505

  5. Al-Manasia, M., Chaczko, Z.: A survey of computer system architecture simulators, case study: sniper. In: Proceedings of the 2nd Asia-Pacific Conference on Computer-Aided System Engineering, APCASE 2014, 10–12 February 2014, South Kuta, Indonesia, pp. 14–15 (2014). ISBN: 978-0-9924518-0-6

    Google Scholar 

  6. Barroso, L.A., Gharachorloo, K., Bugnion, E.: Memory system characterization of commercial workloads. In: ACM SIGARCH Computer Architecture News 26(3), 3–14 (1998)

    Google Scholar 

  7. Barroso, L.A., Gharachorloo, K., McNamara, R., Nowatzyk, A., Qadeer, S., Sano, B., Smith, S., Stets, R., Verghese, B.: Piranha: a scalable architecture based on single-chip multiprocessing. In: ACM SIGARCH Computer Architecture News 28(2), 93–282 (2000)

    Google Scholar 

  8. Bilir, E.E., Dickson, R.M., Hu, Y., Plakal, M., Sorin, D.J., Hill, M.D., Wood, D.A.: Multicast snooping: a new coherence method using a multicast address network. In: IEEE Proceedings of the 26th International Symposium on Computer Architecture, pp. 294–304 (1999)

    Google Scholar 

  9. Borkenhagen, J.M., Hoover, R.D., Valk, K.M.: EXA cache/scalability controllers. In: IBM Enterprise X-Architecture Technology: Reaching the Summit, pp. 37–50 (2002)

    Google Scholar 

  10. Censier, L.M., Feautrier, P.: A new solution to coherence problems in multicache systems. IEEE Trans. Comput. 100(12), 1112–1118 (1978)

    Article  Google Scholar 

  11. Chandra, D., Guo, F., Kim, S., Solihin, Y.: Predicting inter-thread cache contention on a chip multi-processor architecture. In: IEEE 11th International Symposium on High-Performance Computer Architecture, pp. 340–351 (2005)

    Google Scholar 

  12. Charlesworth, A.: Starfire: extending the SMP envelope. IEEE Micro 18(1), 39–49 (1998)

    Article  Google Scholar 

  13. Charlesworth, A.: The sun fireplane interconnect. IEEE Micro 22(1), 36–45 (2002)

    Article  Google Scholar 

  14. Clapp, R., Lovett, T.: STiNG: A CC-NUMA computer system for the commercial marketplace. In: IEEE 23rd Annual International Symposium on Computer Architecture (1996)

    Google Scholar 

  15. Cvetanovic, Z.: Performance analysis of the alpha 21364-based HP GS1280 multiprocessor. In: IEEE Proceedings of 30th Annual International Symposium on Computer Architecture, pp. 218–228 (2003)

    Google Scholar 

  16. Frank, S.J.: Tightly coupled multiprocessor system speeds memory-access times, vol. 1. Electronics, United States (1984)

    Google Scholar 

  17. Galles, M., Williams, E.: Performance optimizations, implementation, and verification of the SGI challenge multiprocessor. In: Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences, vol. 1, pp. 134–143. IEEE (1994)

    Google Scholar 

  18. Geer, D.: Chip makers turn to multicore processors. Computer 38(5), 11–13 (2005)

    Article  Google Scholar 

  19. Gharachorloo, K., Barroso, L.A., Nowatzyk, A.: Efficient ECC-based directory implementations for scalable multiprocessors. In: Proceedings of the 12th Symposium on Computer Architecture and High-Performance Computing (SBAC-PAD 2000) (2000)

    Google Scholar 

  20. Gharachorloo, K., Sharma, M., Steely, S., Van Doren, S.: Architecture and design of AlphaServer GS320. In: ACM SIGARCH Computer Architecture News 28, 13–24. ACM (2000)

    Google Scholar 

  21. Goodman, J.R.: Using cache memory to reduce processor-memory traffic. In: 25 Years of the International Symposia on Computer Architecture (selected papers), pp. 255–262. ACM (1998)

    Google Scholar 

  22. Horel, T., Lauterbach, G.: UltraSPARC-III: designing third-generation 64-bit performance. IEEE Micro 19(3), 73–85 (1999)

    Article  Google Scholar 

  23. Hsu, L.R., Reinhardt, S.K., Iyer, R., Makineni, S.: Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource. In: Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, pp. 13–22. ACM (2006)

    Google Scholar 

  24. Iyer, R.: CQoS: a framework for enabling QoS in shared caches of CMP platforms. In: Proceedings of the 18th Annual International Conference on Supercomputing, pp. 257–266. ACM (2004)

    Google Scholar 

  25. Katz, R.H., Eggers, S.J, Wood, D.A., Perkins, C, Sheldon, R.G.: Implementing a cache consistency protocol, vol. 13. IEEE Computer Society Press (1985)

    Google Scholar 

  26. Kuskin, J., Ofelt, D., Heinrich, M., Heinlein, J., Simoni, R., Gharachorloo, K., Chapin, J., Nakahira, D., Baxter, J., Horowitz, M.: The stanford flash multiprocessor. In: IEEE Proceedings the 21st Annual International Symposium on Computer Architecture, pp. 302–313 (1994)

    Google Scholar 

  27. Lenoski, D., Laudon, J., Gharachorloo, K., Gupta, A., Hennessy, J.: The directory-based cache coherence protocol for the DASH multiprocessor, vol. 18. ACM (1990)

    Google Scholar 

  28. Lenoski, D., Laudon, J., Gharachorloo, K., Weber, W.-D., Gupta, A., Hennessy, J., Horowitz, M., Lam, M.S.: The stanford dash multiprocessor. Computer 25(3), 63–79 (1992)

    Article  Google Scholar 

  29. Loudon, J., Lenoski, D.: The SGI origin: a ccNUMA highly scalable server. In: Proceedings of the 24th International Symposium on Computer Architecture. Silicon Graphics Inc. (1997)

    Google Scholar 

  30. Martin, M.M., Harper, P.J., Sorin, D.J., Hill, M.D., Wood, D.A.: Using destination-set prediction to improve the latency/bandwidth tradeoff in shared-memory multiprocessors. In: IEEE Proceedings of the 30th Annual International Symposium on Computer Architecture, pp. 206–217 (2003)

    Google Scholar 

  31. Martin, M.M., Hill, M.D., Wood, D.A.: Token coherence: decoupling performance and correctness. In: IEEE Proceedings of the 30th Annual International Symposium on Computer Architecture, pp. 182–193 (2003)

    Google Scholar 

  32. Martin, M.M., Sorin, D.J., Ailamaki, A., Alameldeen, A.R., Dickson, R.M., Mauer, C.J., Moore, K.E., Plakal, M., Hill, M.D., Wood, D.A.: Timestamp snooping: an approach for extending SMPs. In: ACM SIGARCH Computer Architecture News. 28, pp. 25–36. ACM (2000)

    Google Scholar 

  33. Martin, M.M.: Token Coherence, University of Wisconsin (2003)

    Google Scholar 

  34. Martin, M.M., Hill, M.D., Wood, D.A.: Token coherence: a new framework for shared-memory multiprocessors. IEEE Micro 23(6), 108–116 (2003)

    Article  Google Scholar 

  35. Martin, M.M., Sorin, D.J., Beckmann, B.M., Marty, M.R., Xu, M., Alameldeen, A.R., Moore, K.E., Hill, M.D., Wood, D.A.: Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. In: ACM SIGARCH Computer Architecture News 33(4), 92–99 (2005)

    Google Scholar 

  36. Marty, M.R.: Cache coherence techniques for multicore processors. PhD thesis, University of Wisconsin (2008)

    Google Scholar 

  37. McCreight, E.M.: The dragon computer system. In: Microarchitecture of VLSI Computers, pp. 83–101. Springer (1985)

    Google Scholar 

  38. Moore, G.E.: Cramming more components onto integrated circuits. Reprinted from Electronics 38(8), 114 (1965) IEEE Solid-State Circuits Newslett. 11(5), 33–35 (2006)

    Google Scholar 

  39. Mukherjee, S.S., Bannon, P., Lang, S., Spink, A., Webb, D.: The alpha 21364 network architecture. IEEE Hot Interconnects 9, 113–117 (2001)

    Article  Google Scholar 

  40. Nowatzyk, A., Aybay, G., Browne, M., Kelly, E., Lee, D., Parkin, M.: The S3. mp scalable shared memory multiprocessor. In: IEEE Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences, vol. 1, pp. 144–153 (1994)

    Google Scholar 

  41. Rao, W.: Multi processors, their memory organizations and implementations by Intel and AMD (2009) http://ece.uic.edu/~ wenjing/courses/fa08ECE569/ECE569/w21.pdf

  42. Sorin, D.J., Plakal, M., Condon, A.E., Hill, M.D., Martin, M.M.K., Wood, D.A.: Specifying and verifying a broadcast and a multicast snooping cache coherence protocol. IEEE Trans. Parallel Distrib. Syst. 13(6), 556–578 (2002)

    Article  Google Scholar 

  43. Tang, C.: Cache system design in the tightly coupled multiprocessor system. In: Proceedings of the National Computer Conference and Exposition, 7–10 June 1976, pp. 749–753. ACM (1976)

    Google Scholar 

  44. Tendler, J.M., Dodson, J.S., Fields, J., Le, H., Sinharoy, B.: POWER4 system microarchitecture. IBM J. Res. Dev. 46(1), 5–25 (2002)

    Article  Google Scholar 

  45. Thacker, C.P., Stewart, L.C., Satterthwaite Jr, E.H.: Firefly: a multiprocessor workstation. IEEE Trans. Comput. 37(8), 909–920 (1988)

    Article  Google Scholar 

  46. WindRiver, Wind River Simics “Full System Simulation”. www.windriver.com/products/simics/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Malik Al-Manasia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Al-Manasia, M., Chaczko, Z. (2015). Evaluation of Cache Coherence Mechanisms for Multicore Processors. In: Borowik, G., Chaczko, Z., Jacak, W., Łuba, T. (eds) Computational Intelligence and Efficiency in Engineering Systems. Studies in Computational Intelligence, vol 595. Springer, Cham. https://doi.org/10.1007/978-3-319-15720-7_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-15720-7_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-15719-1

  • Online ISBN: 978-3-319-15720-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics