Skip to main content

ACM: An Efficient Approach for Managing Shared Caches in Chip Multiprocessors

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5409))

Abstract

This paper proposes and studies a hardware-based adaptive controlled migration strategy for managing distributed L2 caches in chip multiprocessors. Building on an area-efficient shared cache design, the proposed scheme dynamically migrates cache blocks to cache banks that best minimize the average L2 access latency. Cache blocks are continuously monitored and the locations of the optimal corresponding cache banks are predicted to effectively alleviate the impact of non-uniform cache access latency. By adopting migration alone without replication, the exclusiveness of cache blocks is maintained, thus further optimizing the cache miss rate. Simulation results using a full system simulator demonstrate that the proposed controlled migration scheme outperforms the shared caching strategy and compares favorably with previously proposed replication schemes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Standard performance evaluation corporation, http://www.specbench.org

  2. Virtutech, A.B.: Simics full system simulator, http://www.simics.com/

  3. Beckmann, B.M., Marty, M.R., Wood, D.A.: Asr: Adaptive selective replication for cmp caches. In: MICRO (December 2006)

    Google Scholar 

  4. Beckmann, B.M., Wood, D.A.: Managing wire delay in large chip-multiprocessor caches. In: MICRO (December 2004)

    Google Scholar 

  5. Chandra, R., Devine, S., Verghese, B., Gupta, A., Rosenblum, M.: Scheduling and page migration for multiprocessor compute servers. In: ASPLOS (October 1994)

    Google Scholar 

  6. Chang, J., Sohi, G.S.: Cooperative caching for chip multiprocessors. In: ISCA (June 2006)

    Google Scholar 

  7. Chishti, A., Powell, M.D., Vijaykumar, T.N.: Distance associativity for high-performance energy-efficient non-uniform cache architectures. In: MICRO (December 2003)

    Google Scholar 

  8. Chishti, Z., Powell, M.D., Vijaykumar, T.N.: Optimizing replication, communication, and capacity allocation in cmps. In: ISCA (June 2005)

    Google Scholar 

  9. Cho, S., Jin, L.: Managing distributed shared l2 caches through os-level page allocation. In: MICRO (December 2006)

    Google Scholar 

  10. Dybdahl, H., Stenstrom, P.: An adaptive shared/private nuca cache partitioning scheme for chip multiprocessors. In: HPCA (February 2007)

    Google Scholar 

  11. Falsafi, B., Wood, D.A.: Reactive numa: A design for unifying s-coma and cc-numa. In: ISCA (June 1997)

    Google Scholar 

  12. Hagersten, E., Landin, A., Haridi, S.: Ddm-a cache-only memory architecture. IEEE Computer (September 1992)

    Google Scholar 

  13. Held, J., Bautista, J., Koehl, S.: From a few cores to many: A tera-scale computing research overview. White Paper. Research at Intel. (January 2006)

    Google Scholar 

  14. Kim, C., Huh, J., Shafi, H., Zhang, L., Burger, D., Keckler, S.W.: A nuca substrate for flexible cmp cache sharing. In: ICS (June 2005)

    Google Scholar 

  15. Johnson, T., Nawathe, U.: An 8-core, 64-thread, 64-bit power efficient sparc soc. In: IEEE ISSCC (February 2007)

    Google Scholar 

  16. Kim, C., Burger, D., Keckler, S.W.: An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In: ASPLOS (October 2002)

    Google Scholar 

  17. Li, F., Kandemir, M., Irwin, M.J.: Implementation and evaluation of a migration-based nuca design for chip multiprocessors. In: ACM SIGMETRICS (June 2008)

    Google Scholar 

  18. Marty, M.R., Hill, M.D.: Virtual hierarchies to support server consolidation. In: ISCA (June 2007)

    Google Scholar 

  19. Mizrahi, H.E., Baer, J.L., Lazowska, E.D., Zahorjan, J.: Introducing memory into the switch elements of multiprocessor interconnection networks. In: ISCA (1989)

    Google Scholar 

  20. Mullins, R., West, A., Moore, S.: Low-latency virtual-channel routers for on-chip networks. In: ISCA (June 2004)

    Google Scholar 

  21. Sinharoy, B., Kalla, R.N., Tendler, J.M., Eickemeyer, R.J., Joyner, J.B.: Power5 system microarchitecture. IBM J. Res. & Dev. (July 2005)

    Google Scholar 

  22. Vangal, S., Howard, J., Ruhl, G., Dighe, S., Wilson, H., Tschanz, J., Finan, D., Iyer, P., Singh, A., Jacob, T., Jain, S., Venkataraman, S., Hoskote, Y., Borkar, N.: An 80-tile 1.28tflops network-on-chip in 65nm cmos. In: ISSCC, New York (February 2007)

    Google Scholar 

  23. Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The splash-2 programs: Characterization and methodological considerations. In: ISCA (July 1995)

    Google Scholar 

  24. Zhang, M., Asanović, K.: Victim migration: Dynamically adapting between private and shared cmp caches. Technical Report TR-2005-064, Computer Science and Artificial Intelligence Labratory. MIT (October 2005)

    Google Scholar 

  25. Zhang, M., Asanović, K.: Victim replication: Maximizing capacity while hiding wire delay in tiled chip multiprocessors. In: ISCA, New York (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hammoud, M., Cho, S., Melhem, R. (2009). ACM: An Efficient Approach for Managing Shared Caches in Chip Multiprocessors. In: Seznec, A., Emer, J., O’Boyle, M., Martonosi, M., Ungerer, T. (eds) High Performance Embedded Architectures and Compilers. HiPEAC 2009. Lecture Notes in Computer Science, vol 5409. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92990-1_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-92990-1_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-92989-5

  • Online ISBN: 978-3-540-92990-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics