Skip to main content

Scalable Shared-Cache Management by Containing Thrashing Workloads

  • Conference paper
High Performance Embedded Architectures and Compilers (HiPEAC 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5952))

Abstract

Multi-core processors with shared last-level caches are vulnerable to performance inefficiencies and fairness issues when the cache is not carefully managed between the multiple cores. Cache partitioning is an effective method for isolating poorly-interacting threads from each other, but designing a mechanism with simple logic and low area overhead will be important for incorporating such schemes in future embedded multi-core processors. In this work, we identify that major performance problems only arise when one or more “thrashing” applications exist. We propose a simple yet effective Thrasher Caging (TC) cache management scheme that specifically targets these thrashing applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bader, D.A., Li, Y., Li, T., Sachdeva, V.: BioPerf: A Benchmark Suite to Evaluate High-Performance Computer Architecture of Bioinformatics Applications. In: Proc. of the IEEE Intl. Symp. on Workload Characterization, Austin, TX, USA, October 2005, pp. 163–173 (2005)

    Google Scholar 

  2. Chandra, D., Guo, F., Kim, S., Solihin, Y.: Predicting Inter-Thread Cache Contenton on a Chip Multi-Processor Architecture. In: Proc. of the 11th Intl. Symp. on High Performance Computer Architecture, February 2005, pp. 340–351 (2005)

    Google Scholar 

  3. Chang, J., Sohi, G.: Cooperative Cache Partitioning for Chip Multiprocessors. In: Proc. of the 21st Intl. Conf. on Supercomputing, June 2007, pp. 242–252 (2007)

    Google Scholar 

  4. Dybdahl, H., Stenström, P., Natvig, L.: A Cache-Partitioning Aware Replacement Policy for Chip Multiprocessors. In: Proc. of the Intl. Conf. on High Performance Computing, Bangalore, India (December 2006)

    Google Scholar 

  5. Fritts, J.E., Steiling, F.W., Tucek, J.A.: MediaBench II Video: Expediting the Next Generation of Video Systems Research. In: Embedded Processors for Multimedia and Communications II, Proceedings of the SPIE, March 2005, vol. 5683, pp. 79–93 (2005)

    Google Scholar 

  6. Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: MiBench: A Free, Commerically Representative Embedded Benchmark Suite. In: Proc. of the 4th Workshop on Workload Characterization, Austin, TX, USA, December 2001, pp. 83–94 (2001)

    Google Scholar 

  7. Hamerly, G., Perelman, E., Lau, J., Calder, B.: SimPoint 3.0: Faster and More Flexible Program Analysis. In: Proc. of the Workshop on Modeling, Benchmarking and Simulation (June 2005)

    Google Scholar 

  8. Hsu, L., Reinhardt, S., Iyer, R., Makineni, S.: Communist, Utilitarian, and Capitalist Cache Policies on CMPs: Caches as a Shared Resource. In: Proc. of the 15th Intl. Conf. on Parallel Architectures and Compilation Techniques, September 2006, pp. 13–22 (2006)

    Google Scholar 

  9. Jaleel, A., Hasenplaugh, W., Qureshi, M., Sebot, J., Steely Jr., S., Emer, J.: Adaptive Insertion Policies for Managing Shared Caches. In: Proc. of the 17th Intl. Conf. on Parallel Architectures and Compilation Techniques (September 2007)

    Google Scholar 

  10. Kim, S., Chandra, D., Solihin, Y.: Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture. In: Proc. of the 13th Intl. Conf. on Parallel Architectures and Compilation Techniques, September 2004, pp. 111–122 (2004)

    Google Scholar 

  11. Lee, C., Potkonjak, M., Mangione-Smith, W.H.: MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communication Systems. In: Proc. of the 30th Intl. Symp. on Microarchitecture, Research Triangle Park, NC, USA, December 1997, pp. 330–335 (1997)

    Google Scholar 

  12. Lin, J., Lu, Q., Ding, X., Zhang, Z., Sadayappan, P.: Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems. In: Proc. of the 14th Intl. Symp. on High Performance Computer Architecture, February 2008, pp. 367–378 (2008)

    Google Scholar 

  13. Loh, G.H., Subramaniam, S., Xie, Y.: Zesto: A Cycle-Level Simulator for Highly Detailed Microarchitecture Exploration. In: Proc. of the Intl. Symp. on Performance Analysis of Systems and Software, Boston, MA, USA (April 2009)

    Google Scholar 

  14. Luo, K., Gummaraju, J., Franklin, M.: Balancing Throughput and Fairness in SMT Processors. In: Proc. of the 2001 Intl. Symp. on Performance Analysis of Systems and Software, Tucson, AZ, USA, November 2001, pp. 164–171 (2001)

    Google Scholar 

  15. Moreto, M., Cazorla, F., Ramirez, A., Valero, M.: Explaining Dynamic Cache Partitioning Speed Ups. Computer Architecture Letters 6 (2007)

    Google Scholar 

  16. Narayanan, R., Ozisikyilmaz, B., Zambreno, J., Memik, H., Choudhary, A.: MineBench: A Benchmark Suite for Data Mining Workloads. In: Proc. of the IEEE Intl. Symp. on Workload Characterization, October 2006, pp. 182–188 (2006)

    Google Scholar 

  17. Qureshi, M., Lynch, D., Mutlu, O., Patt, Y.: A Case for MLP-Aware Cache Replacement. In: Proc. of the 33rd Intl. Symp. on Computer Architecture, June 2006, pp. 167–178 (2006)

    Google Scholar 

  18. Qureshi, M., Patt, Y.: Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In: Proc. of the 39th Intl. Symp. on Microarchitecture, December 2006, pp. 423–432 (2006)

    Google Scholar 

  19. Rafique, N., Lin, W.-T., Thottethodi, M.: Architectural Support for Operating System-Driven CMP Cache Management. In: Proc. of the 15th Intl. Conf. on Parallel Architectures and Compilation Techniques, September 2006, pp. 2–12 (2006)

    Google Scholar 

  20. Snavely, A., Tullsen, D.: Symbiotic Job Scheduling for a Simultaneous Multithreading Processor. In: Proc. of the 9th Symp. on Architectural Support for Programming Languages and Operating Systems, November 2000, pp. 234–244 (2000)

    Google Scholar 

  21. Srikantaiah, S., Kandemir, M., Irwin, M.J.: Adaptive Set-Pinning: Managing Shared Caches in Chip Multiprocessors. In: Proc. of the 13th Symp. on Architectural Support for Programming Languages and Operating Systems, Seattle, WA, USA (March 2009)

    Google Scholar 

  22. Stone, H., Tuerk, J., Wolf, J.: Optimal Paritioning of Cache Memory. IEEE Transactions on Computers 41(9), 1054–1068 (1992)

    Article  Google Scholar 

  23. Suh, G.E., Rudolph, L., Devadas, S.: Dynamic Partitioning of Shared Cache Memory. Journal of Supercomputing 28(1), 7–26 (2004)

    Article  MATH  Google Scholar 

  24. Xie, Y., Loh, G.H.: Dynamic Classification of Program Memory Behaviors in CMPs. In: Proc. of the Workshop on Chip Multiprocessor Memory Systems and Interconnects, Beijing, China (June 2008)

    Google Scholar 

  25. Xie, Y., Loh, G.H.: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches. In: Proc. of the 36th Intl. Symp. on Computer Architecture, Austin, TX, USA (June 2009)

    Google Scholar 

  26. Yeh, T.Y., Faloutsos, P., Patel, S.J., Reinman, G.: ParallAX: an Architecture for Real-Time Physics. In: Proc. of the 34th Intl. Symp. on Computer Architecture, June 2007, pp. 232–243 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xie, Y., Loh, G.H. (2010). Scalable Shared-Cache Management by Containing Thrashing Workloads. In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds) High Performance Embedded Architectures and Compilers. HiPEAC 2010. Lecture Notes in Computer Science, vol 5952. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11515-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11515-8_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11514-1

  • Online ISBN: 978-3-642-11515-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics