Abstract
Multi-core processors with shared last-level caches are vulnerable to performance inefficiencies and fairness issues when the cache is not carefully managed between the multiple cores. Cache partitioning is an effective method for isolating poorly-interacting threads from each other, but designing a mechanism with simple logic and low area overhead will be important for incorporating such schemes in future embedded multi-core processors. In this work, we identify that major performance problems only arise when one or more “thrashing” applications exist. We propose a simple yet effective Thrasher Caging (TC) cache management scheme that specifically targets these thrashing applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bader, D.A., Li, Y., Li, T., Sachdeva, V.: BioPerf: A Benchmark Suite to Evaluate High-Performance Computer Architecture of Bioinformatics Applications. In: Proc. of the IEEE Intl. Symp. on Workload Characterization, Austin, TX, USA, October 2005, pp. 163–173 (2005)
Chandra, D., Guo, F., Kim, S., Solihin, Y.: Predicting Inter-Thread Cache Contenton on a Chip Multi-Processor Architecture. In: Proc. of the 11th Intl. Symp. on High Performance Computer Architecture, February 2005, pp. 340–351 (2005)
Chang, J., Sohi, G.: Cooperative Cache Partitioning for Chip Multiprocessors. In: Proc. of the 21st Intl. Conf. on Supercomputing, June 2007, pp. 242–252 (2007)
Dybdahl, H., Stenström, P., Natvig, L.: A Cache-Partitioning Aware Replacement Policy for Chip Multiprocessors. In: Proc. of the Intl. Conf. on High Performance Computing, Bangalore, India (December 2006)
Fritts, J.E., Steiling, F.W., Tucek, J.A.: MediaBench II Video: Expediting the Next Generation of Video Systems Research. In: Embedded Processors for Multimedia and Communications II, Proceedings of the SPIE, March 2005, vol. 5683, pp. 79–93 (2005)
Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: MiBench: A Free, Commerically Representative Embedded Benchmark Suite. In: Proc. of the 4th Workshop on Workload Characterization, Austin, TX, USA, December 2001, pp. 83–94 (2001)
Hamerly, G., Perelman, E., Lau, J., Calder, B.: SimPoint 3.0: Faster and More Flexible Program Analysis. In: Proc. of the Workshop on Modeling, Benchmarking and Simulation (June 2005)
Hsu, L., Reinhardt, S., Iyer, R., Makineni, S.: Communist, Utilitarian, and Capitalist Cache Policies on CMPs: Caches as a Shared Resource. In: Proc. of the 15th Intl. Conf. on Parallel Architectures and Compilation Techniques, September 2006, pp. 13–22 (2006)
Jaleel, A., Hasenplaugh, W., Qureshi, M., Sebot, J., Steely Jr., S., Emer, J.: Adaptive Insertion Policies for Managing Shared Caches. In: Proc. of the 17th Intl. Conf. on Parallel Architectures and Compilation Techniques (September 2007)
Kim, S., Chandra, D., Solihin, Y.: Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture. In: Proc. of the 13th Intl. Conf. on Parallel Architectures and Compilation Techniques, September 2004, pp. 111–122 (2004)
Lee, C., Potkonjak, M., Mangione-Smith, W.H.: MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communication Systems. In: Proc. of the 30th Intl. Symp. on Microarchitecture, Research Triangle Park, NC, USA, December 1997, pp. 330–335 (1997)
Lin, J., Lu, Q., Ding, X., Zhang, Z., Sadayappan, P.: Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems. In: Proc. of the 14th Intl. Symp. on High Performance Computer Architecture, February 2008, pp. 367–378 (2008)
Loh, G.H., Subramaniam, S., Xie, Y.: Zesto: A Cycle-Level Simulator for Highly Detailed Microarchitecture Exploration. In: Proc. of the Intl. Symp. on Performance Analysis of Systems and Software, Boston, MA, USA (April 2009)
Luo, K., Gummaraju, J., Franklin, M.: Balancing Throughput and Fairness in SMT Processors. In: Proc. of the 2001 Intl. Symp. on Performance Analysis of Systems and Software, Tucson, AZ, USA, November 2001, pp. 164–171 (2001)
Moreto, M., Cazorla, F., Ramirez, A., Valero, M.: Explaining Dynamic Cache Partitioning Speed Ups. Computer Architecture Letters 6 (2007)
Narayanan, R., Ozisikyilmaz, B., Zambreno, J., Memik, H., Choudhary, A.: MineBench: A Benchmark Suite for Data Mining Workloads. In: Proc. of the IEEE Intl. Symp. on Workload Characterization, October 2006, pp. 182–188 (2006)
Qureshi, M., Lynch, D., Mutlu, O., Patt, Y.: A Case for MLP-Aware Cache Replacement. In: Proc. of the 33rd Intl. Symp. on Computer Architecture, June 2006, pp. 167–178 (2006)
Qureshi, M., Patt, Y.: Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In: Proc. of the 39th Intl. Symp. on Microarchitecture, December 2006, pp. 423–432 (2006)
Rafique, N., Lin, W.-T., Thottethodi, M.: Architectural Support for Operating System-Driven CMP Cache Management. In: Proc. of the 15th Intl. Conf. on Parallel Architectures and Compilation Techniques, September 2006, pp. 2–12 (2006)
Snavely, A., Tullsen, D.: Symbiotic Job Scheduling for a Simultaneous Multithreading Processor. In: Proc. of the 9th Symp. on Architectural Support for Programming Languages and Operating Systems, November 2000, pp. 234–244 (2000)
Srikantaiah, S., Kandemir, M., Irwin, M.J.: Adaptive Set-Pinning: Managing Shared Caches in Chip Multiprocessors. In: Proc. of the 13th Symp. on Architectural Support for Programming Languages and Operating Systems, Seattle, WA, USA (March 2009)
Stone, H., Tuerk, J., Wolf, J.: Optimal Paritioning of Cache Memory. IEEE Transactions on Computers 41(9), 1054–1068 (1992)
Suh, G.E., Rudolph, L., Devadas, S.: Dynamic Partitioning of Shared Cache Memory. Journal of Supercomputing 28(1), 7–26 (2004)
Xie, Y., Loh, G.H.: Dynamic Classification of Program Memory Behaviors in CMPs. In: Proc. of the Workshop on Chip Multiprocessor Memory Systems and Interconnects, Beijing, China (June 2008)
Xie, Y., Loh, G.H.: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches. In: Proc. of the 36th Intl. Symp. on Computer Architecture, Austin, TX, USA (June 2009)
Yeh, T.Y., Faloutsos, P., Patel, S.J., Reinman, G.: ParallAX: an Architecture for Real-Time Physics. In: Proc. of the 34th Intl. Symp. on Computer Architecture, June 2007, pp. 232–243 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xie, Y., Loh, G.H. (2010). Scalable Shared-Cache Management by Containing Thrashing Workloads. In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds) High Performance Embedded Architectures and Compilers. HiPEAC 2010. Lecture Notes in Computer Science, vol 5952. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11515-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-11515-8_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11514-1
Online ISBN: 978-3-642-11515-8
eBook Packages: Computer ScienceComputer Science (R0)