Revisiting Cache Resizing

Keramidas, Georgios; Datsios, Chrysovalantis

doi:10.1007/s10766-013-0291-4

Georgios Keramidas¹ &
Chrysovalantis Datsios¹

233 Accesses
4 Citations
Explore all metrics

Abstract

We present a novel framework to dynamically reconfigure on-chip memory resources according to the changing behavior of the executing applications. Our framework enables smooth scaling (i.e., resizing) of the on-chip caches targeting both performance and power efficiency. In contrast to previous approaches, the resizing decisions in our framework are not tainted by transient events (e.g., misses) that are due to downsizing avoiding swinging the cache size due to trial-and-error resizing decisions. This minimizes both execution time penalty induced by downsizing decisions as well as the effective cache size. In addition, an inherent property of our approach is that the actual invalidation of the cache blocks and the corresponding write-backs of the cache dirty blocks are asynchronous to resizing decisions, ensuring a smooth transition from one size to another. This makes it possible to apply our framework even on write-back caches—a major limitation in previous proposals. The proposed framework is simple to implement requiring minimal hardware overhead (\(<\)1 % of the target cache). Using cycle-accurate simulations and a wide range of applications, we evaluate our approach against previously proposed cache resizing schemes for various cache sizes and types. In all cases, our experimental findings show significant benefits across the board in both power and performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

When a structure is too small (below 64 bytes) for CACTI to directly model, we model larger structures with the same associativity and number of ports and approximate the power and timing using linear regression.
Note that incorrect lookups cannot happen because we maintain all the tag bits required by the smallest cache.
This is also the effect of selecting only one performance indicator (i.e., the number of misses) to guide the hybrid reconfiguration policy.

References

Abella, J., Gonzalez, A., Vera, X., O’Boyle, M.F.P.: IATAC: a smart predictor to turn-off L2 cache lines. ACM Trans. Archit. Code Optim. 2, 55–77 (2005)
Google Scholar
Albonesi, D.H.: Selective cache ways: on-demand cache resource allocation. In: Proceedings of the International Symposium on Microarchitecture (1999)
Balasubramonian, R., Albonesi, D.H., Buyuktosunoglu, A., Dwarkadas, S.: Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In: Proceedings of the International Symposium on Computer Architecture (2000)
Balasubramonian, R., Dwarkadas, S., Albonesi, D.H.: Dynamically managing the communication-parallelism trade-off in future clustered processors. In: Proceedings of the International Symposium on Computer Architecture (2003)
Burd, T., Brodersen, R.: Design issues for dynamic voltage scaling. In: Proceedings of the International Symposium on Low Power Electronics and Design (2000)
Collins, J., Tullsen, D.M.: Hardware identification of cache conflict misses. In: Proceedings of the International Symposium on Microarchitecture (1999)
Dhodapkar, A.S., Smith, J.E.: Managing multi-configuration hardware via dynamic working prediction. In: Proceedings of the International Symposium on Computer Architecture (2002)
Flautner, K., Kim, N.S., Martin, S., Blaauw, D., Mudge, T.: Drowsy caches: simple techniques for reducing leakage power. In: Proceedings of the International Symposium on Computer Architecture (2002)
Hu, Z., Kaxiras, S., Martonosi, M.: Timekeeping in the memory system: predicting and optimizing memory behavior. In: Proceedings of the International Symposium on Computer Architecture (2002)
Kaxiras, S., Hu, Z., Martonosi, M.: Cache decay: exploiting generational behavior to reduce cache leakage power. In: Proceedings of the International Symposium on Computer Architecture (2001)
Keramidas, G., Xekalakis, P., Kaxiras, S.: Recruiting decay for dynamic power reduction in set-associative caches. Trans. High Perform. Embed. Archit. Compil. 5470, 4–22 (2009)
Google Scholar
Keramidas, G., Datsios, C., Kaxiras, S.: A Framework for Efficient Cache Resizing. In: Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and, Simulation (2012)
Kharbutli, M., Solihin, Y.: Counter-based cache replacement and bypassing algorithms. IEEE Trans. Comput. 57, 433–447 (2008)
Google Scholar
Kuroda, T., Suzuki, K., Mita, S., Fujita, T., Yamane, F., Sano, F., Chiba, A., Watanabe, Y., Matsuda, K., Maeda, T., Sakurai, T., Furuyama, T.: Variable supply-voltage scheme for low-power high-speed CMOS digital design. J. Solid-State Circuits 33, 454–462 (1998)
Google Scholar
Lai, A.C., Fide, C., Falsafi, B.: Dead-block prediction & dead block correlating prefetchers. In: Proceedings of the International Symposium on Computer Architecture (2001)
Liu, H., Ferdman, M., Huh, J., Burger, D.: Cache bursts: a new approach for eliminating dead blocks and increasing cache efficiency. In: Proceedings of the International Symposium on Microarchitecture (2008)
Mattson, R.L., Gecsei, J., Slutz, D.R., Traiger, I.L.: Evaluation techniques for storage hierarchies. IBM Syst. J. 9, 78–117 (1970)
Google Scholar
Muralimanohar, N., Balasubramonian, R., Jouppi, N.P. : CACTI 6.0: a tool to model large caches. HP Technical, Report (2009)
Powell, M., Yang, S.H., Falsafi, B., Roy, K., Vijaykumar, T.N.: Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories. In: Proceedings of the International Symposium on Low Power Electronics and Design (2000)
Spanberger, A.: Designing a Dynamically Reconfigurable Cache for High Performance and Low Power. PhD Thesis, University of Virginia (2002)
Srinivasan, J.R.: Improving Cache Utilization. University of Cambridge, Technical Report (2011)
Sundararajan, K.T., Jones, T.M., Tophamet, N.: Smart Cache: a self adaptive cache architecture for energy efficiency. In: Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and, Simulation (2011)
Yang, S.H., Falsafi, B., Powell, M.D.,Roy, K., Vijaykumar, T.N.: An integrated circuit/architecture approach to reducing leakage in deep-submicron high-performance i-caches. In: Proceedings of the High-Performance Computer Architecture Conference (2001)
Yang, S.H., Falsafi, B., Powell, M.D., Vijaykumar, T.N.: Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay. In: Proceedings of the High Performance Computer Architecture Conference (2002)
Ye, Y., Borkar, S., De, V.: A new technique for standby leakage reduction in high performance circuits. In: Proceedings of the International Symposium on VLSI Circuits (1998)
Zhang, C., Vahid, F., Najjar, W.: A highly configurable cache architecture for embedded systems. In: Proceedings of the International Conference on Computer Architecture (2003)

Download references

Author information

Authors and Affiliations

Industrial Systems Institute, Patras, Greece
Georgios Keramidas & Chrysovalantis Datsios

Authors

Georgios Keramidas
View author publications
You can also search for this author in PubMed Google Scholar
Chrysovalantis Datsios
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georgios Keramidas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Keramidas, G., Datsios, C. Revisiting Cache Resizing. Int J Parallel Prog 43, 59–85 (2015). https://doi.org/10.1007/s10766-013-0291-4

Download citation

Received: 19 January 2013
Accepted: 11 October 2013
Published: 25 October 2013
Issue Date: February 2015
DOI: https://doi.org/10.1007/s10766-013-0291-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Revisiting Cache Resizing

Abstract

Access this article

Similar content being viewed by others

Survey on chiplets: interface, interconnect and integration methodology

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

Recent progress in InGaZnO FETs for high-density 2T0C DRAM applications

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Revisiting Cache Resizing

Abstract

Access this article

Similar content being viewed by others

Survey on chiplets: interface, interconnect and integration methodology

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

Recent progress in InGaZnO FETs for high-density 2T0C DRAM applications

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation