Skip to main content

A Machine Learning Approach for a Scalable, Energy-Efficient Utility-Based Cache Partitioning

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9137))

Included in the following conference series:

Abstract

In multi- and many-core processors, a shared Last Level Cache (LLC) is utilized to alleviate the performance problems resulting from long latency memory instructions. However, an unmanaged LLC may become quite useless when the running threads have conflicting interests. In one extreme, a thread can make benefit from every portion of the cache whereas, in the other end, another thread may just want to thrash the whole LLC. Recently, a variety of way-partitioning mechanisms are introduced to improve cache performance. Today, almost all of the studies utilize the Utility-based Cache Partitioning (UCP) algorithm as their allocation policy. However, the UCP look-ahead algorithm, although it provides a better utility measure than its greedy counterpart, requires a very complex hardware circuitry and dissipates a considerable amount of energy at the end of each decision period. In this study, we propose an offline supervised machine learning algorithm that replaces the UCP look-ahead circuitry with a circuitry requiring almost negligible hardware and energy cost. Depending on the cache and processor configuration, our thorough analysis and simulation results show that the proposed mechanism reduces up to 5 % of the overall transistor count and 5 % of the overall processor energy without introducing any performance penalty.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Qureshi, M.K., Patt, Y.N.: Utility-based cache partitioning: a low-overhead, high-performance, runtime mechanism to partition shared caches. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 423–432. IEEE Computer Society, Washington, DC (2006)

    Google Scholar 

  2. Xie, Y., Loh, G.H.: PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches. In: SIGARCH Computer Architecture News, pp. 174–183. ACM, New York (2009)

    Google Scholar 

  3. Qureshi, M.K., Jaleel, A., Patt, Y.N., Steely, S.C., Emer, J.: Adaptive insertion policies for high performance caching. In: Proceedings of the 34th Annual International Symposium on Computer Architecture, pp. 381–391. ACM, New York (2007)

    Google Scholar 

  4. Jaleel, A., Hasenplaugh, W., Qureshi, M., Sebot, J., Steely, Jr., S., Emer, J.: Adaptive insertion policies for managing shared caches. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 208–219. ACM, New York (2008)

    Google Scholar 

  5. Sanchez, D., Kozyrakis, C.: Vantage: scalable and efficient fine-grain cache partitioning. In: SIGARCH Computer Architecture News, pp. 57–68. ACM, New York (2011)

    Google Scholar 

  6. Wang, R., Chen, L.: Futility scaling: high-associativity cache partitioning. In: 47th IEEE/ACM International Symposium on Microarchitecture (MICRO) (2014)

    Google Scholar 

  7. Choi, S., Yeung, D.: Learning-based SMT processor resource distribution via hill-climbing. In: SIGARCH Computer Architecture News, pp. 239–251. ACM, New York (2006)

    Google Scholar 

  8. Bitirgen, R., Ipek, E., Martinez, J.F.: Coordinated management of multiple interacting resources in chip multiprocessors: a machine learning approach. In: Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 41), pp. 318–329. IEEE, Computer Society, Washington DC (2008)

    Google Scholar 

  9. Macsim simulator. http://code.google.eom/p/macsim/

  10. Henning, J.: SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)

    Article  MathSciNet  Google Scholar 

  11. Hamerly, G., Perelman, E., Lau, J., Calder, B.: SimPoint 3.0: faster and more flexible program phase analysis. J. Instr. Level Parallelism 7, 1–28 (2005)

    Google Scholar 

  12. Muralimanohar, N., Balasubramonian, R., Jouppi, N.: Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0. In: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 40), pp. 3–14. IEEE Computer Society, Washington, DC (2007)

    Google Scholar 

  13. Tran, A.T., Baas, B.M.: Design of an energy-efficient 32-bit adder operating at subthreshold voltages in 45-nm CMOS. In: Third International Conference on Communications and Electronics (ICCE), pp. 87–91 (2010)

    Google Scholar 

  14. Mehmood, N., Hansson, M., Alvandpour, A.: An energy-efficient 32-bit multiplier architecture in 90-nm CMOS. In: IEEE 24th Norchip Conference, pp. 35–38 (2006)

    Google Scholar 

  15. Pham, T.N., Swartzlander, E.E.: Design of Radix 4 SRT dividers for single precision DSP in deep submicron CMOS technology. In: IEEE International Symposium on Signal Processing and Information Technology, pp. 236–241 (2006)

    Google Scholar 

  16. Folegnani, D., Gonzalez, A.: Energy-effective issue logic. In: IEEE International Symposium on Computer Architecture, pp. 230–239 (2001)

    Google Scholar 

Download references

Acknowledgement

This work is supported by the Scientific and Technical Research Council of Turkey (TUBITAK) for Wise-Cache Project under Grant No: 114E119.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gurhan Kucuk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Guney, I.A., Yildiz, A., Bayindir, I.U., Serdaroglu, K.C., Bayik, U., Kucuk, G. (2015). A Machine Learning Approach for a Scalable, Energy-Efficient Utility-Based Cache Partitioning. In: Kunkel, J., Ludwig, T. (eds) High Performance Computing. ISC High Performance 2015. Lecture Notes in Computer Science(), vol 9137. Springer, Cham. https://doi.org/10.1007/978-3-319-20119-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20119-1_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20118-4

  • Online ISBN: 978-3-319-20119-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics