Skip to main content

Approximate Cache Architectures

  • Chapter
  • First Online:
Approximate Circuits

Abstract

In this chapter, we explore the application of approximate computing techniques to caches and the memory access portion of the processor pipeline. As memory accesses contribute significantly to the latency and energy consumption of applications, they have long been the target of various optimizations. Large cache hierarchies are a mainstay in modern designs in order to avoid the long latency and high energy associated with accessing DRAM on every load or store request. With growing data set sizes, building ever larger caches is not necessarily an effective use of silicon real estate. We present recent work that improves the effectiveness of cache storage and reduces the cost of memory accesses by exploiting the inherently noisy or imprecise data that these applications operate on. First, we consider work that selectively forgoes loading data from the caches and memory when the processor can make a reasonable estimate of the value that is needed. Next, we explore work that selectively determines which values to store in the cache through approximate deduplication of data; by reducing how much data needs to be stored in the cache, we see an increase in the effective cache capacity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alameldeen A, Wood DA (2004) Adaptive cache compression for high-performance processors. In: International symposium on computer architecture

    Google Scholar 

  2. Albericio J, Ibanez P, Vinals V, Llaberia JM (2013) The reuse cache: downsizing the shared last-level cache. In: Proceedings of the international symposium on microarchitecture

    Google Scholar 

  3. Alvarez C, Corbal J, Valero M (2005) Fuzzy memoization for floating-point multimedia applications. IEEE Trans Comput 54:922–927

    Article  Google Scholar 

  4. Biswas S, Franklin D, Savage A, Dixon R, Sherwood T, Chong F (2009) Multi-execution: multicore caching for data-similar executions. In: Proceedings of the international symposium on computer architecture

    Google Scholar 

  5. Burtscher M (2000) Improving context-based load value prediction. PhD Thesis, University of Colorado

    Google Scholar 

  6. Ceze L, Strauss K, Tuck J, Torrellas J, Renau J (2006) CAVA: using checkpoint-assisted value prediction to hide L2 misses. ACM Trans Archit Code Optim 3:182–208

    Article  Google Scholar 

  7. Chen X, Yang L, Dick RP, Shang L, Lekatsas H (2010) C-pack: a high-performance microprocessor cache compression algorithm. IEEE Trans Very Large Scale Integr 18:8

    Google Scholar 

  8. Falsafi B, Wenisch T (2014) A Primer on hardware prefetching. Morgan Claypool, San Rafael

    Article  Google Scholar 

  9. Fluhr E, Friedrich J, Dreps D, Zyuban V, Still G, Gonzalez C, Hall A, Hogenmiller D, Malgioglio F, Nett R, Paredes J, Pille J, Plass D, Puri R, Restle P, Shan D, Stawiasz K, Deniz ZT, Wendel D, Ziegler M (2014) POWER8TM: a 12-core server-class processor in 22nm SOI with 7.6tb/s off-chip bandwidth. In: Proceedings of the international solid state circuits conference

    Google Scholar 

  10. Gabbay F (1996) Speculative execution based on value prediction. EE Department Technical Report 1080, Technion - Israel Institute of Technology

    Google Scholar 

  11. Hallnor E, Reinhardt S (2005) A unified compressed memory hierarchy. In: Proceedings of the international symposium on high performance computer architecture

    Google Scholar 

  12. Hammarlund P, Martinez A, Bajwa A, Hill D, Hallnor E, Jiang H, Dixon M, Derr M, Hunsaker M, Kumar R, Osborne R, Rajwar R, Singhal R, D’Sa R, Chappell R, Kaushik S, Chennupaty S, Jourdan S, Gunther S, Piazza T, Burton T (2014) Haswell: the fourth-generation intel core processor. IEEE Micro 34:2

    Article  Google Scholar 

  13. Jaleel A, Theobald KB, Steely SC Jr, Emer J (2010) High performance cache replacement using re-reference interval prediction (RRIP). In: proceedings of the 38th international symposium on computer architecture

    Google Scholar 

  14. Khan SM, Tian Y, Jiménez DA (2010) Dead block replacement and bypass with a sampling predictor. In: Proceedings of the 43rd international symposium on microarchitecture

    Google Scholar 

  15. Kharbutli M, Irwin K, Solihin Y, Lee J (2004) Using prime numbers for cache indexing to eliminate conflict misses. In: HPCA

    Google Scholar 

  16. Kleanthous M, Sazeides Y (2008) CATCH: a mechanism for dynamically detecting cache-content-duplication and its application to instruction caches. In: Proceedings of the conference on design automation and test in Europe

    Google Scholar 

  17. Lipasti MH, Wilkerson CB, Shen JP (1996) Value locality and load value prediction. In: Proceedings of the international conference architectural support for programming languages and operating systems

    Google Scholar 

  18. Liu S, Gaudiot J (2009) Potential impact of value prediction on communication in many-core architectures. IEEE Trans Comput 58:759–769

    Article  MathSciNet  Google Scholar 

  19. Martin MMK, Sorin DJ, Cain HW, Hill MD, Lipasti MH (2001) Correctly implementing value prediction in microprocessors that support multithreading or multiprocessing. In: Proceedings of the international symposium on microarchitecture

    Google Scholar 

  20. Nakra T, Gupta R, Soffa ML (1999) Global context-based value prediction. In: Proceedings of the international symposium high-performance computer architecture

    Google Scholar 

  21. Pekhimenko G, Seshadr V, Mutlu O, Kozuch M, Gibbons PB, Mowry TC (2012) Base-delta-immediate compression: Practical data compression for on-chip caches. In: Proceedings of the international conference on parallel architecture and compilation techniques

    Google Scholar 

  22. Qureshi MK, Jaleel A, Patt YN, Steely SC Jr, Emer J (2007) Adaptive insertion policies for high performance caching. In: Proceedings of the 34th international symposium on computer architecture

    Google Scholar 

  23. San Miguel J, Badr M, Enright Jerger N (2014) Load value approximation. In: International symposium on microarchitecture

    Google Scholar 

  24. San Miguel J, Albericio J, Moshovos A, Enright Jerger N (2015) Doppelganger: a cache for approximate computing. In: MICRO

    Google Scholar 

  25. San Miguel J, Albericio J, Enright Jerger N, Jaleel A (2016) The bunker cache for spatio-value approximation. In: MICRO

    Google Scholar 

  26. Sardashti S, Wood DA (2013) Decoupled compressed cache: exploiting spatial locality for energy-optimized compressed caching. In: International symposium on microarchitecture

    Google Scholar 

  27. Sardashti S, Seznec A, Wood DA (2014) Skewed compressed cache. In: International symposium on microarchitecture

    Google Scholar 

  28. Sazeides Y, Smith J (1997) The predictability of data values. In: Proceedings of the international symposium microarchitecture

    Google Scholar 

  29. Sendag R, Chuang P-F, Lilja D (2003) Address correlation: exceeding the limits of locality. IEEE Comput Archit Lett 2:3–3

    Article  Google Scholar 

  30. Seznec A (1993) A case for two-way skewed-associative caches. In: Proceedings of the international symposium computer architecture

    Google Scholar 

  31. Sreeram J, Pande S (2010) Exploiting approximate value locality for data synchronization on multi-core processors. In: Proceedings of the international symposium workload characterization

    Google Scholar 

  32. Thwaites B, Pekhimenko G, Esmaeilzadeh H, Yazdanbakhsh A, Mutlu O, Park J, Mururu G, Mowry T (2014) Rollback-free value prediction with approximate loads. Poster presented at PACT

    Google Scholar 

  33. Tian Y, Khan S, Jimenez D, Loh G (2014) Last-level cache deduplication. In: Proceedings of the international conference on supercomputing

    Google Scholar 

  34. Tong JYF, Nagle D, Rutenbar RA (2000) Reducing power by optimizing the necessary precision/range of floating-point arithmetic. IEEE Trans Very Large Scale Integr Syst 8:273–286

    Article  Google Scholar 

  35. Wong D, Kim NS, Annavaram M (2016) Approximating warps with intra-warp operand value similarity. In: Proceedings of the international symposium on high performance computer architecture

    Google Scholar 

  36. Wu CJ, Jaleel A, Martonosi M, Steely S Jr, Emer J (2011) PACMan: prefetch-aware cache management for high performance caching. In: Proceedings of the international symposium on microarchitecture

    Google Scholar 

  37. Yazdanbakhsh A, Pekhimenko G, Thwaites B, Esmaeilzadeh H, Mutlu O, Mowry TC (2016) RFVP: rollback-free value prediction with safe-to-approximate loads. ACM Trans Archit Code Optim 12:4

    Article  Google Scholar 

  38. Zhang Y, Yang J, Gupta R (2000) Frequent value locality and value-centric data cache design. ACM SIGOPS Oper Syst Rev 34:150–159

    Article  Google Scholar 

  39. Zhou H, Flanagan J, Conte TM (2003) Detecting global stride locality in value streams. In: Proceedings of the international symposium computer architecture

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Natalie Enright Jerger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Jerger, N.E., Miguel, J.S. (2019). Approximate Cache Architectures. In: Reda, S., Shafique, M. (eds) Approximate Circuits. Springer, Cham. https://doi.org/10.1007/978-3-319-99322-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99322-5_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99321-8

  • Online ISBN: 978-3-319-99322-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics