An Efficient Simulation Algorithm for Cache of Random Replacement Policy

  • Shuchang Zhou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6289)


Cache is employed to exploit the phenomena of locality in many modern computer systems. One way of evaluating the impact of cache is to run a simulator on traces collected from realistic work load. However, for an important category of cache, namely those of random replacement policy, each round of the naïve simulation can only give one out of many possible results, therefore requiring many rounds of simulation to capture the cache behavior, like determining the hit probability of a particular cache reference. In this paper, we present an algorithm that efficiently approximates the hit probability in linear time with moderate space in a single round. Our algorithm is applicable to realistic processor cache parameters where the associativity is typically low, and extends to cache of large associativity. Experiments show that in one round, our algorithm collects information that would previously require up to dozens of rounds of simulation.


Simulation Cache memories Stochastic approximation 


  1. 1.
    Introduction to algorithms. MIT Press, Cambridge (2001)Google Scholar
  2. 2.
    Fang, C., Carr, S., Önder, S., Wang, Z.: Reuse-distance-based miss-rate prediction on a per instruction basis. In: Proceedings of the 2004 Workshop on Memory System Performance MSP 2004, Washington, D.C., June 8, pp. 60–68. ACM, New York (2004)CrossRefGoogle Scholar
  3. 3.
    Ding, C., Zhong, Y.: Predicting whole-program locality through reuse distance analysis. In: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, PLDI 2003, San Diego, California, USA, June 9-11, pp. 245–257. ACM, New York (2003)CrossRefGoogle Scholar
  4. 4.
    Beyls, K., D’Hollander, E.H.: Reuse Distance-Based Cache Hint Selection. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 265–274. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  5. 5.
    Bao, Y., Chen, M., Ruan, Y., Liu, L., Fan, J., Yuan, Q., Song, B., Xu, J.: HMTT: a platform independent full-system memory trace monitoring system. In: Proceedings of the 2008 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2008, Annapolis, MD, USA, June 2-6, pp. 229–240. ACM, New York (2008)Google Scholar
  6. 6.
    Sweetman, D.: See MIPS Run, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2006) ISBN 0-12088-421-6Google Scholar
  7. 7.
    Sugumar, R.A., Abraham, S.G.: Multi-configuration simulation algorithms for the evaluation of computer architecture designs. Technical report, University of Michigan (1993)Google Scholar
  8. 8.
    Mattson, R.L., Gecsei, J., Slutz, D., Traiger, I.L.: Evaluation techniques for storage hierarchies. IBM System Journal 9(2), 78–117 (1970)CrossRefzbMATHGoogle Scholar
  9. 9.
    Smith, J.E., Goodman, J.R.: A study of instruction cache organizations and replacement policies. SIGARCH Comput. Archit. News 11(3), pp. 132–137 (1983)Google Scholar
  10. 10.
  11. 11.
  12. 12.
    Berg, E., Hagersten, E.: Fast data-locality profiling of native execution. SIGMETRICS Perform. Eval. Rev. 33(1), 169–180 (2005)CrossRefGoogle Scholar
  13. 13.
    ARM Cortex-R4 processor manual,
  14. 14.
    Guo, F., Solihin, Y.: An analytical model for cache replacement policy performance. In: Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2006/Performance 2006, Saint Malo, France, June 26-30, pp. 228–239. ACM, New York (2006)Google Scholar
  15. 15.
    Chandra, D., Guo, F., Kim, S., Solihin, Y.: Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture. In: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, HPCA, February 12-16, pp. 340–351. IEEE Computer Society, Washington (2005)Google Scholar
  16. 16.
    Suh, G.E., Devadas, S., Rudolph, L.: Analytical cache models with applications to cache partitioning. In: Proceedings of the 15th International Conference on Supercomputing, ICS 2001, Sorrento, Italy, pp. 1–12. ACM, New York (2001)Google Scholar
  17. 17.
    Agarwal, A., Hennessy, J., Horowitz, M.: An analytical cache model. ACM Trans. Comput. 7(2), 184–215 (1989)CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2010

Authors and Affiliations

  • Shuchang Zhou
    • 1
  1. 1.Key Laboratory of Computer System and Architecture, Institute of Computing TechnologyChinese Academy of SciencesBeijingChina

Personalised recommendations