Exploring the Prefetcher/Memory Controller Design Space: An Opportunistic Prefetch Scheduling Strategy

  • Marius Grannaes
  • Magnus Jahre
  • Lasse Natvig
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6566)

Abstract

Prefetching is a well-known technique for bridging the memory gap. By predicting future memory references the prefetcher can fetch data from main memory and insert it into the cache such that overall performance is increased. Modern memory controllers reorder memory requests to exploit the 3D structure of modern DRAM interfaces. In particular, prioritizing memory requests that use open pages increases throughput significantly. In this work, we investigate the prefetcher/memory controller design space along three dimensions: prefetching heuristic, prefetch scheduling strategy and available memory bandwidth. In particular, we evaluate 5 different prefetchers and 6 prefetch scheduling strategies. Through this extensive investigation, we observed that prior prefetch scheduling strategies often cause memory bus contention in bandwidth constrained CMPs which in turn causes performance regressions. To avoid this problem, we propose a novel prefetch scheduling heuristic called Opportunistic Prefetch Scheduling that selectively prioritizes prefetches to open DRAM pages such that performance regressions are minimized. Opportunistic prefetch scheduling reduces performance regressions by 6.7X and 5.2X, while improving performance by 17% and 20% for sequential and scheduled region prefetching, compared to the direct scheduling strategy.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Binkert, N.L., Dreslinski, R.G., Hsu, L.R., Lim, K.T., Saidi, A.G., Reinhardt, S.K.: The M5 Simulator: Modeling Networked Systems. IEEE Micro. 26(4), 52–60 (2006)CrossRefGoogle Scholar
  2. 2.
    Cantin, J.F., Lipasti, M.H., Smith, J.E.: Stealth prefetching. SIGPLAN Not. 41(11), 274–282 (2006)CrossRefGoogle Scholar
  3. 3.
    Chen, T.F., Baer, J.L.: Effective hardware-based data prefetching for high-performance processors. IEEE Transactions on Computers 44, 609–623 (1995)CrossRefMATHGoogle Scholar
  4. 4.
    Grannaes, M., Jahre, M., Natvig, L.: Low-cost open-page prefetch scheduling in chip multiprocessors. In: XXVI IEEE International Conference on Computer Design (ICCD) (2008)Google Scholar
  5. 5.
    Grannaes, M., Jahre, M., Natvig, L.: Storage efficient hardware prefetching using delta correlating prediction tables. In: Data Prefetching Championships (2009)Google Scholar
  6. 6.
    Hur, I., Lin, C.: Adaptive history-based memory schedulers. In: MICRO 37: Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 343–354 (2004)Google Scholar
  7. 7.
    JEDEC Solid State Technology Association: DDR2 SDRAM Specification (May 2006)Google Scholar
  8. 8.
    Lee, C.J., Mutlu, O., Narasiman, V., Patt, Y.N.: Prefetch-Aware DRAM Controllers. In: MICRO 2008: Proceedings of the 41st IEEE/ACM International Symposium on Microarchitecture, pp. 200–209 (2008)Google Scholar
  9. 9.
    Lin, W.F., Reinhardt, S.K., Burger, D.: Designing a modern memory hierarchy with hardware prefetching. IEEE Transactions on Computers 50(11), 1202–1218 (2001)CrossRefGoogle Scholar
  10. 10.
    Lin, W.F., Reinhardt, S.K., Burger, D.: Reducing DRAM latencies with an integrated memory hierarchy design. In: HPCA 2001: Proceedings of the 7th International Symposium on High-Performance Computer Architecture, pp. 301–312 (2001)Google Scholar
  11. 11.
    Natarajan, C., Christenson, B., Briggs, F.: A study of performance impact of memory controller features in multi-processor server environment. In: WMPI 2004: Proceedings of the 3rd Workshop on Memory Performance Issues, pp. 80–87 (2004)Google Scholar
  12. 12.
    Nesbit, K.J., Dhodapkar, A.S., Smith, J.E.: AC/DC: An adaptive data cache prefetcher. In: Proceedings of the 13th International Conference on Parallel Architecture and Compilation Techniques, pp. 135–145 (2004)Google Scholar
  13. 13.
    Nesbit, K.J., Smith, J.E.: Data cache prefetching using a global history buffer. Micro 25, 90–97 (2005)Google Scholar
  14. 14.
    Rixner, S., Dally, W.J., Kapasi, U.J., Mattson, P., Owens, J.D.: Memory access scheduling. In: ISCA 2000: Proceedings of the 27th Annual International Symposium on Computer Architecture, pp. 128–138 (2000)Google Scholar
  15. 15.
    Shao, J., Davis, B.: A burst scheduling access reordering mechanism. In: IEEE 13th International Symposium on High Performance Computer Architecture, pp. 285–294 (2007)Google Scholar
  16. 16.
    Smith, A.J.: Cache memories. ACM Comput. Surv. 14(3), 473–530 (1982)CrossRefGoogle Scholar
  17. 17.
    Somogyi, S., Wenisch, T.F., Ailamaki, A., Falsafi, B., Moshovos, A.: Spatial memory streaming. SIGARCH Comput. Archit. News 34(2), 252–263 (2006)CrossRefGoogle Scholar
  18. 18.
    SPEC: SPEC CPU 2000 Web Page, http://www.spec.org/cpu2000/
  19. 19.
    Zhu, Z., Zhang, Z.: A performance comparison of DRAM memory system optimizations for SMT processors. In: HPCA 2005: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, pp. 213–224 (2005)Google Scholar
  20. 20.
    Zhu, Z., Zhang, Z., Zhang, X.: Fine-grain priority scheduling on multi-channel memory systems. In: Eighth International Symposium on High-Performance Computer Architecture, pp. 107–116 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Marius Grannaes
    • 1
  • Magnus Jahre
    • 1
  • Lasse Natvig
    • 1
  1. 1.Norwegian University of Science and TechnologyNorway

Personalised recommendations