Advertisement

Storage-Efficient Data Prefetching for High Performance Computing

Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 164)

Abstract

Data prefetching is widely adopted in modern high performance processors to bridge the ever-increasing performance gap between processor and memory. Many prefetching techniques have been proposed to exploit patterns among data access history that is stored in on-chip hardware table. We demonstrate that the table size has considerable impact on the performance of data prefetching. While a small table size limits the effectiveness of the prediction due to inadequate history, a large table is expensive to be implemented on-chip and has longer latency. It is critical to find a storage-efficient data prefetching mechanism. We propose a novel Dynamic Signature Method (DSM) that stores the addresses efficiently to reduce the demand of storage for prefetching. We have carried out extensive simulation testing with a trace-driven simulator, CMP$im, and SPEC CPU2006 benchmarks. Experimental results show that the new DSM based prefetcher achieved better performance improvement for over half benchmarks compared to the existing prefetching approaches with the same storage consumption.

References

  1. 1.
    Chen, T.-F., Baer, J.-L.: Effective hardware based data prefetching for high performance processors. IEEE Trans. Comput. 44, 609–623 (1995)Google Scholar
  2. 2.
    Dahlgren, F., Dubois, M., Stenstrom, P.: Fixed and adaptive sequential prefetching in shared memory multiprocessors. In: ICPP (1993)Google Scholar
  3. 3.
    Doweck, J.: Inside Intel Core microarchitecture and smart memory access. Intel (2006)Google Scholar
  4. 4.
    DPC Homepage.: http://www.jilp.org/dpc (2008)
  5. 5.
    Ebrahimi, E., Mutlu, O., Lee, C.J., Patt, YN.: Coordinated control of multiple prefetchers in multi-core systems. In: MICRO (2009)Google Scholar
  6. 6.
    Goeman, B., Vandierendonck, H., Bosschere, K.D.: Differential FCM: increasing value prediction accuracy by improving table usage efficiency. In: HPCA (2001)Google Scholar
  7. 7.
    Jaleel, A., Cohn, R.S., Luk, C.-K., Jacob, B.: CMP$im: a pin-based on- the-fly multi-core cache simulator. In: 4th Workshop on Modeling, Benchmarking and Simulation (2008)Google Scholar
  8. 8.
    Joseph, D., Grunwald, D.: Prefetching using Markov predictors. In: ISCA (1997)Google Scholar
  9. 9.
    Kandiraju, G.B., Sivasubramaniam, A.: Going the distance for TLB prefetching: an application-driven study. In: ISCA (2002)Google Scholar
  10. 10.
    Luk, C.-K., Cohn, R.S., Muth, R., et. al.: Pin: building customized program analysis tools with dynamic instrumentation. In: PLDI (2005)Google Scholar
  11. 11.
    Nesbit, K.J., Dhodapkar, A.S., Smith, J.E.: AC/DC: an adaptive data cache prefetcher. In: PACT (2004)Google Scholar
  12. 12.
    Nesbit, K.J., Smith, J.E.: Data cache prefetching using a global history buffer. In: HPCA (2004)Google Scholar
  13. 13.
    Sazeides, Y., Smith, J.E.: The predictability of data values. In: MICRO (1997)Google Scholar
  14. 14.
    Sinharoy, B., Kalla, R.N., Tendler, J.M., Eickemeyer, R.J.: POWER5 system microarchitecture. IBM J. Res. Dev. 49, 505–521 (2005)Google Scholar
  15. 15.
    Somogyi, S., Wenisch, T.F., et al.: Spatial memory streaming. In: ISCA (2006)Google Scholar
  16. 16.
    Somogyi, S., Wenisch, T.F., Ferdman, M., Falsafi, B.: Spatio-temporal memory streaming. In: ISCA (2009)Google Scholar
  17. 17.
    Spradling, C.D.: SPEC CPU2006 benchmark tools. ACM SIGARCH Comput. Archit. News 35, 130–134 (2007)Google Scholar
  18. 18.
    Srinath, S., Mutlu, O., Kim, H., Patt, Y.N.: Feedback directed prefetching: improving the performance and bandwidth-efficiency of hardware prefetchers. In: HPCA (2007)Google Scholar
  19. 19.
    Wenisch, T.F., Somogyi, S., Hardavellas, N., Kim, J., Ailamaki, A., Falsafi, B.: Temporal streaming of shared memory. In: ISCA (2005)Google Scholar
  20. 20.
    Zhu, H., Chen, Y., Sun, X.: Timing local streams: improving timeliness in data prefetching. In: ICS (2010)Google Scholar

Copyright information

© Springer Science+Business Media Dortdrecht 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceTexas Tech UniversityLubbockUSA
  2. 2.Department of Electrical and Computer EngineeringUniversity of Illinois at Urbana-ChampaignUrbanaUSA
  3. 3.Department of Computer ScienceIllinois Institute of TechnologyChicagoUSA

Personalised recommendations