Abstract
There has been intensive research on data prefetching focusing on performance improvement, however, the energy aspect of prefetching is relatively unknown. Our experiments show that although software prefetching tends to be more energy efficient, hardware prefetching outperforms software prefetching on most of the applications in terms of performance. This paper proposes several techniques to make hardware-based data prefetching power-aware. Our proposed techniques include three compiler-based approaches which make the prefetch predictor more power efficient. The compiler identifies the pattern of memory accesses in order to selectively apply different prefetching schemes depending on predicted access patterns and to filter out unnecessary prefetches. We also propose a hardware-based filtering technique to further reduce the energy overhead due to prefetching in the L1 cache. Our experiments show that the proposed techniques reduce the prefetching-related energy overhead by close to 40% without reducing its performance benefits.
An erratum to this chapter can be found at http://dx.doi.org/10.1007/11574859_13 .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
The standard performance evaluation corporation (2000), http://www.spec.org
Azizi, N., Moshovos, A., Najm, F.N.: Low-leakage asymmetric-cell sram. In: Proc. of the 2002 international symposium on Low power electronics and design, pp. 48–51 (2002)
Baer, J.L., Chen, T.F.: An effictive on-chip preloading scheme to reduce data access penalty. In: Supercomputing 1991, pp. 179–186 (November 1991)
Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A framework for architectural-level power analysis and optimizations. In: Proceedings of the 27th Annual International Symposium on Computer Architecture, pp. 83–94 (June 2000)
Burger, D.C., Austin, T.M.: The simplescalar tool set, version 2.0. Technical Report CS-TR-1997-1342, University of Wisconsin, Madison (June 1997)
Cooksey, R., Jourdan, S., Grunwald, D.: A stateless content-directed data prefetching mechanism. In: Tenth intl. conf. on architectural support for programming languages and operating systems(ASPLOS-X), pp. 279–290 (2002)
Gowan, M.K., Biro, L.L., Jackson, D.B.: Power considerations in the design of the alpha 21264 microprocessor. In: Proceedings of the 1998 Conference on Design Automation (DAC 1998), pp. 726–731 (June 1998)
Guo, Y., Chheda, S., Koren, I., Krishna, C.M., Moritz, C.A.: Energy characterization of hardware-based data prefetching. In: International Conference on Computer Design (ICCD 2004) (October 2004)
Guo, Y., Chheda, S., Moritz, C.A.: Runtime biased pointer reuse analysis and its application to energy efficiency. In: Workshop on Power-Aware Computer Systems(PACS) at Micro-36, pp. 1–15 (December 2003)
Lipasti, M.H., Schmidt, W.J., Kunkel, S.R., Roediger, R.R.: Spaid: software prefetching in pointer- and call-intensive environments. In: Proceedings of the 28th annual international symposium on Microarchitecture, pp. 231–236 (November 1995)
Luk, C.-K., Mowry, T.C.: Compiler-based prefetching for recursive data structures. In: Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII), pp. 222–233 (October 1996)
Montanaro, J., et al.: A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor. Digital Technical Journal of Digital Equipment Corporation 9(1) (1997)
Mowry, T.: Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Dept. of Computer Science, Stanford University (March 1994)
Mowry, T.C., Lam, M.S., Gupta, A.: Design and evaluation of a compiler algorithm for prefetching. In: Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 62–73 (October 1992)
Rogers, A., Carlisle, M.C., Reppy, J.H., Hendren, L.J.: Supporting dynamic data structures on distributed-memory machines. ACM Transactions on Programming Languages and Systems 17(2), 233–263 (1995)
Roth, A., Moshovos, A., Sohi, G.S.: Dependence based prefetching for linked data structures. In: Proc. of the 8th Architectural Support for Programming Languages and Operating Systems, pp. 115–126 (October 1998)
Roth, A., Sohi, G.S.: Effective jump-pointer prefetching for linked data structures. In: Proceedings of the 26th annual international symposium on Computer architecture, pp. 111–121. IEEE Computer Society Press, Los Alamitos (1999)
Smith, A.J.: Sequential program prefetching in memory bierarchies. IEEE Computer 11(12), 7–21 (1978)
Wilson, R., French, R., Wilson, C., Amarasinghe, S., Anderson, J., Tjiang, S., Liao, S.-W., Tseng, C.-W., Hall, M.W., Lam, M., Hennessy, J.L.: SUIF: A parallelizing and optimizing research compiler. Technical Report CSL-TR-94-620, Computer Systems Laboratory, Stanford University (May 1994)
Zhang, M., Asanovic, K.: Highly-associative caches for low-power processors. In: Kool Chips Workshop, 33rd International Symposium on Microarchitecture (December 2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guo, Y., Chheda, S., Koren, I., Krishna, C.M., Moritz, C.A. (2005). Energy-Aware Data Prefetching for General-Purpose Programs. In: Falsafi, B., VijayKumar, T.N. (eds) Power-Aware Computer Systems. PACS 2004. Lecture Notes in Computer Science, vol 3471. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11574859_6
Download citation
DOI: https://doi.org/10.1007/11574859_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29790-1
Online ISBN: 978-3-540-31485-1
eBook Packages: Computer ScienceComputer Science (R0)