PARROT: Power Awareness Through Selective Dynamically Optimized Traces

Rosner, Roni; Almog, Yoav; Moffie, Micha; Schwartz, Naftali; Mendelson, Avi

doi:10.1007/978-3-540-28641-7_14

Roni Rosner¹⁸,
Yoav Almog¹⁸,
Micha Moffie¹⁸,
Naftali Schwartz¹⁸ &
…
Avi Mendelson¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3164))

Included in the following conference series:

International Workshop on Power-Aware Computer Systems

667 Accesses
1 Citations

Abstract

We present the PARROT concept aimed at both higher performance and power-awareness. The PARROT microarchitectural framework integrates trace caching, dynamic optimizations and pipeline decoupling. We employ a gradual and selective approach for applying complex mechanisms only for the most frequently used traces to maximize the performance gain at any given power constraint, thus attaining finer control of tradeoffs between performance and power awareness.

We show that the PARROT microarchitecture delivers performance increases comparable to those available through conventional doubling of execution resources (average 16% IPC improvement). This improvement comes through better utilization of all available resources with the combination of a trace cache and selective trace optimization. On the other hand, performance advantage of a trace cache alone is limited to wide-machine configurations. No less critical, however, is power awareness. The PARROT microarchitecture delivers the performance increase at a comparable energy level, whereas the conventional path to higher performance consumes an average 70% more energy. Meanwhile, for those designs which can tolerate a higher power budget, PARROT gracefully scales up to use additional execution resources in a uniformly efficient manner. In particular, a PARROT-style doubly-wide machine delivers an average 45% IPC improvement while actually improving the Cubic- MIPS-per-WATT power awareness metric by over 50%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Almog, Y., Rosner, R., Schwartz, N., Schmorak, A.: Specialized Dynamic Optimizations for High-Performance Energy-Efficient Mi-croarchitecture. In: CGO 2004 (to appear, 2004)
Google Scholar
Bala, V., Duesterwald, E., Banerjia, S.: Transparent Dynamic Optimization: The Design and Implementation of Dynamo. TR HPL-1999-78, HP Labs
Google Scholar
Bekerman, M., Mendelson, A., Sheaffer, G.: Performance and Hardware Complexity Tradeoffs in Designing Multithreaded Architectures. In: PACT, October 1996, pp. 24–34 (1996)
Google Scholar
Black, B., Shen, J.P.: Turboscalar: A High Frequency High IPC Microarchitecture. In: ISCA 27 (June 2000)
Google Scholar
Brooks, D.M., et al.: Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors. IEEE Micro 20(6), 36–44 (2000)
Google Scholar
Cai, G., Lim, C.H., Daasch, W.R.: Thermal-Scheduling For Ultra Low Power Mobile Microprocessor. In: WCED 2002 (2002)
Google Scholar
Ebcioglu, K., Altman, E.R.: DAISY: Dynamic Compilation for 100% Architectural Compatibility. In: ISCA 24, pp. 26–37 (1997)
Google Scholar
Fahs, B., Bose, S., Crum, M., Slechta, B., Spadini, F., Tung, T., Patel, S.J., Lumetta, S.S.: Permormance Characterization of a Hardware Mechanism for Dynamic Optimization. In: MICRO 34 (2001)
Google Scholar
Friendly, D., Patel, S., Patt, Y.: Putting the Fill Unit to Work: Dynamic Optimizations for Trace Cache Microprocessors. In: MICRO 31 (November 1998)
Google Scholar
Gschwind, M., Altman, E.R., Sathaye, S., Ledak, P., Appenzeller, D.: Dynamic and Transparent Binary Translation. IEEE Computer Magazine 33(3), 54–59 (2000)
Article Google Scholar
Hinton, G., Sager, D., Upton, M., Boggs, D., Carmean, D., Kyker, A., Roussel, P.: The Microarchitecture of the Pentium ^® 4 Processor. Intel Technology Journal (2001)
Google Scholar
Jacobson, Q., Rotenberg, E., Smith, J.E.: Path-Based Next Trace Prediction. In: MICRO 30 (1997)
Google Scholar
Jourdan, S., Rappoport, L., Almog, Y., Erez, M., Yoaz, A., Ronen, R.: eXtended Block Cache. In: HPCA 6 (January 2000)
Google Scholar
Kosyakovsky, O., Mendelson, A., Kolodny, A.: The Use of Profile-based Trace Classification for Improving the Power and Performance of Trace Cache Systems. In: 4th Workshop on Feedback-Directed and Dynamic Optimization, Austin (December 2001)
Google Scholar
Lam, M.S., Wilson, R.P.: Limits of Control Flow on Parallelism. In: Proc. 19th ISCA, May 1992, pp. 46–57 (1992)
Google Scholar
Mahlke, S.A., Lin, D.C., Chen, W.Y., Hank, R.E., Bringmann, R.A.: Effective Compiler Support for Predicated Execution using the Hyperblock. In: MICRO 25 (1992)
Google Scholar
Melvin, S., Patt, Y.: Enhancing Instruction Scheduling with a Block-Structured ISA. Intern. Journal of Parallel Prog. 23(3), 221–243 (1995)
Article Google Scholar
Merten, M.C., Trick, A.R., George, C.N., Gyllenhaal, J., Hwu, W.W.: A Hardware-Driven Profiling Scheme for Identifying Program Hot Spots to Support Runtime Optimization. In: ISCA 26 (1999)
Google Scholar
Merten, M.C., Trick, A.R., Nystrom, E.M., Barnes, R.D., Mwu, W.: A Hardware Mechanism for Dynamic Extraction and Relayout of Program Hot Spots. In: ISCA 27 (May 2000)
Google Scholar
Nair, R., Hopkins, M.E.: Exploiting instruction level parallelism in processors by caching scheduled groups. In: Proc. ISCA 24, pp. 13–25 (1997)
Google Scholar
Parikh, A., Kandemir, M., Vijaykrishnan, N., Irwin, M.J.: VLIW Scheduling for Energy and Performance. In: Proc. IEEE Workshop on VLIW, April 2001, pp. 111–117 (2001)
Google Scholar
Patel, S., Lumetta, S.: rePlay: A Hardware Framework for Dynamic Optimization. IEEE Trans. on Computers 50(6), 590–608 (2001)
Article Google Scholar
Patel, S., Tung, T., Bose, S., Crum, M.: Increasing the Size of Atomic Instruction Blocks using Control Flow Assertions. In: MICRO 33 (2000)
Google Scholar
Peleg, A., Weiser, U.: Dynamic Flow Instruction Cache Memory Organized Around Trace Segments Independent of Virtual Address Line, U.S. Patent 5,381,533 (January 1995)
Google Scholar
Postiff, M., Tyson, G., Mudge, T.: Performance Limits of Trace Caches. Journal of ILP 1 (October 1999)
Google Scholar
Rosner, R., Mendelson, A., Ronen, R.: Filtering Techniques to Improve Trace-Cache Efficiency. In: Malyshkin, V.E. (ed.) PaCT 2001. LNCS, vol. 2127. Springer, Heidelberg (2001)
Google Scholar
Rosner, R., Moffie, M., Sazeides, Y., Ronen, R.: Selecting Long Atomic Traces for High Coverage. In: ICS 2003, pp. 2–11 (2003)
Google Scholar
Rotenberg, E., Bennett, S., Smith, J.: A trace cache microarchitecture and evaluation. IEEE Trans. on Computers 48(2), 111–120 (1999)
Article Google Scholar
Solomon, B., Ronen, R., Orenstien, D., Almog, Y., Mendelson, A.: Micro-Operation Cache: A Power Aware Frontend for Variable Instruction Length ISA. In: ISLPED 2001 (August 2001)
Google Scholar
Slechta, B., et al.: Dynamic Optimizations of Micro-Operations. In: HPCA 9 (February 2003)
Google Scholar
Srinivasan, V., Brooks, D., Gschwind, M., Bose, P., Zyuban, V., Strenski, P.N., Emma, P.G.: Optimizing Pipelines for Power and Performance. In: MICRO 35 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Microprocessor Research, Intel Labs, Haifa, Israel
Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz & Avi Mendelson

Authors

Roni Rosner
View author publications
You can also search for this author in PubMed Google Scholar
Yoav Almog
View author publications
You can also search for this author in PubMed Google Scholar
Micha Moffie
View author publications
You can also search for this author in PubMed Google Scholar
Naftali Schwartz
View author publications
You can also search for this author in PubMed Google Scholar
Avi Mendelson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Electrical and Computer Engineering, Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, 15213, Pittsburgh, PA, USA
Babak Falsafi
ECE, Purdue University, P.O. Box, 47907, IN, USA
T. N. VijayKumar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rosner, R., Almog, Y., Moffie, M., Schwartz, N., Mendelson, A. (2005). PARROT: Power Awareness Through Selective Dynamically Optimized Traces. In: Falsafi, B., VijayKumar, T.N. (eds) Power-Aware Computer Systems. PACS 2003. Lecture Notes in Computer Science, vol 3164. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28641-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-540-28641-7_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24031-0
Online ISBN: 978-3-540-28641-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics