International Journal of Parallel Programming

, Volume 31, Issue 6, pp 451–467 | Cite as

Power-Aware Compilation for Register File Energy Reduction

  • José L. Ayala
  • Alexander Veidenbaum
  • Marisa López-Vallejo


Most power reduction techniques have focused on gating the clock to unused functional units to minimize static power consumption, while system level optimizations have been used to deal with dynamic power consumption. Once these techniques are applied, register file power consumption becomes a dominant factor in the processor. This paper proposes a power-aware reconfiguration mechanism in the register file driven by a compiler. Optimal usage of the register file in terms of size is achieved and unused registers are put into a low-power state. Total energy consumption in the register file is reduced by 65% with no appreciable performance penalty for MiBench benchmarks on an embedded processor. The effect of reconfiguration granularity on energy savings is also analyzed, and the compiler approach to optimize energy results is presented.

register file management compiler support energy aware 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    S.-W. Lee and J.-L. Gaudiot, Power Considerations in the Design of High Performance Multi-Threaded Architectures, Technical report, Department of Electrical Engineering, University of Southern California (2000).Google Scholar
  2. 2.
    R. Gonzalez and M. Horowitz, Energy Dissipation in General Purpose Microprocessors, IEEE Journal of Solid-State Circuits, 9(21):1277–1284 (September 1996).Google Scholar
  3. 3.
    M. K. Gowan, L. L. Biro, and D. B. Jackson, Power Considerations in the Design of the Alpha 21264 Microprocessor, Design Automation Conference (1998).Google Scholar
  4. 4.
    R. P. Preston et al., Design of an 8-Wide Superscalar RISC Microprocessor with Simultaneous Multithreading, International Solid-State Circuits Conference (2002).Google Scholar
  5. 5.
    D. R. Gonzales, Micro-RISC Architecture for the Wireless Market, International Symposium on Microarchitecture (1999).Google Scholar
  6. 6.
    A. Iyer and D. Marculescu, Power Aware Microarchitecture Resource Scaling, Design and Test in Europe (2001).Google Scholar
  7. 7.
    M. C. Merten, A. R. Trick, C. N. George, J. C. Gyllenhaal, and W. W. Hwu, Hardware-Driven Profiling Scheme for Identifying Program Hot Spots to Support Runtime Optimization, International Symposium on Computer Architecture (1999).Google Scholar
  8. 8.
    R. Maro, Y. Bai, and R. I. Bahar, Dynamically Reconfiguring Processor Resources to Reduce Power Consumption in High-Performance Processors, Workshop on Power Aware Computing Systems (2000).Google Scholar
  9. 9.
    M. Kandemir, N. Vijaykrishnan, M. J. Irwin, and W. Ye, Influence of Compiler Optimizations on System Power, Design Automation Conference (2000).Google Scholar
  10. 10.
    A. Azevedo, I. Issenin, R. Cornea, R. Gupta, N. Dutt, A. Veidenbaum, and A. Nicolau, Architectural and Compiler Strategies for Dynamic Power Management in the COPPER Project, International Workshop on Innovative Architecture (2001).Google Scholar
  11. 11.
    A. Azevedo, I. Issenin, R. Cornea, R. Gupta, N. Dutt, A. Veidenbaum, and A. Nicolau, Profile-Based Dynamic Voltage Scheduling Using Program Checkpoints in the COPPER Framework, Design and Test in Europe (2002).Google Scholar
  12. 12.
    G. Savransky, R. Ronen, and A. Gonzalez, Lazy Retirement: A Power Aware Register Management Mechanism, Workshop on Complexity Efficient Design (2002).Google Scholar
  13. 13.
    J. Zalamea, J. Llosa, E. Ayguadé, and M. Valero, Software and Hardware Techniques to Optimize Register File Utilization in Lovliw Architectures, International Workshop on Advanced Compiler Technology (2001).Google Scholar
  14. 14.
    K. Inoue, High-Performance Low-Power Cache Memory Architectures, Ph.D. thesis, Kyushu University (2001).Google Scholar
  15. 15.
    D. Ponomarev, G. Kucuk, and K. Ghose, Energy-Efficient Design of the Reorder Buffer, International Workshop on Power and Timing Modeling, Optimization and Simulation (2002).Google Scholar
  16. 16.
    N. S. Kim and T. Mudge, Reducing Register Ports Using Delayed Write-Back Queues and Operand Pre-Fetch, International Conference on Supercomputing (2003).Google Scholar
  17. 17.
    J. L. Ayala, M. López-Vallejo, A. Veidenbaum, and C. A. López, Energy Aware Register File Implementation Through Instruction Predecode, International Conference on Application-Specific Systems, Architectures and Processors (2003).Google Scholar
  18. 18.
    S. A. Mahlke, W. Y. Chen, P. P. Chang, and W. Hwu, Scalar Program Performance on Multiple-Instruction Issue Processors with a Limited Number of Registers, Hawaii International Conference on System Sciences, pp. 34–44 (1992).Google Scholar
  19. 19.
    M. Postiff, D. Greene, and T. Mudge, The Need for Large Register File in Integer Codes, Technical Report CSE-TR–434–00, Electrical Engineering and Computer Science Department, The University of Michigan, USA (2000).Google Scholar
  20. 20.
    M. Powell, S. Yang, B. Falsafi, K. Roy, and T. N. Vijaykumar, Gated-vdd: A Circuit Technique to Reduce Leakage in Deep-Submicron Cache Memories, International Symposium on Low Power Electronics and Design (2000).Google Scholar
  21. 21.
    K. Flautner, N. S. Kim, S. Martin, D. Blaauw, and M. T, Drowsy Caches: Simple Techniques for Reducing Leakage Power, International Symposium on Computer Architecture (2002).Google Scholar
  22. 22.
    ARM, ARM7TDMI, Technical Manual (2001).Google Scholar
  23. 23.
    A. Halambi, A. Shrivastava, P. Biswas, N. Dutt, and A. Nicolau, An Efficient Compiler Technique for Code Size Reduction Using Reduced Bit-Width ISAs, Design and Test in Europe (2002).Google Scholar
  24. 24.
    A. Krishnaswamy and R. Gupta, Profile Guided Selection of ARM and Thumb Instructions, ACM SIGPLAN (2002).Google Scholar
  25. 25.
    T. Austin, E. Larson, and Dan Ernst, Simplescalar: An Infrastructure for Computer System Modeling, Computer, 35(2):59–67 (February 2002).Google Scholar
  26. 26.
    D. Brooks, V. Tiwari, and M. Martonosi, Wattch: A Framework for Architectural-Level Power Analysis and Optimizations, International Symposium on Computer Architecture (2000).Google Scholar
  27. 27.
    S. J. E. Wilton and N. P. Jouppi, An Enhanced Access and Cycle Time Model for On-Chip Caches, Technical Report 93/5, Computer Science Department, University of Wisconsin-Madison, USA (1994).Google Scholar
  28. 28.
    M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown, MiBench: A Free, Commercially Representative Embedded Benchmark Suite, Annual Workshop on Workload Characterization (2001).Google Scholar

Copyright information

© Plenum Publishing Corporation 2003

Authors and Affiliations

  • José L. Ayala
    • 1
  • Alexander Veidenbaum
    • 2
  • Marisa López-Vallejo
    • 1
  1. 1.Departamento de Ingeniería ElectrónicaUniversidad Politécnica de Madrid, E.T.S.I. Telecomunicación, Ciudad Universitaria s/nMadridSpain
  2. 2.Center for Embedded Computer SystemsUniversity of CaliforniaIrvine

Personalised recommendations