Wasted dynamic power and correlation to instruction set architecture for CPU throttling

  • Abdullah A. OwahidEmail author
  • Eugene B. John


Reducing dynamic power consumption is one of the major design goals in modern high-performance processor design. Throttling is a mechanism that reduces dynamic power at the expense of reduced throughput. Instruction profiling can identify a set of instructions suitable for fine-grained throttling without significant performance degradation. In this paper, an Electronic Design Automation (EDA) flow was developed to process pipeline trace at an early stage to identify the bottleneck. Using the developed EDA flow, this work identifies a set of instructions suitable for fine-grained CPU throttling to reduce wasted dynamic power in RISC-V architecture. To rank higher stall causing instructions in the instruction profile, a weight-based system was introduced. It was observed that independent of the workload and type, higher stall causing instructions were repeating across all the benchmark programs. The top 10 instruction profiles for each test suite identify probable throttling clock cycles for each pipeline stage for wasted dynamic power reduction at minimal performance loss. These results are expected to enable researchers to reduce wasted dynamic power by modifying existing architecture and effectively apply throttling mechanism without significant performance degradation.


Instruction profiling CPU throttling Wasted dynamic power Power and performance 


  1. 1.
    Alam M, Kang K, Paul BC, Roy K (2007) Reliability- and process-variation aware design of vlsi circuits. In: 2007 14th International Symposium on the Physical and Failure Analysis of Integrated Circuits.
  2. 2.
    Aragon JL, Gonzalez J, Gonzalez A (2003) Power-aware control speculation through selective throttling. In: The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings, pp 103–112.
  3. 3.
    Borkar S (2005) Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro 25(6):10–16. CrossRefGoogle Scholar
  4. 4.
    Celio C, Chiu PF, Nikolic B, Patterson DA, Asanovi K (2017) Boom v2: an open-source out-of-order risc-v core. Tech. Rep. UCB/EECS-2017-157, EECS Department, University of California, Berkeley.
  5. 5.
    Celio C, Patterson DA, Asanovi K (2015) The Berkeley out-of-order machine (boom): an industry-competitive, synthesizable, parameterized risc-v processor. Tech. Rep. UCB/EECS-2015-167, EECS Department, University of California, Berkeley.
  6. 6.
    Deng Q, Meisner D, Bhattacharjee A, Wenisch TF, Bianchini R (2012) Coscale: coordinating CPU and memory system DVFS in server systems. In: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp 143–154.
  7. 7.
    Eggers SJ, Emer JS, Levy HM, Lo JL, Stamm RL, Tullsen DM (1997) Simultaneous multithreading: a platform for next-generation processors. IEEE Micro 17(5):12–19. CrossRefGoogle Scholar
  8. 8.
    Gelsinger P (2006) Moore’s law—the genius lives on. IEEE Solid State Circuits Soc Newsl 11(5):18–20. CrossRefGoogle Scholar
  9. 9.
    Gelsinger PP (2001) Microprocessors for the new millennium: challenges, opportunities, and new frontiers. In: 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177), pp 22–25.
  10. 10.
    Ghosh S, Roy K (2010) Parameter variation tolerance and error resiliency: new design paradigm for the nanoscale era. Proc IEEE 98(10):1718–1751. CrossRefGoogle Scholar
  11. 11.
    Gowan MK, Biro LL, Jackson DB (1998) Power considerations in the design of the alpha 21264 microprocessor. In: Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175), pp 726–731.
  12. 12.
    Guthaus MR, Ringenberg JS, Ernst D, Austin TM, Mudge T, Brown RB (2001) Mibench: a free, commercially representative embedded benchmark suite. In: Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, WWC ’01, pp 3–14. IEEE Computer Society, Washington, DC, USA.
  13. 13.
    Intel 4004. Accessed 13 Mar 2018
  14. 14.
    Jharia B, Sarkar S, Agarwal R (2007) Effects of scaling on the impact ionization and subthreshold current in submicron mosfets. Microelectron Int 25(1):41–45. CrossRefGoogle Scholar
  15. 15.
    Kim Y, John LK, Pant S, Manne S, Schulte M, Bircher WL, Govindan MSS (2012) Audit: stress testing the automatic way. In: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp 212–223.
  16. 16.
    Kim Y, John LK, Paul I, Manne S, Schulte M (2013) Performance boosting under reliability and power constraints. In: 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp 334–341.
  17. 17.
    Lee SW, Gaudiot JL (2006) Throttling-based resource management in high performance multithreaded architectures. IEEE Trans Comput 55(9):1142–1152. CrossRefGoogle Scholar
  18. 18.
    Manne S, Klauser A, Grunwald D (1998) Pipeline gating: speculation control for energy reduction. In: Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235), pp 132–141.
  19. 19.
    O3 Pipeline Viewer. Accessed 16 Apr 2017
  20. 20.
    Owahid A, John E (2017) Identifying micro-ops for CPU throttling to reduce wasted dynamic power. In: The 3rd Annual Samsung Austin R&D Center Technology ForumGoogle Scholar
  21. 21.
    Owahid A, John E (2017) RTL level instruction profiling for CPU throttling to reduce wasted dynamic power. In: The 2017 International Symposium on Parallel and Distributed Computing and Computational Science (CSCI-ISPD).
  22. 22.
    RISC-V. Accessed 16 Apr 2017
  23. 23.
    RISC-V Specification. Accessed 10 Nov 2017
  24. 24.
    RISCV Benchmarks. Accessed 29 Oct 2017
  25. 25.
    riscv-boom. Accessed 16 Apr 2017
  26. 26.
    Rocket-chip. Accessed 16 Apr 2017
  27. 27.
    Sanchez H, Kuttanna B, Olson T, Alexander M, Gerosa G, Philip R, Alvarez J (1997) Thermal management system for high performance PowerPC/sup TM/microprocessors. In: Proceedings IEEE COMPCON 97. Digest of Papers, pp 325–330.
  28. 28.
    Seng JS, Tullsen DM, Cai OZN (2012) Power-sensitive multithreaded architecture. In: 2012 IEEE 30th International Conference on Computer Design (ICCD), pp 17–24.
  29. 29.
    Yeo YC, King TJ, Hu C (2003) Mosfet gate leakage modeling and selection guide for alternative gate dielectrics based on leakage considerations. IEEE Trans Electron Devices 50(4):1027–1035. CrossRefGoogle Scholar
  30. 30.
    Zhang W, Zhang H, Lach J (2014) Adaptive front-end throttling for superscalar processors. In: 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp 21–26.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringThe University of Texas at San AntonioSan AntonioUSA
  2. 2.San AntonioUSA

Personalised recommendations