Advertisement

Power-efficient Processor Architecture

  • Preeti Ranjan Panda
  • Aviral Shrivastava
  • B. V. N. Silpa
  • Krishnaiah Gummidipudi
Chapter

Abstract

Since the creation of the first processor/CPU in 1971, silicon technology consistently allowed to pack twice the number of transistors on the same die every 18 to 24 months [33]. Scaling of technology allowed the implementation of faster and larger circuits on silicon, permitting a sophisticated and powerful set of features to be integrated into CPU. Figure 3.1 shows the evolution of processors from 4-bit scalar datapath to 64-bit superscalar datapath and their respective transistor counts. Processors evolved not only in terms of datapath width, but also in terms of a wide variety of architectural features such as pipelining, floating point support, on-chip memories, superscalar processing, out-of-order processing, speculative execution, multi-threading, muticore CPUs, etc.

Keywords

Function Unit Register File Program Counter Execution Unit Branch Prediction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
  2. 2.
    Alper, B., Stanley, S., David, B., Pradip, B., Peter, C., David, A.: An adaptive issue queue for reduced power at high performance. In: Power-Aware Computer Systems, pp. 25–39 (2001)Google Scholar
  3. 3.
    Balasubramonian, R., Dwarkadas, S., Albonesi, D.H.: Reducing the complexity of the register file in dynamic superscalar processors. In: MICRO 34: Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, pp. 237–248. IEEE Computer Society, Washington, DC, USA (2001)Google Scholar
  4. 4.
    Baniasadi, A., Moshovos, A.: Instruction flow-based front-end throttling for power-aware high-performance processors. In: Low Power Electronics and Design, International Symposium on, 2001., pp. 16–21 (2001). DOI 10.1109/LPE.2001.945365Google Scholar
  5. 5.
    Baniasadi, A., Moshovos, A.: Branch predictor prediction: A power-aware branch predictor for high-performance processors. Computer Design, International Conference on0, 458 (2002). DOI http://doi.ieeecomputersociety.org/10.1109/ICCD.2002.1106813
  6. 6.
    Brooks, D., Martonosi, M.: Dynamically exploiting narrow width operands to improve processor power and performance. In: High-Performance Computer Architecture, 1999. Proceedings. Fifth International Symposium On, pp. 13–22 (1999). DOI 10.1109/HPCA.1999.744314Google Scholar
  7. 7.
    Brooks, D., Martonosi, M.: Value-based clock gating and operation packing: dynamic strategies for improving processor power and performance. ACM Trans. Comput. Syst.18(2), 89–126 (2000). DOI http://doi.acm.org/10.1145/350853.350856
  8. 8.
    Buyuktosunoglu, A., Y, T.K., Albonesi, D.H., Z, P.B.: Energy efficient co-adaptive instruction fetch and issue. In: In ISCA 03: Proceedings of the 30th Annual International Symposium on Computer Architecture, pp. 147–156. ACM Press (2003)Google Scholar
  9. 9.
    Chatterjee, A., Nandakumar, M., Chen, I.: An investigation of the impact of technology scaling on power wasted as short-circuit current in low voltage static cmos circuits. In: ISLPED ’96: Proceedings of the 1996 international symposium on Low power electronics and design, pp. 145–150. IEEE Press, Piscataway, NJ, USA (1996)Google Scholar
  10. 10.
    Chaver, D., Pi nuel, L., Prieto, M., Tirado, F., Huang, M.C.: Branch prediction on demand: an energy-efficient solution. In: ISLPED ’03: Proceedings of the 2003 international symposium on Low power electronics and design, pp. 390–395. ACM, New York, NY, USA (2003). DOI http://doi.acm.org/10.1145/871506.871603
  11. 11.
    Correale Jr., A.: Overview of the power minimization techniques employed in the ibm powerpc 4xx embedded controllers. In: ISLPED ’95: Proceedings of the 1995 international symposium on Low power design, pp. 75–80. ACM, New York, NY, USA (1995). DOI http://doi.acm.org/10.1145/224081.224095
  12. 12.
    Cruz, J.L., González, A., Valero, M., Topham, N.P.: Multiple-banked register file architectures. SIGARCH Comput. Archit. News28(2), 316–325 (2000). DOI http://doi.acm.org/10.1145/342001.339708
  13. 13.
    Dropsho, S., Kursun, V., Albonesi, D.H., Dwarkadas, S., Friedman, E.G.: Managing static leakage energy in microprocessor functional units. In: MICRO 35: Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, pp. 321–332. IEEE Computer Society Press, Los Alamitos, CA, USA (2002)Google Scholar
  14. 14.
    Farkas, K.I., Chow, P., Jouppi, N.P.: Register file design considerations in dynamically scheduled processors. In: HPCA ’96: Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture, p. 40. IEEE Computer Society, Washington, DC, USA (1996)Google Scholar
  15. 15.
    Farkas, K.I., Chow, P., Jouppi, N.P., Vranesic, Z.: The multicluster architecture: reducing cycle time through partitioning. In: MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, pp. 149–159. IEEE Computer Society, Washington, DC, USA (1997)Google Scholar
  16. 16.
    Folegnani, D., Gonzalez, A.: Energy-effective issue logic. In: Computer Architecture, 2001. Proceedings. 28th Annual International Symposium on, pp. 230–239 (2001). DOI 10.1109/ISCA.2001.937452Google Scholar
  17. 17.
    Ghose, K., Kamble, M.: Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation. In: Low Power Electronics and Design, 1999. Proceedings. 1999 International Symposium on, pp. 70–75 (1999)Google Scholar
  18. 18.
    Ghose, K., Ponomarev, D., Kucuk, G., Flinders, A., Kogge, P.M.: Exploiting bit-slice inactivities for reducing energy requirements of superscalar processors. In: In Kool Chips Workshop, MICRO-33 (2000)Google Scholar
  19. 19.
    Gonzalez, R.E.: Low-power processor design. Tech. rep., Stanford University, Stanford, CA, USA (1997)Google Scholar
  20. 20.
    Hennessy, J.L., Patterson, D.A.: Computer architecture: a quantitative approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2002)MATHGoogle Scholar
  21. 21.
    Hiraki, M., Bajwa, R., Kojima, H., Gorny, D., Nitta, K., Shri, A.: Stage-skip pipeline: a low power processor architecture using a decoded instruction buffer. In: Low Power Electronics and Design, 1996., International Symposium on, pp. 353–358 (1996). DOI 10.1109/LPE. 1996.547538Google Scholar
  22. 22.
    Hu, Z., Buyuktosunoglu, A., Srinivasan, V., Zyuban, V., Jacobson, H., Bose, P.: Microarchitectural techniques for power gating of execution units. In: ISLPED ’04: Proceedings of the 2004 international symposium on Low power electronics and design, pp. 32–37. ACM, New York, NY, USA (2004). DOI http://doi.acm.org/10.1145/1013235.1013249
  23. 23.
    Jouppi, N.P., Wall, D.W.: Available instruction-level parallelism for superscalar and superpipelined machines. In: ASPLOS-III: Proceedings of the third international conference on Architectural support for programming languages and operating systems, pp. 272–282. ACM, New York, NY, USA (1989). DOI http://doi.acm.org/10.1145/70082.68207
  24. 24.
    Kim, N.S., Mudge, T.: The microarchitecture of a low power register file. In: ISLPED ’03: Proceedings of the 2003 international symposium on Low power electronics and design, pp. 384–389. ACM, New York, NY, USA (2003). DOI http://doi.acm.org/10.1145/871506.871602
  25. 25.
    Kondo, M., Nakamura, H.: A small, fast and low-power register file by bit-partitioning. In: HPCA ’05: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, pp. 40–49. IEEE Computer Society, Washington, DC, USA (2005). DOI http://dx.doi.org/10.1109/HPCA.2005.3
  26. 26.
    Kucuk, G., Ergin, O., Ponomarev, D., Ghose, K.: Distributed reorder buffer schemes for low power. In: Computer Design, 2003. Proceedings. 21st International Conference on, pp. 364–370 (2003). DOI 10.1109/ ICCD.2003.1240920Google Scholar
  27. 27.
    Kucuk, G., Ghose, K., Ponomarev, D.V., Kogge, P.M.: Energy-efficient instruction dispatch buffer design for superscalar processors. In: IEEE/ACM International Symposium on Low Power Electronics and Design, pp. 237–242 (2001)Google Scholar
  28. 28.
    Kucuk, G., Ponomarev, D., Ghose, K.: Low-complexity reorder buffer architecture. In: ICS ’02: Proceedings of the 16th international conference on Supercomputing, pp. 57–66. ACM, New York, NY, USA (2002). DOI http://doi.acm.org/10.1145/514191.514202
  29. 29.
    Kursun, V., Friedman, E.G.: Low swing dual threshold voltage domino logic. In: GLSVLSI ’02: Proceedings of the 12th ACM Great Lakes symposium on VLSI, pp. 47–52. ACM, New York, NY, USA (2002). DOI http://doi.acm.org/10.1145/505306.505317
  30. 30.
    Li, H., Bhunia, S., Chen, Y., Vijaykumar, T., Roy, K.: Deterministic clock gating for microprocessor power reduction. In: High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. The Ninth International Symposium on, pp. 113–122 (2003). DOI 10.1109/HPCA. 2003.1183529Google Scholar
  31. 31.
    Magklis, G., Scott, M.L., Semeraro, G., Albonesi, D.H., Dropsho, S.: Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor. In: ISCA ’03: Proceedings of the 30th annual international symposium on Computer architecture, pp. 14–27. ACM, New York, NY, USA (2003). DOI http://doi.acm.org/10.1145/859618. 859621
  32. 32.
    Manne, S., Klauser, A., Grunwald, D.: Pipeline gating: speculation control for energy reduction. In: In Proceedings of the 25th Annual International Symposium on Computer Architecture, pp. 132–141 (1998)Google Scholar
  33. 33.
    Moore, G.: Cramming more components onto integrated circuits. Electronics Magazine38(8) (1965)Google Scholar
  34. 34.
    Munch, M., Wurth, B., Mehra, R., Sproch, J., Wehn, N.: Automating rt-level operand isolation to minimize power consumption in datapaths. In: Design, Automation and Test in Europe Conference and Exhibition 2000. Proceedings, pp. 624–631 (2000). DOI 10.1109/DATE.2000. 840850Google Scholar
  35. 35.
    Palacharla, S., Jouppi, N.P., Smith, J.E.: Complexity-effective superscalar processors. SIGARCH Comput. Archit. News25(2), 206–218 (1997). DOI http://doi.acm.org/10.1145/384286.264201
  36. 36.
    Parikh, D., Skadron, K., Zhang, Y., Stan, M.: Power-aware branch prediction: Characterization and design. IEEE Transactions on Computers53, 168–186 (2004). DOI http://doi.ieeecomputersociety org/10.1109/TC.2004.1261827.Google Scholar
  37. 37.
    Ponomarev, D., Kucuk, G., Ghose, K.: Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources. In: Microarchitecture, 2001. MICRO-34. Proceedings. 34th ACM/IEEE International Symposium on, pp. 90–101 (2001). DOI 10. 1109/MICRO.2001.991108Google Scholar
  38. 38.
    Ponomarev, D., Kucuk, G., Ghose, K.: Energy-efficient design of the reorder buffer. In: PATMOS ’02: Proceedings of the 12th International Workshop on Integrated Circuit Design. Power and Timing Modeling, Optimization and Simulation, pp. 289–299. Springer-Verlag, London, UK (2002)Google Scholar
  39. 39.
    Rele, S., Pande, S., Önder, S., Gupta, R.: Optimizing static power dissipation by functional units in superscalar processors. In: CC ’02: Proceedings of the 11th International Conference on Compiler Construction, pp. 261–275. Springer-Verlag, London, UK (2002)Google Scholar
  40. 40.
    Ross, P.: Why CPU frequency stalled. Spectrum, IEEE45(4), 72–72 (2008). DOI 10.1109/MSPEC.2008.4476447CrossRefGoogle Scholar
  41. 41.
    Tiwari, V., Malik, S., Ashar, P.: Guarded evaluation: pushing power management to logic synthesis/design. In: ISLPED ’95: Proceedings of the 1995 international symposium on Low power design, pp. 221–226. ACM, New York, NY, USA (1995). DOI http://doi.acm.org/10.1145/224081.224120
  42. 42.
    Tiwari, V., Singh, D., Rajgopal, S., Mehta, G., Patel, R., Baez, F.: Reducing power in high-performance microprocessors. In: Design Automation Conference, 1998. Proceedings, pp. 732–737 (1998)Google Scholar
  43. 43.
    Yang, C., Orailoglu, A.: Power efficient branch prediction through early identification of branch addresses. In: CASES ’06: Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, pp. 169–178. ACM, New York, NY, USA (2006). DOI http://doi.acm.org/10.1145/1176760.1176782
  44. 44.
    Zalamea, J., Llosa, J., Ayguadé, E., Valero, M.: Two-level hierarchical register file organization for vliw processors. In: MICRO 33: Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, pp. 137–146. ACM, New York, NY, USA (2000). DOI http://doi.acm.org/10.1145/360128.360143
  45. 45.
    Zyuban, V., Kogge, P.: Optimization of high-performance superscalar architectures for energy efficiency. In: Low Power Electronics and Design, 2000. ISLPED ’00. Proceedings of the 2000 International Symposium on, pp. 84–89 (2000)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Preeti Ranjan Panda
    • 1
  • Aviral Shrivastava
    • 2
  • B. V. N. Silpa
    • 1
  • Krishnaiah Gummidipudi
    • 1
  1. 1.Department Computer Science and EngineeringIndian Institute of TechnologyNew DelhiIndia
  2. 2.Department of Computer Science and EngineeringArizona State UniversityTempeUSA

Personalised recommendations