Performance Evaluation of Compiler Controlled Power Saving Scheme

  • Jun Shirako
  • Munehiro Yoshida
  • Naoto Oshiyama
  • Yasutaka Wada
  • Hirofumi Nakano
  • Hiroaki Shikano
  • Keiji Kimura
  • Hironori Kasahara
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4759)


Multicore processors, or chip multiprocessors, which allow us to realize low power consumption, high effective performance, good cost performance and short hardware/software development period, are attracting much attention. In order to achieve full potential of multicore processors, cooperation with a parallelizing compiler is very important. The latest compiler extracts multilevel parallelism, such as coarse grain task parallelism, loop parallelism and near fine grain parallelism, to keep parallel execution efficiency high. It also controls voltage and clock frequency of processors carefully to reduce energy consumption during execution of an application program. This paper evaluates performance of compiler controlled power saving scheme which has been implemented in OSCAR multigrain parallelizing compiler. The developed power saving scheme realizes voltage/frequency control and power shutdown of each processor core during coarse grain task parallel processing. In performance evaluation, when static power is assumed as one-tenth of dynamic power, OSCAR compiler with the power saving scheme achieved 61.2 percent energy reduction for SPEC CFP95 applu without performance degradation on 4 processors and 87.4 percent energy reduction for mpeg2encode, 88.1 percent energy reduction for SPEC CFP95 tomcatv and 84.6 percent energy reduction for applu with real-time deadline constraint on 4 processors.


Dynamic Power Static Schedule Multicore Processor Schedule Length Task Parallelism 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Suga, A., Matsunami, K.: Introducing the FR 500 embedded microprocessor. IEEE MICRO 20, 21–27 (2000)CrossRefGoogle Scholar
  2. 2.
    Cornish, J.: Balanced energy optimization. In: International Symposium on Low Power Electronics and Design (2004)Google Scholar
  3. 3.
    Pham, D., et al.: The design and implementation of a first-generation CELL processor. In: Proceeding of the IEEE International Solid-State Circuits Conference. (2005)Google Scholar
  4. 4.
  5. 5.
    Kalla, R., Sinharoy, B., Tendler, J.: IBM Power5 chip: a dual-core multithreaded processor. IEEE Micro 24(2), 40–47 (2004)CrossRefGoogle Scholar
  6. 6.
    Wolfe, M.: High Performance Compilers for Parallel Computing. Addison-Wesley Publishing Company, Reading (1996)zbMATHGoogle Scholar
  7. 7.
    Eigenmann, R., Hoeflinger, J., Padua, D.: On the automatic parallelization of the perfect benchmarks. IEEE Trans. on parallel and distributed systems 9(1) (January 1998)Google Scholar
  8. 8.
    Hall, M.W., Anderson, J.M., Amarasinghe, S.P., Murphy, B.R., Liao, S., Bugnion, E., Lam, M.S.: Maximizing multiprocessor performance with the SUIF compiler. IEEE Computer (1996)Google Scholar
  9. 9.
    Gonzalez, M., Martorell, X., Oliver, J., Ayguade, E., Labarta, J.: Code generation and run-time support for multi-level parallelism exploitation. In: Proc. of the 8th International Workshop on Compilers for Parallel Computing (January 2000)Google Scholar
  10. 10.
    Honda, H., Iwata, M., Kasahara, H.: Coarse grain parallelism detection scheme of a fortran program. Trans. of IEICE J73-D-1(12), 951–960 (1990)Google Scholar
  11. 11.
    Kasahara, H., et al.: A multi-grain parallelizing compilation scheme on OSCAR. In: Proc. 4th Workshop on Language and Compilers for Parallel Computing (1991)Google Scholar
  12. 12.
    Kasahara, H.: Advanced automatic parallelizing compiler technology. IPSJ MAGANIE  (April 2003)Google Scholar
  13. 13.
    Albonesi, D.H., et al.: Dynamically tuning processor resources with adaptive processing. IEEE Computer (December 2003)Google Scholar
  14. 14.
    Wu, Q., Juang, P., Martonosi, M., Clark, D.W.: Formal online methods for voltage/frequency control in multiple clock domain microprocessors. In: Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems (October 2004)Google Scholar
  15. 15.
    Shirako, J., Oshiyama, N., Wada, Y., Shikano, H., Kimura, K., Kasahara, H.: Compiler control power saving scheme for multi core processors. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds.) LCPC 2005. LNCS, vol. 4339, Springer, Heidelberg (2006)Google Scholar
  16. 16.
    Shirako, J., Oshiyama, N., Wada, Y., Shikano, H., Kimura, K., Kasahara, H.: Parallelizing compilation scheme for reduction of power consumption of chip multiprocessors. In: CPC. Proc. of 12th International Workshop on Compilers for Parallel Computers (January 2006)Google Scholar
  17. 17.
    Obata, M., Shirako, J., Kaminaga, H., Ishizaka, K., Kasahara, H.: Hierarchical parallelism control for multigrain parallel processing. In: Proc. of 15th International Workshop on Languages and Compilers for Parallel Computing (August 2002)Google Scholar
  18. 18.
    Shirako, J., Nagasawa, K., Ishizaka, K., Obata, M., Kasahara, H.: Selective inline expansion for improvement of multi grain parallelism. In: PDCN 2004 (February 2004)Google Scholar
  19. 19.
    Kasahara, H., Honda, H., Iwata, M., Hirota, M.: A compilation scheme for macro-dataflow computation on hierarchical multiprocessor system. In: Proc. Int Conf. on Parallel Processing (1990)Google Scholar
  20. 20.
    Kasahara, H., Narita, S., Hashimoto, S.: Architecture of OSCAR. Trans of IEICE J71-D(8) (August 1988)Google Scholar
  21. 21.
    Kasahara, H., Honda, H., Narita, S.: Parallel processing of near fine grain tasks using static scheduling on OSCAR. In: Proceedings of Supercomputing 1990 (November 1990)Google Scholar
  22. 22.
    Kimura, K., Ogata, W., Okamoto, M., Kasahara, H.: Near fine grain parallel processing on single chip multiprocessors. Trans. of IPSJ 40(5) (May 1999)Google Scholar
  23. 23.
    Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A framework for architectural-level power analysis and optimizations. In: Proc. of the 27th ISCA (June 2000)Google Scholar
  24. 24.
    Kawaguchi, H., Shin, Y., Sakurai, T.: uITRON-LP: Power-conscious real-time os based on cooperative voltage scaling for multimedia applications. IEEE Transactions on multimedia (February 2005)Google Scholar
  25. 25.
    Kodaka, T., Nakano, H., Kimura, K., Kasahara, H.: Parallel processing using data localization for MPEG2 encoding on OSCAR chip multiprocessor. In: Proc. of International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems (January 2004)Google Scholar
  26. 26.
    Ishizaka, K., Miyamoto, T., Shirako, M.o.J., kimura, K., Kasahara, H.: Performance of OSCAR multigrain parallelizing compiler on SMP servers. In: Proc. of 17th International Workshop on Languages and Compilers for Parallel Computing (September 2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jun Shirako
    • 1
  • Munehiro Yoshida
    • 1
  • Naoto Oshiyama
    • 1
  • Yasutaka Wada
    • 1
  • Hirofumi Nakano
    • 1
  • Hiroaki Shikano
    • 1
    • 2
  • Keiji Kimura
    • 1
  • Hironori Kasahara
    • 1
  1. 1.Dept. of Computer ScienceWaseda UniversityTokyoJapan
  2. 2.Central Research Laboratory, Hitachi, Ltd.Kokubunji-shiJapan

Personalised recommendations