Computer Science - Research and Development

, Volume 27, Issue 4, pp 319–327 | Cite as

Global optimization model on power efficiency of GPU and multicore processing element for SIMD computing with CUDA

  • Da-Qi RenEmail author
  • Reiji Suda
Special Issue Paper


Estimating and analyzing the power consuming features of a program on a hardware platform is important for energy aware High Performance Computing (HPC) optimization, it can help to handle critical design constraints at the level of software, chose preferable algorithm in order to reach the best energy performance. Optimizing the power efficiency of CUDA program on GPU and multicore processing element is a problem in combinatorial optimization because of the complexity of power factors and criteria. A four-tuple global optimization model has been created to indicate the procedure to find optimal energy solution. In addition, an experimental method is illustrated to examine SIMD computing for capturing power parameters, five individual energy optimization methods are provided and implemented. The optimization results have been validated by comparative analysis on real systems.


Energy aware HPC GPGPU computing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Grochowski E, Annavaram M (2006) Energy per instruction trends in Intel microprocessors, Technology@Intel Magazine Google Scholar
  2. 2.
    Hong S, Kim H (2010) An integrated GPU power and performance model. In: International symposium on computer architecture, Saint-Malo, France Google Scholar
  3. 3.
    Nagasaka H, Maruyama N, Nukada A, Endo T, Matsuoka S (2010) Statistical power modeling of GPU kernels using performance counters. In: International conference on green computing, pp 115–122 CrossRefGoogle Scholar
  4. 4.
    Ren DQ, Suda R (2012) Power aware SIMD algorithm design on GPU and multicore architectures. In: Handbook of energy-aware and green computing. Chapman and Hall/CRC Press, ISBN 978-1439850404 (to appear) Google Scholar
  5. 5.
    Ren DQ (2011) Power model for performance optimization of CPU-GPU processing element power efficiency in data intensive SIMD/SPMD computation. J Parallel Distrib Comput 71(2):245–253 CrossRefGoogle Scholar
  6. 6.
    Weise T (2009) Global optimization algorithms theory and application, E-Book:, Version: 2009-06-26
  7. 7.
    CUDA programming guide (2010) Google Scholar
  8. 8.
    Ren DQ, Suda R (2009) Power efficient large matrices multiplication by load scheduling on multi-core and GPU platform with CUDA. In: IEEE international conference on computational science and engineering (CSE 09), Vancouver, Canada, pp 424–429 CrossRefGoogle Scholar
  9. 9.
    Ren DQ, Giannacopoulos D, Suda R (2010) Power performance analysis of 3-D finite element mesh refinement with tetrahedra by CUDA/MPI on multi-core and GPU architecture. In: 14th biennial IEEE conference on electromagnetic field computation, Chicago, USA Google Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of TokyoTokyoJapan
  2. 2.CRESTJSTTokyoJapan

Personalised recommendations