POIGEM: A Programming-Oriented Instruction Level GPU Energy Model for CUDA Program

  • Qi Zhao
  • Hailong Yang
  • Zhongzhi Luan
  • Depei Qian
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8285)


GPU architectures tend to be increasingly important in multi-core era nowadays due to their formidable computational horsepower. With the assistant of effective programming paradigms as CUDA, GPUs are widely adopted to accelerate scientific applications. Meanwhile, the surging energy consumption by GPUs becomes a major challenge to both GPU architects and programmers. In addition to the efforts designing energy efficient GPU architecture, comprehensive understanding on how programming affects the energy consumption of GPU application is also indispensable from the programmer perspective.

In this paper, we present a programming-oriented PTX instruction level energy model to provide programmers the ability of predicting the energy consumption of their program. Distinct from previous models which require hardware performance counters or architectural simulations, our model relies on the PTX instruction of a CUDA program which is not only portable but also accurate. With the selected PTX instructions based on empirical study, we apply linear regression to build the GPU energy model. One appealing advantage of our model is that it does not require any instrumentation or profiling of the GPU application during execution. Actually, our model is able to advise the programmers step by step to illustrate how their way of programming impacts the final energy consumption, especially at the stage of hacking the codes. Our model is evaluated on NVIDIA GeForce GTX 470 with Rodinia benchmark suites. The results show the accuracy of our model is promising with average prediction error below 3.7%. With the help of our GPU energy model, the programmers are gaining valuable insights to improve the energy efficiency of the application.


Programming-oriented Instruction level Energy prediction CUDA 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: Gpu computing. Proceedings of the IEEE 96(5), 879–899 (2008)CrossRefGoogle Scholar
  2. 2.
    Kirk, D.: Nvidia cuda software and gpu parallel computing architecture. In: ISMM, vol. 7, pp. 103–104 (2007)Google Scholar
  3. 3.
    Hsu, C.-H., Feng, W.-C.: A power-aware run-time system for high-performance computing. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, p. 1. IEEE Computer Society (2005)Google Scholar
  4. 4.
    Hong, S., Kim, H.: An integrated gpu power and performance model. In: ISCA 2010. ACM (2010)Google Scholar
  5. 5.
    Chen, J., Li, B., Zhang, Y., Peng, L., Peir, J.-K.: Tree structured analysis on gpu power study. In: ICCD 2011. IEEE (2011)Google Scholar
  6. 6.
    Nagasaka, H., Maruyama, N., Nukada, A., Endo, T., Matsuoka, S.: Statistical power modeling of gpu kernels using performance counters. In: Green Computing Conference (2010)Google Scholar
  7. 7.
    NVIDIA Compute. Ptx: Parallel thread execution isa version 2.3, 1 (2010), Dostopno na:
  8. 8.
    NVIDIA Compute. CUDA Compiler Driver NVCC (2013)Google Scholar
  9. 9.
    Chen, J., Li, B., Zhang, Y., Peng, L., Peir, J.-k.: Statistical gpu power analysis using tree-based methods. In: 2011 International Green Computing Conference and Workshops (IGCC), pp. 1–6. IEEE (2011)Google Scholar
  10. 10.
    Ma, X., Dong, M., Zhong, L., Deng, Z.: Statistical power consumption analysis and modeling for gpu-based computing. In: Proceeding of ACM SOSP Workshop on Power Aware Computing and Systems, HotPower (2009)Google Scholar
  11. 11.
    Luo, C., Suda, R.: A performance and energy consumption analytical model for gpu. In: 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing (DASC), pp. 658–665. IEEE (2011)Google Scholar
  12. 12.
    Ma, K., Li, X., Chen, W., Zhang, C., Wang, X.: Greengpu: A holistic approach to energy efficiency in gpu-cpu heterogeneous architectures. In: 2012 41st International Conference on Parallel Processing (ICPP), pp. 48–57. IEEE (2012)Google Scholar
  13. 13.
    NVIDIA Compute. Using Inline PTX Assembly in CUDA (2013)Google Scholar
  14. 14.
    Hong, S., Kim, H.: An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. ACM SIGARCH Computer Architecture News 37, 152–163 (2009)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.-H., Skadron, K.: Rodinia: A benchmark suite for heterogeneous computing. In: IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 44–54. IEEE (2009)Google Scholar
  16. 16.
    Collange, S., Defour, D., Tisserand, A.: Power consumption of GPUs from a software perspective. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009, Part I. LNCS, vol. 5544, pp. 914–923. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  17. 17.
    Pool, J., Lastra, A., Singh, M.: An energy model for graphics processing units. In: 2010 IEEE International Conference on Computer Design (ICCD), pp. 409–416. IEEE (2010)Google Scholar
  18. 18.
    Rofouei, M., Stathopoulos, T., Ryffel, S., Kaiser, W., Sarrafzadeh, M.: Energy-aware high performance computing with graphic processing units. In: Workshop on Power Aware Computing and System (2008)Google Scholar
  19. 19.
    Huang, S., Xiao, S., Feng, W.-c.: On the energy efficiency of graphics processing units for scientific computing. In: IEEE International Symposium on Parallel & Distributed Processing, IPDPS 2009, pp. 1–8. IEEE (2009)Google Scholar
  20. 20.
    Tiwari, V., Malik, S., Wolfe, A., Lee, M.T.-C.: Instruction level power analysis and optimization of software. In: Technologies for Wireless Computing, pp. 139–154. Springer (1996)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Qi Zhao
    • 1
  • Hailong Yang
    • 1
  • Zhongzhi Luan
    • 1
  • Depei Qian
    • 1
  1. 1.Sino-German Joint Software Institute, Department of Computer Science and EngineeringBeihang UniversityBeijingChina

Personalised recommendations