Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

GPGPU-Perf: efficient, interval-based DVFS algorithm for mobile GPGPU applications

  • 232 Accesses

  • 2 Citations

Abstract

Although general purpose computations on graphics processing unit (GPGPU) technologies are available even on GPUs, their performance has been seriously affected by the underlying dynamic voltage and frequency scaling (DVFS) mechanism of GPU. In order to save the energy, eventually prolonging the battery life, the DVFS adjusts the GPU’s frequency according to the past utilization. When the GPU processes graphic tasks only, it is enough to process them within a fixed time (typically 30–60 frames per second), so the DVFS parameters can be conservatively set. However, in GPGPU case, the GPU should process them at much higher rates depending on applications. Although a modification of DVFS parameters may improve the GPGPU performance, the energy efficiency is sacrificed, and the performance of graphic tasks is affected, as these parameters are shared by both graphic and GPGPU tasks. In order to improve the GPGPU performance without influencing the graphic performance, we devise the new GPGPU-Perf algorithm that adjusts the DVFS parameters such as thresholds and an interval. The new algorithm controls the frequency more intelligently for mobile GPGPU applications, and thus the performance over energy increases by 1.44 times with no influences on graphic tasks and any modifications of GPGPU algorithms. To the best of our knowledge, this paper is the first work that proposes a GPU-DVFS algorithm for GPGPU applications.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. 1.

    Altantsetseg, E., Muraki, Y., Matsuyama, K., Konno, K.: Feature line extraction from unorganized noisy point clouds using truncated Fourier series. Vis. Comput. 29(6–8), 617–626 (2013)

  2. 2.

    Bakhoda, A., Yuan, G.L., Fung, W.W.L., Wong, H., Aamodt, T.M.: Analyzing CUDA workloads using a detailed GPU simulator. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software, pp. 163–174 (2009)

  3. 3.

    Boyer, M.: Improving Resource Utilization in Heterogeneous CPU-GPU Systems. Ph.D. thesis, University of Virginia, Virginia (2013)

  4. 4.

    Chang, B., Woo, S., Ihm, I.: GPU-based parallel construction of compact visual hull meshes. Vis. Comput. 30(2), 201–211 (2014)

  5. 5.

    Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: A benchmark suite for heterogeneous computing. In: Proceedings of IEEE International Symposium on Workload Characterization (IISWC), pp. 44–54 (2009)

  6. 6.

    Choi, K., Soma, R., Pedram, M.: Dynamic voltage and frequency scaling based on workload decomposition. In: Proceedings of the International Symposium on Low Power Electronics and Design, pp. 174–179 (2004)

  7. 7.

    Huang, M., Liu, F., Wu, E.: A GPU-based matting Laplacian solver for high resolution image matting. Vis. Comput. 26(6–8), 943–950 (2010)

  8. 8.

    Khronos: The OpenCL C Specification Version: 2.0. Khronos Group (2014)

  9. 9.

    Leng, J., Hetherington, T., ElTantawy, A., Gilani, S., Kim, N.S., Aamodt, T.M., Reddi, V.J.: GPUWattch: enabling energy optimizations in GPGPUs. In: Proceedings of the 40th Annual International Symposium on Computer Architecture, pp. 487–498 (2013)

  10. 10.

    Liu, F., Harada, T., Lee, Y., Kim, Y.J.: Real-time collision culling of a million bodies on graphics processing units. ACM Trans. Gr. 29(6), 154:1–154:8 (2010)

  11. 11.

    Liu, F., Kim, Y.J.: Exact and adaptive signed distance fields computation for rigid and deformable models on GPUs. IEEE Trans. Vis. Comput. Gr. (TVCG) 20(5), 714–725 (2014)

  12. 12.

    Ma, K., Li, X., Chen, W., Zhang, C., Wang, X.: GreenGPU: A holistic approach to energy efficiency in GPU-CPU heterogeneous architectures. In: Proceedings of International Conference on Parallel Processing, pp. 48–57 (2012)

  13. 13.

    Mochockitt, B.C., Lahirit, K., Cadambit, S., Hut, X.S.: Signature-based workload estimation for mobile 3D graphics. In: Proceedings of Design Automation Conference, pp. 592–597 (2006)

  14. 14.

    Orgerie, A.C., Assuncao, MDd, Lefevre, L.: A survey on techniques for improving the energy efficiency of large-scale distributed systems. ACM Comput. Surv. 46(4), 47:1–47:31 (2014)

  15. 15.

    Pallipadi, V., Starikovskiy, A.: The ondemand governor: past, present and future. Proc. Linux Symp. 2, 223–238 (2006)

  16. 16.

    Rister, B., Wang, G., Wu, M., Cavallaro, J.R.: A fast and efficient sift detector using the mobile GPU. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2013)

  17. 17.

    Shen, J., Varbanescu, A.L.: A detailed performance analysis of the openMP Rodinia benchmark. In: Proceedings of Technical Report PDS-2011-011, Delft University of Technology, Delft

  18. 18.

    Xinxin, M., Ling, S.Y., Kaiyong, Z., Xiaowen, C.: A measurement study of GPU DVFS on energy conservation. In: Proceedings of the Workshop on Power-Aware Computing and Systems (2013)

Download references

Acknowledgments

This work was supported in part by NRF in Korea (2012R1A2A2A01046246, 2012R1A2A2A06047007, 2014K1A3A1A17073365) and MCST/KOCCA in the CT R&D program 2014 (R2014060011). Young J. Kim is the corresponding author.

Author information

Correspondence to Young J. Kim.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kim, S., Kim, Y.J. GPGPU-Perf: efficient, interval-based DVFS algorithm for mobile GPGPU applications. Vis Comput 31, 1045–1054 (2015). https://doi.org/10.1007/s00371-015-1111-1

Download citation

Keywords

  • DVFS
  • GPGPU
  • Mobile device
  • OpenCL
  • OpenGL ES