Skip to main content

Advertisement

Log in

Event driven power consumption optimization control model of GPU clusters

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Reducing power consumption for GPU cluster in large-scale stream computing process can bring various benefits such as reducing operating costs and environmental effect. We formulate the problem of power consumption as a constrained optimization problem, minimizing power state of cluster nodes to reduce power consumption while guaranteeing system performance and reliability. The proposed control model based on Model Prediction Control is designed to make a comprehensive metric of GPU cluster achieve expected performance, energy efficiency and reliability. It is different from the previous models, which just consider power consumption as the sole control objective. The event-triggering mechanism is introduced to reduce control overhead. It successfully separates sampling cluster status signals from control model. So the controller needs not to periodically interrupt computing process to solve optimal solutions. Finally, we evaluate and compare this control model with the previous control model by using artificial and real-world workloads. The experimental results show that our proposed control model is able to outperform existing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Mike, S., Jeremy, E., Craig, S., et al.: ECOG: a power-efficient GPU cluster architecture for scientific computing. Comput. Sci. Eng. 13(2), 83–87 (2011)

    Article  Google Scholar 

  2. Abbas, K., Nirwan, A.: Toward low-cost workload distribution for integrated green data centers. IEEE Commun. Lett. 19(1), 26–29 (2015)

    Article  Google Scholar 

  3. Ashwin, M.A., Lokendra, S., et al.: MPI-ACC: accelerator-aware MPI for scientific applications. IEEE Trans. Parallel Distrib. Syst. 27(5), 1401–1414 (2016)

    Article  Google Scholar 

  4. Wang, H., Sreeram, P., Devendar, B., et al.: GPU-aware MPI on rdma-enabled cluster:design, implementation and evaluation. IEEE Trans. Parallel Distrib. Syst. 25(10), 2595–2605 (2014)

    Article  Google Scholar 

  5. Dario, B., Audric, L., et al.: Modeling and evaluation of energy policies in green clouds. IEEE Trans. Parallel Distrib. Syst. 26(11), 3052–3065 (2015)

    Article  Google Scholar 

  6. Zhang, Y., Mueller, F.: Autogeneration and autotuning of 3D stencil codes on homogeneous and heterogeneous GPU clusters. IEEE Trans. Parallel Distrib Syst. 24(3), 417–427 (2013)

    Article  Google Scholar 

  7. Tang, Y., Gedik, B.: Autopipelining for data stream processing. IEEE Trans. Parallel Distrib Syst. 24(12), 2344–2354 (2013)

    Article  Google Scholar 

  8. Deng, Z., X, W., Wang, L., et al.: Parallel processing of dynamic continuous queries over streaming data flows. IEEE Trans. Parallel Distrib. Syst. 26(3), 834–864 (2015)

    Article  Google Scholar 

  9. Yang, J., Zeng, K., et al.: Dynamic cluster reconfiguration for energy conservation in computation intensive service. IEEE Trans. Comput. 61(10), 1401–1416 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  10. Wang, H., Cao, Y.: Predicting power consumption of GPUs with fuzzy wavelet neural networks. Parallel Comput. 44(5), 18–36 (2015)

    Article  Google Scholar 

  11. Gandhi, A., Harchol-Balter, M. et al.: Optimal power allocation in server farms. In: Proceeding of the 11th International Joint Conference Measurement and Modeling of Computer Systems, pp. 157–168 (2009)

  12. Ewa, N.S., Andrzej, S., et al.: Dynamic power management in energy-aware computer networks and data intensive computing systems. Future Gener. Comput. Syst. 37, 284–296 (2014)

    Article  Google Scholar 

  13. Liu, Y., Zhu, H., Lu, K., Liu, Y.: A power provision and capping architecture for large scale systems. In: Proceeding of the 26th IEEE International Parallel and Distributed Processing Symposium Workship& PHD Forum, pp. 954–963 (2012)

  14. Bertini, L., J, C.B., Daniel, M.: Power and performance control of soft real-time web server clusters. Inf. Process. Lett. 110, 767–773 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  15. Lefurgy, C., Wang, X., Ware, M: Server-level power control. In: Proceeding of the Fourth International Conference on Autonomic Computing(ICAC’07), (2007)

  16. Wang, X., Chen, M., Xing, F.: MIMI power control for high-density servers in an enclosure. IEEE Trans. Parallel Distrib Syst. 21(10), 1412–1426 (2010)

    Article  Google Scholar 

  17. Wang, X., Wang, Y.: Coordinating power control and performance management for virtualized server clusters. IEEE Trans. Parallel Distrib. Syst. 22(2), 245–259 (2011)

    Article  Google Scholar 

  18. Wang, X., Chen, M., Lefurgy, C., Keller, T.W.: SHIP: a scalable hierarchical power control architecture for large-scale data centers. IEEE Trans. Parallel Distrib. Syst. 23(1), 168–176 (2012)

    Article  Google Scholar 

  19. Gong, J., Xu, X.: A gray-box feedback control approach for system-level peak power management. In: Proceeding of the 39th International Conference on Parallel Processing, pp. 555–564 (2010)

  20. Lama, P., Zhou, X.: Coordinated power and performance guarantee with fuzzy MIMO control in virtualized server clusters. IEEE Trans. Comput. 64(1), 97–111 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  21. Enokido, T., Takizawa, M.: An extended power consumption model for distributed applications. In: Proceeding of the 26th IEEE International Conference on Advanced Information Networking and Applications, pp. 912–919 (2012)

  22. Sergio, N., Cristian, P., et al.: Controlling datacenter power consumption while maintaining temperature and QoS levels. In: IEEE 3rd International Conference on Cloud Networking, pp. 242–247 (2014)

  23. Saul, C.L., Marcelo, D.F.: On the control of power consumption in server farms via heavy traffic approximation. In: IEEE 53rd Conference on Decision and Control, pp. 3683–3688 (2014)

  24. Dimitrov, M., Mantor, M., Zhou, H.: Understanding software approaches for GPGPU reliability. In: Proceedings of 2nd workshop on general purpose processing on graphics processing units. ACM, New York

  25. Dal, D., Mansouri, N.: Power optimization with power islands synthesis. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 28(7), 1025–1037 (2009)

    Article  Google Scholar 

  26. Padoin, E.L., Pilla, L.L., et al.: Evaluating application performance and energy consumption on hybrid CPU + GPU architecture. Clust. Comput. 16, 511–525 (2013)

    Article  Google Scholar 

  27. Degalahal, V., Li, L., Narayanan, V.: Soft errors issues in low-power caches. IEEE Trans. Very Large Scale Integr. Syst. 13(10), 1157–1166 (2005)

    Article  Google Scholar 

  28. Firouzi, F., Azarpeyvand, A., et al.: Adaptive fault-tolerant DVFS with dynamic online AVF prediction. Microelectron. Reliab. 52, 1197–1208 (2012)

    Article  Google Scholar 

  29. Zhu, D., Aydin, H.: Reliability-aware energy management for periodic real-time tasks. IEEE Trans. Comput. 58(10), 1382–1397 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  30. Dixit, A., Wood, A.: The impact of new technology on soft error rates. 2011 IEEE International Reliability Physics Symposium(IRPS), pp. 5B.4.1–5B.4.7 (2011)

  31. Zhao, B. Aydin, H., Zhu, D.: Energy management under general task-level reliability constraints. In: 2012 IEEE 18th Real Time and Embedded Technology and Applications Symposium, pp. 1080–1812 (2012)

  32. Hancao, L., Haddad, W.M.: Model predictive control for a multi-compartment respiratory system. IEEE Trans. Instrum. Meas. 21(5), 1988–1995 (2013)

    Google Scholar 

  33. Chen, Y., Zhang, J., et al.: A service selection model using mixed intelligent optimization. Chin. J. Comput. 36(2), 384–391 (2013). (in Chinese)

    Google Scholar 

  34. Li, X.: A novel effective solution for non-differentiable optimization problems. Sci. Sin. Math. 24(4), 371–377 (1994). (in Chinese)

    Google Scholar 

  35. Li, S., Zheng, Y., Lin, Z.: Impacted-region optimization for distributed model predictive control systems with constraints. IEEE Trans. Autom. Sci. Eng. 99(5), 1–14 (2014)

    Google Scholar 

  36. Hsueh, Y., Chen, H.: Map matching for low-sampling-rate GPS trajectories by exploring real-time moving directions. Inf. Sci. 433, 55–69 (2018)

    Article  MathSciNet  Google Scholar 

  37. Yuan, J., Zheng, Y., Xie, X., Sun, G.: Driving with knowledge from the physical world. In: the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, KDD’11, New York. ACM (2011)

  38. Deng, Z., Yangyang, H., et al.: A scalable and fast OPTICS for clustering trajectory big data. Clust. Comput. 18, 549–562 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China (No. 60970012), Shandong Provincial Natural Science Foundation, China (No. ZR2017MF050), Project of Shandong Province Higher Educational Science and technology program (No. J17KA049) and Shandong Province Key Research and Development Program of China (No. 2018GGX101005, 2017CXGC0701, 2016GGX109001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haifeng Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Cao, Y. Event driven power consumption optimization control model of GPU clusters. Cluster Comput 22, 965–979 (2019). https://doi.org/10.1007/s10586-018-02886-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-02886-x

Keywords

Navigation