Advertisement

Event driven power consumption optimization control model of GPU clusters

  • Haifeng WangEmail author
  • Yunpeng Cao
Article
  • 39 Downloads

Abstract

Reducing power consumption for GPU cluster in large-scale stream computing process can bring various benefits such as reducing operating costs and environmental effect. We formulate the problem of power consumption as a constrained optimization problem, minimizing power state of cluster nodes to reduce power consumption while guaranteeing system performance and reliability. The proposed control model based on Model Prediction Control is designed to make a comprehensive metric of GPU cluster achieve expected performance, energy efficiency and reliability. It is different from the previous models, which just consider power consumption as the sole control objective. The event-triggering mechanism is introduced to reduce control overhead. It successfully separates sampling cluster status signals from control model. So the controller needs not to periodically interrupt computing process to solve optimal solutions. Finally, we evaluate and compare this control model with the previous control model by using artificial and real-world workloads. The experimental results show that our proposed control model is able to outperform existing techniques.

Keywords

Energy conservation GPU cluster Power consumption control Model predictive control Stream computing 

Notes

Acknowledgements

This work was supported by the National Nature Science Foundation of China (No. 60970012), Shandong Provincial Natural Science Foundation, China (No. ZR2017MF050), Project of Shandong Province Higher Educational Science and technology program (No. J17KA049) and Shandong Province Key Research and Development Program of China (No. 2018GGX101005, 2017CXGC0701, 2016GGX109001).

References

  1. 1.
    Mike, S., Jeremy, E., Craig, S., et al.: ECOG: a power-efficient GPU cluster architecture for scientific computing. Comput. Sci. Eng. 13(2), 83–87 (2011)CrossRefGoogle Scholar
  2. 2.
    Abbas, K., Nirwan, A.: Toward low-cost workload distribution for integrated green data centers. IEEE Commun. Lett. 19(1), 26–29 (2015)CrossRefGoogle Scholar
  3. 3.
    Ashwin, M.A., Lokendra, S., et al.: MPI-ACC: accelerator-aware MPI for scientific applications. IEEE Trans. Parallel Distrib. Syst. 27(5), 1401–1414 (2016)CrossRefGoogle Scholar
  4. 4.
    Wang, H., Sreeram, P., Devendar, B., et al.: GPU-aware MPI on rdma-enabled cluster:design, implementation and evaluation. IEEE Trans. Parallel Distrib. Syst. 25(10), 2595–2605 (2014)CrossRefGoogle Scholar
  5. 5.
    Dario, B., Audric, L., et al.: Modeling and evaluation of energy policies in green clouds. IEEE Trans. Parallel Distrib. Syst. 26(11), 3052–3065 (2015)CrossRefGoogle Scholar
  6. 6.
    Zhang, Y., Mueller, F.: Autogeneration and autotuning of 3D stencil codes on homogeneous and heterogeneous GPU clusters. IEEE Trans. Parallel Distrib Syst. 24(3), 417–427 (2013)CrossRefGoogle Scholar
  7. 7.
    Tang, Y., Gedik, B.: Autopipelining for data stream processing. IEEE Trans. Parallel Distrib Syst. 24(12), 2344–2354 (2013)CrossRefGoogle Scholar
  8. 8.
    Deng, Z., X, W., Wang, L., et al.: Parallel processing of dynamic continuous queries over streaming data flows. IEEE Trans. Parallel Distrib. Syst. 26(3), 834–864 (2015)CrossRefGoogle Scholar
  9. 9.
    Yang, J., Zeng, K., et al.: Dynamic cluster reconfiguration for energy conservation in computation intensive service. IEEE Trans. Comput. 61(10), 1401–1416 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Wang, H., Cao, Y.: Predicting power consumption of GPUs with fuzzy wavelet neural networks. Parallel Comput. 44(5), 18–36 (2015)CrossRefGoogle Scholar
  11. 11.
    Gandhi, A., Harchol-Balter, M. et al.: Optimal power allocation in server farms. In: Proceeding of the 11th International Joint Conference Measurement and Modeling of Computer Systems, pp. 157–168 (2009)Google Scholar
  12. 12.
    Ewa, N.S., Andrzej, S., et al.: Dynamic power management in energy-aware computer networks and data intensive computing systems. Future Gener. Comput. Syst. 37, 284–296 (2014)CrossRefGoogle Scholar
  13. 13.
    Liu, Y., Zhu, H., Lu, K., Liu, Y.: A power provision and capping architecture for large scale systems. In: Proceeding of the 26th IEEE International Parallel and Distributed Processing Symposium Workship& PHD Forum, pp. 954–963 (2012)Google Scholar
  14. 14.
    Bertini, L., J, C.B., Daniel, M.: Power and performance control of soft real-time web server clusters. Inf. Process. Lett. 110, 767–773 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Lefurgy, C., Wang, X., Ware, M: Server-level power control. In: Proceeding of the Fourth International Conference on Autonomic Computing(ICAC’07), (2007)Google Scholar
  16. 16.
    Wang, X., Chen, M., Xing, F.: MIMI power control for high-density servers in an enclosure. IEEE Trans. Parallel Distrib Syst. 21(10), 1412–1426 (2010)CrossRefGoogle Scholar
  17. 17.
    Wang, X., Wang, Y.: Coordinating power control and performance management for virtualized server clusters. IEEE Trans. Parallel Distrib. Syst. 22(2), 245–259 (2011)CrossRefGoogle Scholar
  18. 18.
    Wang, X., Chen, M., Lefurgy, C., Keller, T.W.: SHIP: a scalable hierarchical power control architecture for large-scale data centers. IEEE Trans. Parallel Distrib. Syst. 23(1), 168–176 (2012)CrossRefGoogle Scholar
  19. 19.
    Gong, J., Xu, X.: A gray-box feedback control approach for system-level peak power management. In: Proceeding of the 39th International Conference on Parallel Processing, pp. 555–564 (2010)Google Scholar
  20. 20.
    Lama, P., Zhou, X.: Coordinated power and performance guarantee with fuzzy MIMO control in virtualized server clusters. IEEE Trans. Comput. 64(1), 97–111 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Enokido, T., Takizawa, M.: An extended power consumption model for distributed applications. In: Proceeding of the 26th IEEE International Conference on Advanced Information Networking and Applications, pp. 912–919 (2012)Google Scholar
  22. 22.
    Sergio, N., Cristian, P., et al.: Controlling datacenter power consumption while maintaining temperature and QoS levels. In: IEEE 3rd International Conference on Cloud Networking, pp. 242–247 (2014)Google Scholar
  23. 23.
    Saul, C.L., Marcelo, D.F.: On the control of power consumption in server farms via heavy traffic approximation. In: IEEE 53rd Conference on Decision and Control, pp. 3683–3688 (2014)Google Scholar
  24. 24.
    Dimitrov, M., Mantor, M., Zhou, H.: Understanding software approaches for GPGPU reliability. In: Proceedings of 2nd workshop on general purpose processing on graphics processing units. ACM, New YorkGoogle Scholar
  25. 25.
    Dal, D., Mansouri, N.: Power optimization with power islands synthesis. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 28(7), 1025–1037 (2009)CrossRefGoogle Scholar
  26. 26.
    Padoin, E.L., Pilla, L.L., et al.: Evaluating application performance and energy consumption on hybrid CPU + GPU architecture. Clust. Comput. 16, 511–525 (2013)CrossRefGoogle Scholar
  27. 27.
    Degalahal, V., Li, L., Narayanan, V.: Soft errors issues in low-power caches. IEEE Trans. Very Large Scale Integr. Syst. 13(10), 1157–1166 (2005)CrossRefGoogle Scholar
  28. 28.
    Firouzi, F., Azarpeyvand, A., et al.: Adaptive fault-tolerant DVFS with dynamic online AVF prediction. Microelectron. Reliab. 52, 1197–1208 (2012)CrossRefGoogle Scholar
  29. 29.
    Zhu, D., Aydin, H.: Reliability-aware energy management for periodic real-time tasks. IEEE Trans. Comput. 58(10), 1382–1397 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Dixit, A., Wood, A.: The impact of new technology on soft error rates. 2011 IEEE International Reliability Physics Symposium(IRPS), pp. 5B.4.1–5B.4.7 (2011)Google Scholar
  31. 31.
    Zhao, B. Aydin, H., Zhu, D.: Energy management under general task-level reliability constraints. In: 2012 IEEE 18th Real Time and Embedded Technology and Applications Symposium, pp. 1080–1812 (2012)Google Scholar
  32. 32.
    Hancao, L., Haddad, W.M.: Model predictive control for a multi-compartment respiratory system. IEEE Trans. Instrum. Meas. 21(5), 1988–1995 (2013)Google Scholar
  33. 33.
    Chen, Y., Zhang, J., et al.: A service selection model using mixed intelligent optimization. Chin. J. Comput. 36(2), 384–391 (2013). (in Chinese) Google Scholar
  34. 34.
    Li, X.: A novel effective solution for non-differentiable optimization problems. Sci. Sin. Math. 24(4), 371–377 (1994). (in Chinese) Google Scholar
  35. 35.
    Li, S., Zheng, Y., Lin, Z.: Impacted-region optimization for distributed model predictive control systems with constraints. IEEE Trans. Autom. Sci. Eng. 99(5), 1–14 (2014)Google Scholar
  36. 36.
    Hsueh, Y., Chen, H.: Map matching for low-sampling-rate GPS trajectories by exploring real-time moving directions. Inf. Sci. 433, 55–69 (2018)MathSciNetCrossRefGoogle Scholar
  37. 37.
    Yuan, J., Zheng, Y., Xie, X., Sun, G.: Driving with knowledge from the physical world. In: the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, KDD’11, New York. ACM (2011)Google Scholar
  38. 38.
    Deng, Z., Yangyang, H., et al.: A scalable and fast OPTICS for clustering trajectory big data. Clust. Comput. 18, 549–562 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Information Science and Engineer SchoolLin Yi UniversityLinyiChina

Personalised recommendations