Skip to main content

Advertisement

Log in

Joint energy optimization on the server and network sides for geo-distributed data centers

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Energy optimization has become an emerging concern for cloud service providers. Existing methods focus on reducing the energy consumption of either server inside the data center or data transmission between data centers. Moreover, most of the works are based on assumptions that servers and workloads are homogeneous. This is not in accordance with the fact that modern data centers are built from various classes of servers. In this paper, we consider the joint energy optimization of intra- and inter-data center in both homo- and heterogeneous cases. We first propose an optimization model to minimize the joint energy cost of servers and network sides. To tackle the time-coupling constraint of carbon emission, we apply the Lyapunov optimization framework to transform the original problem into a well-studied queue stability problem. For better scalability of time complexity, we derive a distributed solution by using generalized benders decomposition. Then, we extend the model to deal with the situation where requests and data centers are heterogeneous as data centers are typically built from servers with different specifications. To better deal with the dynamic of the network (e.g., the occurrence of faults), we leverage a deep Q-network (DQN) and propose a fault-tolerant DQN-based solution. Finally, the simulation results show the high efficiency of our proposal in cost-saving and performance-enhancing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Zhou Z, Liu F, Li Z (2016) Bilateral electricity trade between smart grids and green datacenters: pricing models and performance evaluation. IEEE J Select Areas Commun 34:3993–4007. https://doi.org/10.1109/JSAC.2016.2611898

    Article  Google Scholar 

  2. Lu X, Kong F, Liu X, Yin J, Xiang Q, Yu H (2020) Bulk savings for bulk transfers: minimizing the energy-cost for geo-distributed data centers. IEEE Trans Cloud Comput 8:73–85. https://doi.org/10.1109/TCC.2017.2739160

    Article  Google Scholar 

  3. Abts D, Marty MR, Wells PM, Klausler P, Liu H (2010) Energy proportional datacenter networks. ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture. 338–347

  4. Chen Q, Chen J, Zheng B, Cui J, Qian Y (2015) Utilization-based VM consolidation scheme for power efficiency in cloud data centers. In: 2015 IEEE International Conference on Communication Workshop (ICCW). pp. 1928–1933. IEEE, London, United Kingdom.https://doi.org/10.1109/ICCW.2015.7247462

  5. Qin Y, Han W, Yang Y, Yang W, Liu B (2019) Joint energy optimization on the server and network sides for geo-distributed datacenters. In: ICC 2019–2019 IEEE International Conference on Communications (ICC). pp. 1–6. https://doi.org/10.1109/ICC.2019.8761333

  6. Liu Z, Lin M, Wierman A, Low S, Andrew LLH (2015) Greening geographical load balancing. IEEE/ACM Trans Netw 23:657–671. https://doi.org/10.1109/TNET.2014.2308295

    Article  Google Scholar 

  7. Xu H, Feng C, Li B (2015) Temperature aware workload management in geo-distributed data centers. IEEE Trans Parallel Distrib Syst 26:1743–1753. https://doi.org/10.1109/TPDS.2014.2325836

    Article  Google Scholar 

  8. Lin M, Wierman A, Andrew LLH, Thereska E (2013) Dynamic right-sizing for power-proportional data centers. IEEE/ACM Trans Netw. 21:1378–1391. https://doi.org/10.1109/TNET.2012.2226216

    Article  Google Scholar 

  9. Wierman A, andrew LLH, Tang A (2009) Power-aware speed scaling in processor sharing systems. In: IEEE INFOCOM 2009-The 28th Conference on Computer Communications. pp. 2007–2015. IEEE, Rio De Janeiro, Brazil (2009). https://doi.org/10.1109/INFCOM.2009.5062123

  10. Zhou Z, Liu F, Zou R, Liu J, Xu H, Jin H (2016) Carbon-aware online control of geo-distributed cloud services. IEEE Trans Parallel Distrib Syst 27:2506–2519. https://doi.org/10.1109/TPDS.2015.2504978

    Article  Google Scholar 

  11. Guo Y, Fang Y (2013) Electricity cost saving strategy in data centers by using energy storage. IEEE Trans Parallel Distrib Syst 24:1149–1160. https://doi.org/10.1109/TPDS.2012.201

    Article  Google Scholar 

  12. Deng X, Wu D, Shen J, He J (2016) Eco-aware online power management and load scheduling for green cloud datacenters. IEEE Syst J 10:78–87. https://doi.org/10.1109/JSYST.2014.2344028

    Article  Google Scholar 

  13. Gao Y, Wei H (2017) Profit-aware workload management for geo-distributed data centers. In: 2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT). pp. 60–66. IEEE, Taipei. https://doi.org/10.1109/PDCAT.2017.00019

  14. Kulkami AK, Annappa B (2017) Cost aware service broker algorithm for load balancing geo-distrubuted data centers in cloud. In: 2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES). pp. 1–5. IEEE, Kollam. https://doi.org/10.1109/SPICES.2017.8091337

  15. Zhang B, Hwang J (2017) Task assignment optimization in geographically distributed data centers. In: 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM). pp. 497–502. IEEE, Lisbon, Portugal. https://doi.org/10.23919/INM.2017.7987318

  16. Gu L, Zeng D, Guo S, Xiang Y, Hu J (2016) A general communication cost optimization framework for big data stream processing in geo-distributed data centers. IEEE Trans Comput 65:19–29. https://doi.org/10.1109/TC.2015.2417566

    Article  MathSciNet  MATH  Google Scholar 

  17. Li W, Li K (2019) Cost-minimizing bandwidth guarantee for inter-datacenter traffic. IEEE TRANSACTIONS ON CLOUD COMPUTING 7:12

    Google Scholar 

  18. Xiao W, Bao W, Zhu X, Liu L (2017) Cost-aware big data processing across geo-distributed datacenters. IEEE Trans Parallel Distrib Syst 28:3114–3127. https://doi.org/10.1109/TPDS.2017.2708120

    Article  Google Scholar 

  19. Tripathi R, Vignesh S, Tamarapalli V, Medhi D (2017) Cost efficient design of fault tolerant geo-distributed data centers. IEEE Trans Netw Serv Manage 14:13

    Article  Google Scholar 

  20. Yi D, Zhou X, Wen Y, Tan R (2019) Toward efficient compute-intensive job allocation for green data centers: a deep reinforcement learning approach. In: 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). pp. 634–644. IEEE, Dallas, TX, USA. https://doi.org/10.1109/ICDCS.2019.00069

  21. Yi D, Zhou X, Wen Y, Tan R (2020) Efficient compute-intensive job allocation in data centers via deep reinforcement learning. IEEE Trans Parallel Distrib Syst 31:1474–1485. https://doi.org/10.1109/TPDS.2020.2968427

    Article  Google Scholar 

  22. Li Y, Wen Y, Tao D, Guan K (2020) Transforming cooling optimization for green data center via deep reinforcement learning. IEEE Trans Cybern 50:2002–2013. https://doi.org/10.1109/TCYB.2019.2927410

    Article  Google Scholar 

  23. Ran Y, Hu H, Zhou X, Wen Y (2019) DeepEE: Joint optimization of job scheduling and cooling control for data center energy efficiency using deep reinforcement learning. In: 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). pp. 645–655. IEEE, Dallas, TX, USA. https://doi.org/10.1109/ICDCS.2019.00070

  24. Chi C, Ji K, Marahatta A, Song P, Zhang F, Liu Z (2020) Jointly optimizing the IT and cooling systems for data center energy efficiency based on multi-agent deep reinforcement learning. In: Proceedings of the Eleventh ACM International Conference on Future Energy Systems. pp. 489–495. ACM, virtual event Australia. https://doi.org/10.1145/3396851.3402658

  25. Sun P, Guo Z, Liu S, Lan J, Wang J, Hu Y (2020) SmartFCT: improving power-efficiency for data center networks with deep reinforcement learning. Comput Netw 179:107255. https://doi.org/10.1016/j.comnet.2020.107255

    Article  Google Scholar 

  26. Yang, X., Wang, Y., He, H., Sun, C., Zhang, Y. (2019) Deep Reinforcement learning for economic energy scheduling in data center microgrids. In: 2019 IEEE power & energy society general meeting (PESGM). pp. 1–5. IEEE, Atlanta, GA, USA. https://doi.org/10.1109/PESGM40551.2019.8974083

  27. Cheng, M., Li, J., Nazarian, S. (2018) DRL-cloud: deep reinforcement learning-based resource provisioning and task scheduling for cloud service providers. In: 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC). pp. 129–134. IEEE, Jeju. https://doi.org/10.1109/ASPDAC.2018.8297294

  28. Gao, J., Wang, H., Shen, H.: Smartly Handling Renewable Energy Instability in Supporting A Cloud Datacenter. In: 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). pp. 769–778. IEEE, New Orleans, LA, USA (2020). https://doi.org/10.1109/IPDPS47924.2020.00084

  29. Xu C, Wang K, Li P, Xia R, Guo S, Guo M (2020) Renewable energy-aware big data analytics in geo-distributed data centers with reinforcement learning. IEEE Trans Netw Sci Eng 7:205–215. https://doi.org/10.1109/TNSE.2018.2813333

    Article  MathSciNet  Google Scholar 

  30. Xiaojie Zhou, Kun Wang, Weijia Jia, Minyi Guo (2017) Reinforcement learning-based adaptive resource management of differentiated services in geo-distributed data centers. In: 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS). pp. 1–6. IEEE, Vilanova i la Geltrú, Spain. https://doi.org/10.1109/IWQoS.2017.7969161

  31. Wei T, Ren S, Zhu Q (2020) Deep reinforcement learning for joint datacenter and HVAC load control in distributed mixed-use buildings. IEEE Trans Sustain Comput. https://doi.org/10.1109/TSUSC.2019.2910533

    Article  Google Scholar 

  32. Kang D-K, Yang E-J, Youn C-H (2019) Deep learning-based sustainable data center energy cost minimization with temporal MACRO/MICRO scale management. IEEE Access 7:5477–5491. https://doi.org/10.1109/ACCESS.2018.2888839

    Article  Google Scholar 

  33. Jing C, Zhu Y, Li M (2013) Customer Satisfaction-Aware Scheduling for Utility Maximization on Geo-distributed Cloud Data Centers. IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, Zhangjiajie, pp. 218-225, doi: https://doi.org/10.1109/HPCC.and.EUC.2013.40

  34. Yao Y, Huang L, Sharma AB, Golubchik L, Neely MJ (2014) Power cost reduction in distributed data centers: a two-time-scale approach for delay tolerant workloads. IEEE Trans Parallel Distrib Syst 25:200–211. https://doi.org/10.1109/TPDS.2012.341

    Article  Google Scholar 

  35. Gao PX, Curtis AR, Wong B, Keshav S (2012) It’s not easy being green. In: SIGCOMM ’12. pp. 211–222

  36. Xu H, Li B (2013) Joint request mapping and response routing for geo-distributed cloud services. In: 2013 Proceedings IEEE INFOCOM. pp. 854–862. IEEE, Turin, Italy. https://doi.org/10.1109/INFCOM.2013.6566873

  37. Szymaniak M, Presotto D, Pierre G, van Steen M (2008) Practical large-scale latency estimation. Comput Netw 52:1343–1364. https://doi.org/10.1016/j.comnet.2007.11.022

    Article  Google Scholar 

  38. Qureshi A (2010) Power-demand routing in massive geo-distributed systems. Thesis (Ph. D.) Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science

  39. Neely M (2010) Stochastic network optimization with application to communication and queueing systems. Morgan Kaufmann, San Mateo

    Book  Google Scholar 

  40. GEOFFRmN, A.M. (1972) Generalized Benders decomposition. J Optim Theory Appl 10: 24

  41. Rao L, Liu X, Ilic MD, Liu J (2012) Distributed coordination of internet data centers under multiregional electricity markets. Proc IEEE 100:269–282. https://doi.org/10.1109/JPROC.2011.2161236

    Article  Google Scholar 

  42. Lu X, Kong F, Yin J, Liu X, Yu H, Fan G (2015) Geographical job scheduling in data centers with heterogeneous demands and servers. In: IEEE 8th International Conference on Cloud Computing. pp. 413–420. IEEE, New York City, NY, USA. https://doi.org/10.1109/CLOUD.2015.62

  43. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518:529–533. https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  44. Zhang MY, Deng DL, Chen M, Wang DP (2017) Joint bidding and geographical load balancing for datacenters: is uncertainty a blessing or a curse? IEEE INFOCOM. 9:49

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Science and Technology Fundament Research Fund of Shenzhen under grant JCYJ20160318095218091, JCYJ20170307151807788.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Qin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Description of Equivalency between Virtual Queue and Constraint (11)

These queues own stability which is equivalent to \(\lim_{T \to \infty } {\mathbb{E}}\{ X_{i} (T)/T\} = 0\). And for (15), it exists that

$$ X_{i} (t + 1) \ge X_{i} (t) - N_{i}^{\max } + (E_{i} (t) + e_{i} (t))N_{i} (t). $$
(41)

The following inequality can be obtained for the summation of the time slot from 0 to \(t\) with the above inequality (41):

$$ \frac{{X_{i} (T) - X_{i} (0)}}{T} + N_{i}^{\max } \ge \frac{1}{T}\sum\limits_{t = 0}^{T - 1} {(E_{i} (t) + e_{i} (t))N_{i} (t)} . $$
(42)

Bring \(X(0) = 0\) and take the two sides of inequality (42) to the limit, we can get the following inequality:

$$ \mathop {\lim }\limits_{T \to \infty } \frac{{{\mathbb{E}}\{ X_{i} (t)\} }}{T} + N_{i}^{\max } \ge \mathop {\lim }\limits_{T \to \infty } \frac{1}{T}\sum\limits_{t = 0}^{T - 1} {(E_{i} (t) + e_{i} (t))N_{i} (t)} . $$
(43)

If the virtual queues \(X_{i} (t)\) are stable, it should be satisfied that \(\lim_{T \to \infty } {\mathbb{E}}\{ X_{i} (T)/\} /T = 0\) and the above inequality (43) can be converted to the carbon emission constraint (11) of problem (14). Therefore, satisfying carbon emission constraint is equivalent to the stability of virtual queues. Thus, in this way, the carbon emissions constraint (14) can be constrained into the stability of the virtual queue.

Appendix 2

Proof of Lemma 1

Similar to [11], there is a fact obviously which is that \((\max [x - y + z,0])^{2} \le x^{2} + y^{2} + z^{2} - 2x(y - z),\forall x,y,z \ge 0.\) So, for virtual queue \(X_{i} (t),\) there are

$$ \begin{gathered} X_{i}^{2} (t + 1) - X_{i}^{2} (t) \le (N_{i}^{\max } )^{2} + (E_{i} (t) + e_{i} (t))^{2} N_{i}^{2} (t) \hfill \\ - 2X_{i} (t)[N_{i}^{\max } - (E_{i} (t) + e_{i} (t))N_{i} (t)]. \hfill \\ \end{gathered} $$
(44)

Based on the above inequality, we can get:

$$ \begin{gathered} \Delta ({\mathbf{X}}(t)) \le \sum\nolimits_{i = 1}^{I} {\frac{1}{2}{\mathbb{E}}\{ (N_{i}^{\max } )^{2} + (E_{i} (t) + e_{i} (t))^{2} N_{i}^{2} (t)|{\mathbf{X}}(t)\} } \hfill \\ - \sum\nolimits_{i = 1}^{I} {X_{i} (t){\mathbb{E}}\{ N_{i}^{\max } - (E_{i} (t) + e_{i} (t))N_{i} (t)|{\mathbf{X}}(t)\} } . \hfill \\ \end{gathered} $$
(45)

It is obvious that \((E_{i} (t) + e_{i} (t))N_{i} (t)\) is bounded by \(\max_{i,t} (E_{i} (t) + e_{i} (t))N_{i} (t)\). So, the bound value of the express \(1/2\sum\nolimits_{i = 1}^{I} {{\mathbb{E}}\{ (N_{i}^{\max } )^{2} + (E_{i} (t) + e_{i} (t))^{2} N_{i}^{2} (t)|{\mathbf{X}}(t)\} }\) is defined by \(B = 1/2 \cdot (\sum\nolimits_{i = 1}^{I} {(N_{i}^{\max } )^{2} + I(\max_{i,t} (E_{i} (t) + e_{i} (t))N_{i} (t)} )^{2} )\).

The following inequality can be obtained by adding the express \(H{\mathbb{E}}\{ \sum\nolimits_{i = 1}^{I} {(E_{i} (t) + e_{i} (t)) \cdot M_{i} (t)} \}\) to both sides of inequality (45):

$$ \begin{gathered} \Delta ({\mathbf{X}}(t)) + H{\mathbb{E}}\{ \sum\limits_{i = 1}^{I} {(E_{i} (t) + e_{i} (t))M_{i} (t)} |{\mathbf{X}}(t)\} \le B - \sum\limits_{i = 1}^{I} {X_{i} (t)N_{i}^{\max } } \hfill \\ + \sum\limits_{i = 1}^{I} {{\mathbb{E}}\{ (E_{i} (t) + e_{i} (t))(HM_{i} (t) + X_{i} (t)N_{i} (t))|{\mathbf{X}}(t)\} } . \hfill \\ \end{gathered} $$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qin, Y., Han, W., Yang, Y. et al. Joint energy optimization on the server and network sides for geo-distributed data centers. J Supercomput 77, 7757–7790 (2021). https://doi.org/10.1007/s11227-020-03523-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03523-4

Keywords

Navigation