Towards Energy Efficient Orchestration of Cloud Computing Infrastructure
The emerging of new Cloud services and applications demanding for ever more performance (i.e., on one hand, the rapid growth of applications using deep learning –DL, on the other hand, HPC-oriented work-flows executed in Cloud) is continuously putting pressure on Cloud providers to increase capabilities of their large data centers, by embracing more advanced and heterogeneous devices [2, 3, 11]. Hardware heterogeneity also helps Cloud providers to improve energy efficiency of their infrastructures by using architectures dedicated to specific workloads. However, heterogeneity represents a challenge from the infrastructure management perspective. In this highly dynamic context, workload orchestration requires advanced algorithms to not defeat the efficiency provided by the hardware layer. Despite past works partially addressed the problem, a comprehensive solution is still missing.
This paper presents the solution studied within the European H2020 project OPERA . Our approach is intended for managing the workload in large infrastructures running heterogeneous systems, by using a two-steps approach. Whenever new jobs are submitted, an energy-aware allocation policy is used to select the most efficient nodes where to execute the incoming jobs. In a second step, the whole workload is consolidated by means of the optimization of a cost model. This paper focuses on an allocation algorithm aimed at reducing the overall energy consumption; it also presents the results of simulations on a State-of-the-Art framework. When compared with well-known and broadly adopted allocation strategies, the proposed approach results in a tangible energy-saving (up to 30% compared to First Fit allocation policy, and up to 45.2% compared to the Best Fit), thus demonstrating energy efficiency superiority.
This work is supported by the OPERA project, which has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under the grant agreement No. 688386.
- 1.European H2020 OPERA project. http://www.operaproject.eu
- 2.Amazon Web Services (AWS) – Accelerated computing instances. https://aws.amazon.com/ec2/instance-types/
- 3.Microsoft Azure – GPU based instances. https://azure.microsoft.com/en-us/pricing/details/virtual-machines/series/
- 5.Markidis, S., Chien, S.W.D., Laure, E., Peng, I.B., Vetter, J.S.: NVIDIA Tensor Core Programmability, Performance & Precision, arXiv:1803.04014v1 [cs.DC] (2018)
- 7.de Dinechin, B.D., et al.: A clustered manycore processor architecture for embedded and accelerated applications. In: High Performance Extreme Computing Conference (HPEC). IEEE (2013)Google Scholar
- 8.Ramey, C.: TILE-Gx100 ManyCore processor: acceleration interfaces and architecture. In: Hot Chips 23 Symposium (HCS). IEEE (2011)Google Scholar
- 9.Intel Stratix10 website. https://www.altera.com/products/fpga/stratix-series/stratix-10/overview.html
- 10.Intel Arria10 website. https://www.altera.com/products/fpga/arria-series/arria-10/overview.html
- 11.Microsoft project Catapult. https://www.microsoft.com/en-us/research/project/project-catapult/
- 12.Putnam, A., et al.: A cloud-scale acceleration architecture. In: 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE/ACM (2016)Google Scholar
- 14.Silva Filho, M.C., et al.: CloudSim plus: a cloud computing simulation framework pursuing software engineering principles for improved modularity, extensibility and correctness. In: IFIP/IEEE International Symposium on Integrated Network Management (2017)Google Scholar
- 15.Silva Filho, M.C., et al.: CloudSim Plus: A Modern Java 8 Framework for Modeling and Simulation of Cloud Computing Infrastructures and Services. https://github.com/manoelcampos/cloudsim-plus/blob/master/docs/cloudsim-plus-white-paper.pdf
- 18.El Bakely, A.H., Hefny, H.A.: Using shortest job first scheduling in greencloud computing. Int. J. Adv. Res. Comput. Commun. Eng. 4, 348–354 (2015)Google Scholar
- 20.Bahwaireth, K., Tawalbeh, L., Benkhelifa, E., et al.: EURASIP J. Info. Secur. 2016, 15 (2016). https://doi.org/10.1186/s13635-016-0039-y
- 21.Pinheiro, E., Bianchini, R., et al.: Load balancing and unbalancing for power and performancee in cluster-based systems. In: Proceedings of the Workshop on Compilers and Operating Systems for Low Power, pp. 182–195 (2001)Google Scholar
- 22.Filelis-Papadopoulos, C.K., Giannoutakis, K.M., Gravvanis, G.A., et al.: J. Supercomput. 74, 530 (2018). https://doi.org/10.1007/s11227-017-2143-2