Abstract
One of the challenges in managing cloud computing clusters is assigning resources based on the customers’ needs. For this mechanism to work efficiently, it is imperative that there are sufficient resources reserved to maintain continuous operation, but not too much to avoid overhead costs. Additionally, to avoid the overhead of acquisition time, it is important to reserve resources sufficiently in advance. This paper presents a novel reliable general-purpose mechanism for prediction-based resource usage reservation. The proposed solution should be capable of operating for long periods of time without drift-related problems, and dynamically adapt to changes in system usage. To achieve this, a novel signature-based ensemble prediction method is presented, which utilizes multiple distinct prediction algorithms suited for various use-cases, as well as an anomaly detection mechanism used to improve prediction accuracy. This ensures that the mechanism can operate efficiently in different real-life scenarios. Thanks to a novel signature-based selection algorithm, it is possible to use the best available prediction algorithm for each use-case, even over long periods of time, which would typically lead to drifts. The proposed approach has been evaluated using real-life historical data from various production servers, which include traces from more than 1,500 machines collected over more than a year. Experimental results have demonstrated an increase in prediction accuracy of up to 21.4 percent over the neural network approach. The evaluation of the proposed approach highlights the importance of choosing the appropriate prediction method, especially in diverse scenarios where the load changes frequently.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Data Availability
The data that supports the findings of this study is available from Ringier Axel Springer Poland and Nokia, but restrictions apply to the availability of this data, which was used under license for the current study, and thus is not publicly available. The data is, however, available from the authors upon reasonable request and with permission of Ringier Axel Springer Poland and Nokia.
References
Albayrak, S., Camtepe, S.A., Edman, M., et al.: Host-based anomaly detection via resource usage signatures. Tech. rep., Distributed Artificial Intelligence Laboratory - Technische Universitat Berlin, Berlin, Germany (2009)
Anupama, K.C., Shivakumar, B.R., Nagaraja, R.: Resource utilization prediction in cloud computing using hybrid model. Int. J. Adv. Comput. Sci. Appl. 12(4) (2021). https://doi.org/10.14569/IJACSA.2021.0120447
Bisina, K.V., Azeez, M.A.: Optimized estimation of power spectral density. In: 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 871–875 (2017). https://doi.org/10.1109/ICCONS.2017.8250588
Cruz, R.M., Sabourin, R., Cavalcanti, G.D.: Dynamic classifier selection: recent advances and perspectives. Inf. Fusion 41, 195–216 (2018). https://doi.org/10.1016/j.inffus.2017.09.010
Cruz, R.M., Souza, M.A., Sabourin, R., et al.: Dynamic ensemble selection and data preprocessing for multi-class imbalance learning. Int. J. Pattern Recognit. Artif. Intell. 33(11), 1940009 (2019)
Faber, K., Corizzo, R., Sniezynski, B., et al.: Lifelong learning for anomaly detection: new challenges, perspectives, and insights. arXiv:2303.07557 (2023)
Girish, L., Rao, S.K.: Anomaly detection in cloud environment using artificial intelligence techniques. Computing 105(3), 675–688 (2023)
Gupta, S., Dileep, A.D., Gonsalves, T.A.: Online sparse blstm models for resource usage prediction in cloud datacentres. IEEE Trans. Netw. Serv. Manage. 17(4), 2335–2349 (2020). https://doi.org/10.1109/TNSM.2020.3013922
Hagemann, T., Katsarou, K.: A systematic review on anomaly detection for cloud computing environments. In: Proceedings of the 2020 3rd Artificial Intelligence and Cloud Computing Conference. Association for Computing Machinery, New York, NY, USA, AICCC ’20, pp. 83–96 (2021). https://doi.org/10.1145/3442536.3442550
He, Z., Chen, P., Li, X., et al.: A spatiotemporal deep learning approach for unsupervised anomaly detection in cloud systems. IEEE Trans. Neural Netw. Learn. Syst. 34(4), 1705–1719 (2023). https://doi.org/10.1109/TNNLS.2020.3027736
He, Z., Hu, G., Lee, R.B.: Cloudshield: real-time anomaly detection in the cloud. In: Proceedings of the Thirteenth ACM Conference on Data and Application Security and Privacy. Association for Computing Machinery, New York, NY, USA, CODASPY ’23, pp. 91–102 (2023). https://doi.org/10.1145/3577923.3583639
Kumar, J., Singh, A.K.: Workload prediction in cloud using artificial neural network and adaptive differential evolution. Futur. Gener. Comput. Syst. 81, 41–52 (2018). https://doi.org/10.1016/j.future.2017.10.047
Kumar, J., Goomer, R., Singh, A.K.: Long short term memory recurrent neural network (lstm-rnn) based workload forecasting model for cloud datacenters. Procedia Comput. Sci. 125, 676–682 (2018). https://doi.org/10.1016/j.procs.2017.12.087
Li, X., Wang, H., Xiu, P., et al.: Resource usage prediction based on bilstm-gru combination model. In: 2022 IEEE International Conference on Joint Cloud Computing (JCC), pp. 9–16 (2022). https://doi.org/10.1109/JCC56315.2022.00009
Liao, P., Pan, G., Wang, B., et al.: Efficient proactive resource allocation for multi-stage cloud-native microservices. In: Tari, Z., Li, K., Wu, H. (eds.) Algorithms and Architectures for Parallel Processing, pp. 411–432. Springer Nature Singapore, Singapore (2024)
Lin, J.: Divergence measures based on the shannon entropy. IEEE Trans. Inf. Theory 37(1), 145–151 (1991). https://doi.org/10.1109/18.61115
Malav, A., Gupta, S.K., Mahariya, S.K., et al.: Optimal resource management in cloud computing. AIP Conf. Proc. 2771(1), 020040 (2023). https://doi.org/10.1063/5.0152298
Mason, K., Duggan, M., Barrett, E., et al.: Predicting host cpu utilization in the cloud using evolutionary neural networks. Futur. Gener. Comput. Syst. 86, 162–173 (2018). https://doi.org/10.1016/j.future.2018.03.040
Mohapatra, S.S., Kumar, R.R., Alenezi, M., et al.: Qos-aware cloud service recommendation using metaheuristic approach. Electronics 11(21) (2022). https://doi.org/10.3390/electronics11213469
Moura, T.J., Cavalcanti, G.D., Oliveira, L.S.: Mine: a framework for dynamic regressor selection. Inf. Sci. 543, 157–179 (2021). https://doi.org/10.1016/j.ins.2020.07.056
Nawrocki, P., Smendowski, M.: Long-term prediction of cloud resource usage in high-performance computing. In: Mikyška, J., de Mulatier, C., Paszynski, M., et al. (eds.) Computational Science – ICCS 2023, pp. 532–546 . Springer Nature Switzerland, Cham (2023)
Nawrocki, P., Sus, W.: Anomaly detection in the context of long-term cloud resource usage planning. Knowl. Inf. Syst. 64(10), 2689–2711 (2022). https://doi.org/10.1007/s10115-022-01721-5
Nawrocki, P., Osypanka, P., Posluszny, B.: Data-driven adaptive prediction of cloud resource usage. J. Grid Comput. 21(1), 6 (2023). https://doi.org/10.1007/s10723-022-09641-y
Nguyen, T., Tran, N., Nguyen, B.M., et al.: A resource usage prediction system using functional-link and genetic algorithm neural network for multivariate cloud metrics. In: 2018 IEEE 11th Conference on Service-Oriented Computing and Applications (SOCA), pp. 49–56 (2018). https://doi.org/10.1109/SOCA.2018.00014
Ouali, C., Dumouchel, P., Gupta, V.: A robust audio fingerprinting method for content-based copy detection. In: 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6 (2014). https://doi.org/10.1109/CBMI.2014.6849814
Park, J., Baik, J.: Improving software reliability prediction through multi-criteria based dynamic model selection and combination. J. Syst. Softw. 101, 236–244 (2015). https://doi.org/10.1016/j.jss.2014.12.029
Riganelli, O., Saltarel, P., Tundo, A., et al.: Cloud failure prediction with hierarchical temporal memory: an empirical assessment. In: 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 785–790 (2021). https://doi.org/10.1109/ICMLA52953.2021.00130
Sergio, A.T., de Lima, T.P., Ludermir, T.B.: Dynamic selection of forecast combiners. Neurocomputing 218, 37–50 (2016). https://doi.org/10.1016/j.neucom.2016.08.072
Shah, S.Y., Patel, D., Vu, L., et al.: Autoai-ts: autoai for time series forecasting. CoRR abs/2102.12347. arXiv:2102.12347 (2021)
Sniezynski, B., Nawrocki, P., Wilk, M., et al.: VM reservation plan adaptation using machine learning in cloud computing. J. Grid Comput. 17(4), 797–812 (2019). https://doi.org/10.1007/s10723-019-09487-x
Ullah, F., Bilal, M., Yoon, S.K.: Intelligent time-series forecasting framework for non-linear dynamic workload and resource prediction in cloud. Comput. Netw. 225, 109653 (2023). https://doi.org/10.1016/j.comnet.2023.109653
Xin, R., Liu, H., Chen, P., et al.: Robust and accurate performance anomaly detection and prediction for cloud applications: a novel ensemble learning-based framework. J. Grid Comput. (2023). https://doi.org/10.1186/s13677-022-00383-6
Acknowledgements
The research presented in this paper was supported by funds from the Polish Ministry of Science and Higher Education allocated to the AGH University of Krakow. The authors would like to thank Ringier Axel Springer Poland and Nokia for providing the data used in the tests.
Funding
The research presented in this paper was supported by funds from the Polish Ministry of Science and Higher Education allocated to the AGH University of Krakow.
Author information
Authors and Affiliations
Contributions
W.S.: Methodology, Software, Investigation, Writing - original draft. P.N.: Supervision, Conceptualization, Methodology, Writing - review & editing.
Corresponding author
Ethics declarations
Ethics Approval and Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sus, W., Nawrocki, P. Signature-based Adaptive Cloud Resource Usage Prediction Using Machine Learning and Anomaly Detection. J Grid Computing 22, 46 (2024). https://doi.org/10.1007/s10723-024-09764-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10723-024-09764-4