Abstract
Host load prediction is essential in computing to improve resource utilization and for achieving service level agreements. However, due to variations in load and the inefficiency of feature extraction, prediction of load on hosts in fog computing an immense challenge. A predictive model with variable load patterns can better estimate future resource needs, which is crucial for capacity planning, service-level goals and energy efficiency. To improve workload prediction accuracy, proposed framework proposes a time-series-based multivariate ensemble model using anomaly detection techniques. In the proposed work, virtual machines are deployed on different platforms and numerous parameters such as CPU utilization, number of cores, RAM, allocated memory, available memory, disk I/O and network I/O are extracted. There may exist inconsistencies in load prediction due to the enormous volume of data. To reduce redundancy in data, various anomaly detection techniques are utilized. The performance of the proposed ensemble model is compared with various time series models using Mean Absolute Error, Mean Squared Error, Root Mean Squared Error and Mean Absolute Percentage Error (MAPE) and Accuracy. Moreover, the effectiveness of the proposed ensemble model is demonstrated on the generated dataset and performance is measured based on the performance evaluation metrics. The ensemble model exhibits higher accuracy in workload prediction as compared to the current state-of-the-art models, it achieves the lowest MAPE and provides an accuracy of about 88%.
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study
References
Armbrust, M., Fox, A., Griffith, R., et al.: A view of cloud computing. Commun. ACM 53(4), 50–58 (2010)
Zhang, Q., Cheng, L., Boutaba, R.: Cloud computing: state-of-the-art and research challenges. J. Internet Serv. Appl. 1, 7–18 (2010). https://doi.org/10.1007/s13174-010-0007-6
Cortés, R., Bonnaire, X., Marin, O., et al.: Stream processing of healthcare sensor data: studying user traces to identify challenges from a big data perspective. Procedia Comput. Sci. 52, 1004–1009 (2015). https://doi.org/10.1016/j.procs.2015.05.093
Hu, P., Dhelim, S., Ning, H., et al.: Survey on fog computing: architecture, key technologies, applications and open issues. J. Netw. Comput. Appl. 98, 27–42 (2017). https://doi.org/10.1016/j.jnca.2017.09.002
Yi, S., Hao, Z., Qin, Z. et al.: Fog computing: platform and applications. In: 2015 Third IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb), IEEE, pp. 73–78, (2015) https://doi.org/10.1109/HotWeb.2015.22
Pereira, E., Fischer, I.A., Medina, R.D., et al.: A load balancing algorithm for fog computing environments. In: Latin American High Performance Computing Conference, pp. 65–77. Springer, Cham (2019)
Huang, Z., Peng, J., Lian, H., et al.: Deep recurrent model for server load and performance prediction in data center. Complexity (2017). https://doi.org/10.1155/2017/8584252
Calheiros, R.N., Masoumi, E., Ranjan, R., et al.: Workload prediction using arima model and its impact on cloud applications’ qos. IEEE Trans. Cloud Comput. 3(4), 449–458 (2014). https://doi.org/10.1109/TCC.2014.2350475
Piraghaj, S.F., Dastjerdi, A.V., Calheiros, R.N., et al.: Containercloudsim: an environment for modeling and simulation of containers in cloud data centers. Softw.: Pract. Exp. 47(4), 505–521 (2017). https://doi.org/10.1109/IMIS.2014.50
Patel, Y.S., Jaiswal, R., Misra, R.: Deep learning-based multivariate resource utilization prediction for hotspots and coldspots mitigation in green cloud data centers. J. Supercomput. 78(4), 5806–5855 (2022). https://doi.org/10.1007/s11227-021-04107-6
Ruiz, A.P., Flynn, M., Large, J., et al.: The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Discovery 35(2), 401–449 (2021). https://doi.org/10.1007/s10618-020-00727-3
Huang, H.C., Cressie, N.: Spatio-temporal prediction of snow water equivalent using the Kalman filter. Comput. Stat. Data Anal. 22(2), 159–175 (1996)
Ho, S.L., Xie, M., Goh, T.N.: A comparative study of neural network and box-jenkins arima modeling in time series prediction. Comput. Ind. Eng. 42(2–4), 371–375 (2002). https://doi.org/10.1016/S0360-8352(02)00036-0
Gao, J., Wang, H., Shen, H.: Machine learning based workload prediction in cloud computing. In: 2020 29th International Conference on Computer Communications and Networks (ICCCN), IEEE, pp. 1–9, (2020a) https://doi.org/10.1109/ICCCN49398.2020.9209730
Chen, J., Wang, Y., et al.: A hybrid method for short-term host utilization prediction in cloud computing. J. Electr. Comput. Eng. (2019). https://doi.org/10.1155/2019/2782349
Janardhanan, D., Barrett, E.: Cpu workload forecasting of machines in data centers using lstm recurrent neural networks and Arima models. In: 2017 12th international conference for internet technology and secured transactions (ICITST), IEEE, pp. 55–60, (2017) https://doi.org/10.23919/ICITST.2017.8356346
Gupta, S., Dinesh, D.A.: Resource usage prediction of cloud workloads using deep bidirectional long short term memory networks. In: 2017 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS), IEEE, pp. 1–6, (2017) https://doi.org/10.1109/ANTS.2017.8384098
Kumar, J., Goomer, R., Singh, A.K.: Long short term memory recurrent neural network (lstm-rnn) based workload forecasting model for cloud datacenters. Procedia Comput. Sci. 125, 676–682 (2018). https://doi.org/10.1016/j.procs.2017.12.087
Tran, N., Nguyen, T., Nguyen, B.M., et al.: A multivariate fuzzy time series resource forecast model for clouds using lstm and data correlation analysis. Procedia Comput. Sci. 126, 636–645 (2018). https://doi.org/10.1016/j.procs.2018.07.298
Gao, J., Wang, H., Shen, H.: Task failure prediction in cloud data centers using deep learning. IEEE Trans. Serv. Comput. 15(3), 1411–1422 (2020). https://doi.org/10.1109/TSC.2020.2993728
Karim, M.E., Maswood, M.M.S., Das, S., et al.: Bhyprec: a novel bi-lstm based hybrid recurrent neural network model to predict the cpu workload of cloud virtual machine. IEEE Access 9, 131476–131495 (2021). https://doi.org/10.1109/ACCESS.2021.3113714
Kumar, J.S.A.: Workload prediction in the cloud using artificial neural network and adaptive differential evolution. Future Gener. Comput. Syst. 81, 41–52 (2018). https://doi.org/10.1016/j.future.2017.10.047
Ramezani, F., Naderpour, M.: A fuzzy virtual machine workload prediction method for cloud environments. In: 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), IEEE, pp. 1–6, (2017) https://doi.org/10.1109/FUZZ-IEEE.2017.8015450
Yang, J., Liu, C., Shang, Y., et al.: A cost-aware auto-scaling approach using the workload prediction in service clouds. Inform. Syst. Front. 16, 7–18 (2014). https://doi.org/10.1007/s10796-013-9459-0
Qiu, F., Zhang, B., Guo, J.: A deep learning approach for VM workload prediction in the cloud. In: 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), IEEE, pp. 319–324, (2016) https://doi.org/10.1109/SNPD.2016.7515919
Jheng, J.J., Tseng, F.H., Chao, H.C.: et al A novel VM workload prediction using grey forecasting model in cloud data center. In: The International Conference on Information Networking 2014 (ICOIN2014), IEEE, pp. 40–45, (2014) https://doi.org/10.1109/ICOIN.2014.6799662
Yu, Y., Jindal, V., Yen, I.L. et al.: Integrating clustering and learning for improved workload prediction in the cloud. In: 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), IEEE, pp. 876–879, (2016) https://doi.org/10.1002/cpe.5931
Patel, Y.S., Misra, R.: Performance comparison of deep VM workload prediction approaches for cloud. In: Progress in Computing, Analytics, and Networking, pp. 149–160. Springer, Singapore (2018)
Xu, M., Song, C., Wu, H., et al.: esdnn: deep neural network based multivariate workload prediction in cloud computing environments. ACM Trans. Internet Technol. (TOIT) 22(3), 1–24 (2022). https://doi.org/10.1145/3524114
Dang-Quang, N.M., Yoo, M.: Multivariate deep learning model for workload prediction in cloud computing. In: 2021 International Conference on Information and Communication Technology Convergence (ICTC), IEEE, pp. 858–862, (2021) https://doi.org/10.1109/ICTC52510.2021.9620931
Ouhame, S., Hadi, Y., Ullah, A.: An efficient forecasting approach for resource utilization in cloud data center using cnn-lstm model. Neural Comput. Appl. 33, 10043–10055 (2021). https://doi.org/10.1007/s00521-021-05770-9
Shishira, S., Kandasamy, A.: Beem-nn: an efficient workload optimization using bee mutation neural network in federated cloud environment. J. Ambient Intell. Human. Comput. 12, 3151–3167 (2021). https://doi.org/10.1007/s12652-020-02474-1
Singh, A.K., Saxena, D., Kumar, J., et al.: A quantum approach towards the adaptive prediction of cloud workloads. IEEE Trans. Parallel Distrib. Syst. 32(12), 2893–2905 (2021). https://doi.org/10.1109/TPDS.2021.3079341
Khan, T., Tian, W., Ilager, S., et al.: Workload forecasting and energy state estimation in cloud data centers: Ml-centric approach. Future Gener. Comput. Syst. 128, 320–332 (2022). https://doi.org/10.1016/j.future.2021.10.019
Zhang, G., Patuwo, B.E., Hu, M.Y.: Forecasting with artificial neural networks: the state of the art. Int. J. Forecast. 14(1), 35–62 (1998)
Safi, S.K., Sanusi, O.I.: A hybrid of artificial neural network, exponential smoothing, and arima models for covid-19 time series forecasting. Model Assisted Stat. Appl. 16(1), 25–35 (2021). https://doi.org/10.3233/MAS-210512
Gupta, S., Dileep, A.D., Gonsalves, T.A.: A joint feature selection framework for multivariate resource usage prediction in cloud servers using stability and prediction performance. J. Supercomput. 74, 6033–6068 (2018). https://doi.org/10.1007/s11227-018-2510-7
Farahnakian, F., Pahikkala, T., Liljeberg, P. et al.: Utilization prediction aware vm consolidation approach for green cloud computing. In: 2015 IEEE 8th International Conference on Cloud Computing, IEEE, pp. 381–388, (2015) https://doi.org/10.1109/CLOUD.2015.58
Wang, X., Lu, X.: A host-based anomaly detection framework using xgboost and lstm for iot devices. Wirel. Commun. Mobile Comput. 2020, 1–13 (2020). https://doi.org/10.1155/2020/8838571
Guha, A., Samanta, D.: Hybrid approach to document anomaly detection: an application to facilitate rpa in title insurance. Int. J. Automat. Comput. 18(1), 55–72 (2021). https://doi.org/10.1007/s11633-020-1247-y
Goldstein, M., Dengel, A.: Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. KI-2012 Poster Demo Track 1, 59–63 (2012)
Chakraborty, S.: Topsis and modified topsis: a comparative analysis. Decis. Anal. J. 2, 100021 (2022). https://doi.org/10.1016/j.dajour.2021.10002
Pushpalatha, R., Ramesh, B.: Amalgamation of neural network and genetic algorithm for efficient workload prediction in data center. In: Smith, J. (ed.) Advances in VLSI, signal processing, power electronics, IoT, communication and embedded systems, pp. 69–84. Springer, Singapore (2021)
Funding
The authors have not disclosed any funding.
Author information
Authors and Affiliations
Contributions
1. Shabnam Bawa: Conceptualization, Methodology, Programming, Formal Analysis and Validation, Writing Original Draft, Writing-Review and Editing 2. Prashant Singh Rana: Supervision, Methodology, review and Editing 3. Rajkumar Tekchandani: Supervision, Methodology, review and Editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no Conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bawa, S., Rana, P.S. & Tekchandani, R. Multivariate time series ensemble model for load prediction on hosts using anomaly detection techniques. Cluster Comput (2024). https://doi.org/10.1007/s10586-024-04517-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10586-024-04517-0