Analyzing the impact of various parameters on job scheduling in the Google cluster dataset

Shahmirzadi, Danyal; Khaledian, Navid; Rahmani, Amir Masoud

doi:10.1007/s10586-024-04377-8

Analyzing the impact of various parameters on job scheduling in the Google cluster dataset

Published: 29 March 2024

(2024)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Danyal Shahmirzadi¹,
Navid Khaledian² &
Amir Masoud Rahmani³

83 Accesses
Explore all metrics

Abstract

Cloud architecture and its operations interest both general consumers and researchers. Google, as a technology giant, offers cloud services globally. This paper analyzes the Google cluster usage trace, focusing on three key aspects: task execution times, rescheduling frequency, and the relationship between task priority and rescheduling. Firstly, we examine how memory and processor performance impact task execution times across different machines. Next, we investigate how the number of task constraints influences rescheduling frequency and overall environmental efficiency. Furthermore, we analyze how task priority affects rescheduling and explore its correlation with task constraints. The results reveal that doubling the memory size can accelerate tasks by a factor of nine and that 90% of rescheduling is associated with tasks having less than seven constraints. We aim to enhance data center performance by identifying bottlenecks in the Google Cluster Dataset and providing recommendations for all cloud service providers. Our key findings indicate that memory plays a more significant role than the processor, and tasks with higher constraints have a less pronounced impact on rescheduling than anticipated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing the Performance of Cloud Environment by a Novel Three-Stage Task Scheduling Policy

CCA: a deadline-constrained workflow scheduling algorithm for multicore resources on the cloud

Article 22 June 2016

Valuable survey on scheduling algorithms in the cloud with various publications

Article 11 June 2022

References

Khaledian, N., Khamforoosh, K., Azizi, S., Maihami, V.: IKH-EFT: an improved method of workflow scheduling using the krill herd algorithm in the fog-cloud environment. Sustain. Comput.: Inform. Syst. 37, 100834 (2023)
Google Scholar
Rosà, A., Chen, L.Y., Birke, R., Binder, W.: Demystifying casualties of evictions in big data priority scheduling. ACM SIGMETRICS Perform. Eval. Rev. 42(4), 12–21 (2015)
Article Google Scholar
Chen, X., Lu, C. D., Pattabiraman, K.: Failure analysis of jobs in compute clouds: a Google cluster case study. In 2014 IEEE 25th International Symposium on Software Reliability Engineering (pp. 167–177). IEEE. (2014)
Rzadca, K., Findeisen, P., Swiderski, J., Zych, P., Broniek, P., Kusmierek, J., Wilkes, J.: Autopilot: workload autoscaling at Google. In proceedings of the fifteenth european conference on computer systems (pp. 1–16), (2020)
Anil, R., Capan, G., Drost-Fromm, I., Dunning, T., Friedman, E., Grant, T., Yılmazel, Ö.: Apache mahout: machine learning on distributed dataflow systems. J. Mach. Learn. Res. 21(127), 1–6 (2020)
Google Scholar
Gévay, G.E., Soto, J., Markl, V.: Handling iterations in distributed dataflow systems. ACM Comput. Surv. (CSUR) 54(9), 1–38 (2021)
Article Google Scholar
Tirmazi, M., Barker, A., Deng, N., Haque, M.E., Qin, Z.G., Hand, S., Wilkes, J.: Borg: the next generation. In proceedings of the fifteenth european conference on computer systems (pp. 1–14), (2020)
Fernández-Cerero, D., Varela-Vaca, Á.J., Fernández-Montes, A., Gómez-López, M.T., Alvárez-Bermejo, J.A.: Measuring data-centre workflows complexity through process mining: the Google cluster case. J. Supercomput. 76, 2449–2478 (2020)
Article Google Scholar
Gog, I., Schwarzkopf, M., Gleave, A., Watson, R.N., Hand, S.: Firmament: Fast, centralized cluster scheduling at scale. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (pp. 99–115), (2016)
Fernández Cerero, D., Fernández Montes González, A., Jakóbik, A., Kolodziej, J.: Stackelberg game-based models in energy-aware cloud scheduling. In ECMS 2018: 32nd European Conference on Modelling and Simulation (2018). European Council for Modelling and Simulation, (2018)
Khaledian, N., Khamforoosh, K., Akraminejad, R., Abualigah, L., Javaheri, D.: An energy-efficient and deadline-aware workflow scheduling algorithm in the fog and cloud environment. Computing 106(1), 109–137 (2024)
Article Google Scholar
Fernández-Cerero, D., Jakóbik, A., Grzonka, D., Kołodziej, J., Fernández-Montes, A.: Security supportive energy-aware scheduling and energy policies for cloud environments. J. Parallel Distrib. Comput. 119, 191–202 (2018)
Article Google Scholar
Maala, H.H., Yousif, S.A.: Cluster trace analysis for performance enhancement in cloud computing environments. J. Theor. Appl. Inf. Technol. 97(7), 2019 (2019)
Google Scholar
R. Koch, "The 20/80Principle: the secret of achieving more with less.," Doubleday, (1999)
Van Loo, T., Jindal, A., Benedict, S., Chadha, M., Gerndt, M.: Scalable infrastructure for workload characterization of cluster traces. (2022), arXiv preprint
Adil, I.H., Wahid, A., Mantell, E.H.: Split sample skewness. Commun. Stat. Theory Methods 50(22), 5171–5188 (2021)
Article MathSciNet Google Scholar
Olabisi, D., Abubakar, S.K., Abdullahi, A.T.: demystifying dew computing: concept, architecture and research opportunities. Int. J. Comput. Trends Technol. 70, 39–43 (2022)
Article Google Scholar
Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H., Kozuch, M.A.: Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the third ACM symposium on cloud computing (pp. 1–13), (2012)
Umer, A., Mian, A.N., Rana, O.: Predicting machine behavior from Google cluster workload traces. Concurr. Comput.: Pract. Exp. 35(5), e7559 (2023)
Article Google Scholar
Jassas, M. S., Mahmoud, Q. H.: Failure characterization and prediction of scheduling jobs in Google cluster traces. In 2019 IEEE 10th GCC Conference & Exhibition (GCC) (pp. 1–7). IEEE. (2019)
Wang, H., Jiang, C., Xie, B.: Missing data analysis and prediction: a Google cluster case study. (2022)
Ngang'a, D.N., Cheruiyot, W.K., Njagi, D. A Machine Learning Framework for Predicting Failures in Cloud Data Centers-A Case of Google Cluster-Azure Clouds and Alibaba Clouds. Available at SSRN 4404569
Soualhia, M., Khomh, F., Tahar, S.: Predicting scheduling failures in the cloud: A case study with Google clusters and Hadoop on Amazon EMR. In 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems (pp. 58–65). IEEE. (2015)
Chen, S., Yang, C., Huang, W., Liang, W., Ke, N., Souri, A., Li, K.C.: Fairness constraint efficiency optimization for multiresource allocation in a cluster system serving internet of things. Int. J. Commun. Syst. 36(3), e5395 (2023)
Article Google Scholar
Wilkes, J.: More Google cluster data. Google research blog, Nov, (2011)
Gupta, S., Dileep, A.D.: Long range dependence in cloud servers: a statistical analysis based on Google workload trace. Computing 102(4), 1031–1049 (2020)
Article MathSciNet Google Scholar
Subramanian, N.V., Sriram, V.S.: Load-aware VM migration using hypergraph based CDB-LSTM. Intell. Autom. Soft Comput. 35(3), 3279–3294 (2023)
Article Google Scholar
Berisha, B., Mëziu, E., Shabani, I.: Big data analytics in Cloud computing: an overview. J. Cloud Comput. 11(1), 24 (2022)
Article Google Scholar
Osborne, J.W., Overbay, A.: The power of outliers (and why researchers should always check for them). Pract. Assess. Res. Eval. 9(1), 6 (2019)
Google Scholar
Seo, S.: A review and comparison of methods for detecting outliers in univariate data sets (Doctoral dissertation, University of Pittsburgh), (2006)
Brys, G., Hubert, M., Struyf, A.: A robust measure of skewness. J. Comput. Graph. Stat. 13(4), 996–1017 (2004)
Article MathSciNet Google Scholar
Tawhid, A., Teotia, T., Elmiligi, H.: Machine Learning for Optimizing Healthcare Resources Machine Learning, Big Data, and IoT for Medical Informatics, pp. 215–239. Academic Press, Cambridge (2021)
Book Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Engineering Science and Technology, National Yunlin University of Science and Technology, 123 University Road, Douliou, 64002, Yunlin, Taiwan
Danyal Shahmirzadi
Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, Esch-sur-Alzette, Luxembourg
Navid Khaledian
Future Technology Research Center, National Yunlin University of Science and Technology, Douliou, Yunlin, Taiwan
Amir Masoud Rahmani

Authors

Danyal Shahmirzadi
View author publications
You can also search for this author in PubMed Google Scholar
Navid Khaledian
View author publications
You can also search for this author in PubMed Google Scholar
Amir Masoud Rahmani
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

"D.S: Writing original draft, Methodology, Software, N.K: preparation, visualization; A.M.R.: conceptualization, validation, supervision, and investigation. All authors reviewed the manuscript."

Corresponding author

Correspondence to Amir Masoud Rahmani.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shahmirzadi, D., Khaledian, N. & Rahmani, A.M. Analyzing the impact of various parameters on job scheduling in the Google cluster dataset. Cluster Comput (2024). https://doi.org/10.1007/s10586-024-04377-8

Download citation

Received: 28 November 2023
Revised: 20 February 2024
Accepted: 22 February 2024
Published: 29 March 2024
DOI: https://doi.org/10.1007/s10586-024-04377-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analyzing the impact of various parameters on job scheduling in the Google cluster dataset

Abstract

Access this article

Similar content being viewed by others

Enhancing the Performance of Cloud Environment by a Novel Three-Stage Task Scheduling Policy

CCA: a deadline-constrained workflow scheduling algorithm for multicore resources on the cloud

Valuable survey on scheduling algorithms in the cloud with various publications

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analyzing the impact of various parameters on job scheduling in the Google cluster dataset

Abstract

Access this article

Similar content being viewed by others

Enhancing the Performance of Cloud Environment by a Novel Three-Stage Task Scheduling Policy

CCA: a deadline-constrained workflow scheduling algorithm for multicore resources on the cloud

Valuable survey on scheduling algorithms in the cloud with various publications

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation