Abstract
Many real-world applications executed on distributed computing systems are organized as directed acyclic graphs (DAGs) of tasks. The algorithm employed for scheduling these tasks across the system has a substantial impact on the achieved performance. Despite the numerous DAG scheduling algorithms proposed by researchers, there is a lack of benchmarks that evaluate the performance of such algorithms on a set of real application instances in realistic conditions. Thus developers of runtime systems often resort to the use of simple but inefficient algorithms. In this work we aim to fill this gap by proposing a benchmark for evaluating DAG scheduling algorithms based on a set of 150 real-world workflow instances with up to 1695 tasks and 10 realistic cluster configurations with multi-core machines. We apply this benchmark for evaluation of 16 scheduling algorithms including the well-known static algorithms and the commonly used in practice dynamic algorithms. The obtained results demonstrate that the proposed benchmark allows to clearly separate and compare the algorithms performance from different angles and to make important observations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adhikari, M., Amgoth, T., Srirama, S.N.: A survey on scheduling strategies for workflows in cloud environment and emerging trends. ACM Comput. Surv. (CSUR) 52(4), 1–36 (2019)
Arabnejad, H., Barbosa, J.G.: List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans. Parallel Distrib. Syst. 25(3), 682–694 (2014)
Arya, L.K., Verma, A.: Workflow scheduling algorithms in cloud environment-a survey. In: 2014 Recent Advances in Engineering and Computational Sciences (RAECS), pp. 1–4 (2014)
Badia Sala, R.M., Ayguadé Parra, E., Labarta Mancho, J.J.: Workflows for science: a challenge when facing the convergence of HPC and big data. Supercomput. Front. Innov. 4(1), 27–47 (2017)
Bharathi, S., Chervenak, A., Deelman, E., Mehta, G., Su, M.H., Vahi, K.: Characterization of scientific workflows. In: 2008 Third Workshop on Workflows in Support of Large-Scale Science, pp. 1–10. IEEE (2008)
Bittencourt, L.F., Sakellariou, R., Madeira, E.R.M.: DAG scheduling using a lookahead variant of the heterogeneous earliest finish time algorithm. In: 2010 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, pp. 27–34, February 2010
Casanova, H., Wong, Y.C., Pottier, L., da Silva, R.F.: On the feasibility of simulation-driven portfolio scheduling for cyberinfrastructure runtime systems. In: Job Scheduling Strategies for Parallel Processing (2022)
Coleman, T., Casanova, H., Pottier, L., Kaushik, M., Deelman, E., da Silva, R.F.: WfCommons: a framework for enabling scientific workflow research and development. Futur. Gener. Comput. Syst. 128, 16–27 (2022)
Deelman, E., Gannon, D., Shields, M., Taylor, I.: Workflows and e-science: an overview of workflow system features and capabilities. Futur. Gener. Comput. Syst. 25(5), 528–540 (2009)
Deelman, E., et al.: Pegasus, a workflow management system for science automation. Futur. Gener. Comput. Syst. 46, 17–35 (2015)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Durillo, J.J., Nae, V., Prodan, R.: Multi-objective energy-efficient workflow scheduling using list-based heuristics. Futur. Gener. Comput. Syst. 36, 221–236 (2014)
Garey, M.R., Johnson, D.S.: Computers and Intractability, vol. 174. Freeman, San Francisco (1979)
Gupta, A., Garg, R.: Workflow scheduling in heterogeneous computing systems: a survey. In: 2017 International Conference on Computing and Communication Technologies for Smart Nation (IC3TSN), pp. 319–326. IEEE (2017)
Kwok, Y.K., Ahmad, I.: Benchmarking the task graph scheduling algorithms. In: Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing, pp. 531–537. IEEE (1998)
Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13, 457–493 (2015)
Luo, J., Zhou, Y., Li, X., Yuan, M., Yao, J., Zeng, J.: Learning to optimize DAG scheduling in heterogeneous environment. arXiv preprint arXiv:2103.06980 (2021)
Mao, M., Humphrey, M.: Auto-scaling to minimize cost and meet application deadlines in cloud workflows. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–12 (2011)
Orhean, A.I., Pop, F., Raicu, I.: New scheduling approach using reinforcement learning for heterogeneous distributed systems. J. Parallel Distrib. Comput. 117, 292–302 (2018)
Rodriguez, M.A., Buyya, R.: A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments. Concurr. Comput. Pract. Exp. 29(8), e4041 (2017)
Sih, G.C., Lee, E.A.: A compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures. IEEE Trans. Parallel Distrib. Syst. 4(2), 175–187 (1993)
Sinnen, O.: Task Scheduling for Parallel Systems. Wiley, New York (2007)
Sukhoroslov, O.: Supporting efficient execution of workflows on everest platform. In: Voevodin, V., Sobolev, S. (eds.) Supercomputing: 5th Russian Supercomputing Days, RuSCDays 2019, Moscow, Russia, 23–24 September 2019, Revised Selected Papers 5, vol. 1129, pp. 713–724. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36592-9_58
Sukhoroslov, O.: Toward efficient execution of data-intensive workflows. J. Supercomput. 77(8), 7989–8012 (2021)
Sukhoroslov, O., Gorokhovskii, M., Ilgovskiy, R., Kuskarov, T., Semenov, Y., Vetrov, A.: Towards a general framework for studying resource management in large scale distributed systems. In: 4th International Workshop on Information, Computation, and Control Systems for Distributed Environments (ICCS-DE 2022), pp. 79–96 (2022)
Sukhoroslov, O., Nazarenko, A., Aleksandrov, R.: An experimental study of scheduling algorithms for many-task applications. J. Supercomput. 75, 7857–7871 (2019)
Topcuoglu, H., Hariri, S., Wu, M.Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
Acknowledgments
This work is supported by the Russian Science Foundation (project 22-21-00812).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sukhoroslov, O., Gorokhovskii, M. (2023). Benchmarking DAG Scheduling Algorithms on Scientific Workflow Instances. In: Voevodin, V., Sobolev, S., Yakobovskiy, M., Shagaliev, R. (eds) Supercomputing. RuSCDays 2023. Lecture Notes in Computer Science, vol 14389. Springer, Cham. https://doi.org/10.1007/978-3-031-49435-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-49435-2_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49434-5
Online ISBN: 978-3-031-49435-2
eBook Packages: Computer ScienceComputer Science (R0)