Abstract
Workflow scheduling problems have been widely studied in cloud computing and edge computing, which aim to exploit cloud-edge resources to execute workflow tasks considering several constraints and optimization goals. However, in the era of Internet of things, the load of each computing task and the amount of data transferred between computing tasks will fluctuate, which changes the original workflow and needs for a new scheduling plan correspondingly. Existing methods are difficult to quickly cope with these dynamic changes and there are few studies applying neural networks to solve problems in workflow scheduling. To bridge the gap, we propose an innovative supervised learning method which leverages function-fitting strategy of neural networks to link the workflow environment and its optimal scheduling plan. Specifically, our approach can be divided into two steps, the first one is to generate dataset and train a seq2seq-based prediction models. In this step, we develop an algorithm for generating a significant amount of workflow instances while ensuring dataset diversity based on complexity estimation. Then we apply GA, NSGA, NSGA-NN three different types GA-based optimization methods to search optimal solutions. Finally, we construct dataset which includes {workflow, environment configurations \(\rightarrow\) obtained optimal solution} and train a seq2seq-based model. The other part is real-time generation of scheduling plans based on trained seq2seq-based model. Simulation experiments have confirmed that our method is both effective and efficient, demonstrating its ability to adapt to changes in the execution environment, workflow task load, and task data transmission, and effectively schedule tasks in real-time. The simulation results show that the seq2seq-based prediction method can approach 90% of the optimal scheme.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Meena, J., Kumar, M., Vardham, M.: Cost effective genetic algorithm for workflow scheduling in cloud under deadline constraint. IEEE Access 4, 1–1 (2016)
Suraj, P., Linlin, W., Siddeswara, G., Rajkumar, B.: A Particle Swarm Optimization-Based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments, pp. 400–407. IEEE, New York (2010)
Topcuoglu, H., Hariri, S., Min-You, W.: Performance-effective and low-complexity task scheduling forheterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13, 260–274 (2002)
Bittencourt, L.F., Sakellariou, R., Madeira, M.: DAG Scheduling using a lookahead variant of the heterogeneous earliest finish time algorithm. In: 2010 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, pp. 27–34. IEEE, Italy (2010)
Arabnejad, H., Barbosa, J.G.: List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans. Parallel Distrib. Syst. 25(3), 682–694 (2014)
Panda, S.K., Jana, P.K.: Uncertainty-based QoS Min-Min algorithm for heterogeneous multi-cloud environment. Arab. J. Sci. Eng. 41(8), 3003–3025 (2016)
Liu, H., Ma, Y., Chen, P., Xia, Y., Ma, Y., Zheng, W., Xiaobo L.: Scheduling Multi-workflows over Edge Computing Resources with Time-Varying Performance. A Novel Probability-Mass Function and DQN-Based Approach, pp. 197–209. Springer, Cham (2020)
Kintsakis, A., Psomopoulos, F., Mitkas, P.: Reinforcement learning based scheduling in a workflow management system. Eng. Appl. Artif. Intell. 81, 94–106 (2019)
Pan, Y., Sun, X., Xia, Y., Chen, P., Pang, S., Li, X., Ma, Y.: A Stochastic-Performance-Distribution-Based Approach to Cloud Workflow Scheduling with Fluctuating Performance, pp. 33–48. Springer, Cham (2020)
Li, W., Xia, Y., Zhou, M., Sun, X., Zhu, Q.: Fluctuation-aware and predictive workflow scheduling in cost-effective infrastructure-as-a-service clouds. IEEE Access. 2018, 1 (2018)
Liu, L., Huang, H., Tan, H., Cao, W., Yang, P., Li, X.Y.: Online DAG Scheduling with On-Demand Function Configuration in Edge Computing, pp 213–224. Springer, Cham (2019)
Ismayilov, G., Topcuoglu, H.: Neural network based multi-objective evolutionary algorithm for dynamic workflow scheduling in cloud computing. Future Gener. Comput. Syst. 102, 10 (2019)
Chen, T., Saxena, S., Li, L., Fleet, D.J., Hinton, G.: Pix2seq: a language modeling framework for object detection. Preprint at http://arxiv.org/abs/2109.10852, (2021)
Barika, M., Garg, S., Ranjan, R.: Adaptive scheduling for efficient execution of dynamic stream workflows. https://arxiv.org/abs/1912.08397 (2019)
Arabnejad, V., Bubendorfer, K., Ng, B.: Scheduling deadline constrained scientific workflows on dynamically provisioned cloud resources. Future Gener. Comput. Syst. 75, 20 (2017)
Dong, T., Xue, F., Xiao, C., Zhang, J.: Deep reinforcement learning for dynamic workflow scheduling in cloud environment. In: 2021 IEEE International Conference on Services Computing (SCC), pp. 107–115. IEEE, New York (2021)
Xiaolong, X., Cao, H., Geng, Q., Liu, X., Dai, F., Wang, C.: Dynamic resource provisioning for workflow scheduling under uncertainty in edge computing environment. Concurr. Comput. Practice Exp. 34(14), e5674 (2022)
Barika, M., Garg, S., Ranjan, R.: Cost effective stream workflow scheduling to handle application structural changes. Future Gener. Comput. Syst. 112, 348–361 (2020)
Arabnejad, V., Bubendorfer, K., Ng, B.: Dynamic multi-workflow scheduling: a deadline and cost-aware approach for commercial clouds. Future Gener. Comput. Syst. 100, 98–108 (2019)
Yi, P., Wang S., Wu, L., Xia, Y., Wanbo, Z., Shanchen, P., Ziyang, Z., Peng, C., Yawen, L.: A novel approach to scheduling workflows upon cloud resources with fluctuating performance. Mobile Netw. Appl. 25(2), 690–700 (2020)
Shaw, R., Howley, E., Barrett, E.: Predicting the available bandwidth on intra cloud network links for deadline constrained workflow scheduling in public clouds. In: Michael, M., Antonio, V., Jianmin, W., Marc O. (ed.) Service-Oriented Computing, Lecture Notes in Computer Science, pp. 221–228. Springer International Publishing, Cham (2017)
Jairam Naik, K., Pedagandam, M., Mishra, A.: Workflow scheduling optimisation for distributed environment using artificial neural networks and reinforcement learning. Int. J. Comput. Sci. Eng. 24(6), 653–670 (2021)
Huang, J.: The workflow task scheduling algorithm based on the ga model in the cloud computing environment. J. Softw. 9(4), 873–880 (2014)
Hafsi, H., Gharsellaoui, H., Bouamama, S.: Towards a Novel NSGAII-based Approach for Multi-objectives Scientific Workflow Scheduling on Hybrid Clouds. GECCO (2019)
Chen, W., Deelman, E.: Workflowsim: a toolkit for simulating scientific workflows in distributed environments. In: 2012 IEEE 8th International Conference on E-science, pp. 1–8. IEEE, New York (2012)
Juve, G., Chervenak, A., Deelman, E., Bharathi, S., Mehta, G., Vahi, K.: Characterizing and profiling scientific workflows. Future Gener. Comput. Syst. 29, 682–692 (2013)
Gupta, I., Choudhary, A., Jana, P.: Generation and Proliferation of Random Directed Acyclic Graphs for Workflow Scheduling Problem, pp 123–127. ACM, New York (2017)
Canon, LC., Sayah, M., Héam, P.C.: A comparison of random task graph generation methods for scheduling problems. https://arxiv.org/abs/1902.05808 (2019)
Cordeiro, D., Mounié, G., Perarnau, S., Trystram, D., Vincent, J.M., Wagner, F.: Random Graph Generation for Scheduling Simulations. ICST (2010)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to Sequence Learning with Neural Networks. NIPS, Montreal (2014)
Funding
This work is supported by the International Cooperation and Exchange Program of National Natural Science Foundation of China (Grant no. 62061136006) and National Natural Science Foundation of China (Grant no. 61832004).
Author information
Authors and Affiliations
Contributions
ZY: Conceptualization, Methodology; MZ: Data curation, Software, Writing—Original draft preparation; HL: Validation, Visualization; WD: Validation, Writing—Reviewing and Editing; All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, Z., Zhang, M., Li, H. et al. A novel seq2seq-based prediction approach for workflow scheduling. Cluster Comput 27, 1897–1910 (2024). https://doi.org/10.1007/s10586-023-04061-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-023-04061-3