Abstract
Scientific workflows are becoming increasingly important for complex scientific applications. Conducting real experiments for large-scale workflows is challenging because they are very expensive and time consuming. A simulation is an alternative approach to a real experiment that can help evaluating the performance of workflow management systems (WMS) and optimise workflow management techniques. Although there are several workflow simulators available today, they are often user-oriented and treat the cloud as a black box. Unfortunately, this behaviour prevents the evaluation of the infrastructure level impact of the various decisions made by the WMSs. To address these issues, we have developed a WMS simulator (called DISSECT-CF-WMS) on DISSECT-CF that exposes the internal details of cloud infrastructures. DISSECT-CF-WMS enables better energy awareness by allowing the study of schedulers for physical machines. It also enables dynamic provisioning to meet the resource needs of the workflow application while considering the provisioning delay of a VM in the cloud. We evaluated our simulation extension by running several workflow applications on a given infrastructure. The experimental results show that we can investigate different schedulers for physical machines on different numbers of virtual machines to reduce energy consumption. The experiments also show that DISSECT-CF-WMS is up to 295× faster than WorkflowSim and still provides equivalent results. The experimental results of auto-scaling show that it can optimise makespan, energy consumption and VM utilisation in contrast to static VM provisioning.
Article PDF
Similar content being viewed by others
Data Availability
The application developed in this manuscript is publicly available under an open source license at https://github.com/kecskemeti/dissect-cf-examples.
References
Abramovici, A., Althouse, W.E., Drever, R.W., Gürsel, Y., Kawamura, S., Raab, F.J., Shoemaker, D., Sievers, L., Spero, R.E., Thorne, K.S., et al.: LIGO: The laser interferometer gravitational-wave observatory. Science 256(5055), 325–333 (1992)
Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludascher, B., Mock, S.: Kepler: an extensible system for design and execution of scientific workflows. In: Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004, pp 423–424. IEEE (2004)
Bell, W.H., Cameron, D.G., Millar, A.P., Capozza, L., Stockinger, K., Zini, F.: Optorsim: A grid simulator for studying dynamic data replication strategies. Int. J. High Perform. Comput. Appl. 17(4), 403–416 (2003)
Benoit, A., Rehn-Sonigo, V., Robert, Y.: Optimizing latency and reliability of pipeline workflow applications. In: 2008 IEEE International Symposium on Parallel and Distributed Processing, pp 1–10. IEEE (2008)
Blythe, J., Jain, S., Deelman, E., Gil, Y., Vahi, K., Mandal, A., Kennedy, K.: Task scheduling strategies for workflow-based applications in grids. In: CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005, vol. 2, pp 759–767. IEEE (2005)
Braun, T.D., Siegel, H.J., Beck, N., Bölöni, L.L., Maheswaran, M., Reuther, A.I., Robertson, J.P., Theys, M.D., Yao, B., Hensgen, D., et al.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)
Brown, D.A., Brady, P.R., Dietz, A., Cao, J., Johnson, B., McNabb, J.: A case study on the use of workflow technologies for scientific analysis: Gravitational wave data analysis. In: Workflows for e-Science, pp 39–59. Springer (2007)
Buyya, R., Murshed, M.: Gridsim: A toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concurr. Comput. Pract. Experience 14 (13-15), 1175–1220 (2002)
Cai, Z., Li, Q., Li, X.: Elasticsim: A toolkit for simulating workflows with cloud resource runtime auto-scaling and stochastic task execution times. J. Grid Comput. 15(2), 257–272 (2017)
Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A., Buyya, R.: CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw. Pract. Experience 41(1), 23–50 (2011)
Cao, J., Jarvis, S.A., Saini, S., Nudd, G.R.: Gridflow: Workflow management for grid computing. In: CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings, pp 198–205. IEEE (2003)
Casanova, H., Giersch, A., Legrand, A., Quinson, M., Suter, F.: Versatile, scalable, and accurate simulation of distributed applications and platforms. J. Parallel Distrib. Comput. 74(10), 2899–2917 (2014)
Casanova, H., Pandey, S., Oeth, J., Tanaka, R., Suter, F., da Silva, R.F.: Wrench: A framework for simulating workflow management systems. In: 2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS), pp 74–85. IEEE (2018)
Casanova, H., da Silva, R.F., Tanaka, R., Pandey, S., Jethwani, G., Koch, W., Albrecht, S., Oeth, J., Suter, F.: Developing accurate and scalable simulators of production workflow management systems with wrench. Futur. Gener. Comput. Syst. 112, 162–175 (2020)
Chen, W., Deelman, E.: WorkflowSim: A toolkit for simulating scientific workflows in distributed environments. In: 2012 IEEE 8th International Conference on E-science, pp 1–8. IEEE (2012)
Da Silva, R.F., Glatard, T., Desprez, F.: Self-healing of workflow activity incidents on distributed computing infrastructures. Futur. Gener. Comput. Syst. 29(8), 2284–2294 (2013)
Deelman, E., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Patil, S., Su, M.H., Vahi, K., Livny, M.: Pegasus: Mapping scientific workflows onto the grid. In: European Across Grids Conference, pp 11–20. Springer (2004)
Deelman, E., Vahi, K., Juve, G., Rynge, M., Callaghan, S., Maechling, P.J., Mayani, R., Chen, W., Da Silva, R.F., Livny, M., et al.: Pegasus, a workflow management system for science automation. Futur. Gener. Comput. Syst. 46, 17–35 (2015)
Garg, S.K., Buyya, R.: NetworkCloudSim: Modelling parallel applications in cloud simulations. In: 2011 4th IEEE International Conference on Utility and Cloud Computing, pp 105–113. IEEE (2011)
Graves, R., Jordan, T.H., Callaghan, S., Deelman, E., Field, E., Juve, G., Kesselman, C., Maechling, P., Mehta, G., Milner, K., et al.: Cybershake: A physics-based seismic hazard model for southern california. Pure Appl. Geophys. 168(3-4), 367–381 (2011)
Gu, Y., Wu, Q.: Maximizing workflow throughput for streaming applications in distributed environments. In: 2010 Proceedings of 19th International Conference on Computer Communications and Networks, pp 1–6. IEEE (2010)
Hirales-Carbajal, A., Tchernykh, A., Röblitz, T., Yahyapour, R.: A grid simulation framework to study advance scheduling strategies for complex workflow applications. In: 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), pp 1–8. IEEE (2010)
Hoefler, T., Schneider, T., Lumsdaine, A.: Loggopsim: simulating large-scale applications in the LogGOPS model. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pp 597–604 (2010)
Iosup, A., Li, H., Jan, M., Anoep, S., Dumitrescu, C., Wolters, L., Epema, D.H.: The grid workloads archive. Futur. Gener. Comput. Syst. 24(7), 672–686 (2008)
Juve, G., Deelman, E.: Resource provisioning options for large-scale scientific workflows. In: 2008 IEEE 4th International Conference on eScience, pp 608–613. IEEE (2008)
Kandaswamy, G., Mandal, A., Reed, D.A.: Fault tolerance and recovery of scientific workflows on computational grids. In: 2008 8th IEEE International Symposium on Cluster Computing and the Grid (CCGRID), pp 777–782. IEEE (2008)
Kecskemeti, G.: DISSECT-CF: a simulator to foster energy-aware scheduling in infrastructure clouds. Simul. Model. Pract. Theory 58, 188–218 (2015)
Kecskemeti, G., Ostermann, S., Prodan, R.: Fostering energy-awareness in simulations behind scientific workflow management systems. In: 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing, pp 29–38. IEEE (2014)
Livny, J., Teonadi, H., Livny, M., Waldor, M.K.: High-throughput, kingdom-wide prediction and annotation of bacterial non-coding RNAs. PloS ONE 3(9), e3197 (2008)
Malawski, M., Juve, G., Deelman, E., Nabrzyski, J.: Algorithms for cost-and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds. Futur. Gener. Comput. Syst. 48, 1–18 (2015)
Milojičić, D., Llorente, I.M., Montero, R.S.: Opennebula: A cloud management tool. IEEE Internet Comput. 15(2), 11–14 (2011)
Nunez, A., Vazquez-Poletti, J.L., Caminero, A.C., Carretero, J., Llorente, I.M.: Design of a new cloud computing simulation platform. In: International Conference on Computational Science and Its Applications, pp 582–593. Springer (2011)
Ostermann, S., Kecskemeti, G., Prodan, R.: Multi-layered simulations at the heart of workflow enactment on clouds. Concurr. Comput. Pract. Experience 28(11), 3180–3201 (2016)
Ostermann, S., Plankensteiner, K., Bodner, D., Kraler, G., Prodan, R.: Integration of an event-based simulation framework into a scientific workflow execution environment for grids and clouds. In: European Conference on a Service-Based Internet, pp 1–13. Springer (2011)
Ostermann, S., Plankensteiner, K., Prodan, R.: Using a new event-based simulation framework for investigating resource provisioning in clouds. Sci. Program. 19(2-3), 161–178 (2011)
Ostermann, S., Plankensteiner, K., Prodan, R., Fahringer, T.: GroudSim: An event-based simulation framework for computational grids and clouds. In: European Conference on Parallel Processing, pp 305–313. Springer (2010)
Ostermann, S., Prodan, R., Fahringer, T.: Dynamic cloud provisioning for scientific grid workflows. In: 2010 11th IEEE/ACM International Conference on Grid Computing, pp 97–104. IEEE (2010)
Tikir, M.M., Laurenzano, M.A., Carrington, L., Snavely, A.: Psins: An open source event tracer and execution simulator for MPI applications. In: European Conference on Parallel Processing, pp 135–148. Springer (2009)
Topcuoglu, H., Hariri, S., Wu, M.Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
Tsai, M.H., Lai, K.C., Chang, H.Y., Chen, K.F., Huang, K.C.: Pewss: A platform of extensible workflow simulation service for workflow scheduling research. Softw. Pract. Experience 48(4), 796–819 (2018)
Ullman, J.D.: Np-complete scheduling problems. J. Comput. Syst. Sci. 10(3), 384–393 (1975)
Vydyanathan, N., Catalyurek, U.V., Kurc, T.M., Sadayappan, P., Saltz, J.H.: Toward optimizing latency under throughput constraints for application workflows on clusters. In: European Conference on Parallel Processing, pp 173–183. Springer (2007)
Zheng, G., Kakulapati, G., Kalé, L. V.: Bigsim: A parallel simulator for performance prediction of extremely large parallel machines. In: 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings, p 78. IEEE (2004)
Funding
Open access funding provided by University of Miskolc. This work was supported in part by the Hungarian Scientific Research Fund under Grant agreement OTKA FK 131793.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Al-Haboobi, A., Kecskemeti, G. Developing a Workflow Management System Simulation for Capturing Internal IaaS Behavioural Knowledge. J Grid Computing 21, 2 (2023). https://doi.org/10.1007/s10723-022-09638-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10723-022-09638-7