Performability Evaluation and Optimization of Workflow Applications in Cloud Environments

  • Danilo OliveiraEmail author
  • André Brinkmann
  • Nelson Rosa
  • Paulo Maciel


Given the characteristics of dynamic provisioning and illusion of unlimited resources, clouds are becoming a popular alternative for running scientific workflows. In a cloud system for processing workflow applications, the system’s performance is heavily influenced by two factors: the scheduling strategy and failure of components. Failures in a cloud system can simultaneously affect several users and depreciate the number of available computing resources. A bad scheduling strategy can increase the expected makespan and the idle time of physical machines. In this paper, we propose an optimization method for the scheduling of scientific workflows on cloud systems. The method comprises the use of a meta-heuristic algorithm coupled to a performability model that provides the fitnesses of explored solutions. For being able to represent the combined effect of scheduling and component failures, we adopted discrete event simulation for the performability model. Experimental results show the effectiveness of the hybrid simulation-optimization approach for optimizing the number of allocated virtual machines and the scheduling of tasks regarding performability.


Scientific workflows Performability Stochastic petri nets Optimization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



  1. 1.
    Alwabel, A., Walters, R., Wills, G.: Desktopcloudsim: Simulation of node failures in the cloud. In: International Conference on Cloud Computing, GRIDs, and Virtualization, p. 29 (2015)Google Scholar
  2. 2.
    Ando, E., Nakata, T., Yamashita, M.: Approximating the longest path length of a stochastic dag by a normal distribution in linear time. J. Discrete Algoritms 7(4), 420–438 (2009)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Arabnejad, H., Barbosa, J.G.: A budget constrained scheduling algorithm for workflow applications. J. Grid Comput. 12(4), 665–679 (2014)Google Scholar
  4. 4.
    Bianchi, L., Dorigo, M., Gambardella, L.M., Gutjahr, W.J.: A survey on metaheuristics for stochastic combinatorial optimization. Nat. Comput. 8(2), 239–287 (2009)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Bitam, S.: Bees life algorithm for job scheduling in cloud computing. In: Proceedings of the Third International Conference on Communications and Information Technology, pp. 186–191 (2012)Google Scholar
  6. 6.
    Blum, C., Roli, A.: Metaheuristics in combinatorial optimization: overview and conceptual comparison. ACM Comput. Surv. (CSUR) 35(3), 268–308 (2003)Google Scholar
  7. 7.
    Bolch, G., Greiner, S., de Meer, H., Trivedi, K.S.: Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications. Wiley, Hoboken (2006)zbMATHGoogle Scholar
  8. 8.
    Book, R.V., et al.: Michael r. garey and david s. johnson, computers and intractability: a guide to the theory of n p-completeness. Bulletin (New Series) of the American Mathematical Society 3(2), 898–904 (1980)Google Scholar
  9. 9.
    Brown, D.A., Brady, P.R., Dietz, A., Cao, J., Johnson, B., McNabb, J.: A case study on the use of workflow technologies for scientific analysis: gravitational wave data analysis. In: Workflows for E-Science, pp. 39–59. Springer (2007)Google Scholar
  10. 10.
    Bux, M., Leser, U.: Dynamiccloudsim: Simulating heterogeneity in computational clouds. Futur. Gener. Comput. Syst. 46, 85–99 (2015)Google Scholar
  11. 11.
    Cai, Z., Li, Q., Li, X.: Elasticsim: a toolkit for simulating workflows with cloud resource runtime auto-scaling and stochastic task execution times. J. Grid Comput. 15(2), 257–272 (2017)Google Scholar
  12. 12.
    Cai, Z., Li, X., Ruiz, R., Li, Q.: A delay-based dynamic scheduling algorithm for bag-of-task workflows with stochastic task execution times in clouds. Futur. Gener. Comput. Syst. 71, 57–72 (2017)Google Scholar
  13. 13.
    Calheiros, R.N., Ranjan, R., Beloglazov, A., De Rose, C.A., Buyya, R.: Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw. Pract. Exp. 41(1), 23–50 (2011)Google Scholar
  14. 14.
    Chen, W., Deelman, E.: Workflowsim: a toolkit for simulating scientific workflows in distributed environments. In: 2012 IEEE 8th International Conference on E-Science (E-Science), pp. 1–8. IEEE (2012)Google Scholar
  15. 15.
    Chen, W.N., Zhang, J.: Ant colony optimization for software project scheduling and staffing with an event-based scheduler. IEEE Trans. Softw. Eng. 39(1), 1–17 (2013)Google Scholar
  16. 16.
    Davis, N.A., Rezgui, A., Soliman, H., Manzanares, S., Coates, M.: Failuresim: a system for predicting hardware failures in cloud data centers using neural networks. In: 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), pp. 544–551. IEEE (2017)Google Scholar
  17. 17.
    Entezari-Maleki, R., Trivedi, K.S., Sousa, L., Movaghar, A.: Performability-based workflow scheduling in grids. The Computer Journal (2018)Google Scholar
  18. 18.
    Ever, E.: Performability analysis of cloud computing centers with large numbers of servers. J. Supercomput. 73(5), 2130–2156 (2017)Google Scholar
  19. 19.
    Ghosh, R., Trivedi, K.S., Naik, V.K., Kim, D.S.: End-To-End performability analysis for infrastructure-as-a-service cloud: an interacting stochastic models approach. In: 2010 IEEE 16th Pacific Rim International Symposium on Dependable Computing (PRDC), pp. 125–132. IEEE (2010)Google Scholar
  20. 20.
    Goldberg, D.E., Lingle, R., et al.: Alleles, loci, and the traveling salesman problem. In: Proceedings of an International Conference on Genetic Algorithms and their Applications, vol. 154, pp. 154–159. Lawrence Erlbaum, Hillsdale (1985)Google Scholar
  21. 21.
    Gorissen, D., Couckuyt, I., Demeester, P., Dhaene, T., Crombecq, K.: A surrogate modeling and adaptive sampling toolbox for computer based design. J. Mach. Learn. Res. 11, 2051–2055 (2010)Google Scholar
  22. 22.
    Gu, J., Hu, J., Zhao, T., Sun, G.: A new resource scheduling strategy based on genetic algorithm in cloud computing environment. J. Comput. 7(1), 42–52 (2012)Google Scholar
  23. 23.
    Guimarães, A.P., Maciel, P.R., Matias, R.: An analytical modeling framework to evaluate converged networks through business-oriented metrics. Reliab. Eng. Syst. Saf. 118, 81–92 (2013)Google Scholar
  24. 24.
    Hamby, D.: A review of techniques for parameter sensitivity analysis of environmental models. Environ. Monit. Assess. 32(2), 135–154 (1994)Google Scholar
  25. 25.
    Hoffa, C., Mehta, G., Freeman, T., Deelman, E., Keahey, K., Berriman, B., Good, J.: On the use of cloud computing for scientific workflows. In: 2008. Escience’08. IEEE Fourth International Conference on Escience, pp. 640–645. IEEE (2008)Google Scholar
  26. 26.
    Juve, G., Bharathi, S.: Pegasus synthetic workflow generator. (2014)
  27. 27.
    Juve, G., Deelman, E., Vahi, K., Mehta, G., Berriman, B., Berman, B.P., Maechling, P.: Scientific workflow applications on amazon Ec2. In: 2009 5th IEEE International Conference on E-Science Workshops, pp. 59–66. IEEE (2009)Google Scholar
  28. 28.
    Kim, D.S., Machida, F., Trivedi, K.S.: Availability modeling and analysis of a virtualized system. In: 2009. PRDC’09. 15th IEEE Pacific Rim International Symposium on Dependable Computing, pp. 365–371. IEEE (2009)Google Scholar
  29. 29.
    Kliazovich, D., Pecero, J.E., Tchernykh, A., Bouvry, P., Khan, S.U., Zomaya, A.Y.: Ca-dag: Modeling communication-aware applications for scheduling in cloud computing. J. Grid Comput. 14(1), 23–39 (2016)Google Scholar
  30. 30.
    Kohne, A., Spohr, M., Nagel, L., Spinczyk, O.: Federatedcloudsim: a sla-aware federated cloud simulation framework. In: Proceedings of the 2nd International Workshop on CrossCloud Systems, pp. 3. ACM (2014)Google Scholar
  31. 31.
    LD, D.B., Krishna, P.V.: Honey bee behavior inspired load balancing of tasks in cloud computing environments. Appl. Soft Comput. 13(5), 2292–2303 (2013)Google Scholar
  32. 32.
    Lin, W., Wu, W., Wang, J.Z.: A heuristic task scheduling algorithm for heterogeneous virtual clusters. Sci. Program. 2016, Article ID 7040276 (2016)Google Scholar
  33. 33.
    Maciel, P., Matos, R., Silva, B., Figueiredo, J., Oliveira, D., Fé, I., Maciel, R., Dantas, J.: Mercury: performance and dependability evaluation of systems with exponential, expolynomial, and general distributions. In: 2017 IEEE 22Nd Pacific Rim International Symposium on Dependable Computing (PRDC), pp. 50–57. IEEE (2017)Google Scholar
  34. 34.
    Mainkar, V., Trivedi, K.S.: Sufficient conditions for existence of a fixed point in stochastic reward net-based iterative models. IEEE Trans. Softw. Eng. 22(9), 640–653 (1996)Google Scholar
  35. 35.
    Malawski, M., Juve, G., Deelman, E., Nabrzyski, J.: Algorithms for cost-and deadline-constrained provisioning for scientific workflow ensembles in iaas clouds. Futur. Gener. Comput. Syst. 48, 1–18 (2015)Google Scholar
  36. 36.
    Meyer, J.F.: On evaluating the performability of degradable computing systems. IEEE Trans. Comput. C-29(8), 720–731 (1980)zbMATHGoogle Scholar
  37. 37.
    Mezmaz, M., Melab, N., Kessaci, Y., Lee, Y.C., Talbi, E.G., Zomaya, A.Y., Tuyttens, D.: A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems. J. Parallel Distrib. Comput. 71(11), 1497–1508 (2011)Google Scholar
  38. 38.
    Molloy, M.K.: Performance analysis using stochastic petri nets. IEEE Trans. Comput. 31(9), 913–917 (1982)Google Scholar
  39. 39.
    Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)MathSciNetzbMATHGoogle Scholar
  40. 40.
    Oliveira, D., Matos, R., Dantas, J., Ferreira, J., Silva, B., Callou, G., Maciel, P., Brinkmann, A.: Advanced stochastic petri net modeling with the mercury scripting language. In: ValueTools 2017, 11th EAI International Conference on Performance Evaluation Methodologies and Tools. Venice, Italy. Elsevier (2017)Google Scholar
  41. 41.
    Panda, S.K., Jana, P.K.: Efficient task scheduling algorithms for heterogeneous multi-cloud environment. J. Supercomput. 71(4), 1505–1533 (2015)Google Scholar
  42. 42.
    Plateau, B., Atif, K.: Stochastic automata network of modeling parallel systems. IEEE Trans. Softw. Eng. 17(10), 1093–1108 (1991)MathSciNetGoogle Scholar
  43. 43.
    Qiu, X., Sun, P., Guo, X., Xiang, Y.: Performability analysis of a cloud system. In: 2015 IEEE 34th International Performance Computing and Communications Conference (IPCCC), pp. 1–6. IEEE (2015)Google Scholar
  44. 44.
    Queipo, N.V., Haftka, R.T., Shyy, W., Goel, T., Vaidyanathan, R., Tucker, P.K.: Surrogate-based analysis and optimization. Prog. Aerosp. Sci. 41(1), 1–28 (2005)Google Scholar
  45. 45.
    Raei, H., Yazdani, N.: Performability analysis of cloudlet in mobile cloud computing. Inform. Sci. 388, 99–117 (2017)Google Scholar
  46. 46.
    Ramakrishnan, L., Reed, D.A.: Performability modeling for scheduling and fault tolerance strategies for scientific workflows. In: Proceedings of the 17th International Symposium on High Performance Distributed Computing, pp. 23–34. ACM (2008)Google Scholar
  47. 47.
    Rimal, B.P., Maier, M.: Workflow scheduling in multi-tenant cloud computing environments. IEEE Trans. Parallel Distrib. Syst. 28(1), 290–304 (2017)Google Scholar
  48. 48.
    Rodriguez, M.A., Buyya, R.: A taxonomy and survey on scheduling algorithms for scientific workflows in iaas cloud computing environments. Concurr. Comput. Pract. Exp. 29(8), e4041 (2017)Google Scholar
  49. 49.
    Sousa, E., Lins, F., Tavares, E., Cunha, P., Maciel, P.: A modeling approach for cloud infrastructure planning considering dependability and cost requirements. IEEE Trans. Syst. Man Cybern. Syst. Hum. 45(4), 549–558 (2015)Google Scholar
  50. 50.
    Sousa, E., Lins, F., Tavares, E., Maciel, P.: Cloud infrastructure planning considering different redundancy mechanisms. Computing 99(9), 841–864 (2017)MathSciNetGoogle Scholar
  51. 51.
    Swisher, J.R., Hyden, P.D., Jacobson, S.H., Schruben, L.W.: A Survey of simulation optimization techniques and procedures. In: Simulation Conference, 2000. Proceedings. Winter, vol. 1, pp. 119–128. IEEE (2000)Google Scholar
  52. 52.
    Tawfeek, M.A., El-Sisi, A., Keshk, A.E., Torkey, F.A.: Cloud task scheduling based on ant colony optimization. In: 2013 8th International Conference on Computer Engineering & Systems (ICCES), pp. 64–69. IEEE (2013)Google Scholar
  53. 53.
    Tsai, C.W., Rodrigues, J.J.: Metaheuristic scheduling for cloud: a survey. IEEE Syst. J. 8(1), 279–291 (2014)Google Scholar
  54. 54.
    Vinay, K., Kumar, S.D.: Fault-tolerant scheduling for scientific workflows in cloud environments. In: 2017 IEEE 7th International Advance Computing Conference (IACC), pp. 150–155. IEEE (2017)Google Scholar
  55. 55.
    Vöckler, J. S., Juve, G., Deelman, E., Rynge, M., Berriman, B.: Experiences using cloud computing for a scientific workflow application, In: Proceedings of the 2nd International Workshop on Scientific Cloud Computing, pp. 15–24. ACM (2011)Google Scholar
  56. 56.
    Wang, J., Bao, W., Zhu, X., Yang, L.T., Xiang, Y.: Festal: fault-tolerant elastic scheduling algorithm for real-time tasks in virtualized clouds. IEEE Trans. Comput. 64(9), 2545–2558 (2015)MathSciNetzbMATHGoogle Scholar
  57. 57.
    Wang, T., Chang, X., Liu, B.: Performability analysis for iaas cloud data center. In: 2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 91–94. IEEE (2016)Google Scholar
  58. 58.
    Xia, Y., Zhou, M., Luo, X., Zhu, Q., Li, J., Huang, Y.: Stochastic modeling and quality evaluation of infrastructure-as-a-service clouds. IEEE Trans. Autom. Sci. Eng. 12(1), 162–170 (2015)Google Scholar
  59. 59.
    Xu, Y., Li, K., He, L., Zhang, L., Li, K.: A hybrid chemical reaction optimization scheme for task scheduling on heterogeneous computing systems. IEEE Trans. Parallel Distrib. Syst. 26 (12), 3208–3222 (2015)Google Scholar
  60. 60.
    Zhao, C., Zhang, S., Liu, Q., Xie, J., Hu, J.: Independent tasks scheduling based on genetic algorithm in cloud computing. In: 2009. Wicom’09. 5th International Conference on Wireless Communications, Networking and Mobile Computing, pp. 1–4. IEEE (2009)Google Scholar
  61. 61.
    Zhao, H.W., Tian, L.W.: Resource schedule algorithm based on artificial fish swarm in cloud computing environment. In: Applied Mechanics and Materials, vol. 635, pp. 1614–1617. Trans Tech Publ (2014)Google Scholar
  62. 62.
    Zheng, W., Sakellariou, R.: Stochastic dag scheduling using a monte carlo approach. J. Parallel Distrib. Comput. 73(12), 1673–1689 (2013)zbMATHGoogle Scholar
  63. 63.
    Zheng, W., Wang, C., Zhang, D.: A randomization approach for stochastic workflow scheduling in clouds. Sci. Program. 2016, Article ID 9136107 (2016)Google Scholar
  64. 64.
    Zheng, Z., Wang, R., Zhong, H., Zhang, X.: An approach for cloud resource scheduling based on parallel genetic algorithm. In: 2011 3rd International Conference on Computer Research and Development (ICCRD), vol. 2, pp. 444–447. IEEE (2011)Google Scholar
  65. 65.
    Zhou, A., Wang, S., Sun, Q., Zou, H., Yang, F.: Ftcloudsim: a simulation tool for cloud service reliability enhancement mechanisms. In: Proceedings Demo & Poster Track of ACM/IFIP/USENIX International Middleware Conference, p. 2. ACM (2013)Google Scholar
  66. 66.
    Zhu, X., Wang, J., Guo, H., Zhu, D., Yang, L.T., Liu, L.: Fault-tolerant scheduling for real-time scientific workflows with elastic resource provisioning in virtualized clouds. IEEE Trans. Parallel Distrib. Syst. 27(12), 3501–3517 (2016)Google Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.Federal University of PernambucoInformatics CenterRecifeBrazil
  2. 2.Data Processing Center (ZDV)Johannes Gutenber UniversityMainzGermany

Personalised recommendations