On the efficiency of several VM provisioning strategies for workflows with multi-threaded tasks on clouds

Abstract

Cloud computing promises the delivery of on-demand pay-per-use access to unlimited resources. Using these resources requires more than a simple access to them as most clients have certain constraints in terms of cost and time that need to be fulfilled. Therefore certain scheduling heuristics have been devised to optimize the placement of client tasks on allocated virtual machines. The applications can be roughly divided in two categories: independent bag-of-tasks and workflows. In this paper we focus on the latter and investigate a less studied problem, i.e., the effect the virtual machine allocation policy has on the scheduling outcome. For this we look at how workflow structure, execution time, virtual machine instance type affect the efficiency of the provisioning method when cost and makespan are considered. To aid our study we devised a mathematical model for cost and makespan in case single or multiple instance types are used. While the model allows us to determine the boundaries for two of our extreme methods, the complexity of workflow applications calls for a more experimental approach to determine the general relation. For this purpose we considered synthetically generated workflows that cover a wide range of possible cases. Results have shown the need for probabilistic selection methods in case small and heterogeneous execution times are used, while for large homogeneous ones the best algorithm is clearly noticed. Several other conclusions regarding the efficiency of powerful instance types as compared to weaker ones, and of dynamic methods against static ones are also made.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. 1.

    http://aws.amazon.com/ec2/ (accessed Dec 7th 2012).

  2. 2.

    http://www.rackspace.com/blog/announcing-cloud-load-balancing-private-beta/ (accessed Dec 7th 2012).

  3. 3.

    http://hadoop.apache.org/ (accessed Jun 24th 2013).

  4. 4.

    http://www.stata.com/statamp/statamp.eps (accessed Jun 24th 2013).

  5. 5.

    http://aws.amazon.com/ec2/ (accessed Feb 26th 2013).

  6. 6.

    can be downloaded at https://gforge.inria.fr/projects/simiaas/.

References

  1. 1.

    Bittencourt L, Madeira E (2011) Hcoc: a cost optimization algorithm for workflow scheduling in hybrid clouds. J Internet Serv Appl 2:207–227

    Article  Google Scholar 

  2. 2.

    Bittencourt LF, Madeira ERM (2008) A performance-oriented adaptive scheduler for dependent tasks on grids. Concurr Comput Pract Exp 20(9):1029–1049

    Article  Google Scholar 

  3. 3.

    Bobroff N, Kochut A, Beaty K (2007) Dynamic placement of virtual machines for managing sla violations. In: 10th IFIP/IEEE international symposium on integrated network management. IEEE, pp. 119–128

  4. 4.

    den Bossche RV, Vanmechelen K, Broeckhove J (2010) Cost-optimal scheduling in hybrid IaaS clouds for deadline constrained workloads. In: IEEE CLOUD, pp 228–235

  5. 5.

    Byun EK, Kee YS, Kim JS, Maeng S (2011) Cost optimized provisioning of elastic resources for application workflows. Future Gener Comput Syst 27(8):1011–1026

    Article  Google Scholar 

  6. 6.

    Caron E, Desprez F, Muresan A, Suter F (2012) Budget constrained resource allocation for non-deterministic workflows on an iaas cloud. In: Xiang Y, Stojmenovic I, Apduhan B, Wang G, Nakano K, Zomaya A (eds) Algorithms and architectures for parallel processing. Lecture notes in computer science, vol 7439. Springer, Berlin, pp 186–201

  7. 7.

    Casanova H, Legrand A, Quinson M (2008) Simgrid: a generic framework for large-scale distributed experiments. In: Proceedings of the tenth international conference on computer modeling and simulation. UKSIM ’08IEEE Computer Society, Washington, DC, USA, pp 126–131

  8. 8.

    Deelman E, Singh G, Livny M, Berriman GB, Good J (2008) The cost of doing science on the cloud: the montage example. In: SuperComputing’08, p 50

  9. 9.

    Doğan A, Özgüner F (2005) Biobjective scheduling algorithms for execution time-reliability trade-off in heterogeneous computing systems*. Comput J 48(3):300–314

    Article  Google Scholar 

  10. 10.

    Frincu Marc E, Genaud S, Gossa J (2014) On the efficiency of several VM provisioning strategies for workflows with multi-threaded tasks on clouds. Rapport de recherche RR-8449, INRIA. http://hal.inria.fr/hal-00929814

  11. 11.

    Frincu M, Genaud S, Gossa J (2013) Comparing provisioning and scheduling strategies for workflows on clouds. In: IEEE workshop proceedings of 28th IEEE international parallel & distributed processing symposium, pp 2101–2110

  12. 12.

    Frîncu ME (2014) Scheduling highly available applications on cloud environments. Future Gener Comput Syst 32:138–153

    Article  Google Scholar 

  13. 13.

    Google: Google compute engine pricing. https://cloud.google.com/pricing/compute-engine. Accessed 20 June 2013

  14. 14.

    Gu J, Hu J, Zhao T, Sun G (2012) A new resource scheduling strategy based on genetic algorithm in cloud computing environment. JCP 7(1):42–52

  15. 15.

    Gutierrez-Garcia JO, Sim KM (2012) A family of heuristics for agent-based elastic cloud bag-of-tasks concurrent scheduling. Future Gener Comput Syst 29(7):1682–1699

    Article  Google Scholar 

  16. 16.

    Hwang E, Kim KH (2012) Minimizing cost of virtual machines for deadline-constrained mapreduce applications in the cloud. In: 2012 ACM/IEEE 13th international conference on grid computing (GRID), pp 130–138

  17. 17.

    Lin C, Lu S (2011) Scheduling scientific workflows elastically for cloud computing. In: 2011 IEEE international conference on cloud computing (CLOUD), pp 746–747

  18. 18.

    Liu K (2009) Scheduling algorithms for instance-intensive cloud workflows. Ph.D. thesis, University of Swinburne Australia

  19. 19.

    Lucas-Simarro JL, Moreno-Vozmediano R, Montero RS, Llorente IM (2013) Scheduling strategies for optimal service deployment across multiple clouds. Future Gener Comput Syst 29(6):1431–1441

    Article  Google Scholar 

  20. 20.

    Mao M, Humphrey M (2011) Auto-scaling to minimize cost and meet application deadlines in cloud workflows. In: Proceedings of 2011 international conference for high performance computing, networking, storage and analysis, SC ’11. ACM, New York, NY, USA, pp 49:1–49:12

  21. 21.

    Mao M, Humphrey M (2012) A performance study on the vm startup time in the cloud. In: IEEE CLOUD’12, pp 423–430

  22. 22.

    Michon E, Gossa J, Genaud S (2012) Free elasticity and free CPU power for scientific workloads on IaaS clouds. In: 18th IEEE international conference on parallel and distributed systems. IEEE, Singapore. http://hal.inria.fr/hal-00733155

  23. 23.

    Michon E, Gossa J, Genaud S, Frincu M, Burel A (2013) Porting grid applications to the cloud with schlouder. In: 2013 IEEE 5th international conference on Cloud computing technology and science (CloudCom), vol 1, pp 505–512

  24. 24.

    Mohammadi Fard H, Prodan R, Fahringer T (2012) A truthful dynamic workflow scheduling mechanism for commercial multi-cloud environments. IEEE Trans Parallel Distrib Syst(99), 1

  25. 25.

    Pandey S, Wu L, Guru S, Buyya R (2010) A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. In: 2010 24th IEEE international conference on advanced information networking and applications (AINA), pp 400–407

  26. 26.

    Radulescu A, van Gemund A (2001) A low-cost approach towards mixed task and data parallel scheduling. In: International conference on parallel processing, pp 69–76

  27. 27.

    Sakellariou R, Zhao H, Tsiakkouri E, Dikaiakos MD (2007) Scheduling workflows with budget constraints. In: Gorlatch S, Danelutto M (eds) Integrated research in grid computing: CoreGrid series. Springer, Berlin

  28. 28.

    Tobita T, Kasahara H (2002) A standard task graph set for fair evaluation of multiprocessor scheduling algorithms. J Sched 5(5):379–394

    MathSciNet  Article  MATH  Google Scholar 

  29. 29.

    Tordsson J, Montero RS, Moreno-Vozmediano R, Llorente IM (2012) Cloud brokering mechanisms for optimized placement of virtual machines across multiple providers. Future Gener Comput Syst 28(2):358–367

    Article  Google Scholar 

  30. 30.

    Villegas D, Antoniou A, Sadjadi SM, Iosup A (2012) An analysis of provisioning and allocation policies for infrastructure-as-a-service clouds. Proceedings of the 2012 12th IEEE/ACM international symposium on cluster, cloud and grid computing (ccgrid 2012), CCGRID ’12IEEE Computer Society, Washington, DC, USA, pp 612–619

  31. 31.

    Wu Z, Liu X, Ni Z, Yuan D, Yang Y (2013) A market-oriented hierarchical scheduling strategy in cloud workflow systems. J Supercomput 63:256–293

    Article  Google Scholar 

  32. 32.

    Zaman S, Grosu D (2013) Combinatorial auction-based allocation of virtual machine instances in clouds. J Parallel Distrib Comput 73(4):495–508

    Article  Google Scholar 

  33. 33.

    Zhao H, Sakellariou R (2003) An experimental investigation into the rank function of the heterogeneous earliest finish time scheduling algorithm. In: Kosch H, Bszrmnyi L, Hellwagner H (eds) Euro-Par 2003 parallel processing, vol 2790. Lecture notes in computer scienceSpringer, Berlin, pp 189–194

Download references

Acknowledgments

Work partially supported by the French ANR project SONGS 11-INFRA-13.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Marc E. Frincu.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Frincu, M.E., Genaud, S. & Gossa, J. On the efficiency of several VM provisioning strategies for workflows with multi-threaded tasks on clouds. Computing 96, 1059–1086 (2014). https://doi.org/10.1007/s00607-014-0410-0

Download citation

Keywords

  • Workflow scheduling
  • Virtual machine provisioning
  • Cloud computing
  • Cost and makespan modeling

Mathematics Subject Classification (2010)

  • 68M14
  • 68M20