Skip to main content
Log in

Multiple Workflow Scheduling Strategies with User Run Time Estimates on a Grid

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

In this paper, we present an experimental study of deterministic non-preemptive multiple workflow scheduling strategies on a Grid. We distinguish twenty five strategies depending on the type and amount of information they require. We analyze scheduling strategies that consist of two and four stages: labeling, adaptive allocation, prioritization, and parallel machine scheduling. We apply these strategies in the context of executing the Cybershake, Epigenomics, Genome, Inspiral, LIGO, Montage, and SIPHT workflows applications. In order to provide performance comparison, we performed a joint analysis considering three metrics. A case study is given and corresponding results indicate that well known DAG scheduling algorithms designed for single DAG and single machine settings are not well suited for Grid scheduling scenarios, where user run time estimates are available. We show that the proposed new strategies outperform other strategies in terms of approximation factor, mean critical path waiting time, and critical path slowdown. The robustness of these strategies is also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Pinedo, M.L.: Scheduling: Theory, Algorithms, and Systems, 3rd edn. Springer (2008)

  2. Mccreary, C., Khan, A.A., Thompson, J.J., Mcardle, M.E.: A comparison of heuristics for scheduling dags on multiprocessors. In: International Parallel and Distributed Processing Symposium (IPPS94), pp. 446–451. Cancun, México (1994)

  3. Kwong, K.Y., Ahmad, I.: Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors. IEEE Trans. Parallel Distrib. Syst. 7, 506–521 (1996)

    Article  Google Scholar 

  4. Kwok, Y.-K., Ahmad, I.: Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31(4), 406–471 (1999)

    Article  Google Scholar 

  5. Leung, J., Kelly, L., Anderson, J.H.: Handbook of Scheduling: Algorithms, Models, and Performance Analysis. CRC Press, Inc., Boca Raton (2004)

    MATH  Google Scholar 

  6. Rajakumar, S., Arunachalam, V.P., Selladurai, V.: Workflow balancing strategies in parallel machine scheduling. Int. J. Adv. Manuf. Technol. 23, 366–374 (2004)

    Article  Google Scholar 

  7. Wieczorek, M., Prodan, R., Fahringer, T.: Scheduling of scientific workflows in the askalon grid environment. SIGMOD Record 34(3), 56–62 (2005)

    Article  Google Scholar 

  8. Bittencourt, L.F., Madeira, E.R.M.: A dynamic approach for scheduling dependent tasks on the xavantes grid middleware. In: MCG’06: Proceedings of the 4th International Workshop on Middleware for Grid Computing. MCG’06, pp. 10–16. ACM, New York (2006)

    Chapter  Google Scholar 

  9. Jia, Y., Rajkumar, B.: Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms. Sci. Program. 14(3), 217–230 (2006)

    Google Scholar 

  10. Ramakrishnan, A., Singh, G., Zhao, H., Deelman, E., Sakellariou, R., Vahi, K., Blackburn, K., Meyers, D., Samidi, M.: Scheduling data-intensive workflows onto storage-constrained distributed resources. In: CCGRID’07: Proceedings of the 7th IEEE Symposium on Cluster Computing and the Grid. CCGRID’07, pp. 14–17 (2007)

  11. Szepieniec, T., Bubak, M.: Investigation of the dag eligible jobs maximization algorithm in a grid. In: Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing, GRID’08, pp. 340–345. IEEE Computer Society, Washington (2008)

    Chapter  Google Scholar 

  12. Singh, G., Su, M.-H., Vahi, K., Deelman, E., Berriman, B., Good, J., Katz, D.S., Mehta, G.: Workflow task clustering for best effort systems with Pegasus. In: MG’08: Proceedings of the 15th ACM Mardi Gras Conference, pp. 1–8. ACM, New York (2008)

    Google Scholar 

  13. Masko, L., Dutot, P.F., Mounie, G., Trystram, D., Tudruj, M.: Scheduling moldable tasks for dynamic SMP clusters in soc technology. In: Parallel Processing and Applied Mathematics. Lecture Notes in Computer Science, vol. 3911, pp. 879–887. Springer (2005)

  14. Masko, L., Mounie, G., Trystram, D., Tudruj, M.: Program graph structuring for execution in dynamic SMP clusters using moldable tasks. In: International Symposium on Parallel Computing in Electrical Engineering, PAR ELEC 2006, pp. 95–100 (2006)

  15. Singh, G., Kesselman, C., Deelman, E.: Optimizing grid-based workflow execution. J. Grid Computing 3, 201–219 (2005)

    Article  Google Scholar 

  16. Bittencourt, L.F., Madeira, E.R.M.: Towards the scheduling of multiple workflows on computational grids. J. Grid Computing 8, 419–441 (2010)

    Article  Google Scholar 

  17. Zhao, H., Sakellariou, R.: Scheduling multiple dags onto heterogeneous systems. In: Parallel and Distributed Processing Symposium, 20th International, IPDPS’06, p. 14. IEEE Computer Society, Washington (2006)

    Google Scholar 

  18. Topcuouglu, H., Hariri, S., Wu, M.-Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)

    Article  Google Scholar 

  19. Sakellariou, R., Zhao, H.: A hybrid heuristic for dag scheduling on heterogeneous systems. In: 13th IEEE Heterogeneous Computing Workshop (HCW’04). IPDPS’04, pp. 111–123. IEEE Computer Society, Santa Fe (2004)

    Google Scholar 

  20. Zhu, L., Sun, Z., Guo, W., Jin, Y., Sun, W., Hu, W.: Dynamic multi dag scheduling algorithm for optical grid environment. Netw. Architect. Manag. Appl. V 6784(1), 1122 (2007)

    Google Scholar 

  21. N’takpé, T., Suter, F.: Concurrent scheduling of parallel task graphs on multi-clusters using constrained resource allocations. In: International Parallel and Distributed Processing Symposium/International Parallel Processing Symposium, pp. 1–8 (2009)

  22. Hsu, C.-C., Huang, K.-C., Wang, F.-J.: Online scheduling of workflow applications. In Grid environments. Future Gen. Comput. Syst. 27, 860–870 (2011)

    Article  Google Scholar 

  23. Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12, 529–543 (2001)

    Article  Google Scholar 

  24. Ramirez-Alcaraz, J.M., Tchernykh, A., Yahyapour, R., Schwiegelshohn, U., Quezada-Pina, A., Gonzalez-García, J.L., Hirales-Carbajal, A.: Job allocation strategies with user run time estimates for online scheduling in hierarchical Grids. J. Grid Computing 9, 95–116 (2011)

    Article  Google Scholar 

  25. Shmoys, D.B., Wein, J., Williamson, D.P.: Scheduling parallel machines on-line. SIAM J. Comput. 24, 1313–1331 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  26. Condor high throughput computing. Available in: http://www.cs.wisc.edu/condor/. Cited August 2011

  27. Openpbs. Available in: http://www.mcs.anl.gov/research/projects/openpbs/. Cited August 2011

  28. Globus. Available in http://www.globus.org/. Cited August 2011

  29. Tchernykh, A., Schwiegelshohn, U., Yahyapour, R., Kuzjurin, N.: On-line hierarchical job scheduling on Grids with admissible allocation. J. Scheduling 13, 545–552 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  30. Workflow generator. Available in https://confluence.pegasus.isi.edu/display/pegasus/. Cited August 2010

  31. Garey, M.R., Graham, R.L.: Bounds for multiprocessor scheduling with resource constraints. SIAM J. Comput. 4, 187–200 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  32. Hirales-Carbajal, A., Tchernykh, A., Roblitz, T., Yahyapour, R.: A grid simulation framework to study advance scheduling strategies for complex workflow applications. In: IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), pp. 1–8 (2010)

  33. Bharathi, S., Chervenak, A., Deelman, E., Mehta, G., Su, M.-H., Vahi, K.: Characterization of scientific workflows. In: Third Workshop on Workflows in Support of Large-Scale Science, WORKS08, pp. 1–10 (2008)

  34. Lee, C.B., Schwartzman, Y., Hardy, J., Snavely, A.: Are user runtime estimates inherently inaccurate? In: Job Scheduling Strategies for Parallel Processing, pp. 253–263 (2004)

  35. Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  36. Dolan, E.D., Moré, J.J., Munson, T.S.: Optimality measures for performance profiles. Siam. J. Optim. 16, 891–909 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  37. Dongarra, J.J., Jeannot, E., Saule, E., Shi, Z.: Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems. In: Proceedings of the Nineteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA’07, pp. 280–288. ACM, New York (2007)

    Chapter  Google Scholar 

  38. Saule, E., Trystram, D.: Analyzing scheduling with transient failures. Inform. Process. Lett. 109(11), 539–542 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  39. Canon, L.-C., Jeannot, E., Sakellariou, R., Zheng, W.: Comparative evaluation of the robustness of dag scheduling heuristics. In: Gorlatch, S., Fragopoulou, P., Priol, T. (eds.) Journal of Grid Computing, pp. 73–84. Springer, New York (2008)

    Chapter  Google Scholar 

  40. Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for scheduling parameter sweep applications in grid environments. In: Heterogeneous Computing Workshop, pp. 349–363 (2000)

  41. Buyya, R., Murshed, M.: GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. J. Concurr. Comput. Pract. Exp. 14, 1175–1220 (2002)

    Article  MATH  Google Scholar 

  42. Casanova, H.: SimGrid: a toolkit for the simulation of application scheduling. In: Proceedings of the First IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 430–437 (2001)

  43. Sulistio, A., Yeo, C.S., Buyya, R.A.: A taxonomy of computer-based simulations and its mapping to parallel and distributed systems simulation tools. Software: Practice and Experience (SPE) 34(7), 653–673 (2004). ISSN: 0038-0644

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrei Tchernykh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hirales-Carbajal, A., Tchernykh, A., Yahyapour, R. et al. Multiple Workflow Scheduling Strategies with User Run Time Estimates on a Grid. J Grid Computing 10, 325–346 (2012). https://doi.org/10.1007/s10723-012-9215-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-012-9215-6

Keywords

Navigation