Skip to main content
Log in

A multi-level scheduler for batch jobs on grids

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper proposes a two-level scheduler for dynamically scheduling a continuous stream of sequential and multi-threaded batch jobs on grids, made up of interconnected clusters of heterogeneous single-processor and/or symmetric multiprocessor machines. The scheduler aims to schedule arriving jobs respecting their computational and deadline requirements, and optimizing the hardware and software resource usage. At the top of the hierarchy a lightweight meta-scheduler (MS) classifies incoming jobs according to their requirements, and schedules them among the underlying resources balancing the workload. At cluster level a Flexible Backfilling algorithm carries out the job machine associations by exploiting dynamic information about the environment. Scheduling decisions at both levels are based on job priorities computed by using different sets of heuristics. The different proposals have been compared through simulations. Performance figures show the feasibility of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. El-Rewini H, Lewis TG, Ali HH (1994) Task scheduling in parallel and distributed systems. PTR Prentice Hall, New York

    Google Scholar 

  2. Hovestadt M, Keller A, Kao O, Streit A (2003) Scheduling in hpc resource management systems: queuing vs. planning. In: Feitelson DG, Rudolph L, Schwiegelshohn U (eds) Job scheduling strategies for parallel processing, 9th international workshop, JSSPP 2003, Seattle, WA, USA, June 24, 2003. Lecture notes in computer science, vol 2862. Springer, Berlin

    Google Scholar 

  3. Buyya R, Abramson D, Giddy J (2000) Economy driven resource management architecture for computational power grids. In: International conference on parallel and distributed processing techniques and applications (PDPTA2000)

    Google Scholar 

  4. Baraglia R, Dazzi P, Capannini G, Pagano G (2010) A multi-criteria job scheduling framework for large computing farms. In: Proceedings of IEEE CIT 2010, pp 187–194

    Google Scholar 

  5. Feitelson DD, Rudolph L, Schwiegelshohn U (2005) Parallel job scheduling, a status report. In: Job scheduling strategies for parallel processing, 10th international workshop, JSSPP 2004, Revised selected papers, New York, NY, USA, June 13, 2004. Lecture notes in computer science, vol 3277. Springer, Berlin, pp 1–16

    Google Scholar 

  6. Baraglia R, Capannini G, Pasquali M, Puppin D, Ricci L, Techiouba AD (2007) A two-level scheduler to dynamically schedule a stream of batch jobs in large-scale grids. In: Making grids work, proceedings of the CoreGRID workshop on programming models grid and P2P system architecture grid systems, tools and environments, 12–13 June 2007, Heraklion, Crete, Greece, pp 103–115

    Google Scholar 

  7. KlusÃÄçek D, Rudovà H, Baraglia R, Pasquali M, Capannini G (2008) Comparison of multi-criteria scheduling techniques. In: Grid computing achievements and prospects. Springer, Berlin, pp 173–184

    Google Scholar 

  8. Bolze R, Cappello F, Caron E, Daydé M, Desprez F, Jeannot E, Jégou Y, Lanteri S, Leduc J, Melab N, Mornet G, Namyst R, Primet P, Quetier B, Richard O, Talbi E-G, Irena T (2006) Grid’5000: a large scale and highly reconfigurable experimental grid testbed. Int J High Perform Comput Appl 20(4):481–494

    Article  Google Scholar 

  9. Platform LSF reports user’s guide (2005). http://www.platform.com Web site, October

  10. Berman F, Wolski R, Casanova H, Cirne W, Dail H, Faerman M, Figueira S, Hayes J, Obertelli G, Schopf J, Shao G, Smallen S, Spring N, Su A, Zagorodnov D (2003) Adaptive computing on the grid using AppLeS. IEEE Trans Parallel Distrib Syst 144:369–382

    Article  Google Scholar 

  11. GENIAS Software GmbH (1995) Codine: Computing in distributed networked environments, 1995. http://www.genias.de/genias/english/codine.html

  12. Frey J, Tannenbaum T, Livny M, Foster I, Tuecke S (2002) Condor-g: a computation management agent for multi-institutional grids. Cluster Comput 5(3):237–246

    Article  Google Scholar 

  13. Capit N, Costa GD, Georgiou Y, Huard G, Mouniãà G, Neyron P, Richard O, (2005) A batch scheduler with high level components. In: Proceedings of cluster computing and grid 2005 (CCGrid05), pp 776–783

    Chapter  Google Scholar 

  14. Mohamed HH, Epema DHJ (2005) Experiences with the koala co-allocating scheduler in multiclusters. In: Proceedings of CCGRID ’05, Washington, DC, USA. IEEE Computer Society, Los Alamitos, pp 784–791

    Google Scholar 

  15. VIOLA—Vertically Integrated Optical Testbed for Large Application in DFN, website (2005). Online: http://www.viola-testbed.de/

  16. Vadhiyar SS, Dongarra JJ (2002) A meta-scheduler for the grid. In: Proceedings of the 11th IEEE international symposium on high performance distributed computing (HPDC’02), Edinburgh, July 2002. IEEE Computer Society, Los Alamitos, pp 343–351

    Chapter  Google Scholar 

  17. Huedo E, Montero RS, Llorente IM (2005) The GridWay framework for adaptive scheduling and execution on grids. Scalable Comput, Pract Exp 6(3):1–8

    Google Scholar 

  18. Berman F, Chien A, Cooper K, Dongarra J, Foster I, Gannon D, Johnsson L, Kennedy K, Kesselman C, Mellor-Crummey J, Reed D, Torczon L, Wolski R (2001) The GrADS project: software support for high-level grid application development. Int J High Perform Appl Supercomp 15(4):327–344

    Article  Google Scholar 

  19. Allen G, Davis K, Dolkas KN, Doulamis ND, Goodale T, Kielmann T, Merzky A, Nabrzyski J, Pukacki J, Radke T, Russell M, Shalf J, Taylor I (2003) Enabling applications on the grid: a GridLab overview. Int J High Perform Comput Appl 17:449–466 2003

    Article  Google Scholar 

  20. Casavant TL, Kuhl JG (1988) A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Trans Softw Eng 14(2):141–154

    Article  Google Scholar 

  21. Fiat A, Woeginger GJ (1998) Online algorithms, the state of the art. In: Lecture notes in computer science, vol 1442. Springer, London

    Google Scholar 

  22. Mu’alem AW, Feitelson DG (2001) Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans Parallel Distrib Syst 12(6):529–543

    Article  Google Scholar 

  23. Schwiegelshohn U, Yahyapour R (1998) Analysis of first-come first-serve parallel job scheduling. In: SODA ’98, Philadelphia, PA, USA, pp 629–638

    Google Scholar 

  24. Dazzi P, Nidito F, Pasquali M (2007) New perspectives in autonomic design patterns for stream-classification-systems. In: Proceedings of the 2007 workshop on automating service quality (WRASQ ’07), New York, NY, USA, pp 34–37

    Chapter  Google Scholar 

  25. The Globus Toolkit (2011) http://www.globus.org/

  26. Uniform Interface to Computing Resources (2011) http://www.unicore.eu/

  27. GRMS User Guide v.2.0. (2011) http://www.gridlab.org/Resources/Deliverables/D9.6.Users_Guide.pdf

  28. Noël S, Delannoy O, Emad N, Manneback P, Petiton SG (2006) A multi-level scheduler for the grid computing YML framework. In: Proceedings of Euro-par workshops, pp 87–100

    Google Scholar 

  29. Abramson D, Buyya R, Giddy J (2002) A computational economy for grid computing and its implementation in the Nimrod-G resource broker. Future Gen Comput Syst 18(8):1061–1074

    Article  MATH  Google Scholar 

  30. Borissov N, Anandasivam A, Wirström N, Neumann D (2008) Rational bidding using reinforcement learning. In: Proceedings of the 5th international workshop on grid economics and business models, GECON ’08, Las Palmas de Gran Canaria, Spain, pp 73–88

    Google Scholar 

  31. SORMA—Self-Organizing ICT Resource Management (2008) European Union’s Information Society Technologies Programme. http://www.im.uni-karlsruhe.de/sorma/

  32. Capannini G, Baraglia R, Puppin D, Ricci L, Pasquali M (2007) A job scheduling framework for large computing farms. In: Proceedings of SC07, Reno, USA

    Google Scholar 

  33. Pasquali M, Baraglia R, Capannini G, Ricci L, Laforenza D (2008) A two-level scheduler to dynamically schedule a stream of batch jobs in large-scale grids. In: Proceedings of international conference on high performance distributed computing (HPDC 2008), pp 231–232

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ranieri Baraglia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pasquali, M., Baraglia, R., Capannini, G. et al. A multi-level scheduler for batch jobs on grids. J Supercomput 57, 81–98 (2011). https://doi.org/10.1007/s11227-011-0571-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-011-0571-y

Keywords

Navigation