Skip to main content
Log in

A probabilistic and adaptive scheduling algorithm using system-generated predictions for inter-grid resource sharing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Rapid advancement and more readily availability of Grid technologies have encouraged many businesses and researchers to establish Virtual Organizations (VO) and make use of their available desktop resources to solve computing intensive problems. These VOs, however, work as disjointed and independent communities with no resource sharing between them. We, in previous work, have proposed a fully decentralized and reconfigurable Inter-Grid framework for resource sharing among such distributed and autonomous Grid systems (Rao et al. in ICCSA, [2006]). The specific problem that underlies in such a collaborating Grids system is scheduling of resources as there is very little knowledge about availability of the resources due to the distributed and autonomous nature of the underlying Grid entities. In this paper, we propose a probabilistic and adaptive scheduling algorithm using system-generated predictions for Inter-Grid resource sharing keeping collaborating Grid systems autonomous and independent. We first use system-generated job runtime estimates without actually submitting jobs to the target Grid system. Then this job execution estimate is used to predict the job scheduling feasibility on the target system. Furthermore, our proposed algorithm adapted itself to the actual resource behavior and performance. Simulation results are presented to discuss the correctness and accuracy of our proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Rao I, Huh EN, Lee S, Chung T (2006) Distributed, scalable and reconfigurable inter-grid resource sharing framework. In: International conference on computational science and its applications, ICCSA 2006

  2. Foster I, Kesselman C, Tuecke S (2001) The anatomy of the grid: enabling scalable virtual organizations

  3. de Assuncao MD, Buyya R (2007) A resource exchange mechanism for peak load management in intergrid environments. Technical report, The University of Melbourne

  4. GN-CG (2006) Grid interoperability now community group. http://forge.ogf.org/sf/projects/gin

  5. AuYoung A, Chun B, Snoeren A, Vahdat A (2004) Resource allocation in federated distributed computing infrastructures. In: Proceedings of the 1st workshop on operating system and architectural support for the ondemand IT infrastructure

  6. Wu Y, SongWu, Yu H, Hu C (2005) Cgsp: an extensible and reconfigurable grid framework. In: APPT’ 05: proceedings of 6th international workshop on advanced parallel processing technologies. IEEE Computer Society, Los Alamitos, pp 292–300

    Google Scholar 

  7. Ranjan R, Buyya R, Harwood A (2004) A model for cooperative federation of distributed clusters. In: HPDC’ 04: the 14th IEEE international symposium on high performance distributed computing. IEEE Computer Society, Los Alamitos

    Google Scholar 

  8. Litzkow M, Livny M, Mutka M (1988) Condor—a hunter of idle workstations. In: Proceedings of the 8th international conference of distributed computing systems

  9. Henderson RL (1995) Job scheduling under the portable batch system. In: IPPS ’95: proceedings of the workshop on job scheduling strategies for parallel processing. Springer, London, pp 279–294

    Google Scholar 

  10. Microsystems WGS (2001) Sun grid engine: towards creating a compute power grid. In: CCGRID ’01: proceedings of the 1st international symposium on cluster computing and the grid. IEEE Computer Society, Washington, p 35

    Google Scholar 

  11. Schopf J (2001) Ten actions when superscheduling. Technical Report GFD-I. 4, The Global Grid Forum

  12. Abramson D, Buyya R, Giddy J (2002) A computational economy for grid computing and its implementation in the nimrod-g resource broker. Future Gener Comput Syst 18:1061–1074

    Article  MATH  Google Scholar 

  13. Frey J, Tannenbaum T, Livny M, Foster I, Tuecke S (2002) Condor-g: a computation management agent for multi-institutional grids. Cluster Comput 5:237–246

    Article  Google Scholar 

  14. Venugopal S, Buyya R, Winton L (2004) A grid service broker for scheduling distributed data-oriented applications on global grids. In: MGC ’04: proceedings of the 2nd workshop on middleware for grid computing. ACM Press, New York, pp 75–80

    Chapter  Google Scholar 

  15. Chapin SJ, Katramatos D, Karpovich JF, Grimshaw AS (1999) The legion resource management system. In: IPPS/SPDP ’99/JSSPP ’99: proceedings of the job scheduling strategies for parallel processing. Springer, London, pp 162–178

    Chapter  Google Scholar 

  16. Butt AR, Zhang R, Hu YC (2006) A self-organizing flock of condors. J Parallel Distrib Comput 66:145–161

    MATH  Google Scholar 

  17. Casanova H, Obertelli G, Berman F, Wolski R (2000) The apples parameter sweep template: user-level middleware for the grid. In: Supercomputing ’00: proceedings of the 2000 ACM/IEEE conference on supercomputing (CDROM). IEEE Computer Society, Washington, p 60

    Google Scholar 

  18. da Silva D, Cirne W, Brasileiro FV (2004) Trading cycles for information: using replication to schedule bag-of-tasks applications on computational grids. Springer, Berlin

    Google Scholar 

  19. Subramani V, Kettimuthu R, Srinivasan S, Sadayappan P (2002) Distributed job scheduling on computational grids using multiple simultaneous requests. In: HPDC ’02: proceedings of the 11th IEEE international symposium on high performance distributed computing. IEEE Computer Society, Washington, pp 359–368

    Chapter  Google Scholar 

  20. Elizeu SN, Cirne W, Brasileiro F, Lima A (2005) Exploiting replication and data reuse to efficiently schedule data-intensive applications on grids, vol 3277. Springer, Berlin

    Google Scholar 

  21. Maheswaran M, Ali S, Siegel HJ, Hensgen D, Freund RF (1999) Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J Parallel Distrib Comput 59:107–131

    Article  Google Scholar 

  22. Yeo CS, Buyya R (2006) Managing risk of inaccurate runtime estimates for deadline constrained job admission control in clusters. In: Proceedings of the 35th international conference on parallel processing (ICPP 2006), Columbus, OH. IEEE Computer Society, Los Alamitos, pp 451–458

    Google Scholar 

  23. Feitelson, Nitzberg (1995) Job characteristics of a production parallel scientific workload on the NASA ames iPSC/860. In: IPPS ’95 workshop on job scheduling strategies for parallel processing, vol 949. Springer, Berlin, pp 337–360

    Google Scholar 

  24. Nguyen TD, Vaswani R, Zahorjan J (1996) Using runtime measured workload characteristics in parallel processor scheduling. In: IPPS ’96: proceedings of the workshop on job scheduling strategies for parallel processing. Springer, London, pp 155–174

    Chapter  Google Scholar 

  25. Tsafrir MD, Etsion SMY, Feitelson SMDG (2007) Backfilling using system-generated predictions rather than user runtime estimates. IEEE Trans Parallel Distrib Syst 18:789–803

    Article  Google Scholar 

  26. Zhou S, Zheng X, Wang J, Delisle P (1993) Utopia: a load sharing facility for large, heterogeneous distributed computer systems. Softw Pract Exp 23:1305–1336

    Article  Google Scholar 

  27. Fitzgerald S (2001) Grid information services for distributed resource sharing. In: HPDC ’01: proceedings of the 10th IEEE international symposium on high performance distributed computing. IEEE Computer Society, Washington, p 181

    Google Scholar 

  28. Aggarwal AK, Kent RD (2005) An adaptive generalized scheduler for grid applications. In: HPCS ’05: proceedings of the 19th international symposium on high performance computing systems and applications. IEEE Computer Society, Washington, pp 188–194

    Chapter  Google Scholar 

  29. Mateescu G (2003) Quality of service on the grid via metascheduling with resource co-scheduling and co-reservation. Int J High Perform Comput Appl 17:209–218

    Article  Google Scholar 

  30. Jang SH, Wu X, Taylor V (2004) Using performance prediction to allocate grid resources. Technical report, Technical report, GriPhyN

  31. Sun XH, Wu M (2003) Grid harvest service: a system for long-term, application-level task scheduling. In: IPDPS ’03: proceedings of the 17th international symposium on parallel and distributed processing. IEEE Computer Society, Washington, p 25.1

    Google Scholar 

  32. Derbal Y (2006) A probabilistic scheduling heuristic for computational grids. Multiagent Grid Syst 2:45–59

    MATH  Google Scholar 

  33. Yang L, Foster I, Schopf JM (2003) Homeostatic and tendency-based cpu load predictions. In: IPDPS ’03: proceedings of the 17th international symposium on parallel and distributed processing. IEEE Computer Society, Washington

    Google Scholar 

  34. Yang L, Schopf JM, Foster I (2003) Conservative scheduling: using predicted variance to improve scheduling decisions in dynamic environments

  35. Gibbons R (1997) A historical application profiler for use by parallel schedulers. In: IPPS ’97: proceedings of the job scheduling strategies for parallel processing. Springer, London, pp 58–77

    Google Scholar 

  36. Downey AB (1997) Predicting queue times on space-sharing parallel computers. In: IPPS ’97: proceedings of the 11th international symposium on parallel processing. IEEE Computer Society, Washington, pp 209–218

    Chapter  Google Scholar 

  37. Iverson MA, Özgüner F, Potter L (1999) Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment. IEEE Trans Comput 48:1374–1379

    Article  Google Scholar 

  38. Kapadia NH, Fortes JAB, Brodley CE (1999) Predictive application-performance modeling in a computational grid environment. In: HPDC ’99: proceedings of the 8th IEEE international symposium on high performance distributed computing. IEEE Computer Society, Washington, p 6

    Google Scholar 

  39. Wolski R, Spring NT, Hayes J (1999) The network weather service: a distributed resource performance forecasting service for metacomputing. Future Gener Comput Syst 15:757–768

    Article  Google Scholar 

  40. Legrand IC, Newman HB et al (2004) Monalisa: an agent based, dynamic service system to monitor, control and optimize grid based applications. In: CHEP 2004

  41. Dinda PA (2001) Online prediction of the running time of tasks. In: SIGMETRICS/performance, pp 336–337

  42. Conrad M, Hof HJ (2007) A generic, self-organizing, and distributed bootstrap service for peer-to-peer networks. In: Self-organizing systems, second international workshop. Lecture notes in computer science, vol 4725. Springer, Berlin, pp 59–72

    Google Scholar 

  43. Messing F (1996) Predicting scheduling success. In: SpaceOps symposium, Germany

  44. Chapin SJ, Cirne W, Feitelson DG et al. (1999) Benchmarks and standards for the evaluation of parallel job schedulers. In: Job scheduling strategies for parallel processing. Lecture notes in computer science, vol 1659. Springer, Berlin, pp 66–89

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eui-Nam Huh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rao, I., Huh, EN. A probabilistic and adaptive scheduling algorithm using system-generated predictions for inter-grid resource sharing. J Supercomput 45, 185–204 (2008). https://doi.org/10.1007/s11227-007-0169-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-007-0169-6

Keywords

Navigation