Abstract
Rapid advancement and more readily availability of Grid technologies have encouraged many businesses and researchers to establish Virtual Organizations (VO) and make use of their available desktop resources to solve computing intensive problems. These VOs, however, work as disjointed and independent communities with no resource sharing between them. We, in previous work, have proposed a fully decentralized and reconfigurable Inter-Grid framework for resource sharing among such distributed and autonomous Grid systems (Rao et al. in ICCSA, [2006]). The specific problem that underlies in such a collaborating Grids system is scheduling of resources as there is very little knowledge about availability of the resources due to the distributed and autonomous nature of the underlying Grid entities. In this paper, we propose a probabilistic and adaptive scheduling algorithm using system-generated predictions for Inter-Grid resource sharing keeping collaborating Grid systems autonomous and independent. We first use system-generated job runtime estimates without actually submitting jobs to the target Grid system. Then this job execution estimate is used to predict the job scheduling feasibility on the target system. Furthermore, our proposed algorithm adapted itself to the actual resource behavior and performance. Simulation results are presented to discuss the correctness and accuracy of our proposed algorithm.
Similar content being viewed by others
References
Rao I, Huh EN, Lee S, Chung T (2006) Distributed, scalable and reconfigurable inter-grid resource sharing framework. In: International conference on computational science and its applications, ICCSA 2006
Foster I, Kesselman C, Tuecke S (2001) The anatomy of the grid: enabling scalable virtual organizations
de Assuncao MD, Buyya R (2007) A resource exchange mechanism for peak load management in intergrid environments. Technical report, The University of Melbourne
GN-CG (2006) Grid interoperability now community group. http://forge.ogf.org/sf/projects/gin
AuYoung A, Chun B, Snoeren A, Vahdat A (2004) Resource allocation in federated distributed computing infrastructures. In: Proceedings of the 1st workshop on operating system and architectural support for the ondemand IT infrastructure
Wu Y, SongWu, Yu H, Hu C (2005) Cgsp: an extensible and reconfigurable grid framework. In: APPT’ 05: proceedings of 6th international workshop on advanced parallel processing technologies. IEEE Computer Society, Los Alamitos, pp 292–300
Ranjan R, Buyya R, Harwood A (2004) A model for cooperative federation of distributed clusters. In: HPDC’ 04: the 14th IEEE international symposium on high performance distributed computing. IEEE Computer Society, Los Alamitos
Litzkow M, Livny M, Mutka M (1988) Condor—a hunter of idle workstations. In: Proceedings of the 8th international conference of distributed computing systems
Henderson RL (1995) Job scheduling under the portable batch system. In: IPPS ’95: proceedings of the workshop on job scheduling strategies for parallel processing. Springer, London, pp 279–294
Microsystems WGS (2001) Sun grid engine: towards creating a compute power grid. In: CCGRID ’01: proceedings of the 1st international symposium on cluster computing and the grid. IEEE Computer Society, Washington, p 35
Schopf J (2001) Ten actions when superscheduling. Technical Report GFD-I. 4, The Global Grid Forum
Abramson D, Buyya R, Giddy J (2002) A computational economy for grid computing and its implementation in the nimrod-g resource broker. Future Gener Comput Syst 18:1061–1074
Frey J, Tannenbaum T, Livny M, Foster I, Tuecke S (2002) Condor-g: a computation management agent for multi-institutional grids. Cluster Comput 5:237–246
Venugopal S, Buyya R, Winton L (2004) A grid service broker for scheduling distributed data-oriented applications on global grids. In: MGC ’04: proceedings of the 2nd workshop on middleware for grid computing. ACM Press, New York, pp 75–80
Chapin SJ, Katramatos D, Karpovich JF, Grimshaw AS (1999) The legion resource management system. In: IPPS/SPDP ’99/JSSPP ’99: proceedings of the job scheduling strategies for parallel processing. Springer, London, pp 162–178
Butt AR, Zhang R, Hu YC (2006) A self-organizing flock of condors. J Parallel Distrib Comput 66:145–161
Casanova H, Obertelli G, Berman F, Wolski R (2000) The apples parameter sweep template: user-level middleware for the grid. In: Supercomputing ’00: proceedings of the 2000 ACM/IEEE conference on supercomputing (CDROM). IEEE Computer Society, Washington, p 60
da Silva D, Cirne W, Brasileiro FV (2004) Trading cycles for information: using replication to schedule bag-of-tasks applications on computational grids. Springer, Berlin
Subramani V, Kettimuthu R, Srinivasan S, Sadayappan P (2002) Distributed job scheduling on computational grids using multiple simultaneous requests. In: HPDC ’02: proceedings of the 11th IEEE international symposium on high performance distributed computing. IEEE Computer Society, Washington, pp 359–368
Elizeu SN, Cirne W, Brasileiro F, Lima A (2005) Exploiting replication and data reuse to efficiently schedule data-intensive applications on grids, vol 3277. Springer, Berlin
Maheswaran M, Ali S, Siegel HJ, Hensgen D, Freund RF (1999) Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J Parallel Distrib Comput 59:107–131
Yeo CS, Buyya R (2006) Managing risk of inaccurate runtime estimates for deadline constrained job admission control in clusters. In: Proceedings of the 35th international conference on parallel processing (ICPP 2006), Columbus, OH. IEEE Computer Society, Los Alamitos, pp 451–458
Feitelson, Nitzberg (1995) Job characteristics of a production parallel scientific workload on the NASA ames iPSC/860. In: IPPS ’95 workshop on job scheduling strategies for parallel processing, vol 949. Springer, Berlin, pp 337–360
Nguyen TD, Vaswani R, Zahorjan J (1996) Using runtime measured workload characteristics in parallel processor scheduling. In: IPPS ’96: proceedings of the workshop on job scheduling strategies for parallel processing. Springer, London, pp 155–174
Tsafrir MD, Etsion SMY, Feitelson SMDG (2007) Backfilling using system-generated predictions rather than user runtime estimates. IEEE Trans Parallel Distrib Syst 18:789–803
Zhou S, Zheng X, Wang J, Delisle P (1993) Utopia: a load sharing facility for large, heterogeneous distributed computer systems. Softw Pract Exp 23:1305–1336
Fitzgerald S (2001) Grid information services for distributed resource sharing. In: HPDC ’01: proceedings of the 10th IEEE international symposium on high performance distributed computing. IEEE Computer Society, Washington, p 181
Aggarwal AK, Kent RD (2005) An adaptive generalized scheduler for grid applications. In: HPCS ’05: proceedings of the 19th international symposium on high performance computing systems and applications. IEEE Computer Society, Washington, pp 188–194
Mateescu G (2003) Quality of service on the grid via metascheduling with resource co-scheduling and co-reservation. Int J High Perform Comput Appl 17:209–218
Jang SH, Wu X, Taylor V (2004) Using performance prediction to allocate grid resources. Technical report, Technical report, GriPhyN
Sun XH, Wu M (2003) Grid harvest service: a system for long-term, application-level task scheduling. In: IPDPS ’03: proceedings of the 17th international symposium on parallel and distributed processing. IEEE Computer Society, Washington, p 25.1
Derbal Y (2006) A probabilistic scheduling heuristic for computational grids. Multiagent Grid Syst 2:45–59
Yang L, Foster I, Schopf JM (2003) Homeostatic and tendency-based cpu load predictions. In: IPDPS ’03: proceedings of the 17th international symposium on parallel and distributed processing. IEEE Computer Society, Washington
Yang L, Schopf JM, Foster I (2003) Conservative scheduling: using predicted variance to improve scheduling decisions in dynamic environments
Gibbons R (1997) A historical application profiler for use by parallel schedulers. In: IPPS ’97: proceedings of the job scheduling strategies for parallel processing. Springer, London, pp 58–77
Downey AB (1997) Predicting queue times on space-sharing parallel computers. In: IPPS ’97: proceedings of the 11th international symposium on parallel processing. IEEE Computer Society, Washington, pp 209–218
Iverson MA, Özgüner F, Potter L (1999) Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment. IEEE Trans Comput 48:1374–1379
Kapadia NH, Fortes JAB, Brodley CE (1999) Predictive application-performance modeling in a computational grid environment. In: HPDC ’99: proceedings of the 8th IEEE international symposium on high performance distributed computing. IEEE Computer Society, Washington, p 6
Wolski R, Spring NT, Hayes J (1999) The network weather service: a distributed resource performance forecasting service for metacomputing. Future Gener Comput Syst 15:757–768
Legrand IC, Newman HB et al (2004) Monalisa: an agent based, dynamic service system to monitor, control and optimize grid based applications. In: CHEP 2004
Dinda PA (2001) Online prediction of the running time of tasks. In: SIGMETRICS/performance, pp 336–337
Conrad M, Hof HJ (2007) A generic, self-organizing, and distributed bootstrap service for peer-to-peer networks. In: Self-organizing systems, second international workshop. Lecture notes in computer science, vol 4725. Springer, Berlin, pp 59–72
Messing F (1996) Predicting scheduling success. In: SpaceOps symposium, Germany
Chapin SJ, Cirne W, Feitelson DG et al. (1999) Benchmarks and standards for the evaluation of parallel job schedulers. In: Job scheduling strategies for parallel processing. Lecture notes in computer science, vol 1659. Springer, Berlin, pp 66–89
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rao, I., Huh, EN. A probabilistic and adaptive scheduling algorithm using system-generated predictions for inter-grid resource sharing. J Supercomput 45, 185–204 (2008). https://doi.org/10.1007/s11227-007-0169-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-007-0169-6