In large Grids, like the National Grid Service (NGS), or large distributed architecture different scheduling entities are involved. Despite a global scheduling approach would archive higher performance and could increment the utilization of global system in these scenarios usually independent schedulers carry out its own scheduling decisions. In this paper we present howa coordinated scheduling among all the different centers using data mining prediction techniques can substantially improve the performance of the global distributed infrastructure, and can provide a uniform access to the user to all the heterogeneous Grid resources. We present the Grid Backfilling meta-scheduling policy that optimizes the global utilization of the system resources and increases substantially the response time for the jobs. We also present how data mining techniques applied to historical information can provide very suitable inputs for carrying out the Grid Backfilling meta-scheduling decisions.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S.-H. Chiang, A. C. Arpaci-Dusseau, and M. K. Vernon. The impact of more accurate requested runtimes on production job scheduling performance. 8th International Workshop on Job Scheduling Strategies for Parallel Processing, Vol. 2537:103-127, 2002.
M. V. Devarakonda and R. K. Iyer. Predictability of process resource usage : A measure-ment based study on unix. IEEE Tans. Sotfw. Eng., pp. 1579-1586, 1989
P. Dinda. Online prediction of the running time of tasks. Cluster Computing SIGMET-RICS/Performance, pages 225-236, 2002.
A. B. Downey. Using queue time predictions for processor allocation. 3rd JSSPP, Lecture Notes In Computer Science; Vol. 1291:35-57, 1997.
C. Ernemann, V. Hamscher, , and R. Yahyapour. Benefits of global grid computing for job scheduling. 5th IEEE/ACM International Workshop on Grid Computing, 2004.
D. G. Feitelson and M. A. Jette. Improved utilization and responsiveness with gang scheduling. pages 238-261, 1997.
D. G. Feitelson, L. Rudolph, and U. Schwiegelshohn. Parallel job scheduling - a status report. Job Scheduling Strategies for Parallel Processing: 10th International Workshop, JSSPP 2004, 3277 / 2005:9, June 2004.
S. Gerald, K. Rajkumar, R. Arun, and S. Ponnuswamy. Scheduling of parallel jobs in a heterogeneous multi-site environment. JSSPP, 2003.
R. Gibbons. A historical application profiler for use by parallel schedulers. Job Scheduling Strategies for Parallel Processing 1997, 1997.
A. Goyenechea, F. Guim, I. Rodero, G. Terstyansky, and J. Corbalan. Extracting per- formance hints for grid users using data mining techniques: a case study in the ngs. ”Mediterranean Journal: Special issue on data mining, 2006.
J. Han and M. Kamber. Book: Data mining: Concepts and techniques. Book, 2001.
G. Holmes, A. Donkin, and I. Witten. Weka: A machine learning workbench. In Proc 2nd Australia and New Zealand Conf. on Intelligent Information Systems, 1994
T. C. Jess Labarta, Sergi Girona. Analyzing scheduling policies using dimemas. 3rd Work- shop on environment and tools for parallel scientific computation, 1997.
C. Pinchak, P. Lu, and M. Goldenberg. Practical heterogeneous placeholder scheduling in overlay metacomputers: Early experiences. Job Scheduling Strategies for Parallel Processing, pages 205-228, 2002. Lect. Notes Comput. Sci. vol. 2537.
B. Schroeder and M. Harchol-Balter. Evaluation of task assignment policies for super- computing servers: The case for load unbalancing and fairness. Cluster Computing 2004.
J. Skovira, W. Chan, H. Zhou, and D. A. Lifka. The easy - loadleveler api project. Pro- ceedings of the Workshop on Job Scheduling Strategies for Parallel Processing, Lecture Notes In Computer Science; Vol. 1162:41-47, 1996.
W. Smith, V. E. Taylor, and I. T. Foster. Using run-time predictions to estimate queue wait times and improve scheduler performance. Proceedings of the Job Scheduling Strategies for Parallel Processing, Lecture Notes In Computer Science; Vol. 1659:202-219, 1999.
D. Tsafrir, Y. Etsion, , and D. G. Feitelson. Modeling user runtime estimates. In the 11th JSSPP ,Lecture Notes in Computer Science, Vol.3834:pp. 1-35, 2006.
D. Tsafrir, Y. Etsion, and D. G. Feitelson. Backfilling using system-generated predictions rather than user runtime estimates. In the IEEE TPDS, 2006.
D. Tsafrir and D. G. Feitelson. Instability in parallel job scheduling simulation: the role of workload flurries. In 20th Intl. Parallel and Distributed Processing Symp, 2006.
J. Yue. Global backfilling scheduling in multiclusters. Asian Applied Computing Confer- ence, AACC 2004, pages pp. 232-239, 2004.
Y. Zhang, W. Sun, , and Y. Inoguchi. Cpu load predictions on the computational grid. Cluster and Grid computing, 2006.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Rodero, I., Guim, F., Corbalan, J., Goyeneche, A. (2008). The Grid Backfilling: a Multi-Site Scheduling Architecture with Data Mining Prediction Techniques. In: Grid Middleware and Services. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78446-5_10
Download citation
DOI: https://doi.org/10.1007/978-0-387-78446-5_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-78445-8
Online ISBN: 978-0-387-78446-5
eBook Packages: Computer ScienceComputer Science (R0)