Skip to main content
Log in

Data Intensive and Network Aware (DIANA) Grid Scheduling

Journal of Grid Computing Aims and scope Submit manuscript

Abstract

In Grids scheduling decisions are often made on the basis of jobs being either data or computation intensive: in data intensive situations jobs may be pushed to the data and in computation intensive situations data may be pulled to the jobs. This kind of scheduling, in which there is no consideration of network characteristics, can lead to performance degradation in a Grid environment and may result in large processing queues and job execution delays due to site overloads. In this paper we describe a Data Intensive and Network Aware (DIANA) meta-scheduling approach, which takes into account data, processing power and network characteristics when making scheduling decisions across multiple sites. Through a practical implementation on a Grid testbed, we demonstrate that queue and execution times of data-intensive jobs can be significantly improved when we introduce our proposed DIANA scheduler. The basic scheduling decisions are dictated by a weighting factor for each potential target location which is a calculated function of network characteristics, processing cycles and data location and size. The job scheduler provides a global ranking of the computing resources and then selects an optimal one on the basis of this overall access and execution cost. The DIANA approach considers the Grid as a combination of active network elements and takes network characteristics as a first class criterion in the scheduling decision matrix along with computations and data. The scheduler can then make informed decisions by taking into account the changing state of the network, locality and size of the data and the pool of available processing cycles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Czajkowski, K., Foster, I., Karonis, N., Kesselman, C., Martin, S., Smith, W., Tuecke, S.: A resource management architecture for metacomputing systems. In: 4th Workshop on Job Scheduling Strategies for Parallel Processing. Orlando, FL, 1998 (March 30)

  2. Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An Architecture for a Resource Management and Scheduling System in a Global Computational Grid. In: International Conference on High Performance Computing in Asia–Pacific Region (HPC Asia 2000). Beijing, China, 2000 (IEEE Computer Society)

  3. Nabrzyski, J., Schopf, J.M., Weglarz, J. (eds): Grid Resource Management. Kluwer, Boston, MA (2003)(Fall)

    Google Scholar 

  4. Andretto, P., Borgia, S., Dorigo, A., Gianelle, A., Mordacchini, M., et al.: Practical approaches to Grid workload & resource management in the EGEE Project. In: CHEP 2004, Interlaken, Switzerland, 2005

  5. European Data Grid Project http://eu-datagrid.web.cern.ch/eu-datagrid/

  6. Huedo, E., Montero, R.S., Llorente, I.M.: A framework for adaptive execution on Grids. Softw. Prac. Exp. 34, 631–651 (2004)

    Article  Google Scholar 

  7. Sun Grid Engine, http://www.sun.com/software/Gridware/

  8. Basney, J., Livny, M., Mazzanti, P.: Utilizing widely distributed computational resources efficiently with execution domains. Comput. Phys. Commun. 140, 246–252 (2001)

    Article  MATH  Google Scholar 

  9. Brooke, J., Fellows, D., MacLaren, J.: Resource brokering: the EUROGRID/GRIP Approach. In: UK e-Science All Hands Meeting, Nottingham, UK, 2004 (31 Aug.–3 Sep)

  10. http://www.glite.org/, May 2006.

  11. Legrand, I.: MonaLIsa – Monitoring agents using a large integrated service architecture. In: International Workshop on Advanced Computing and Analysis Techniques in Physics Research, Tsukuba, Japan, 2003 (December)

  12. Cottrell, L., Matthews, W.: Measuring the digital divide with PingER, In: Second round Table on Developing Countries Access to Scientific Knowledge, Trieste, Italy, 2003 (Oct.)

  13. Thomas, M., et al.: JClarens: A Java framework for developing and deploying web services for Grid computing. In: ICWS 2005, FL USA, 2005

  14. Rang Nathan, K., Foster, I.: Decoupling computation and data scheduling in distributed data-intensive applications. In: International Symposium on High Performance Distributed Computing (HPDC-11), Edinburgh, Scotland, 2002 (July)

  15. CMS Production, http://cmsdoc.cern.ch/cms/production/www/html/general/

  16. Fruhwirth, R., Regler, M., Bock, R. K., Grote, H., Notz, D.: Data analysis techniques for high-energy physics. Cambridge University Press, Cambridge, MA (ISBN: 0521635489, p121)

  17. Holtman, K.: HEPGRID2001: A model of a virtual data Grid application. In: Proc. of HPCN Europe 2001, Amsterdam, LNCS 2110, p. 711–720. Springer, Berlin Heidelberg New York (2001)

  18. Mathis, Semke, Mahdavi, Ott: The macroscopic behaviour of the TCP congestion avoidance algorithm. Comput. Commun. Rev. 27(3), 62–82 (1997) (July)

    Article  Google Scholar 

  19. Jin, H., Shi, X., et al.: An adaptive Meta-Scheduler for data-intensive applications. International Journal of Grid and Utility Computing 1(1), 32–37 (2005)

    Article  Google Scholar 

  20. Park, S., Kim, J.: Chameleon: a resource scheduler in a data Grid environment. In:Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, Tokyo, 2003

  21. Stockinger, H., Donno, F., Eulisse, G., Mazzucato, M., Steenberg, C.: Matchmaking, Datasets and Physics Analysis, Workshop on Web and Grid Services for Scientific Data Analysis (WAGSSDA). IEEE Computer Society, Olso, Norway (2005) (June 14)

    Google Scholar 

  22. Pacini, F.: Job Description Language HowTo. http://server11.infn.it/workload-grid/docs/DataGrid-01- TEN-0142-0 2.pdf, Oct. (2003)

  23. Steenberg, C., et al.: The Clarens Grid-enabled Web Services Framework: services and implementation, CHEP 2004 Interlaken Switzerland.

  24. Les Cottrell, R., Ansari, S., Khandpur, P., Gupta, R., Hughes-Jones, R., Chen, M., McIntosh, L., & Leers, F.: Characterization and evaluation of TCP and UDP-based transport on real networks. In: Protocols for Fast Long-Distance networks, Lyon, France, 2005 (Feb.).(SLAC-PUB-10996)

  25. Matthews, W., & Cottrell, L.: Achieving high data throughput in research networks. In: CHEP 2001, China, 2001

  26. Andronico, G., Ardizzone, V., Barbera, R., Catania, R., Carrieri, A., Falzone, A., Giorgio, E., La Rocca, G., Monforte, S., Pappalardo, M., Passaro, G., Platania, G.: GILDA: The Grid INFN Virtual Laboratory for Dissemination Activities. In: International Conference on Testbeds and Research Infrastructures for the Development of networks and Communities (Tridentcom 2005), Trento, Italy, 2005 (pp. 304–305)

  27. Andreetto, P., et.al.: Practical approaches to Grid workload & resource management in the EGEE Project. In:CHEP 2004, Interlaken, Switzerland, 2004

  28. Krauter, K., Buyya, R., Maheswaran, M.: A taxonomy and survey of Grid resource management systems for distributed computing. Softw. Pract. Exp. 32(2), 135–164 (2002) (February)

    Article  MATH  Google Scholar 

  29. Bell, W., Cameron, D., Capozza, L., Millar, A.P., Stockinger, K., Zini, F.: Design of a replica optimisation framework. Technical report, DataGrid-02-TED-021215, Geneva, Switzerland, 2002 (December)

  30. Cameron, D., Casey, J., Guy, L., Kunszt, P., Lemaitre, S., McCance, G., Stockinger, H., Stockinger, K., et al.: Replica management in the EU DataGrid Project. International Journal of Grid Computing, 2(4), 341–351 (2004)

    Article  Google Scholar 

  31. Stockinger, K., Stockinger, H. et al.: Access cost estimation for unified Grid storage systems. In: 4th International Workshop on Grid Computing (Grid2003), Phoenix, Arizona, 2003. IEEE Computer Society, Los Alamitos, CA (2003)(November 17)

  32. Basney, J., Livny, M., Mazzanti, P.: Utilizing widely distributed computational resources efficiently with execution domains. Comput. Phys. Commun. (2000)

  33. http://sourceforge.net/projects/gcsf/

  34. Lauret, et al.: The STAR Unified Meta-Scheduler project, a front end around evolving technologies for user analysis and data production. In: CHEP2004, Interlaken Switzerland, 2004

  35. Barras, T., et al.: The CMS PhEDEx system: a novel approach to robust Grid data management, UK All Hands Meeting, Nottingham, UK, 2005

  36. Kosar, T., Livny, M.: A framework for reliable and efficient data placement in distributed computing systems, To appear in J. Parallel Distrib. Comput. 65(10), 1146–1157 (2005)

  37. Thain, D., et al.: Gathering at the well: creating communities for Grid I/O. In: Supercomputing 2001, Denver, CO, 2001 (November)

  38. Zhao, Y., Hu, Y.: GRESS – a Grid Replica Selection Service. In: ISCA 16th International Conference on Parallel and Distributed Computing Systems (PDCS-2003), Reno, Nevada, 2003

  39. Tan, C., Mills, K.: Performance characterization of decentralized algorithms for replica selection in distributed object systems. In: International Workshop on Software and Performance, Palma de Mallorca, Spain, 2005 (July 12–14)

  40. Nabrzyski, J.: Knowledge-based scheduling method for Globus. In: Globus Retreat, Redondo Beach, CA, 1999

  41. Chervenak, A., et al.: Giggle: a framework for constructing scalable replica location services. In: Supercomputing 2002, Baltimore, MD, 2002 (Nov.16–22)

  42. Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An architecture of a resource management and scheduling system in a global computational Grid. In:HPC Asia 2000, Beijing, China, 2000 (May 14–17)

  43. Uk, J., et al.: SPHINX: A scheduling middleware for aata intensive applications on a Grid. In: CHEP 2004, Interlaken, Switzerland, 2004

  44. Application Level Scheduling (AppLeS) http://apples.ucsd.edu/.

  45. Daily, H., et al.: A decoupled scheduling approach for the GrADS Program Development Environment. In: Supercomputing 2002, Baltimore, MD, 2002 (November 16–22)

  46. Cottrell, R.L., Logg, C.: A new high performance network and application monitoring infrastructure. Technical report SLAC-PUB-9202, SLAC (2002)

  47. Mathis, M., Allman, M.: A framework for defining empirical bulk transfer capacity. RFC 3148, USA (2001) (July)

  48. Anjum, A., McClatchey, R., Stockinger, H., Ali, A., Willers, I., Thomas, M., Sagheer, M., Hasham, K., Alvi, O.: DIANA Scheduling Hierarchies for Optimizing Grid Bulk Job Scheduling. Accepted by 2nd IEEE Int. Conference on e-Science and Grid Computing (e-Science 2006), IEEE Computer Society Press, Amsterdam, The Netherlands, Dec. 2006.

  49. Anjum, A., McClatchey, R., Ali, A., Willers, I.: Bulk Scheduling with the DIANA Scheduler. IEEE Transactions on Nuclear Science, 53(6), December 2006

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard McClatchey.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McClatchey, R., Anjum, A., Stockinger, H. et al. Data Intensive and Network Aware (DIANA) Grid Scheduling. J Grid Computing 5, 43–64 (2007). https://doi.org/10.1007/s10723-006-9059-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-006-9059-z

Key words

Navigation