Advertisement

Journal of Grid Computing

, Volume 8, Issue 1, pp 133–155 | Cite as

Algorithms for Divisible Load Scheduling of Data-intensive Applications

  • Chen Yu
  • Dan C. Marinescu
Article

Abstract

In this paper we introduce the Divisible Load Scheduling (DLS) family of algorithms for data-intensive applications. The polynomial time algorithms partition the input data and generate optimal mappings to collection of autonomous and heterogeneous computational systems. We prove the optimality of the solution and report a simulation study of the algorithms.

Keywords

Divisible Load Scheduling Divisible Load Theory (DLT) Load balancing Grid computing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altilar, D., Paker, Y.: An optimal scheduling algorithm for parallel video processing. In: IEEE Int. Conference on Multimedia Computing and Systems. IEEE Computer Society, Silver Spring (1998)Google Scholar
  2. 2.
    Atallah, M.J., Black, C.L., Marinescu, D.C., Siegel, H.J., Casavant, T.L.: Models and algorithms for co-scheduling compute-intensive tasks on a network of workstations. J. Parallel Distrib. Comput. 16(4), 319–327 (1992)zbMATHCrossRefGoogle Scholar
  3. 3.
    Baraglia, R., Ferrini, R., Tonellotto, N., Ricci, L., Yahyapour, R.: A launch-time scheduling heuristics for parallel applications on wide area Grids. J. Grid Computing 6(2), 159–175 (2008)CrossRefGoogle Scholar
  4. 4.
    Bataineh, S., Robertazzi, T.G.: Distributed computation for a bus network with communication delays. In: Proc. Conf. Information Sciences and Systems, Baltimore, MD (1991)Google Scholar
  5. 5.
    Beaumont, O., Casanova, H., Legrand, A., Robert, Y., Yang, Y.: Scheduling divisible loads on star and tree networks: results and open problems. IEEE Trans. Parallel Distrib. Syst. 16(3), 207–218 (2005)CrossRefGoogle Scholar
  6. 6.
    Bharadwaj, V., Ghose, D., Mani, V., Robertazzi, T.: Scheduling Divisible Loads in Parallel and Distributed Systems. IEEE Computer Society, Silver Spring (1996)Google Scholar
  7. 7.
    Bharadwaj, V., Ghose, D., Robertazzi, T.G.: Divisible Load Theory: a new paradigm for load scheduling in distributed systems. In: Cluster Computing on Divisible Load Scheduling, vol, 6, no. 1, pp. 7–18 (2003)Google Scholar
  8. 8.
    Blazewicz, J., Drozdowski, M., Markiewicz, M.: Divisible task scheduling—concept and verification. Parallel Comput. 25, 87–98 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Blazewicz, J., Drozdowski, M.: Scheduling divisible jobs on hypercubes. Parallel Comput. 21, 1945–1956 (1995)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Blazewicz, J., Drozdowski, M.: The performance limits of a two-dimensional network of load-sharing processors. Found. Comput. Decis. Sci. 21(1), 3–15 (1996)zbMATHMathSciNetGoogle Scholar
  11. 11.
    Braun, T.D., Siegel, H.J., Beck, N., Boloni, L.L., Maheswaran, M., Reuther, A.I., Robertson, J.P., Theys, M.D., Yao, B., Hensgen, D., Freund, R.F.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)CrossRefGoogle Scholar
  12. 12.
    Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for scheduling parameter sweep applications in Grid environments. In: Proceedings of the 9th Heterogeneous Computing Workshop (HCW00), pp. 349–363 (2000)Google Scholar
  13. 13.
    Cheng, Y.-C., Robertazzi, T.G.: Distributed computation with communication delay. IEEE Trans. Aerosp. Electron. Syst. 24, 700–712 (1988)CrossRefGoogle Scholar
  14. 14.
    Cheng, Y.-C., Robertazzi, T.G.: Distributed computation for a tree network with communication delays. IEEE Trans. Aerosp. Electron. Syst. 26(3), 511–516 (1990)CrossRefGoogle Scholar
  15. 15.
    Cohen, B.: BitTorrent Protocol Specification. http://www.bittorrent.org/protocol.html (2008)
  16. 16.
    Darema-Rodgers, F., Norton, V.A., Pfister, G.F.: Using a single-program-multiple-data computational model for parallel execution of scientific applications. Technical Report RC11552, IBM T.J Watson Research Center (1985)Google Scholar
  17. 17.
    Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, ISBN 1-55860-475-8 (2000)Google Scholar
  18. 18.
    Grid Infrastructure Group: TeraGrid. http://www.teragrid.org/ (2009)
  19. 19.
    Hong, Q., Ju, J.: Cooperative task scheduling on workstations network. J. Softw. 9(1), 14–17 (1998)Google Scholar
  20. 20.
    Jacobson, V.: Congestion avoidance and control. In: Proceedings of ACM SIGCOMM ’88 (1988)Google Scholar
  21. 21.
    Ji, Y., Marinescu, D.C., Zhang, W., Zhang, X., Yan, X., Baker, T.S.: A model-based parallel origin and orientation refinement algorithm for CryoTEM and its application to the study of virus structures. J. Struct. Biol. 154(1), 1–19 (2006)CrossRefGoogle Scholar
  22. 22.
    Karatza, H.D.: Gang scheduling and I/O scheduling in a multiprocessor system. In: Proc. Symp. on Performance Evaluation of Computer and Telecommunication Systems (SCSI), pp. 245–252 (2000)Google Scholar
  23. 23.
    Kim, S., Weissman, J.B.: A genetic algorithm-based approach for scheduling decomposable data Grid applications. In: Proc. 33rd Int’l Conf. Parallel Processing (ICPP04), vol. 1, pp. 406–413 (2004)Google Scholar
  24. 24.
    Lee, C., Hamdi, M.: Parallel image processing applications on a network of workstations. Parallel Comput. 21, 137–160 (1995)zbMATHCrossRefGoogle Scholar
  25. 25.
    Legrand, A., Su, A., Vivien, F.: Minimizing the stretch when scheduling flows of biological requests. Research Report RR2005-48. Ecole Normale Superieure de Lyon (2005)Google Scholar
  26. 26.
    Matthews, W., Cottrell, L.: Achieving high data throughput in research networks. In: CHEP 2001, China (2001)Google Scholar
  27. 27.
    Mathis, M., Semke, J., Mahdavi, J.: The macroscopic behaviour of the TCP congestion avoidance algorithm. Comput. Commun. Rev. 27(3), 62–82 (1997)CrossRefGoogle Scholar
  28. 28.
    McClatchey, R., Anjum, A., Stockinger, H., Ali, A., Willers, I., Thomas, M.: Data intensive and network aware (DIANA) Grid scheduling. J. Grid Comput. 5, 43–64 (2007)CrossRefGoogle Scholar
  29. 29.
    Moges, M.A., Robertazzi, T.G.: Grid scheduling divisible loads from multiple sources via linear programming. In: IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2004). Cambridge, MA (2004)Google Scholar
  30. 30.
    Plastino, A., Ribeiro, C.C., Rodriguez, N.: Developing SPMD applications with load balancing. Parallel Comput. 29(6), 743–766 (2003)CrossRefGoogle Scholar
  31. 31.
    Renard, H., Robert, Y., Vivien, F.: Static load-balancing techniques for iterative computations on heterogeneous clusters. Technical Report RR-2003-12, LIP, ENS Lyon, France (2003)Google Scholar
  32. 32.
    Smallen, S., Casanova, H., Berman, F.: Tunable on-line parallel tomography. In: Proceedings of SuperComputing ’01, Denver, CO (2001)Google Scholar
  33. 33.
    Steinmetz, R., Wehrle, K.: Peer-to-peer systems and applications. In: Lecture Notes in Computer Science, vol. 3485. ISBN 3-540-29192-X (2005)Google Scholar
  34. 34.
    Stevens, W.R.: TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms. The Internet Society (RFC2001) (1997)Google Scholar
  35. 35.
    Thain, D., Tannenbaum, T., Livny, M. (2003) Condor and the Grid. In: Grid Computing: Making the Global Infrastructure a Reality. Wiley, New York (2003)Google Scholar
  36. 36.
    Topcuouglu, H., Hariri, S., Wu, M.-Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)CrossRefGoogle Scholar
  37. 37.
    van der Raadt, K., Yang, Y., Casanova, H.: APSTDV: divisible load scheduling and deployment on the Grid. Technical Report CS2004-0785, Dept. of Computer Science and Engineering, University of California, San Diego (2004)Google Scholar
  38. 38.
    Viswanathan, S., Veeravalli, B., Robertazzi, T.G.: Resource-aware distributed scheduling strategies for large-scale computational cluster/Grid systems. IEEE Trans. Parallel Distrib. Syst. 18, 1450–1461 (2007)CrossRefGoogle Scholar
  39. 39.
    Weissman, J.B.: Prophet: automated scheduling of SPMD programs in workstation networks. In: Concurrency: Practice and Experience, vol. 11, pp. 301–321 (1999)Google Scholar
  40. 40.
    Wolski, R., Spring, N., Hayes, J.: Predicting the CPU availability of time-shared unix systems. In: Proceedings of 8th IEEE High Performance Distributed Computing Conference (HPDC8) (1999)Google Scholar
  41. 41.
    Wolski, R., Spring, N.T., Hayes, J.: The network weather service: a distributed resource performance forecasting service for metacomputing. Future Gener. Comput. Syst. 15(5,6), 757–768 (1999)CrossRefGoogle Scholar
  42. 42.
    Wong, H.M., Yu, D., Veeravalli, B., Robertazzi, T.G.: Data-intensive Grid scheduling: multiple sources with capacity constraints. In: Proc. 16th Int’l Conf. Parallel and Distributed Computing and Systems (PDCS03), pp. 7–11 (2003)Google Scholar
  43. 43.
    Wong, H.M., Veeravalli, B., Barlas, G.: Design and performance evaluation of load distribution strategies for multiple divisible loads on heterogeneous linear daisy chain networks. J. Parallel Distrib. Comput. 65(12), 1558–1577 (2005)zbMATHCrossRefGoogle Scholar
  44. 44.
    Yang, Y., Casanova, H.: Multi-round algorithm for scheduling divisible workload applications: analysis and experimental evaluation. Technical Report CS2002-0721, Dept. of Computer Science and Engineering, University of California, San Diego (2002)Google Scholar
  45. 45.
    Yu, C., Marinescu, D.C., Siegel, H.J., Morrison, J.P.: A simulation study of data partitioning algorithms for multiple clusters. In: 7th IEEE Int. Symp. on Cluster Computing and the Grid (CCGrid 2007), Brazil (2007)Google Scholar
  46. 46.
    Yu, C., Marinescu, D.C., Morrison, J.P., Clayton, B.C., Power, D.A.: An automated data processing pipeline for virus structure determination at high resolution. In: 6th Int. Workshop on High Performance Structural Biology (HiCOMB), Long Beach, CA, USA (2007)Google Scholar
  47. 47.
    Yu, C., Marinescu, D.C.: Load distribution and co-termination scheduling algorithms for large-scale distributed applications. In; ISCA 21st International Conference on Parallel and Distributed Computing and Communication Systems (PDCCS 2008), New Orlean, LA (2008)Google Scholar
  48. 48.
    Yu, D., Robertazzi, T.: Divisible load scheduling for Grid computing. In: 15th Int’l Conf. Parallel and Distributed Computing and Systems (PDCS2003). IASTED, Anaheim (2003)Google Scholar
  49. 49.
    Zhu, T., Wu, Y., Yang, G.: Scheduling divisible loads in the dynamic heterogeneous Grid environment. In: Proceedings of the 1st International Conference on Scalable Information Systems, Hong Kong (2006)Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  1. 1.School of Electrical Engineering & Computer ScienceUniversity of Central FloridaOrlandoUSA

Personalised recommendations