Minimizing the stretch when scheduling flows of divisible requests

Abstract

In this paper, we consider the problem of scheduling distributed biological sequence comparison applications. This problem lies in the divisible load framework with negligible communication costs. Thus far, very few results have been proposed for this model. We discuss and select relevant metrics for this framework: namely max-stretch and sum-stretch. We explain the relationship between our model and the preemptive single processor case, and we show how to extend algorithms that have been proposed in the literature for the single processor model to the divisible multi-processor problem domain. We recall known results on closely related problems, we show how to minimize the max-stretch on unrelated machines either in the divisible load model or with preemption, we derive new lower bounds on the competitive ratio of any online algorithm, we present new competitiveness results for existing algorithms, and we develop several new online heuristics. We also address the Pareto optimization of max-stretch. Then, we extensively study the performance of these algorithms and heuristics under realistic scenarios. Our study shows that all previously proposed guaranteed heuristics for max-stretch for the single processor model are inefficient in practice. In contrast, we show that our online algorithms based on linear programming are in practice near-optimal solutions for max-stretch. Our study also clearly suggests heuristics that are efficient for both metrics, although a combined optimization is in theory not possible in the general case.

This is a preview of subscription content, log in to check access.

References

  1. Baker, K. R. (1974). Introduction to sequencing and scheduling. New York: Wiley.

    Google Scholar 

  2. Baker, K. R., Lawler, E. L., Lenstra, J. K., & Rinnooy Kan, A. H. G. (1983). Preemptive scheduling of a single machine to minimize maximum cost subject to release dates and precedence constraints. Operations Research, 31(2), 381–386.

    Article  Google Scholar 

  3. Baptiste, P., Brucker, P., Chrobak, M., Dürr, C., Kravchenko, S. A., & Sourd, F. (2007). The complexity of mean flow time scheduling problems with release times. Journal of Scheduling, 10(2), 139–146.

    Article  Google Scholar 

  4. Bender, M. A. (1998). New algorithms and metrics for scheduling. Ph.D. thesis, Harvard University, May 1998.

  5. Bender, M. A., Chakrabarti, S., & Muthukrishnan, S. (1998). Flow and stretch metrics for scheduling continuous job streams. In Proceedings of the 9th annual ACM-SIAM symposium on discrete algorithms (SODA’98) (pp. 270–279). Philadelphia: SIAM.

    Google Scholar 

  6. Bender, M. A., Muthukrishnan, S., & Rajaraman, R. (2002). Improved algorithms for stretch scheduling. In SODA’02: Proceedings of the thirteenth annual ACM-SIAM symposium on discrete algorithms (pp. 762–771). Philadelphia, PA, USA, 2002. Philadelphia: SIAM.

    Google Scholar 

  7. Bender, M. A., Muthukrishnan, S., & Rajaraman, R. (2004). Approximation algorithms for average stretch scheduling. Journal of Scheduling, 7(3), 195–222.

    Article  Google Scholar 

  8. Bertsekas, D., & Gallager, R. (1987). Data networks. Englewood Cliffs: Prentice Hall.

    Google Scholar 

  9. Bharadwaj, V., Ghose, D., Mani, V., & Robertazzi, T. G. (1996). Scheduling divisible loads in parallel and distributed systems. Los Alamitos: IEEE Comput. Soc.

    Google Scholar 

  10. Blanchet, C., Combet, C., Geourjon, C., & Deléage, G. (2000). MPSA: integrated system for multiple protein sequence analysis with client/server capabilities. Bioinformatics, 16(3), 286–287.

    Article  Google Scholar 

  11. Blazewicz, J. (1977). Scheduling dependent tasks with different arrival times to meet deadlines. In H. Beilner & E. Gelenbe (Eds.), Modelling and performance evaluation of computer systems (Proceedings of the international workshop) (pp. 57–65). Amsterdam: North-Holland.

    Google Scholar 

  12. Blazewicz, J., Ecker, K. H., Pesch, E., Schmidt, G., & Weglarz, J. (2007). Handbook on scheduling: from theory to applications. International handbooks on information systems. Berlin: Springer. ISBN: 978-3-540-28046-0.

    Google Scholar 

  13. Braun, R. C., Pedretti, K. T., Casavant, T. L., Scheetz, T. E., Birkett, C. L., & Roberts, C. A. (2001). Parallelization of local BLAST service on workstation clusters. Future Generation Computer Systems, 17(6), 745–754.

    Article  Google Scholar 

  14. Chekuri, C., & Khanna, S. (2002). Approximation schemes for preemptive weighted flow time. In Proceedings of the 34th annual ACM symposium on theory of computing (pp. 297–305). New York: Assoc. Comput. Mach.

    Google Scholar 

  15. Darling, A. E., Carey, L., & Feng, W. Ch. (2003). The design, implementation, and evaluation of mpiBLAST. In Proceedings of ClusterWorld 2003.

  16. Dertouzos, M. L. (1974). Control robotics: the procedural control of physical processes. In Proceedings of IFIP congress (pp. 897–813).

  17. Garey, M. R., & Johnson, D. S. (1991). Computers and intractability, a guide to the theory of NP-completeness. New York: Freeman.

    Google Scholar 

  18. Gonzalez, T., & Sahni, S. (1976). Open shop scheduling to minimize finish time. Journal of the Association for Computing Machinery, 23(4), 665–679.

    Google Scholar 

  19. GriPPS webpage at http://gripps.ibcp.fr/ (2005).

  20. Johnson, D. S. (1974). Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9, 256–278.

    Article  Google Scholar 

  21. Labetoulle, J., Lawler, E. L., Lenstra, J. K., & Rinnooy Kan, A. H. G. (1984). Preemptive scheduling of uniform machines subject to release dates. In W. R. Pulleyblank (Ed.), Progress in combinatorial optimization (pp. 245–261). San Diego: Academic Press.

    Google Scholar 

  22. Lawler, E. L., & Labetoulle, J. (1978). On preemptive scheduling of unrelated parallel processors by linear programming. Journal of the Association for Computing Machinery, 25(4), 612–619.

    Google Scholar 

  23. Legrand, A., Marchal, L., & Casanova, H. (2003). Scheduling distributed applications: the SimGrid simulation framework. In Proceedings of the 3rd IEEE symposium on cluster computing and the grid.

  24. Legrand, A., Su, A., & Vivien, F. (2004). Off-line scheduling of divisible requests on an heterogeneous collection of databanks (Research Report 5386). INRIA, November 2004. Also available as LIP, ENS Lyon, Research Report 2004-51.

  25. Legrand, A., Su, A., & Vivien, F. (2005). Off-line scheduling of divisible requests on an heterogeneous collection of databanks. In Proceedings of the 14th heterogeneous computing workshop, Denver, Colorado, USA, April 2005. Los Alamitos: IEEE Comput. Soc.

    Google Scholar 

  26. Legrand, A., Su, A., & Vivien, F. (2006). Minimizing the stretch when scheduling flows of biological requests. In Symposium on parallelism in algorithms and architectures SPAA’2006. New York: Assoc. Comput. Mach.

    Google Scholar 

  27. Legrand, A., Su, A., & Vivien, F. (2008). Minimizing the stretch when scheduling flows of divisible requests (Research Report RR2008-08). LIP, École Normale Supérieure de Lyon, February 2008. This is a revised version of the LIP Research Report RR2006-19. Also available as INRIA Research Report 6002 http://hal.inria.fr/inria-00108524.

  28. Lenstra, J. K., Rinnooy Kan, A. H. G., & Brucker, P. (1977). Complexity of machine scheduling problems. Annals of Discrete Mathematics, 1, 343–362.

    Article  Google Scholar 

  29. Massoulié, L., & Roberts, J. (2002). Bandwidth sharing: objectives and algorithms. Transactions on Networking, 10(3), 320–328.

    Article  Google Scholar 

  30. Megow, N. (2002). Performance analysis of on-line algorithms in machine scheduling. Diplomarbeit, Technische Universität Berlin, April 2002.

  31. Miller, P. L., Nadkarni, P. M., & Carriero, N. M. (1991). Parallel computation and FASTA: confronting the problem of parallel database search for a fast sequence comparison algorithm. Computer Applications in the Biosciences, 7(1), 71–78.

    Google Scholar 

  32. Muthukrishnan, S., Rajaraman, R., Shaheen, A., & Gehrke, J. (1999). Online scheduling to minimize average stretch. In IEEE symposium on foundations of computer science (pp. 433–442).

  33. Protasi, M. Ausiello, G. D’Atri, A. (1980). Structure preserving reductions among convex optimization problems. Journal of Computer and System Sciences, 21(1), 136–153.

    Article  Google Scholar 

  34. Schulz, A. S., & Skutella, M. (2002). The power of α-points in preemptive single machine scheduling. Journal of Scheduling, 5(2), 121–133. DOI: 10.1002/jos.093.

    Article  Google Scholar 

  35. Sitters, R. (2005). Complexity of preemptive minsum scheduling on unrelated parallel machines. Journal of Algorithms, 57(1), 37–48.

    Article  Google Scholar 

  36. Slavík, P. (1996). A tight analysis of the greedy algorithm for set cover. In STOC’96: Proceedings of the twenty-eighth annual ACM symposium on theory of computing (pp. 435–441). New York: Assoc. Comput. Mach.

    Google Scholar 

  37. Smith, W. E. (1956). Various optimizers for single-stage production. Naval Research Logistics Quarterly, 3, 59–66.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Frédéric Vivien.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Legrand, A., Su, A. & Vivien, F. Minimizing the stretch when scheduling flows of divisible requests. J Sched 11, 381–404 (2008). https://doi.org/10.1007/s10951-008-0078-4

Download citation

Keywords

  • Bioinformatics
  • Heterogeneous computing
  • Scheduling
  • Divisible load
  • Linear programming
  • Stretch