Journal of Scheduling

, Volume 11, Issue 5, pp 381–404 | Cite as

Minimizing the stretch when scheduling flows of divisible requests

Article

Abstract

In this paper, we consider the problem of scheduling distributed biological sequence comparison applications. This problem lies in the divisible load framework with negligible communication costs. Thus far, very few results have been proposed for this model. We discuss and select relevant metrics for this framework: namely max-stretch and sum-stretch. We explain the relationship between our model and the preemptive single processor case, and we show how to extend algorithms that have been proposed in the literature for the single processor model to the divisible multi-processor problem domain. We recall known results on closely related problems, we show how to minimize the max-stretch on unrelated machines either in the divisible load model or with preemption, we derive new lower bounds on the competitive ratio of any online algorithm, we present new competitiveness results for existing algorithms, and we develop several new online heuristics. We also address the Pareto optimization of max-stretch. Then, we extensively study the performance of these algorithms and heuristics under realistic scenarios. Our study shows that all previously proposed guaranteed heuristics for max-stretch for the single processor model are inefficient in practice. In contrast, we show that our online algorithms based on linear programming are in practice near-optimal solutions for max-stretch. Our study also clearly suggests heuristics that are efficient for both metrics, although a combined optimization is in theory not possible in the general case.

Keywords

Bioinformatics Heterogeneous computing Scheduling Divisible load Linear programming Stretch 

References

  1. Baker, K. R. (1974). Introduction to sequencing and scheduling. New York: Wiley. Google Scholar
  2. Baker, K. R., Lawler, E. L., Lenstra, J. K., & Rinnooy Kan, A. H. G. (1983). Preemptive scheduling of a single machine to minimize maximum cost subject to release dates and precedence constraints. Operations Research, 31(2), 381–386. CrossRefGoogle Scholar
  3. Baptiste, P., Brucker, P., Chrobak, M., Dürr, C., Kravchenko, S. A., & Sourd, F. (2007). The complexity of mean flow time scheduling problems with release times. Journal of Scheduling, 10(2), 139–146. CrossRefGoogle Scholar
  4. Bender, M. A. (1998). New algorithms and metrics for scheduling. Ph.D. thesis, Harvard University, May 1998. Google Scholar
  5. Bender, M. A., Chakrabarti, S., & Muthukrishnan, S. (1998). Flow and stretch metrics for scheduling continuous job streams. In Proceedings of the 9th annual ACM-SIAM symposium on discrete algorithms (SODA’98) (pp. 270–279). Philadelphia: SIAM. Google Scholar
  6. Bender, M. A., Muthukrishnan, S., & Rajaraman, R. (2002). Improved algorithms for stretch scheduling. In SODA’02: Proceedings of the thirteenth annual ACM-SIAM symposium on discrete algorithms (pp. 762–771). Philadelphia, PA, USA, 2002. Philadelphia: SIAM. Google Scholar
  7. Bender, M. A., Muthukrishnan, S., & Rajaraman, R. (2004). Approximation algorithms for average stretch scheduling. Journal of Scheduling, 7(3), 195–222. CrossRefGoogle Scholar
  8. Bertsekas, D., & Gallager, R. (1987). Data networks. Englewood Cliffs: Prentice Hall. Google Scholar
  9. Bharadwaj, V., Ghose, D., Mani, V., & Robertazzi, T. G. (1996). Scheduling divisible loads in parallel and distributed systems. Los Alamitos: IEEE Comput. Soc. Google Scholar
  10. Blanchet, C., Combet, C., Geourjon, C., & Deléage, G. (2000). MPSA: integrated system for multiple protein sequence analysis with client/server capabilities. Bioinformatics, 16(3), 286–287. CrossRefGoogle Scholar
  11. Blazewicz, J. (1977). Scheduling dependent tasks with different arrival times to meet deadlines. In H. Beilner & E. Gelenbe (Eds.), Modelling and performance evaluation of computer systems (Proceedings of the international workshop) (pp. 57–65). Amsterdam: North-Holland. Google Scholar
  12. Blazewicz, J., Ecker, K. H., Pesch, E., Schmidt, G., & Weglarz, J. (2007). Handbook on scheduling: from theory to applications. International handbooks on information systems. Berlin: Springer. ISBN: 978-3-540-28046-0. Google Scholar
  13. Braun, R. C., Pedretti, K. T., Casavant, T. L., Scheetz, T. E., Birkett, C. L., & Roberts, C. A. (2001). Parallelization of local BLAST service on workstation clusters. Future Generation Computer Systems, 17(6), 745–754. CrossRefGoogle Scholar
  14. Chekuri, C., & Khanna, S. (2002). Approximation schemes for preemptive weighted flow time. In Proceedings of the 34th annual ACM symposium on theory of computing (pp. 297–305). New York: Assoc. Comput. Mach. Google Scholar
  15. Darling, A. E., Carey, L., & Feng, W. Ch. (2003). The design, implementation, and evaluation of mpiBLAST. In Proceedings of ClusterWorld 2003. Google Scholar
  16. Dertouzos, M. L. (1974). Control robotics: the procedural control of physical processes. In Proceedings of IFIP congress (pp. 897–813). Google Scholar
  17. Garey, M. R., & Johnson, D. S. (1991). Computers and intractability, a guide to the theory of NP-completeness. New York: Freeman. Google Scholar
  18. Gonzalez, T., & Sahni, S. (1976). Open shop scheduling to minimize finish time. Journal of the Association for Computing Machinery, 23(4), 665–679. Google Scholar
  19. GriPPS webpage at http://gripps.ibcp.fr/ (2005).
  20. Johnson, D. S. (1974). Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9, 256–278. CrossRefGoogle Scholar
  21. Labetoulle, J., Lawler, E. L., Lenstra, J. K., & Rinnooy Kan, A. H. G. (1984). Preemptive scheduling of uniform machines subject to release dates. In W. R. Pulleyblank (Ed.), Progress in combinatorial optimization (pp. 245–261). San Diego: Academic Press. Google Scholar
  22. Lawler, E. L., & Labetoulle, J. (1978). On preemptive scheduling of unrelated parallel processors by linear programming. Journal of the Association for Computing Machinery, 25(4), 612–619. Google Scholar
  23. Legrand, A., Marchal, L., & Casanova, H. (2003). Scheduling distributed applications: the SimGrid simulation framework. In Proceedings of the 3rd IEEE symposium on cluster computing and the grid. Google Scholar
  24. Legrand, A., Su, A., & Vivien, F. (2004). Off-line scheduling of divisible requests on an heterogeneous collection of databanks (Research Report 5386). INRIA, November 2004. Also available as LIP, ENS Lyon, Research Report 2004-51. Google Scholar
  25. Legrand, A., Su, A., & Vivien, F. (2005). Off-line scheduling of divisible requests on an heterogeneous collection of databanks. In Proceedings of the 14th heterogeneous computing workshop, Denver, Colorado, USA, April 2005. Los Alamitos: IEEE Comput. Soc. Google Scholar
  26. Legrand, A., Su, A., & Vivien, F. (2006). Minimizing the stretch when scheduling flows of biological requests. In Symposium on parallelism in algorithms and architectures SPAA’2006. New York: Assoc. Comput. Mach. Google Scholar
  27. Legrand, A., Su, A., & Vivien, F. (2008). Minimizing the stretch when scheduling flows of divisible requests (Research Report RR2008-08). LIP, École Normale Supérieure de Lyon, February 2008. This is a revised version of the LIP Research Report RR2006-19. Also available as INRIA Research Report 6002 http://hal.inria.fr/inria-00108524.
  28. Lenstra, J. K., Rinnooy Kan, A. H. G., & Brucker, P. (1977). Complexity of machine scheduling problems. Annals of Discrete Mathematics, 1, 343–362. CrossRefGoogle Scholar
  29. Massoulié, L., & Roberts, J. (2002). Bandwidth sharing: objectives and algorithms. Transactions on Networking, 10(3), 320–328. CrossRefGoogle Scholar
  30. Megow, N. (2002). Performance analysis of on-line algorithms in machine scheduling. Diplomarbeit, Technische Universität Berlin, April 2002. Google Scholar
  31. Miller, P. L., Nadkarni, P. M., & Carriero, N. M. (1991). Parallel computation and FASTA: confronting the problem of parallel database search for a fast sequence comparison algorithm. Computer Applications in the Biosciences, 7(1), 71–78. Google Scholar
  32. Muthukrishnan, S., Rajaraman, R., Shaheen, A., & Gehrke, J. (1999). Online scheduling to minimize average stretch. In IEEE symposium on foundations of computer science (pp. 433–442). Google Scholar
  33. Protasi, M. Ausiello, G. D’Atri, A. (1980). Structure preserving reductions among convex optimization problems. Journal of Computer and System Sciences, 21(1), 136–153. CrossRefGoogle Scholar
  34. Schulz, A. S., & Skutella, M. (2002). The power of α-points in preemptive single machine scheduling. Journal of Scheduling, 5(2), 121–133. DOI: 10.1002/jos.093. CrossRefGoogle Scholar
  35. Sitters, R. (2005). Complexity of preemptive minsum scheduling on unrelated parallel machines. Journal of Algorithms, 57(1), 37–48. CrossRefGoogle Scholar
  36. Slavík, P. (1996). A tight analysis of the greedy algorithm for set cover. In STOC’96: Proceedings of the twenty-eighth annual ACM symposium on theory of computing (pp. 435–441). New York: Assoc. Comput. Mach. CrossRefGoogle Scholar
  37. Smith, W. E. (1956). Various optimizers for single-stage production. Naval Research Logistics Quarterly, 3, 59–66. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.CNRS, France Université de Grenoble, France LIG, UMR 5217CNRS–Grenoble INP–INRIA–UJF–UPMFGrenobleFrance
  2. 2.Google Inc.CambridgeUSA
  3. 3.INRIA, France Université de Lyon, France LIP, UMR 5668ENS-Lyon–CNRS–INRIA–UCBLLyonFrance

Personalised recommendations