Backfilling with Lookahead to Optimize the Performance of Parallel Job Scheduling

  • Edi Shmueli
  • Dror G. Feitelson
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2862)

Abstract

The utilization of parallel computers depends on how jobs are packed together: if the jobs are not packed tightly, resources are lost due to fragmentation. The problem is that the goal of high utilization may conflict with goals of fairness or even progress for all jobs. The common solution is to use backfilling, which combines a reservation for the first job in the interest of progress with packing of later jobs to fill in holes and increase utilization. However, backfilling considers the queued jobs one at a time, and thus might miss better packing opportunities. We propose the use of dynamic programming to find the best packing possible given the current composition of the queue. Simulation results show that this indeed improves utilization, and thereby reduces the average response time and average slowdown of all jobs.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arndt, O., Freisleben, B., Kielmann, T., Thilo, F.: A Comparative Study of On-Line Scheduling Algorithms for Networks of Workstation. Cluster Computing 3(2), 95–112 (2000)CrossRefGoogle Scholar
  2. 2.
    Balasundaram, V., Fox, G., Kennedy, K., Kremer, U.: A Static Performance Estimator to Guide Data Partitioning Decisions. In: 3rd Symp. Principles and Practice of Parallel Programming, April 1991, pp. 213–223 (1991)Google Scholar
  3. 3.
    Coffman Jr., E.G., Garey, M.R., Johnson, D.S., Tarjan, R.E.: Performance Bounds for Level-Oriented Two-Dimensional Packing Algorithms. SIAM J. Comput. 9(4), 808–826 (1980)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Coffman Jr., E.G., Garey, M.R., Johnson, D.S.: Approximation Algorithms for Bin-Packing - An Updated Survey. In: Ausiello, G., Lucertini, M., Serafini, P. (eds.) Algorithm Design for Computer Systems Design, pp. 49–106. Springer, Heidelberg (1984)Google Scholar
  5. 5.
    Devarakonda, M.V., Iyer, R.K.: Predictability of Process Resource Usage: A Measurement Based Study on UNIX. IEEE Tans. Sotfw. Eng. 15(12), 1579–1586 (1989)CrossRefGoogle Scholar
  6. 6.
    Feitelson, D.G.: A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657), IBM T.J. Watson Research Center(October 1994); revised version (August 1997)Google Scholar
  7. 7.
    Feitelson, D.G., Rudolph, L.: Toward Convergence in Job Schedulers for Parallel Supercomputers. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 1–26. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  8. 8.
    Feitelson, D.G., Rudolph, L., Schweigelshohn, U., Sevcik, K.C., Wong, P.: Theory and Practice in Parallel Job Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 1–34. Springer, Heidelberg (1997)Google Scholar
  9. 9.
    Feitelson, D.G., Rudolph, L.: Metrics and Benchmarking for Parallel Job scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1998, SPDP-WS 1998, and JSSPP 1998. LNCS, vol. 1459, pp. 1–24. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  10. 10.
    Jackson, D., Snell, Q., Clement, M.: Core Algorithms of the Maui Scheduler. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 87–102. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  11. 11.
    Karger, D., Stein, C., Wein, J.: Scheduling Algorithms. In: Atallah, M.J. (ed.) Handbook of algorithms and Theory of computation, CRC Press, Boca Raton (1997)Google Scholar
  12. 12.
    Krakowiak, S.: Principles of Operating Systems. The MIT Press, Cambridge Mass (1998)Google Scholar
  13. 13.
    Krevat, E., Castanos, J.G., Moreira, J.E.: Job Scheduling for the BlueGene/L System. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 38–54. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  14. 14.
    Krueger, P., Lai, T.-H., Radiya, V.A.: Processor Allocation vs. Job Scheduling on Hypercube Computers. In: 11th Intl. Conf. Distributed Comput. Syst., May 1991, pp. 394-401(1991)Google Scholar
  15. 15.
    Law, A.M., Kelton, W.D.: Simulation Modeling and Analysis, 3rd edn. McGraw Hill, New York (2000)Google Scholar
  16. 16.
    Lawson, B.G., Smirni, E.: Multiple-Queue Backfilling Scheduling with Priorities and Reservations for Parallel Systems. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 72–87. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Leutenegger, S.T., Vernon, M.K.: The Performance of Multiprogrammed Multiprocessor Scheduling Policies. In: SIGMETRICS Conf. Measurement and Modeling of Comput. Syst., May 1990, pp. 226–236 (1990)Google Scholar
  18. 18.
    Leutenegger, S.T., Vernon, M.K.: Multiprogrammed Multiprocessor Scheduling Issues. Research Report RC 17642 (#77699), IBM T. J. Watson Research Center (November 1992)Google Scholar
  19. 19.
    Lifka, D.: The ANL/IBM SP Scheduling System. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)Google Scholar
  20. 20.
    Majumdar, S., Eager, D.L., Bunt, R.B.: Scheduling in Multiprogrammed Parallel Systems. In: SIGMETRICS Conf. Measurement and Modeling of Comput. Syst., May 1988, pp. 104–113 (1988)Google Scholar
  21. 21.
    Mu’alem, A.W., Feitelson, D.G.: Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. EEE Trans. on Parallel and Distributed Syst. 12(6), 529–543 (2001)CrossRefGoogle Scholar
  22. 22.
  23. 23.
    Sarkar, V.: Determining Average Program Execution Times and Their Variance. In: Proc. SIGPLAN Conf. Prog. Lang. Design and Implementation, June 1989, pp. 298–312 (1989)Google Scholar
  24. 24.
    Sevick, K.C.: Application Scheduling and Processor Allocation in Multi-programmed Parallel Processing Systems. Performance Evaluation 19(2—3), 107–140 (1994)Google Scholar
  25. 25.
    Sgall, J.: On-Line Scheduling — A Survey. In: Fiat, A. (ed.) Dagstuhl Seminar 1996. LNCS, vol. 1442, pp. 196–231. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  26. 26.
    Skovira, J., Chan, W., Zhou, H., Lifka, D.: The EASY - LoadLeveler API Project. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 41–47. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  27. 27.
    Srinivasan, S., Kettimuthu, R., Subramani, V., Sadayappan, P.: Characterization of Backfilling Strategies for Parallel Job Scheduling. In: Proc. of 2002 Intl. Workshops on Parallel Processing (August 2002)Google Scholar
  28. 28.
    Talby, D., Feitelson, D.G.: Supporting Priorities and Improving Utilization of the IBM SP Scheduler Using Slack-Based Backfilling. In: 13th Intl. Parallel Processing Symp. (IPPS), April 1999, pp. 513–517 (1999)Google Scholar
  29. 29.
    Ward Jr., W.A., Mahood, C.L., West, J.E.: Scheduling Jobs on Parallel Systems Using a Relaxed Backfill Strategy. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 88–102. Springer, Heidelberg (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Edi Shmueli
    • 1
  • Dror G. Feitelson
    • 2
  1. 1.Department of Computer Science, IBM Haifa Research LabsHaifa UniversityHaifaIsrael
  2. 2.School of Computer Science & EngineeringHebrew UniversityJerusalemIsrael

Personalised recommendations