Skip to main content

Metrics for Parallel Job Scheduling and Their Convergence

  • Conference paper
  • First Online:
Job Scheduling Strategies for Parallel Processing (JSSPP 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2221))

Included in the following conference series:

Abstract

The arrival process of jobs submitted to a parallel system is bursty, leading to fluctuations in the load at many time scales. In particular, rare events of extreme load may occur. Such events lead to an increase in the standard deviation of performance metrics, and thus delay the convergence of simulations used to evaluate the scheduling. Different performance metrics have been proposed in an effort to reduce this variability, and indeed display different rates of convergence. However, there is no single metric that outperforms the others under all conditions. Rather, the convergence of different metrics depends on the system being studied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. Brinch Hansen, “An analysis of response ratio scheduling”. In IFIP Congress, Ljubljana, pp. TA–3 150–154, Aug 1971.

    Google Scholar 

  2. W. Cirne and F. Berman, “Adaptive selection of partition size for supercomputer requests”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 187–207, Springer Verlag, 2000. Lect. Notes Comput. Sci. vol. 1911.

    Chapter  Google Scholar 

  3. M. E. Crovella and A. Bestavros, “Self-similarity in world wide web traffic: evidence and possible causes”. In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 160–169, May 1996.

    Google Scholar 

  4. M. E. Crovella and L. Lipsky, “Long-lasting transient conditions in simulations with heavy-tailed workloads”. In Winter Simulation conf., Dec 1997.

    Google Scholar 

  5. A. B. Downey, “A parallel workload model and its implications for processor allocation”. In 6th Intl. Symp. High Performance Distributed Comput., Aug 1997.

    Google Scholar 

  6. A. B. Downey and D. G. Feitelson, “The elusive goal of workload characterization”. Performance Evaluation Rev. 26(4), pp. 14–29, Mar 1999.

    Article  Google Scholar 

  7. D. L. Eager, E. D. Lazowska, and J. Zahorjan, “The limited performance benefits of migrating active processes for load sharing”. In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 63–72, May 1988.

    Google Scholar 

  8. D. G. Feitelson, “Packing schemes for gang scheduling”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 89–110, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1

    Chapter  Google Scholar 

  9. D. G. Feitelson, L. Rudolph, U. Schwiegelshohn, K. C. Sevcik, and P. Wong, “Theory and practice in parallel job scheduling”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 1–34, Springer Verlag, 1997. Lect. Notes Comput. Sci. vol. 1

    Google Scholar 

  10. R. Giladi and N. Ahituv, “SPEC as a performance evaluation measure”. Computer 28(8), pp. 33–42, Aug 1995.

    Article  Google Scholar 

  11. M. Harchol-Balter, M. E. Crovella, and C. D. Murta, “On choosing a task assignment policy for a distributed server system”. In Computer Performance Evaluation, R. Puigjaner, N. Savino, and B. Serra (eds.), pp. 231–242, Springer-Verlag, 1998.

    Google Scholar 

  12. M. Harchol-Balter and A. B. Downey, “Exploiting process lifetime distributions for dynamic load balancing”. ACM Trans. Comput. Syst. 15(3), pp. 253–285, Aug 1997.

    Google Scholar 

  13. P. Heidelberger, “Fast simulation of rare events in queueing and reliability models”. ACM Trans. Modeling & Comput. Simulation 5(1), pp. 43–85, Jan 1995.

    Article  MATH  Google Scholar 

  14. R. Jain, The Art of Computer Systems Performance Analysis. John Wiley & Sons, 1991.

    Google Scholar 

  15. J. Jann, P. Pattnaik, H. Franke, F. Wang, J. Skovira, and J. Riodan, “Modeling of workload in MPPs”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 95–116, Springer Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.

    Google Scholar 

  16. E. D. Lazowska, “The use of percentiles in modeling CPU service time distributions”. In Computer Performance, K. M. Chandy and M. Reiser (eds.), pp. 53–66, North-Holland, 197

    Google Scholar 

  17. W. E. Leland and T. J. Ott, “Load-balancing heuristics and process behavior”. In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 54–69, 1986.

    Google Scholar 

  18. D. Lifka, “The ANL/IBM SP scheduling system”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 295–303, Springer-Verlag, 1995. Lect. Notes Comput. Sci. vol. 949.

    Google Scholar 

  19. U. Lublin, A Workload Model for Parallel Computer Systems. Master’s thesis, Hebrew University, 1999. (In Hebrew).

    Google Scholar 

  20. M. H. MacDougall, Simulating Computer Systems: Techniques and Tools. MIT Press, 1987.

    Google Scholar 

  21. A. W. Mu’alem and D. G. Feitelson, “Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling”. IEEE Trans. Parallel & Distributed Syst. 12(6), pp. 529–543, Jun 2001.

    Article  Google Scholar 

  22. K. Pawlikowski, “Steady-state simulation of queueing processes: a survey of problems and solutions”. ACM Comput. Surv. 22(2), pp. 123–170, Jun 1990.

    Article  MathSciNet  Google Scholar 

  23. R. F. Rosin, “Determining a computing center environment”. Comm. ACM 8(7), pp. 465–468, Jul 1965.

    Article  Google Scholar 

  24. D. Zotkin and P. J. Keleher, “Job-length estimation and performance in backfilling schedulers”. In 8th Intl. Symp. High Performance Distributed Comput., Aug 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Feitelson, D.G. (2001). Metrics for Parallel Job Scheduling and Their Convergence. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2001. Lecture Notes in Computer Science, vol 2221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45540-X_11

Download citation

  • DOI: https://doi.org/10.1007/3-540-45540-X_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42817-6

  • Online ISBN: 978-3-540-45540-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics