Abstract
The arrival process of jobs submitted to a parallel system is bursty, leading to fluctuations in the load at many time scales. In particular, rare events of extreme load may occur. Such events lead to an increase in the standard deviation of performance metrics, and thus delay the convergence of simulations used to evaluate the scheduling. Different performance metrics have been proposed in an effort to reduce this variability, and indeed display different rates of convergence. However, there is no single metric that outperforms the others under all conditions. Rather, the convergence of different metrics depends on the system being studied.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
P. Brinch Hansen, “An analysis of response ratio scheduling”. In IFIP Congress, Ljubljana, pp. TA–3 150–154, Aug 1971.
W. Cirne and F. Berman, “Adaptive selection of partition size for supercomputer requests”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 187–207, Springer Verlag, 2000. Lect. Notes Comput. Sci. vol. 1911.
M. E. Crovella and A. Bestavros, “Self-similarity in world wide web traffic: evidence and possible causes”. In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 160–169, May 1996.
M. E. Crovella and L. Lipsky, “Long-lasting transient conditions in simulations with heavy-tailed workloads”. In Winter Simulation conf., Dec 1997.
A. B. Downey, “A parallel workload model and its implications for processor allocation”. In 6th Intl. Symp. High Performance Distributed Comput., Aug 1997.
A. B. Downey and D. G. Feitelson, “The elusive goal of workload characterization”. Performance Evaluation Rev. 26(4), pp. 14–29, Mar 1999.
D. L. Eager, E. D. Lazowska, and J. Zahorjan, “The limited performance benefits of migrating active processes for load sharing”. In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 63–72, May 1988.
D. G. Feitelson, “Packing schemes for gang scheduling”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 89–110, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1
D. G. Feitelson, L. Rudolph, U. Schwiegelshohn, K. C. Sevcik, and P. Wong, “Theory and practice in parallel job scheduling”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 1–34, Springer Verlag, 1997. Lect. Notes Comput. Sci. vol. 1
R. Giladi and N. Ahituv, “SPEC as a performance evaluation measure”. Computer 28(8), pp. 33–42, Aug 1995.
M. Harchol-Balter, M. E. Crovella, and C. D. Murta, “On choosing a task assignment policy for a distributed server system”. In Computer Performance Evaluation, R. Puigjaner, N. Savino, and B. Serra (eds.), pp. 231–242, Springer-Verlag, 1998.
M. Harchol-Balter and A. B. Downey, “Exploiting process lifetime distributions for dynamic load balancing”. ACM Trans. Comput. Syst. 15(3), pp. 253–285, Aug 1997.
P. Heidelberger, “Fast simulation of rare events in queueing and reliability models”. ACM Trans. Modeling & Comput. Simulation 5(1), pp. 43–85, Jan 1995.
R. Jain, The Art of Computer Systems Performance Analysis. John Wiley & Sons, 1991.
J. Jann, P. Pattnaik, H. Franke, F. Wang, J. Skovira, and J. Riodan, “Modeling of workload in MPPs”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 95–116, Springer Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
E. D. Lazowska, “The use of percentiles in modeling CPU service time distributions”. In Computer Performance, K. M. Chandy and M. Reiser (eds.), pp. 53–66, North-Holland, 197
W. E. Leland and T. J. Ott, “Load-balancing heuristics and process behavior”. In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 54–69, 1986.
D. Lifka, “The ANL/IBM SP scheduling system”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 295–303, Springer-Verlag, 1995. Lect. Notes Comput. Sci. vol. 949.
U. Lublin, A Workload Model for Parallel Computer Systems. Master’s thesis, Hebrew University, 1999. (In Hebrew).
M. H. MacDougall, Simulating Computer Systems: Techniques and Tools. MIT Press, 1987.
A. W. Mu’alem and D. G. Feitelson, “Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling”. IEEE Trans. Parallel & Distributed Syst. 12(6), pp. 529–543, Jun 2001.
K. Pawlikowski, “Steady-state simulation of queueing processes: a survey of problems and solutions”. ACM Comput. Surv. 22(2), pp. 123–170, Jun 1990.
R. F. Rosin, “Determining a computing center environment”. Comm. ACM 8(7), pp. 465–468, Jul 1965.
D. Zotkin and P. J. Keleher, “Job-length estimation and performance in backfilling schedulers”. In 8th Intl. Symp. High Performance Distributed Comput., Aug 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Feitelson, D.G. (2001). Metrics for Parallel Job Scheduling and Their Convergence. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2001. Lecture Notes in Computer Science, vol 2221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45540-X_11
Download citation
DOI: https://doi.org/10.1007/3-540-45540-X_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42817-6
Online ISBN: 978-3-540-45540-0
eBook Packages: Springer Book Archive