Parallel job scheduling: Issues and approaches

Feitelson, Dror G.; Rudolph, Larry

doi:10.1007/3-540-60153-8_20

Dror G. Feitelson¹ &
Larry Rudolph²^nAff3

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 949))

Included in the following conference series:

Workshop on Job Scheduling Strategies for Parallel Processing

471 Accesses
64 Citations

Abstract

Parallel job scheduling is beginning to gain recognition as an important topic that is distinct from the scheduling of tasks within a parallel job by the programmer or runtime system. The main issue is how to share the resources of the parallel machine among a number of competing jobs, giving each the required level of service. This level of scheduling is done by the operating system. The four most commonly used or advocated techniques are to use a global queue, use variable partitioning, use dynamic partitioning, and use gang scheduling. These techniques are surveyed, and the benefits and shortcomings of each are identified. Then additional requirements that are not addressed by current systems are outlined, followed by considerations for evaluating various scheduling schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

G. Alverson, S. Kahan, R. Korry, C. McCann, and B. Smith, “Scheduling on the Tera MTA”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
T. E. Anderson, B. N. Bershad, E. D. Lazowska, and H. M. Levy, “Scheduler activations: effective kernel support for the user-level management of parallelism”. ACM Trans. Comput. Syst. 10(1), pp. 53–79, Feb 1992.
Google Scholar
J. M. Barton and N. Bitar, “A scalable multi-discipline, multiple-processor scheduling framework for IRIX”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
D. L. Black, “Scheduling support for concurrency and parallelism in the Mach operating system”. Computer 23(5), pp. 35–43, May 1990.
Google Scholar
S. H. Bokhari, “On the mapping problem”. IEEE Trans. Comput. C-30(3), pp. 207–214, Mar 1981.
Google Scholar
F. P. Brooks, Jr., The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley, 1975.
Google Scholar
R. M. Bryant, H-Y. Chang, and B. S. Rosenburg, “Operating system support for parallel programming on RP3”. IBM J. Res. Dev. 35(5/6), pp. 617–634, Sep/Nov 1991.
Google Scholar
N. Carriero, E. Freedman, D. Gelernter, and D. Kaminsky, “Adaptive parallelism and Piranha”. Computer 28(1), pp. 40–49, Jan 1995.
Google Scholar
R. Chandra, S. Devine, B. Verghese, A. Gupta, and M. Rosenblum, “Scheduling and page migration for multiprocessor compute servers”. In 6th Intl. Conf. Architect. Support for Prog. Lang. & Operating Syst., pp. 12–24, Nov 1994.
Google Scholar
C. Connelly and C. S. Ellis, “Scheduling to reduce memory coherence overhead on coarse-grain multiprocessors”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
D. G. Feitelson, A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657), IBM T. J. Watson Research Center, Oct 1994.
Google Scholar
D. G. Feitelson and B. Nitzberg, “Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
D. G. Feitelson and L. Rudolph, “Distributed hierarchical control for parallel processing”. Computer 23(5), pp. 65–77, May 1990.
Google Scholar
D. G. Feitelson and L. Rudolph, “Mapping and scheduling in a shared parallel environment using distributed hierarchical control”. In Intl. Conf. Parallel Processing, vol. I, pp. 1–8, Aug 1990.
Google Scholar
D. G. Feitelson and L. Rudolph, “Wasted resources in gang scheduling”. In 5th Jerusalem Conf. Information Technology, pp. 127–136, IEEE Computer Society Press, Oct 1990.
Google Scholar
D. G. Feitelson and L. Rudolph, “Gang scheduling performance benefits for finegrain synchronization”. J. Parallel & Distributed Comput. 16(4), pp. 306–318, Dec 1992.
Google Scholar
D. G. Feitelson and L. Rudolph, “Coscheduling based on runtime identification of activity working sets”. Intl. J. Parallel Programming 23(2), pp. 135–160, Apr 1995.
Google Scholar
M. J. Gonzalez, Jr., “Deterministic processor scheduling”. ACM Comput. Surv. 9(3), pp. 173–204, Sep 1977.
Google Scholar
B. C. Gorda and E. D. Brooks III, Gang Scheduling a Parallel Machine. Technical Report UCRL-JC-107020, Lawrence Livermore National Laboratory, Dec 1991.
Google Scholar
A. Gupta, A. Tucker, and S. Urushibara, “The impact of operating system scheduling policies and synchronization methods on the performance of parallel applications”. In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 120–132, May 1991.
Google Scholar
R. L. Henderson, “Job scheduling under the portable batch system”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
A. Hori et al., “Time space sharing scheduling and architectural support”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
Intel Supercomputer Systems Division, Paragon User's Guide. Order number 312489-003, Jun 1994.
Google Scholar
O. Kipersztok and J. C. Patterson, “Intelligent fuzzy control to augment the scheduling capabilities of network queueing systems”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
L. Kleinrock and J-H. Huang, “On parallel processing systems: Amdahl's law generalized and some results on optimal design”. IEEE Trans. Softw. Eng. 18(5), pp. 434–447, May 1992.
Google Scholar
C. E. Leiserson, Z. S. Abuhamdeh, D. C. Douglas, C. R. Feynman, M. N. Ganmukhi, J. V. Hill, W. D. Hillis, B. C. Kuszmaul, M. A. St. Pierre, D. S. Wells, M. C. Wong, S-W. Yang, and R. Zak, “The network architecture of the Connection Machine CM-5”. In 4th Symp. Parallel Algorithms & Architectures, pp. 272–285, Jun 1992.
Google Scholar
D. Lifka, “The ANL/IBM SP scheduling system”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
D. J. Lilja, “Exploiting the parallelism available in loops”. Computer 27(2), pp. 13–26, Feb 1994.
Google Scholar
V. M. Lo, “Heuristic algorithms for task assignment in distributed systems”. IEEE Trans. Comput. 37(11), pp. 1384–1397, Nov 1988.
Google Scholar
C. McCann, R. Vaswani, and J. Zahorjan, “A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors”. ACM Trans. Comput. Syst. 11(2), pp. 146–178, May 1993.
Google Scholar
M. G. Norman and P. Thanisch, “Models of machines and computation for mapping in multicomputers”. ACM Comput. Surv. 25(3), pp. 263–302, Sep 1993.
Google Scholar
J. K. Ousterhout, “Scheduling techniques for concurrent systems”. In 3rd Intl. Conf. Distributed Comput. Syst., pp. 22–30, Oct 1982.
Google Scholar
C. M. Pancake, “Multithreaded languages for scientific and technical computing”. Proc. IEEE 81(2), pp. 288–304, Feb 1993.
Google Scholar
C. H. Papadimitriou and M. Yannakakis, “Towards an architecture-independent analysis of parallel algorithms”. SIAM J. Comput. 19(2), pp. 322–328, Apr 1990.
Google Scholar
E. W. Parsons and K. C. Sevcik, “Multiprocessor scheduling for high-variability service time distributions”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
J. Pruyne and M. Livny, “Parallel processing on dynamic resources with CARMI”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
M. E. Rosenkrantz, D. J. Schneider, R. Leibensperger, M. shore, and J. Zollweg, “Requirements of the Cornell Theory Center for resource management and process scheduling”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
E. Rosti, E. Smirni, G. Serazzi, and L. W. Dowdy, “Analysis of non-workconserving processor partitioning policies”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
L. Rudolph, M. Slivkin-Allalouf, and E. Upfal, “A simple load balancing scheme for task allocation in parallel machines”. In 3rd Symp. Parallel Algorithms & Architectures, pp. 237–245, Jul 1991.
Google Scholar
W. Saphir, L. A. Tanner, and B. Traversat, “Job management requirements for NAS parallel systems and clusters”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
S. Setia, “The interaction between memory allocation and adaptive partitioning in message-passing multicomputers”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
J. E. Smith, “Characterizing computer performance with a single number”. Comm. ACM 31(10), pp. 1202–1206, Oct 1988.
Google Scholar
P. G. Sobalvarro and W. E. Weihl, “Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
M. S. Squillante, “On the benefits and limitations of dynamic partitioning in parallel computer systems”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
I. Stoica, H. Abdel-Wahab, and A. Pothen, “A microeconomic scheduler for parallel computers”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar
S. Thakkar, P. Gifford, and G. Fielland, “Balance: a shared memory multiprocessor system”. In 2nd Intl. Conf. Supercomputing, vol. I, pp. 93–101, 1987.
Google Scholar
J. Torrellas, A. Tucker, and A. Gupta, “Evaluating the performance of cacheaffinity scheduling in shared-memory multiprocessors”. J. Parallel & Distributed Comput. 24(2), pp. 139–151, Feb 1995.
Google Scholar
J. D. Ullman, “Complexity of sequencing problems”. In Computer and Job-Shop Scheduling Theory, E. G. Coffman, Jr. (ed.), chap. 4, John Wiley & Sons, 1976.
Google Scholar
K. Y. Wang and D. C. Marinescu, “Correlation of the paging activity of individual node programs in the SPMD execution model”. In 28th Hawaii Intl. Conf. System Sciences, vol. I, pp. 61–71, Jan 1995.
Google Scholar
K. K. Yue and D. J. Lilja, “Loop-level process control: an effective processor allocation policy for multiprogrammed shared-memory multiprocessors”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Google Scholar

Download references

Author information

Larry Rudolph
Present address: MIT Laboratory for Computer Science, USA

Authors and Affiliations

IBM T. J. Watson Research Center, P. O. Box 218, 10598, Yorktown Heights, NY
Dror G. Feitelson
Institute of Computer Science, The Hebrew University, 91904, Jerusalem, Israel
Larry Rudolph

Authors

Dror G. Feitelson
View author publications
You can also search for this author in PubMed Google Scholar
Larry Rudolph
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Larry Rudolph .

Editor information

Dror G. Feitelson Larry Rudolph

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feitelson, D.G., Rudolph, L. (1995). Parallel job scheduling: Issues and approaches. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1995. Lecture Notes in Computer Science, vol 949. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60153-8_20

Download citation

DOI: https://doi.org/10.1007/3-540-60153-8_20
Published: 02 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60153-1
Online ISBN: 978-3-540-49459-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics