Abstract
Parallel job scheduling is beginning to gain recognition as an important topic that is distinct from the scheduling of tasks within a parallel job by the programmer or runtime system. The main issue is how to share the resources of the parallel machine among a number of competing jobs, giving each the required level of service. This level of scheduling is done by the operating system. The four most commonly used or advocated techniques are to use a global queue, use variable partitioning, use dynamic partitioning, and use gang scheduling. These techniques are surveyed, and the benefits and shortcomings of each are identified. Then additional requirements that are not addressed by current systems are outlined, followed by considerations for evaluating various scheduling schemes.
Preview
Unable to display preview. Download preview PDF.
References
G. Alverson, S. Kahan, R. Korry, C. McCann, and B. Smith, “Scheduling on the Tera MTA”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
T. E. Anderson, B. N. Bershad, E. D. Lazowska, and H. M. Levy, “Scheduler activations: effective kernel support for the user-level management of parallelism”. ACM Trans. Comput. Syst. 10(1), pp. 53–79, Feb 1992.
J. M. Barton and N. Bitar, “A scalable multi-discipline, multiple-processor scheduling framework for IRIX”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
D. L. Black, “Scheduling support for concurrency and parallelism in the Mach operating system”. Computer 23(5), pp. 35–43, May 1990.
S. H. Bokhari, “On the mapping problem”. IEEE Trans. Comput. C-30(3), pp. 207–214, Mar 1981.
F. P. Brooks, Jr., The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley, 1975.
R. M. Bryant, H-Y. Chang, and B. S. Rosenburg, “Operating system support for parallel programming on RP3”. IBM J. Res. Dev. 35(5/6), pp. 617–634, Sep/Nov 1991.
N. Carriero, E. Freedman, D. Gelernter, and D. Kaminsky, “Adaptive parallelism and Piranha”. Computer 28(1), pp. 40–49, Jan 1995.
R. Chandra, S. Devine, B. Verghese, A. Gupta, and M. Rosenblum, “Scheduling and page migration for multiprocessor compute servers”. In 6th Intl. Conf. Architect. Support for Prog. Lang. & Operating Syst., pp. 12–24, Nov 1994.
C. Connelly and C. S. Ellis, “Scheduling to reduce memory coherence overhead on coarse-grain multiprocessors”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
D. G. Feitelson, A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657), IBM T. J. Watson Research Center, Oct 1994.
D. G. Feitelson and B. Nitzberg, “Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
D. G. Feitelson and L. Rudolph, “Distributed hierarchical control for parallel processing”. Computer 23(5), pp. 65–77, May 1990.
D. G. Feitelson and L. Rudolph, “Mapping and scheduling in a shared parallel environment using distributed hierarchical control”. In Intl. Conf. Parallel Processing, vol. I, pp. 1–8, Aug 1990.
D. G. Feitelson and L. Rudolph, “Wasted resources in gang scheduling”. In 5th Jerusalem Conf. Information Technology, pp. 127–136, IEEE Computer Society Press, Oct 1990.
D. G. Feitelson and L. Rudolph, “Gang scheduling performance benefits for finegrain synchronization”. J. Parallel & Distributed Comput. 16(4), pp. 306–318, Dec 1992.
D. G. Feitelson and L. Rudolph, “Coscheduling based on runtime identification of activity working sets”. Intl. J. Parallel Programming 23(2), pp. 135–160, Apr 1995.
M. J. Gonzalez, Jr., “Deterministic processor scheduling”. ACM Comput. Surv. 9(3), pp. 173–204, Sep 1977.
B. C. Gorda and E. D. Brooks III, Gang Scheduling a Parallel Machine. Technical Report UCRL-JC-107020, Lawrence Livermore National Laboratory, Dec 1991.
A. Gupta, A. Tucker, and S. Urushibara, “The impact of operating system scheduling policies and synchronization methods on the performance of parallel applications”. In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 120–132, May 1991.
R. L. Henderson, “Job scheduling under the portable batch system”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
A. Hori et al., “Time space sharing scheduling and architectural support”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Intel Supercomputer Systems Division, Paragon User's Guide. Order number 312489-003, Jun 1994.
O. Kipersztok and J. C. Patterson, “Intelligent fuzzy control to augment the scheduling capabilities of network queueing systems”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
L. Kleinrock and J-H. Huang, “On parallel processing systems: Amdahl's law generalized and some results on optimal design”. IEEE Trans. Softw. Eng. 18(5), pp. 434–447, May 1992.
C. E. Leiserson, Z. S. Abuhamdeh, D. C. Douglas, C. R. Feynman, M. N. Ganmukhi, J. V. Hill, W. D. Hillis, B. C. Kuszmaul, M. A. St. Pierre, D. S. Wells, M. C. Wong, S-W. Yang, and R. Zak, “The network architecture of the Connection Machine CM-5”. In 4th Symp. Parallel Algorithms & Architectures, pp. 272–285, Jun 1992.
D. Lifka, “The ANL/IBM SP scheduling system”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
D. J. Lilja, “Exploiting the parallelism available in loops”. Computer 27(2), pp. 13–26, Feb 1994.
V. M. Lo, “Heuristic algorithms for task assignment in distributed systems”. IEEE Trans. Comput. 37(11), pp. 1384–1397, Nov 1988.
C. McCann, R. Vaswani, and J. Zahorjan, “A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors”. ACM Trans. Comput. Syst. 11(2), pp. 146–178, May 1993.
M. G. Norman and P. Thanisch, “Models of machines and computation for mapping in multicomputers”. ACM Comput. Surv. 25(3), pp. 263–302, Sep 1993.
J. K. Ousterhout, “Scheduling techniques for concurrent systems”. In 3rd Intl. Conf. Distributed Comput. Syst., pp. 22–30, Oct 1982.
C. M. Pancake, “Multithreaded languages for scientific and technical computing”. Proc. IEEE 81(2), pp. 288–304, Feb 1993.
C. H. Papadimitriou and M. Yannakakis, “Towards an architecture-independent analysis of parallel algorithms”. SIAM J. Comput. 19(2), pp. 322–328, Apr 1990.
E. W. Parsons and K. C. Sevcik, “Multiprocessor scheduling for high-variability service time distributions”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
J. Pruyne and M. Livny, “Parallel processing on dynamic resources with CARMI”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
M. E. Rosenkrantz, D. J. Schneider, R. Leibensperger, M. shore, and J. Zollweg, “Requirements of the Cornell Theory Center for resource management and process scheduling”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
E. Rosti, E. Smirni, G. Serazzi, and L. W. Dowdy, “Analysis of non-workconserving processor partitioning policies”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
L. Rudolph, M. Slivkin-Allalouf, and E. Upfal, “A simple load balancing scheme for task allocation in parallel machines”. In 3rd Symp. Parallel Algorithms & Architectures, pp. 237–245, Jul 1991.
W. Saphir, L. A. Tanner, and B. Traversat, “Job management requirements for NAS parallel systems and clusters”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
S. Setia, “The interaction between memory allocation and adaptive partitioning in message-passing multicomputers”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
J. E. Smith, “Characterizing computer performance with a single number”. Comm. ACM 31(10), pp. 1202–1206, Oct 1988.
P. G. Sobalvarro and W. E. Weihl, “Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
M. S. Squillante, “On the benefits and limitations of dynamic partitioning in parallel computer systems”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
I. Stoica, H. Abdel-Wahab, and A. Pothen, “A microeconomic scheduler for parallel computers”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
S. Thakkar, P. Gifford, and G. Fielland, “Balance: a shared memory multiprocessor system”. In 2nd Intl. Conf. Supercomputing, vol. I, pp. 93–101, 1987.
J. Torrellas, A. Tucker, and A. Gupta, “Evaluating the performance of cacheaffinity scheduling in shared-memory multiprocessors”. J. Parallel & Distributed Comput. 24(2), pp. 139–151, Feb 1995.
J. D. Ullman, “Complexity of sequencing problems”. In Computer and Job-Shop Scheduling Theory, E. G. Coffman, Jr. (ed.), chap. 4, John Wiley & Sons, 1976.
K. Y. Wang and D. C. Marinescu, “Correlation of the paging activity of individual node programs in the SPMD execution model”. In 28th Hawaii Intl. Conf. System Sciences, vol. I, pp. 61–71, Jan 1995.
K. K. Yue and D. J. Lilja, “Loop-level process control: an effective processor allocation policy for multiprogrammed shared-memory multiprocessors”. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1995. Lecture Notes in Computer Science Vol. 949.
Author information
Authors and Affiliations
Corresponding author
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Feitelson, D.G., Rudolph, L. (1995). Parallel job scheduling: Issues and approaches. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1995. Lecture Notes in Computer Science, vol 949. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60153-8_20
Download citation
DOI: https://doi.org/10.1007/3-540-60153-8_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60153-1
Online ISBN: 978-3-540-49459-1
eBook Packages: Springer Book Archive