Using queue time predictions for processor allocation

  • Allen B. Downey
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1291)

Abstract

When a moldable job is submitted to a space-sharing parallel computer, it must choose whether to begin execution on a small, available cluster or wait in queue for more processors to become available. To make this decision, it must predict how long it will have to wait for the larger cluster. We propose statistical techniques for predicting these queue times, and develop an allocation strategy that uses these predictions. We present a workload model based on observed workloads at the San Diego Supercomputer Center and the Cornell Theory Center, and use this model to drive simulations of various allocation strategies. We find that prediction-based allocation not only improves the turnaround time of individual jobs; it also improves the utilization of the system as a whole.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Su-Hui Chiang, Rajesh K. Mansharamani, and Mary K. Vernon. Use of application characteristics and limited preemption for run-to-completion parallel processor scheduling policies. In Proceedings of the 1994 ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems, 1994.Google Scholar
  2. 2.
    Allen B. Downey. A model for speedup of parallel programs. Technical Report CSD-97-933, University of California at Berkeley, 1997.Google Scholar
  3. 3.
    Allen B. Downey. A parallel workload model and its implications for processor allocation. In The Sixth IEEE International Symposium on High Performance Distributed Computing (HPDC '97), 1997. To appear. Also available as University of California technical report number CSD-96-922.Google Scholar
  4. 4.
    Allen B. Downey. Predicting queue times on space-sharing parallel computers. In Proceedings of the 11th International Parallel Processing Symposium, April 1997.Google Scholar
  5. 5.
    Derek L. Eager, John Zahorjan, and Edward L. Lazowska. Speedup versus efficiency in parallel systems. IEEE Transactions on Computers, 38(3):408–423, March 1989.CrossRefGoogle Scholar
  6. 6.
    Dror G. Feitelson and Bill Nitzberg. Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860. In Job Scheduling Strategies for Parallel Processing, Springer-Verlag LNCS Vol 949, pages 337–360, April 1995.Google Scholar
  7. 7.
    Dror G. Feitelson and Larry Rudolph. Evaluation of design choices for gang scheduling using distributed hierarchical control. Journal of Parallel and Distributed Computing, 35:18–34, 1996.CrossRefGoogle Scholar
  8. 8.
    Dror G. Feitelson and Larry Rudolph. Towards convergence in job schedulers for parallel supercomputers. In Job Scheduling Strategies for Parallel Processing, Springer-Verlag LNCS Vol 1162, pages 1–26, April 1996.Google Scholar
  9. 9.
    Dipak Ghosal, Giuseppe Serazzi, and Satish K. Tripathi. The processor working set and its use in scheduling multiprocessor systems. IEEE Transactions on Software Engineering, 17(5):443–453, May 1991.CrossRefGoogle Scholar
  10. 10.
    Shikharesh Majumdar, Derek L. Eager, and Richard B. Bunt. Scheduling in multiprogrammed parallel systems. In Proceedings of the ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 104–113, 1988.Google Scholar
  11. 11.
    Reagan Moore and Richard Klobuchar. DOCT (distributed-object computation testbed) home page http://www.sdsc.edu/doct. San Diego Supercomputer Center, 1996.Google Scholar
  12. 12.
    Vijay K. Naik, Sanjeev K. Setia, and Mark S. Squillante. Performance analysis of job scheduling policies in parallel supercomputing environments. In Supercomputing '93 Conference Proceedings, pages 824–833, March 1993.Google Scholar
  13. 13.
    Eric W. Parsons and Kenneth C. Sevcik. Coordinated allocation of memory and processors in multiprocessors. In Proceedings of the ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems, pages 57–67, May 1996.Google Scholar
  14. 14.
    Emilia Rosti, Evgenia Smirni, Lawrence W. Dowdy, Giuseppe Serazzi, and Brian M. Carlson. Robust partitioning policies of multiprocessor systems. Performance Evaluation, 19(2-3):141–165, Mar 1994.CrossRefGoogle Scholar
  15. 15.
    Emilia Rosti, Evgenia Smirni, Giuseppe Serazzi, and Lawrence W. Dowdy. Analysis of non-work-conserving processor partitioning policies. In Job Scheduling Strategies for Parallel Processing, Springer-Verlag LNCS Vol 949, pages 165–181, April 1995.Google Scholar
  16. 16.
    Sanjeev K. Setia and Satish K. Tripathi. A comparative analysis of static processor partitioning policies for parallel computers. In Proceedings of the Internationsal Workshop on Modeling and Simulation of Computer and Telecommunications Systems (MASCOTS), January 1993.Google Scholar
  17. 17.
    Kenneth C. Sevcik. Characterizations of parallelism in applications and their use in scheduling. Performance Evaluation Review, 17(1):171–180, May 1989.Google Scholar
  18. 18.
    Joseph Skovira, Waiman Chan, Honbo Zhou, and David Lifka. The EASY-LoadLeveler API project In Job Scheduling Strategies for Parallel Processing, Springer-Verlag LNCS Vol 1162, pages 41–47, April 1996.Google Scholar
  19. 19.
    Evgenia Smirni, Emilia Rosti, Lawrence W. Dowdy, and Giuseppe Serazzi. Evaluation of multiprocessor allocation policies. Technical report, Vanderbilt University, 1993.Google Scholar
  20. 20.
    Kurt Windisch, Virginia Lo, Dror Feitelson, Bill Nitzberg, and Reagan Moore. A comparison of workload traces from two production parallel machines. In 6th Symposium on the Frontiers of Massively Parallel Computation, 1996.Google Scholar

Copyright information

© Springer-Verlag 1997

Authors and Affiliations

  • Allen B. Downey
    • 1
  1. 1.University of CaliforniaBerkeley

Personalised recommendations