Advertisement

An Integrated Approach to Parallel Scheduling Using Gang-Scheduling, Backfilling, and Migration

  • Y. Zhang
  • H. Franke
  • J. E. Moreira
  • A. Sivasubramaniam
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2221)

Abstract

Effective scheduling strategies to improve response times, throughput, and utilization are an important consideration in large supercomputing environments. Such machines have traditionally used space-sharing strategies to accommodate multiple jobs at the same time. This approach, however, can result in low system utilization and large job wait times. This paper discusses three techniques that can be used beyond simple space-sharing to greatly improve the performance figures of large parallel systems. The first technique we analyze is backfilling, the second is gang-scheduling, and the third is migration. The main contribution of this paper is an evaluation of the benefits from combining the above techniques. We demonstrate that, under certain conditions, a strategy that combines backfilling, gang-scheduling, and migration is always better than the individual strategies for all quality of service parameters that we consider.

Keywords

Schedule Strategy Migration Cost Parallel Schedule Actual Execution Time Synthetic Workload 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A. B. Downey. Using Queue Time Predictions for Processor Allocation. In IPPS’97 Workshop on Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science, pages 35–57. Springer-Verlag, April 1997.Google Scholar
  2. 2.
    D. G. Feitelson. ASurvey of Scheduling in Multiprogrammed Parallel Systems. Technical Report RC 19790 (87657), IBM T. J. Watson Research Center, October 1994.Google Scholar
  3. 3.
    D. G. Feitelson. Packing schemes for gang scheduling. In Job Scheduling Strategies for Parallel Processing, IPPS’96Workshop, volume 1162 of Lecture Notes in Computer Science, pages 89–110, Berlin, March 1996. Springer-Verlag.CrossRefGoogle Scholar
  4. 4.
    D. G. Feitelson and M. A. Jette. Improved Utilization and Responsiveness with Gang Scheduling. In IPPS’97 Workshop on Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science, pages 238–261. Springer-Verlag, April 1997.Google Scholar
  5. 5.
    D. G. Feitelson, L. Rudolph, U. Schwiegelshohn, K. C. Sevcik, and P. Wong. Theory and Practice in Parallel Job Scheduling. In IPPS’97 Workshop on Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science, pages 1–34. Springer-Verlag, April 1997.Google Scholar
  6. 6.
    D. G. Feitelson and A. Mu’alem Weil. Utilization and predictability in scheduling the IBM SP2 with backfilling. In 12th International Parallel Processing Symposium, pages 542–546, April 1998.Google Scholar
  7. 7.
    H. Franke, J. Jann, J. E. Moreira, and P. Pattnaik. An Evaluation of Parallel Job Scheduling for ASCI Blue-Pacific. In Proceedings of SC99, Portland, OR, November 1999. IBM Research Report RC21559.Google Scholar
  8. 8.
    H. Franke, P. Pattnaik, and L. Rudolph. Gang Scheduling for Highly Effficient Multiprocessors. In Sixth Symposium on the Frontiers of Massively Parallel Computation, Annapolis, Maryland, 1996.Google Scholar
  9. 9.
    R. Gibbons. A Historical Application Profiler for Use by Parallel Schedulers. In IPPS’97 Workshop on Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science, pages 58–77. Springer-Verlag, April 1997.Google Scholar
  10. 10.
    B. Gorda and R. Wolski. Time Sharing Massively Parallel Machines. In International Conference on Parallel Processing, volume II, pages 214–217, August 1995.Google Scholar
  11. 11.
    N. Islam, A. L. Prodromidis, M. S. Squillante, L. L. Fong, and A. S. Gopal. Extensible Resource Management for Cluster Computing. In Proceedings of the 17th International Conference on Distributed Computing Systems, pages 561–568, 1997.Google Scholar
  12. 12.
    J. Jann, P. Pattnaik, H. Franke, F. Wang, J. Skovira, and J. Riordan. Modeling of Workload in MPPs. In Proceedings of the 3rd Annual Workshop on Job Scheduling Strategies for Parallel Processing, pages 95–116, April 1997. In Conjuction with IPPS’97, Geneva, Switzerland.Google Scholar
  13. 13.
    H. D. Karatza. A Simulation-Based Performance Analysis of Gang Scheduling in a Distributed System. In Proceedings 32nd Annual Simulation Symposium, pages 26–33, San Diego, CA, April 11-15 1999.Google Scholar
  14. 14.
    D. Lifka. The ANL/IBM SP scheduling system. In IPPS’95 Workshop on Job Scheduling Strategies for Parallel Processing, volume 949 of Lecture Notes in Computer Science, pages 295–303. Springer-Verlag, April 1995.Google Scholar
  15. 15.
    J. E. Moreira, W. Chan, L. L. Fong, H. Franke, and M. A. Jette. An Infrastructure for Efficient Parallel Job Execution in Terascale Computing Environments. In Proceedings of SC98, Orlando, FL, November 1998.Google Scholar
  16. 16.
    J. K. Ousterhout. Scheduling Techniques for Concurrent Systems. In Third International Conference on Distributed Computing Systems, pages 22–30, 1982.Google Scholar
  17. 17.
    U. Schwiegelshohn and R. Yahyapour. Improving First-Come-First-Serve Job Scheduling by Gang Scheduling. In IPPS’98 Workshop on Job Scheduling Strategies for Parallel Processing, March 1998.Google Scholar
  18. 18.
    J. Skovira, W. Chan, H. Zhou, and D. Lifka. The EASY-LoadLeveler API project. In IPPS’96 Workshop on Job Scheduling Strategies for Parallel Processing, volume 1162 of Lecture Notes in Computer Science, pages 41–47. Springer-Verlag, April 1996.CrossRefGoogle Scholar
  19. 19.
    W. Smith, V. Taylor, and I. Foster. Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance. In Proceedings of the 5th Annual Workshop on Job Scheduling Strategies for Parallel Processing, April 1999. In conjunction with IPPS/SPDP’99, Condado Plaza Hotel & Casino, San Juan, Puerto Rico.Google Scholar
  20. 20.
    K. Suzaki and D. Walsh. Implementation of the Combination of Time Sharing and Space Sharing on AP/Linux. In IPPS’98 Workshop on Job Scheduling Strategies for Parallel Processing, March 1998.Google Scholar
  21. 21.
    K. K. Yue and D. J. Lilja. Comparing Processor Allocation Strategies in Multiprogrammed Shared-Memory Multiprocessors. Journal of Parallel and Distributed Computing, 49(2):245–258, March 1998.zbMATHCrossRefGoogle Scholar
  22. 22.
    B. B. Zhou, R. P. Brent, C. W. Jonhson, and D. Walsh. Job Re-packing for Enhancing the Performance of Gang Scheduling. In Job Scheduling Strategies for Parallel Processing, IPPS’99 Workshop, pages 129–143, April 1999. LNCS 1659.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Y. Zhang
    • 1
  • H. Franke
    • 2
  • J. E. Moreira
    • 2
  • A. Sivasubramaniam
    • 1
  1. 1.Department of Computer Science & EngineeringThe Pennsylvania State UniversityUniversity Park
  2. 2.IBM Thomas J. Watson Research CenterYorktown Heights

Personalised recommendations