Advertisement

Enhancements to the Decision Process of the Self-Tuning dynP Scheduler

  • Achim Streit
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3277)

Abstract

The self-tuning dynP scheduler for modern cluster resource management systems switches between different basic scheduling policies dynamically during run time. This allows to react on changing characteristics of the waiting jobs. In this paper we present enhancements to the decision process of the self-tuning dynP scheduler and evaluate their impact on the performance: (i) While doing a self-tuning step a performance metric is needed for ranking the schedules generated by the different basic scheduling policies. This allows different objectives for the self-tuning process, e.g. more user centric by improving the response time, or more owner centric by improving the makespan. (ii) Furthermore, a self-tuning process can be called at different times of the scheduling process: only at times when the characteristics of waiting jobs change (half self-tuning), i.e. new jobs are submitted; or always when the schedule changes (full self-tuning), i.e. when jobs are submitted or running jobs terminate.

We use discrete event simulations to evaluate the achieved performance. As job input for driving the simulations we use original traces from real supercomputer installations. The evaluation of the two enhancements to the decision process of the self-tuning dynP scheduler shows that a good performance is achieved, if the self-tuning metric is the same as the metric used measuring the overall performance at the end of the simulation. Additionally, calling the self-tuning process only when new jobs are submitted, is sufficient in most scenarios and the performance difference to full self-tuning is small.

Keywords

Schedule Policy Average Response Time Decider Mechanism Simple Decider Resource Management System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Feitelson, D.G.: A Survey of Scheduling in Multiprogrammed Parallel Systems. Research report rc 19790 (87657), IBM T.J. Watson Research Center, Yorktown Heights, NY (1995)Google Scholar
  2. 2.
    Feitelson, D.G., Naaman, M.: Self-Tuning Systems. IEEE Software 16(2), 52–60 (1999)CrossRefGoogle Scholar
  3. 3.
    Feitelson, D.G., Nitzberg, B.: Job Characteristics of a Production Parallel Scientific Workload on the NASA Ames iPSC/860. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 337–360. Springer, Heidelberg (1995)Google Scholar
  4. 4.
    Gehring, J., Ramme, F.: Architecture-Independent Request-Scheduling with Tight Waiting-Time Estimations. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 65–80. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  5. 5.
    Hovestadt, M., Kao, O., Keller, A., Streit, A.: Scheduling in HPC Resource Management Systems: Queuing vs. Planning. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 1–20. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Keller, A., Reinefeld, A.: Anatomy of a Resource Management System for HPC Clusters. In: Annual Review of Scalable Computing, vol. 3, pp. 1–31. Singapore University Press (2001)Google Scholar
  7. 7.
    Lifka, D.A.: The ANL/IBM SP Scheduling System. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)Google Scholar
  8. 8.
    Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D., Freund, R.F.: Dynamic Mapping of a Class of Independent Tasks onto Heterogeneous Computing Systems. Journal of Parallel and Distributed Computing 59(2), 107–131 (1999)CrossRefGoogle Scholar
  9. 9.
    Mu’alem, A., Feitelson, D.G.: Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Trans. Parallel & Distributed Systems 12(6), 529–543 (2001)CrossRefGoogle Scholar
  10. 10.
    Muthukrishnan, S., Rajaraman, R., Shaheen, A., Gehrke, J.E.: Online Scheduling to Minimize Average Stretch. In: Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science, pp. 433–442 (1999)Google Scholar
  11. 11.
    The pling Itanium2 Cluster at the Paderborn Center for Parallel Computing (PC2) (April 2004), http://www.upb.de/pc2/services/systems/pling/index.html
  12. 12.
    The PSC Pentium3 Cluster at the Paderborn Center for Parallel Computing (PC2) (April 2004), http://www.upb.de/pc2/services/systems/psc/index.html
  13. 13.
    Ramme, F., Kremer, K.: Scheduling a Metacomputer by an Implicit Voting System. In: 3rd Int. IEEE Symposium on High-Performance Distributed Computing, pp. 106–113 (1994)Google Scholar
  14. 14.
    Skovira, J., Chan, W., Zhou, H., Lifka, D.: The EASY — LoadLeveler API Project. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 41–47. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  15. 15.
    Streit, A.: A Self-Tuning Job Scheduler Family with Dynamic Policy Switching. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 1–23. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  16. 16.
    Streit, A.: The Self-Tuning dynP Job-Scheduler. In: Proc. of the 11th International Heterogeneous Computing Workshop (HCW) at IPDPS 2002 (book of abstracts, paper only on CD), p. 87. IEEE Computer Society Press, Los Alamitos (2002)Google Scholar
  17. 17.
    Streit, A.: Evaluation of an Unfair Decider Mechanism for the Self-Tuning dynP Job Scheduler. In: Proc. of the 13th International Heterogeneous Computing Workshop (HCW) at IPDPS (book of abstracts, paper only on CD), p. 108. IEEE Computer Society Press, Los Alamitos (2004)Google Scholar
  18. 18.
    Parallel Workloads Archive (April 2004), http://www.cs.huji.ac.il/labs/parallel/workload/

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Achim Streit
    • 1
  1. 1.PC2 – Paderborn Center for Parallel ComputingPaderborn UniversityPaderbornGermany

Personalised recommendations