Improving Parallel Job Scheduling Using Runtime Measurements

  • Fabricio Alves Barbosa da Silva
  • Isaac D. Scherson
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1911)

Abstract

We investigate the use of runtime measurements to improve job scheduling on a parallel machine. Emphasis is on gang scheduling based strategies. With the information gathered at runtime, we define a task classification scheme based on fuzzy logic and Bayesian estimators. The resulting local task classification is used to provide better service to I/O bound and interactive jobs under gang scheduling. This is achieved through the use of idle times and also by controlling the spinning time of a task in the spin block mechanism depending on the node’s workload. Simulation results show considerable improvements, in particular for I/O bound workloads, in both throughput and machine utilization for a gang scheduler using runtime information compared with gang schedulers for which this type of information is not available.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A. C. Arpaci-Dusseau, D. E. Culler, and A. M. Mainwaring. Scheduling with Implicit Information in Distributed Systems. In Proceedings of ACM SIGMETRICS’98, pages 233–243, 1998.Google Scholar
  2. 2.
    J. Edmonds, D.D. Chinn, T. Brecht, and X. Deng. Non-Clairvoyant Multiprocessor Scheduling of Jobs with Changing Execution Chracteristics(extended abstract). In Proceedings of the 1997 ACM Symposium of Theory of Computing, pages 120–129, 1997.Google Scholar
  3. 3.
    A. Hori et al. Implementation of Gang Scheduling on Workstation Cluster. Job Scheduling Strategies for Parallel Processing, LNCL 1162:126–139, 1996.CrossRefGoogle Scholar
  4. 4.
    Al Geist et al. PVM: Parallel Virtual Machine-A User’s guide and tutorial for networked parallel computing. The MIT Press, 1994.Google Scholar
  5. 5.
    D. Culler et al. LogP:Towards a Realistic Model of Parallel Computation. In Proceedings of 4th ACM SIGPLAN Symposium on Principles an Practice of Parallel Programming, pages 1–12, 1993.Google Scholar
  6. 6.
    D. Culler et al. A Practical Model of Parallel Computation. Communication of the ACM, 93(11):78–85, 1996.CrossRefGoogle Scholar
  7. 7.
    J. Jann et al. Modeling of Workloads in MPP. Job Scheduling Strategies for Parallel Processing, LNCL 1291:95–116, 1997.Google Scholar
  8. 8.
    Patrick G. Solbalvarro et al. Dynamic Coscheduling on Workstation Clusters. Job Scheduling Strategies for Parallel Processing, LNCL 1459:231–256, 1998.CrossRefGoogle Scholar
  9. 9.
    D. Feitelson. Packing Schemes for Gang Scheduling. Job Scheduling Strategies for Parallel Processing, LNCL 1162:89–110, 1996.CrossRefGoogle Scholar
  10. 10.
    D. Feitelson and M. A. Jette. Improved Utilization and Responsiveness with Gang Scheduling. Job Scheduling Strategies for Parallel Processing. LNCL 1291:238–261,1997.Google Scholar
  11. 11.
    D. Feitelson and L. Rudolph. Distributed Hierarchical Control for Parallel Processing.IEEE Computer, pages 65–77, May 1990.Google Scholar
  12. 12.
    D Feitelson and L. Rudolph. Mapping and Scheduling in a Shared Parallel Environment Using Distributed Hiearchical Control. In Proceedings of the 1990 International Conference on Parallel Processing, 1990.Google Scholar
  13. 13.
    D. Feitelson and L. Rudolph. Gang Scheduling Performance Benefits for Fine-Grain Synchronization. Journal of Parallel and Distributed Computing, 16:306–318, 1992.MATHCrossRefGoogle Scholar
  14. 14.
    D. Feitelson and L. Rudolph. Coscheduling Based on Runtime Identification of Activity Working Sets. International Journal of Parallel Programming, 23(2):135– 160, 1995.CrossRefGoogle Scholar
  15. 15.
    A. Hori, H. Tezuka, and Y. Ishikawa. Overhead Analysis of Preemptive Gang Scheduling. Job Scheduling Strategies for Parallel Processing, LNCL 1459:217–230, 1998.CrossRefGoogle Scholar
  16. 16.
    M.A. Jette. Performance Characteristics of Gang Scheduling In Multiprogrammed Environments. In Proceedings of SC’97, 1997.Google Scholar
  17. 17.
    J.J. Martin. Bayesian Decison Problems and Markov Chains. John Wiley and Sons Inc., New York, N.Y., 1967.Google Scholar
  18. 18.
    B. Kosko. Fuzziness vs. Probability. International Jounal of General Systems, 17(2-3), 1990.Google Scholar
  19. 19.
    B. Kosko. Neural Networks and Fuzzy Systems: A Dynamical Systems Approach for Machine Intelligence. Prentice Hall, Inc., 1999Google Scholar
  20. 20.
    W. Lee, M. Frank, V. Lee, K. Mackenzie, and L. Rudolph. Implications of I/O for Gang Scheduled Workloads. Job Scheduling Strategies for Parallel Processing, LNCL 1291:215–237, 1997.Google Scholar
  21. 21.
    R. Motwani, S. Phillips, and E. Torng. Non-clairvoyant scheduling. Theoretical Computer Science, 130(1):17–47, 1994.MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    J. K. Ousterhout. Scheduling Techniques for Concurrent Systems. In Proceedings of the 3rd International Conference on Distributed Comp. Systems, pages 22–30, 1982.Google Scholar
  23. 23.
    E. Rosti, G. Serazzi, E. Smirni, and M. S. Squillante. The Impact of I/O on Program Behavior and Parallel Scheduling. In Proceedings of ACM SIGMETRICS’98, pages 56– 64, 1998.Google Scholar
  24. 24.
    F.A.B. Silva, L.M. Campos, and I.D. Scherson. A Lower Bound for Dynamic Scheduling of Data Parallel Programs. In Proceedings EUROPAR’98, 1998.Google Scholar
  25. 25.
    F.A.B. Silva and I.D. Scherson. Towards Flexibility and Scalability in Parallel Job Scheduling. In Proceedings of the 1999 IASTED Conference on Parallel and Distributed Computing Systems, 1999.Google Scholar
  26. 26.
    F.A.B. Silva and I.D. Scherson. Improving Throughput and Utilization on Parallel Machines Through Concurrent Gang. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium 2000, 2000.Google Scholar
  27. 27.
    E. Smirni, R. A. Aydt, A. A. Chien, and D. A. Reed. I/O Requirements of scientific aplications: an evolutionary view. In Proceedings of the IEEE international Symposium of High Performance Distributed Computing, pages 49–59, 1996.Google Scholar
  28. 28.
    E. Smirni and D. A. Reed. Lessons from characterizing the input/output behavior of parallel scientific applications. Performance Evaluation, 33:27–44, 1998.Google Scholar
  29. 29.
    L.G. Valiant. A bridging model for parallel computations. Communications of the ACM, 33(8):103–111, 1990.CrossRefGoogle Scholar
  30. 30.
    F. Wang, M. Papaefthymiou, and M. S. Squillante. Performance Evaluation of Gang Scheduling for Parallel and Distributed Multiprogramming. Job Scheduling Strategies for Parallel Processing, LNCL 1291:277–298, 1997.Google Scholar
  31. 31.
    L.A. Zadeh. Fuzzy Sets. Information and Control, 8:338– 353, 1965.CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Fabricio Alves Barbosa da Silva
    • 1
  • Isaac D. Scherson
    • 1
    • 2
  1. 1.Laboratoire ASIM, LIP6Université Pierre et Marie CurieParisFrance
  2. 2.Information and Comp. ScienceUniversity of CaliforniaIrvineUSA

Personalised recommendations