Advertisement

Job Scheduling Simulator for Assisting the Mapping Configuration Between Queue and Computing Nodes

  • Yuki MatsuiEmail author
  • Yasuhiro Watashiba
  • Susumu Date
  • Takashi Yoshikawa
  • Shinji Shimojo
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 926)

Abstract

In computer centers responsible for providing users with high-performance computing resources such as computer clusters, job throughput is an important criterion in satisfying users’ computing needs. The setting and configuration parameters in a job scheduler affect the criterion. In particular, the mapping between queues deployed in the job scheduler and computing nodes is highly related to job throughput. In most cases, however, the mapping configuration is conducted based on the administrators’ experience and knowhow partly because the tools that facilitate the determination of the mapping are not available. In this paper, we propose a job scheduling simulator that allows the administrators to investigate how the mapping affects job throughput. In the evaluation, the behavior of the proposed job scheduling simulator is assessed through a comparison with an actual computer cluster. In addition, real cases using the proposed job scheduling simulator are discussed.

Notes

Acknowledgement

This work was partially supported by JSPS KAKENHI Grant Numbers JP16H02802, JP17K00101.

References

  1. 1.
    Buchert, T., Ruiz, C., Nussbaum, L., Richard, O.: A survey of general-purpose experiment management tools for distributed systems. Futur. Gener. Comput. Syst. 45, 1–12 (2015)CrossRefGoogle Scholar
  2. 2.
    Buyya, R., Murshed, M.: Gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concurr. Comput.: Pract. Exp. 14, 1175–1220 (2002)CrossRefGoogle Scholar
  3. 3.
    Cybermedia Center, Osaka University: Large-Scale Computer System (2018). http://www.hpc.cmc.osaka-u.ac.jp/en
  4. 4.
    Czarnul, P., Kuchta, J., Matuszek, M., Proficz, J., Rościszewski, P., Wójcik, M., Szymański, J.: MERPSYS: an environment for simulation of parallel application execution on large scale HPC systems. Simul. Model. Pract. Theory 77, 124–140 (2017)CrossRefGoogle Scholar
  5. 5.
    Gentzsch, W.: Sun grid engine: towards creating a compute power grid. In: Proceedings of the 1st IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 35–36. IEEE (2001)Google Scholar
  6. 6.
    Green, T.P., Snyder, J.: DQS, a distributed queuing system. Florida State University (1993)Google Scholar
  7. 7.
    Henderson, R.L.: Job scheduling under the portable batch system. In: Workshop on Job Scheduling Strategies for Parallel Processing, pp. 279–294. Springer (1995)Google Scholar
  8. 8.
    Klusáček, D., Rudová, H.: Alea 2: job scheduling simulator. In: Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques, pp. 1–10. ICST (2010)Google Scholar
  9. 9.
    Obaida, M.A., Liu, J.: Simulation of HPC job scheduling and large-scale parallel workloads. In: Proceedings of the 50th Winter Simulation Conference (WSC), pp. 920–931. IEEE (2017)Google Scholar
  10. 10.
    Power, J., Hestness, J., Orr, M.S., Hill, M.D., Wood, D.A.: gem5-gpu: a heterogeneous CPU-GPU simulator. IEEE Comput. Arch. Lett. 14(1), 34–36 (2015)CrossRefGoogle Scholar
  11. 11.
    Simakov, N.A., Innus, M.D., Jones, M.D., DeLeon, R.L., White, J.P., Gallo, S.M., Patra, A.K., Furlani, T.R.: A slurm simulator: implementation and parametric analysis. In: Proceedings of the 8th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, pp. 197–217. Springer (2017)Google Scholar
  12. 12.
    TOP500.org: TOP500 The List (2018). http://www.top500.org
  13. 13.
    Tourancheau, B.: Experimental methodologies for large-scale systems: a survey. Parallel Process. Lett. 19(3), 399–418 (2009)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Tulasi, B., Wagh, R.S., Balaji, S.: High performance computing and big data analytics - paradigms and challenges. Int. J. Comput. Appl. 116(2), 28–33 (2015)Google Scholar
  15. 15.
    Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple linux utility for resource management. In: Proceedings of the 9th Workshop on Job Scheduling Strategies for Parallel Processing, pp. 44–60. Springer (2003)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Yuki Matsui
    • 1
    Email author
  • Yasuhiro Watashiba
    • 2
  • Susumu Date
    • 2
  • Takashi Yoshikawa
    • 2
  • Shinji Shimojo
    • 2
  1. 1.Graduate School of Information Science and TechnologyOsaka UniversityOsakaJapan
  2. 2.Cybermedia CenterOsaka UniversityOsakaJapan

Personalised recommendations