Advertisement

Supercomputer Efficiency: Complex Approach Inspired by Lomonosov-2 History Evaluation

  • Sergei LeonenkovEmail author
  • Sergey ZhumatiyEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 965)

Abstract

These days the number of supercomputer users and the jobs they execute is rapidly growing, especially for supercomputers, providing computing time to external users. Supercomputers and their computing time are highly expensive, so their efficiency is crucial for both users and owners. There are several ways to increase operational efficiency, however, in most cases it involves a trade-off between efficiency metrics. This brings about a need to define “efficiency” in each specific case. We use the historical data from two largest Russian supercomputers to create a number of metrics in order to provide the definition of resource management “efficiency”. The data from both Lomonosov and Lomonosov-2 supercomputers consists of over one year history of job executions. Lomonosov and Lomonosov-2 efficiency in terms of CPU hours utilization is considerably high, nevertheless, our global goal is to offer the way to maintain or improve this metric when maximizing others examined in the paper.

Keywords

High-performance computing Resource management Supercomputer job scheduling efficiency 

Notes

Acknowledgments

This material is based upon the work supported by Russian Foundation for Basic Research (Agreement N 17-07-00664 A).

References

  1. 1.
    Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple linux utility for resource management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 44–60. Springer, Heidelberg (2003).  https://doi.org/10.1007/10968987_3CrossRefGoogle Scholar
  2. 2.
    Slurm workload manager (2015). http://slurm.schedmd.com/slurm.html
  3. 3.
    Sadovnichy, V., Tikhonravov, A.: LOMONOSOV: supercomputing at moscow state university. In: Contemporary High Performance Computing: From Petascale toward Exascale, pp. 283–307 (2013)Google Scholar
  4. 4.
    Lomonosov—T-Platforms (2015). http://www.top500.org/system/177421
  5. 5.
    Lipari, D.: The SLURM Scheduler Design (2012). http://slurm.schedmd.com/slurm_ug_2012/SUG-2012-Scheduling.pdf
  6. 6.
    Jones, M.: Optimization of resource management using supercomputers SLURM (2012).http://www.ibm.com/developerworks/ru/library/l-slurm-utility/
  7. 7.
    Lomonosov-2 supercomputer configuration (2018). http://users.parallel.ru/wiki/pages/22-config
  8. 8.
    Lomonosov-2 supercomputer on TOP50 list (2018). http://top50.supercomputers.ru/?page=stat&sub=ext&id=593
  9. 9.
    Antonov, A., et al.: An approach for ensuring reliable functioning of a supercomputer based on a formal model. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9573, pp. 12–22. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-32149-3_2CrossRefGoogle Scholar
  10. 10.
    Leonenkov, S., Zhumatiy, S.: Introducing new backfill-based scheduler for SLURM resource manager. Proc. Comput. Sci. 66, 661–669 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Research Computing CenterLomonosov Moscow State UniversityMoscowRussia
  2. 2.Faculty of Computational Mathematics and CyberneticsLomonosov Moscow State UniversityMoscowRussia

Personalised recommendations