Advertisement

Analyzing IO Usage Patterns of User Jobs to Improve Overall HPC System Efficiency

  • Syed Sadat Nazrul
  • Cherie Huang
  • Mahidhar Tatineni
  • Nicole Wolter
  • Dmitry Mishin
  • Trevor Cooper
  • Amit MajumdarEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 964)

Abstract

This work looks at analyzing I/O traffic of users’ jobs on a HPC machine for a period of time. Monitoring tools are collecting the data in a continuous basis on the HPC system. We looked at aggregate I/O data usage patterns of users’ jobs on the system both on the parallel shared Lustre file system and the node-local SSDs. Data mining tools are then applied to analyze the I/O usage pattern data in an attempt to tie the data to particular codes that produced those I/O behaviors from users’ jobs.

Keywords

Aggregate I/O usage data Data mining Application I/O behavior 

Notes

Acknowledgement

Authors acknowledge funding support/sponsorship from Engility Corporation’s High Performance Computing Center of Excellence (HPC CoE) that was used to support the student research. Authors thank Dr. Rajiv Bendale, Engility Corporation, for many valuable suggestions for this project.

References

  1. 1.
  2. 2.
    Hammond, J.: TACC stats: I/O performance monitoring for the instransigent. In: Invited Keynote for the 3rd IASDS Workshop, pp. 1–29 (2011)Google Scholar
  3. 3.
  4. 4.
    Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-1996), pp. 226–231. AAAI Press (1996)Google Scholar
  5. 5.
    Pedregosa, F., et al.: Scikit-learn: machine learning in Python. JMLR 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Kriegel, H.-P., Kroeger, P., Sander, J., Zimek, A.: Density-based clustering. WIREs Data Min. Knowl. Discov. 1(3), 231–240 (2011)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Syed Sadat Nazrul
    • 1
  • Cherie Huang
    • 1
  • Mahidhar Tatineni
    • 1
  • Nicole Wolter
    • 1
  • Dmitry Mishin
    • 1
  • Trevor Cooper
    • 1
  • Amit Majumdar
    • 1
    Email author
  1. 1.San Diego Supercomputer CenterUniversity of California San DiegoLa JollaUSA

Personalised recommendations