Analyzing IO Usage Patterns of User Jobs to Improve Overall HPC System Efficiency
This work looks at analyzing I/O traffic of users’ jobs on a HPC machine for a period of time. Monitoring tools are collecting the data in a continuous basis on the HPC system. We looked at aggregate I/O data usage patterns of users’ jobs on the system both on the parallel shared Lustre file system and the node-local SSDs. Data mining tools are then applied to analyze the I/O usage pattern data in an attempt to tie the data to particular codes that produced those I/O behaviors from users’ jobs.
KeywordsAggregate I/O usage data Data mining Application I/O behavior
Authors acknowledge funding support/sponsorship from Engility Corporation’s High Performance Computing Center of Excellence (HPC CoE) that was used to support the student research. Authors thank Dr. Rajiv Bendale, Engility Corporation, for many valuable suggestions for this project.
- 1.Comet User Guide. http://www.sdsc.edu/support/user_guides/comet.html
- 2.Hammond, J.: TACC stats: I/O performance monitoring for the instransigent. In: Invited Keynote for the 3rd IASDS Workshop, pp. 1–29 (2011)Google Scholar
- 4.Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-1996), pp. 226–231. AAAI Press (1996)Google Scholar