Skip to main content
Log in

Design and implementation of dynamic I/O control scheme for large scale distributed file systems

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In this work, we have analyzed the input/output (I/O) activities of Cori, which is a high-performance computing system at the National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory. Our analysis results indicate that most users do not adjust storage configurations but rather use the default settings. In addition, owing to the interference from many applications running simultaneously, the performance varies based on the system status. To configure file systems autonomously in complex environments, we developed DCA-IO, a dynamic distributed file system configuration adjustment algorithm that utilizes the system log information to adjust storage configurations automatically. Our scheme aims to improve the application performance and avoid interference from other applications without user intervention. Moreover, DCA-IO uses the existing system logs and does not require code modifications, an additional library, or user intervention. To demonstrate the effectiveness of DCA-IO, we performed experiments using I/O kernels of real applications in both an isolated small-sized Lustre environment and Cori. Our experimental results shows that our scheme can improve the performance of HPC applications by up to 263% with the default Lustre configuration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

Raw data were generated at NERSC. Derived data supporting the findings of this study are available from the corresponding author on request.

Notes

  1. Note that application executions with identical names can have different I/O behavior as the behavior can be impacted by input data, algorithms, and more. However, DCA-IO assumes that executions will have similar I/O behavior.

References

  1. Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A.A., Al-Qaness, M.A., Gandomi, A.H.: Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput. Ind. Eng. 157, 107250 (2021)

    Article  Google Scholar 

  2. Adelmann, A., Gsell, A., Oswald, B., Schietinger, T., Bethel, W., Shalf, J., Siegerist, C., Stockinger, K.: Progress on H5Part: a portable high performance parallel data interface for electromagnetics simulations. In: 2007 IEEE Particle Accelerator Conference (PAC), IEEE, pp. 3396–3398 (2007)

  3. Axboe, J.: Fiobenchmark (1998). http://freecode.com/projects/fio

  4. Behzad, B., Luu, H.V.T., Huchette, J., Byna, S., Aydt, R., Koziol, Q., Snir, M., et al. : Taming parallel I/O complexity with auto-tuning. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. ACM, New York, p. 68 (2013)

  5. Behzad, B., Byna, S., Snir, M., et al.: Pattern-driven parallel I/O tuning. In: Proceedings of the 10th Parallel Data Storage Workshop. ACM, New York, pp. 43–48 (2015)

  6. Behzad, B., Byna, S., Wild, S.M., Snir, M., et al.: Dynamic model-driven parallel I/O performance tuning. In: 2015 IEEE International Conference on Cluster Computing (CLUSTER), pp. 184–193 (2015)

  7. Benchmark, I.: https://asc.llnl.gov/sequoia/benchmarks/iorsummaryv1.0.pdf. Accessed 5 Jan 2020

  8. Byna, S., Howison, M.: Parallel I/O kernel (PIOK) suite (2015). https://sdm.lbl.gov/exahdf5/software.html

  9. Carns, P., Harms, K., Allcock, W., Bacon, C., Lang, S., Latham, R., Ross, R.: Understanding and improving computational science storage access through continuous characterization. ACM Trans. Storage (TOS) 7, 8 (2011)

    Google Scholar 

  10. Chan, A., Gropp, W., Lusk, E.: An efficient format for nearly constant-time access to arbitrary time intervals in large trace files. Sci. Prog. 16, 155–165 (2008)

    Article  Google Scholar 

  11. Dorier, M., Antoniu, G., Ross, R., Kimpe, D., Ibrahim, S.: Calciom: Mitigating I/O interference in HPC systems through cross-application coordination. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, pp. 155–164 (2014)

  12. Folk, M., Cheng, A., Yates, K.: HDF5: a file format and I/O library for high performance computing applications. In: Proceedings of Supercomputing, pp. 5–33 (1999)

  13. Gainaru, A., Aupy, G., Benoit, A., Cappello, F., Robert, Y., Snir, M.: Scheduling the I/O of hpc applications under congestion. In: 2015 IEEE International Conference on Parallel and Distributed Processing Symposium (IPDPS). IEEE, pp. 1013–1022 (2015)

  14. Hipp, D.R.: Sqlite (2000). https://www.sqlite.org/index.html

  15. Howison, M.: Tuning hdf5 for lustre file systems. In: Proceedings of 2010 Workshop on Interfaces and Abstractions for Scientific Data Storage (2010)

  16. Kim, S., Sim, A., Wu, K., Byna, S., Wang, T., Son, Y., Eom, H.: DCA-IO: a dynamic I/O control scheme for parallel and distributed file systems. In: CCGRID, pp. 351–360 (2019)

  17. Liao, W.k., Ching, A., Coloma, K., Choudhary, A., Ward, L.: An implementation and evaluation of client-side file caching for MPI-IO. In: IEEE International Conference on Parallel and Distributed Processing Symposium, 2007 (IPDPS 2007), pp. 1–10. IEEE (2007)

  18. Lockwood, G.K., Snyder, S., Wang, T., Byna, S., Carns, P., Wright, N.J.: A year in the life of a parallel file system. In: SC’18: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 931–943. Networking, Storage and Analysis, IEEE (2018)

  19. Lofstead, J., Zheng, F., Liu, Q., Klasky, S., Oldfield, R., Kordenbrock, T., Schwan, K., Wolf, M.: Managing variability in the IO performance of petascale storage systems. In: SC’10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–12. IEEE (2010)

  20. Lustre: a. Lustre* Software Release 2.x Operations Manual

  21. Lustre, A.: b. Lustre Technical White Paper

  22. Luu, H., Behzad, B., Aydt, R., Winslett, M.: A multi-level approach for understanding I/O activity in HPC applications. In: 2013 IEEE International Conference on Cluster Computing (CLUSTER), pp. 1–5. IEEE (2013)

  23. Lustre Filesystem: For Computational Sciences, T.N.I., I/O and Lustre usage. https://www.nics.tennessee.edu/computing-resources/file-systems/io-lustre-tips

  24. Luu, H., Winslett, M., Gropp, W., Ross, R., Carns, P., Harms, K., Prabhat, M., Byna, S., Yao, Y.: A multiplatform study of I/O behavior on petascale supercomputers. In: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pp. 33–44. ACM, New York (2015)

  25. Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., Vivier, L.: The new ext4 filesystem: current status and future plans. In: Proceedings of the Linux symposium, pp. 21–33 (2007)

  26. Minich, M., Di, W., Shipman, G.M., Canon, S.O.S.: Lustre Center of excellence at ORNL (2008)

  27. Neuwirth, S., Wang, F., Oral, S., Bruening, U.: Automatic and transparent resource contention mitigation for improving large-scale parallel file system performance. In: 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS), pp. 604–613. IEEE (2017)

  28. Odajima, T., Kodama, Y., Tsuji, M., Matsuda, M., Maruyama, Y., Sato, M.: Preliminary performance evaluation of the Fujitsu A64FX using hpc applications. In: 2020 IEEE International Conference on Cluster Computing (CLUSTER), pp. 523–530. IEEE (2020)

  29. Patel, T., Byna, S., Lockwood, G.K., Wright, N.J., Carns, P., Ross, R., Tiwari, D.: Uncovering access, reuse, and sharing characteristics of {I/O-Intensive} files on {Large-Scale} production {HPC} systems. In: 18th USENIX Conference on File and Storage Technologies (FAST 20), pp. 91–101 (2020)

  30. Schwan, P., et al.: Lustre: building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium, pp. 380–386 (2003)

  31. Shende, S.S., Malony, A.D.: The tau parallel performance system. Int. J. High Perform. Comput. Appl. 20, 287–311 (2006)

    Article  Google Scholar 

  32. Snyder, S., Carns, P., Latham, R., Mubarak, M., Ross, R., Carothers, C., Behzad, B., Luu, H.V.T., Byna, S., et al.: Techniques for modeling large-scale HPC I/O workloads. In: Proceedings of the 6th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems, p. 5. ACM, New York (2015)

  33. Snyder, S., Carns, P., Harms, K., Latham, R., Ross, R.: Performance evaluation of Darshan 3.0.0 on the Cray XC30. Technical Report. Argonne National Laboratory (ANL), Argonne (2016)

  34. Snyder, S., Carns, P., Harms, K., Ross, R., Lockwood, G.K., Wright, N.J.: Modular HPC I/O characterization with darshan. In: Workshop on Extreme-Scale Programming Tools (ESPT), pp. 9–17. IEEE (2016)

  35. Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., Peck, G.: Scalability in the XFS file system. In: USENIX Annual Technical Conference (1996)

  36. Tang, H., Byna, S., Tessier, F., Wang, T., Dong, B., Mu, J., Koziol, Q., Soumagne, J., Vishwanath, V., Liu, J., et al.: Toward scalable and asynchronous object-centric data management for HPC. In: 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 113–122. IEEE (2018)

  37. Wang, T., Snyder, S., Lockwood, G., Carns, P., Wright, N., Byna, S.: Iominer: Large-scale analytics framework for gaining knowledge from I/O logs. In: 2018 IEEE International Conference on Cluster Computing (CLUSTER), pp. 466–476. IEEE (2018)

  38. Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pp. 307–320. USENIX Association, Berkeley (2006)

  39. You, H., Liu, Q., Li, Z., Moore, S.: The design of an auto-tuning I/O framework on cray xt5 system. In: Cray User Group meeting (CUG 2011) (2011)

  40. Yu, W., Vetter, J.S., Oral, H.S.: Performance characterization and optimization of parallel I/O on the cray XT. In: IEEE International Symposium on Parallel and Distributed Processing, 2008 (IPDPS 2008), pp. 1–11. IEEE (2008)

Download references

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2021R1C1C1010861). This work was supported in part by the Korea Institute for Advancement of Technology (KIAT) grant funded by Korea government (MOTIE) (P0012724, The Competency Development Program for Industry Specialist). This work was supported by the Office of Advanced Scientific Computing Research, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. This research used resources of the National Energy Research Scientific Computing Center. This study was financially supported by Seoul National University of Science and Technology.

Author information

Authors and Affiliations

Authors

Contributions

SK contributed the paper through conceptualization, methodology, software, and writing. AS, KW, and SB contributed the paper through conceptualization, discussion, and supervision. YS contributed the paper through conceptualization, discussion, writing, and supervision.

Corresponding author

Correspondence to Yongseok Son.

Ethics declarations

Conflict of interest

The authors have not disclosed any competing interests.

Ethical approval

Ethical approval was not required for this research.

Informed consent

All the authors listed have approved the manuscript for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, S., Sim, A., Wu, K. et al. Design and implementation of dynamic I/O control scheme for large scale distributed file systems. Cluster Comput 25, 4423–4438 (2022). https://doi.org/10.1007/s10586-022-03640-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-022-03640-0

Keywords

Navigation