Skip to main content

Alleviating I/O Interference Through Workload-Aware Striping and Load-Balancing on Parallel File Systems

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10266))

Included in the following conference series:

Abstract

Nowadays parallel file systems have been widely used in many supercomputers. Lustre is one of the most used parallel file systems, and its enhanced file system named FEFS (Fujitsu Exabyte File System) has been used at K computer. The K computer has adopted two-layered file system consisting of a local file system and a shared global file system with data staging scheme in order to guarantee sufficient I/O throughput on the local file system during computation. However, huge data staging on the shared file system sometimes has led to big I/O interference in light-weight file accesses which have taken place at the same time. Alleviation of such I/O interference on shared file systems is an important issue in managing a big scale of parallel file systems in shared use. In this paper, we focus on I/O interference alleviation by using workload-aware striping and load-balancing. Appropriate striping configuration with effective load-balancing in service thread allocation for incoming I/O requests has improved performance of light-weight file accesses against huge data accesses without excessive sacrifice to data staging performance at the K computer. It is expected that the proposed optimization can be used as a system-wide I/O interference mitigation approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Ajima, Y., Inoue, T., Hiramoto, S., Takagi, Y., Shimizu, T.: The Tofu interconnect. IEEE Micro 32(1), 21–31 (2012)

    Article  Google Scholar 

  2. Crosby, L.D., Mohr, R.: Petascale I/O: challenges, solutions, and recommendations. In: Proceedings of the Extreme Scaling Workshop, BW-XSEDE 2012, pp. 7:1–7:7. University of Illinois at Urbana-Champaign (2012)

    Google Scholar 

  3. Dillow, D.A., Shipman, G.M., Oral, S., Zhang, Z.: I/O congestion avoidance via routing and object placement. In: 2011 Cray User Group Meeting (2011)

    Google Scholar 

  4. Dorier, M., Antoniu, G., Ross, R.B., Kimpe, D., Ibrahim, S.: CALCioM: mitigating I/O interference in HPC systems through cross-application coordination. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp. 155–164. IEEE Computer Society (2014)

    Google Scholar 

  5. Ezell, M., Mohr, R., Wynkoop, J., Braby, R.: Lustre at petascale: experiences in troubleshooting and upgrading. In: 2012 Cray User Group Meeting (2012)

    Google Scholar 

  6. Hirai, K., Iguchi, Y., Uno, A., Kurokawa, M.: Operations management software for the K computer. Fujitsu Sci. Tech. J. 48(3), 310–316 (2012)

    Google Scholar 

  7. Ihara, S.: A new quality of service (QoS) policy for Lustre utilizing the Lustre network request scheduler (NRS) framework. In: Lustre Administrator and Developers Workshop (LAD 2013) (2013)

    Google Scholar 

  8. Lustre. http://lustre.org/

  9. Miyazaki, H., Kusano, Y., Shinjou, N., Shoji, F., Yokokawa, M., Watanabe, T.: Overview of the K computer system. Fujitsu Sci. Tech. J. 48(3), 255–265 (2012)

    Google Scholar 

  10. Mohr, R., Brim, M., Oral, S., Dilger, A.: Evaluating progressive file layouts for Lustre (2016). http://lustre.ornl.gov/ecosystem-2016/

  11. Qian, Y., Barton, E., Wang, T., Puntambekar, N., Dilger, A.: A novel network request scheduler for a large scale storage system. Comput. Sci. - Res. Dev. 23(3), 143–148 (2009)

    Article  Google Scholar 

  12. Qian, Y., Yi, R., Du, Y., Xiao, N., Jin, S.: Dynamic I/O congestion control in scalable Lustre file system. In: 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST 2013), pp. 1–5. IEEE Computer Society (2013)

    Google Scholar 

  13. Rajachandrasekar, R., Jaswani, J., Subramoni, H., Panda, D.K.: Minimizing network contention in InfiniBand clusters with a QoS-aware data-staging framework. In: 2012 IEEE International Conference on Cluster Computing, pp. 329–336 (2012)

    Google Scholar 

  14. Reed, J., Archuleta, J., Brim, M.J., Lothian, J.: Evaluating dynamic file striping for Lustre. In: Proceedings of the International Workshop on the Lustre Ecosystem: Challenges and Opportunities (2015). http://arxiv.org/html/1506.05323

  15. Saini, S., Rappleye, J., Chang, J., Barker, D., Mehrotra, P., Biswas, R.: I/O performance characterization of Lustre and NASA applications on Pleiades. In: 19th International Conference on High Performance Computing (HiPC), pp. 1–10 (2012)

    Google Scholar 

  16. Sakai, K., Sumimoto, S., Kurokawa, M.: High-performance and highly reliable file system for the K computer. Fujitsu Sci. Tech. J. 48(3), 302–309 (2012)

    Google Scholar 

  17. Sumimoto, S.: An overview of Fujitsu’s Lustre based file system. In: Lustre User Group 2011 (2011)

    Google Scholar 

  18. Wang, F., Oral, S., Gupta, S., Tiwari, D., Vazhkudai, S.S.: Improving large-scale storage system performance via topology-aware and balanced data placement. In: 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp. 656–663. IEEE Computer Society (2014)

    Google Scholar 

  19. Yildiz, O., Dorier, M., Ibrahim, S., Ross, R., Antoniu, G.: On the root causes of cross-application I/O interference in HPC storage systems. In: 2016 IEEE 30th International Parallel and Distributed Processing Symposium, pp. 750–759. IEEE Computer Society (2016)

    Google Scholar 

  20. Zhang, X., Davis, K., Jiang, S.: QoS support for end users of I/O-intensive applications using shared storage systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 18:1–18:12. ACM (2011)

    Google Scholar 

Download references

Acknowledgment

The authors would like to thank Fujitsu for providing useful technical information about FEFS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuichi Tsujita .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Tsujita, Y., Yoshizaki, T., Yamamoto, K., Sueyasu, F., Miyazaki, R., Uno, A. (2017). Alleviating I/O Interference Through Workload-Aware Striping and Load-Balancing on Parallel File Systems. In: Kunkel, J.M., Yokota, R., Balaji, P., Keyes, D. (eds) High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science(), vol 10266. Springer, Cham. https://doi.org/10.1007/978-3-319-58667-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58667-0_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58666-3

  • Online ISBN: 978-3-319-58667-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics