Advertisement

Optimizing Data Placement on Hierarchical Storage Architecture via Machine Learning

  • Peng Cheng
  • Yutong LuEmail author
  • Yunfei Du
  • Zhiguang Chen
  • Yang Liu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11783)

Abstract

As storage hierarchies are getting deeper on modern high-performance computing systems, intelligent data placement strategies that can choose the optimal storage tier dynamically is the key to realize the potential of hierarchical storage architecture. However, providing a general solution that can be applied in different storage architectures and diverse applications is challenging. In this paper, we propose adaptive storage learner (ASL), which explores the idea of using machine learning techniques to mine the relationship between data placement strategies and I/O performance under varied workflow characteristics and system status, and uses the learned model to choose the optimal storage tier intelligently. We implement a prototype and integrate it into an existing data management system. Empirical comparison based on real scientific workflows tests shows that ASL is capable of combining workflow characteristics and real-time system status to make optimal data placement decisions.

Keywords

Storage optimization Machine learning Hierarchical storage Data placement 

Notes

Acknowledgment

This work was supported by the National Key R&D Program of China under Grant No. 2017YFB0202204 and No. 2017YFB0202201, the National Science Foundation of China under Grant NO.U1811464, and the Program for Guangdong Introducing Innovative and Entrepreneurial Teams under Grant NO. 2016ZT06D211.

References

  1. 1.
    Habib, S., et al.: Hacc: simulating sky surveys on state-of-the-art supercomputing architectures. New Astron. 42, 49–65 (2016)CrossRefGoogle Scholar
  2. 2.
    Kurth, T., et al.: Exascale deep learning for climate analytics. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018, Dallas, TX, USA, 11–16 November 2018, pp. 51:1–51:12 (2018)Google Scholar
  3. 3.
    Miyoshi, T., et al.: Big data assimilation toward post-petascale severe weather prediction: an overview and progress. Proc. IEEE 104(11), 2155–2179 (2016)CrossRefGoogle Scholar
  4. 4.
    Liu, N., Cope, J., Carns, P.H., Carothers, C.D., Ross, R.B., et al.: On the role of burst buffers in leadership-class storage systems. In: IEEE 28th Symposium on Mass Storage Systems and Technologies, MSST 2012, 16–20 April 2012, Asilomar Conference Grounds, pp. 1–11. Pacific Grove, CA, USA (2012)Google Scholar
  5. 5.
    Docan, C., Parashar, M., Klasky, S.: Dataspaces: an interaction and coordination framework for coupled simulation workflows. Cluster Comput. 15(2), 163–181 (2012)CrossRefGoogle Scholar
  6. 6.
    Bhimji, W., Bard, D., Romanus, M.: Accelerating science with the nersc burst buffer early user program. In: LBNL LBNL-1005736, May 2016Google Scholar
  7. 7.
    Cray. Datawarp user guide s-2558-5204, June 2016. http://docs.cray.com/books/S-2558-5204/S-2558-5204.pdf
  8. 8.
    Oak Ridge National Laboratories. Summit user guide, May 2019. https://www.olcf.ornl.gov/for-users/system-user-guides/summit
  9. 9.
    Swami, S., Mohanram, K.: Reliable non-volatile memories: techniques and measures. IEEE Des. Test 99, 1 (2017)Google Scholar
  10. 10.
    Li, H., Ghodsi, A., Zaharia, M., Shenker, S., Stoica, I.: Tachyon: reliable, memory speed storage for cluster computing frameworks. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 6:1–6:15. Seattle, WA, USA (2014)Google Scholar
  11. 11.
    Kakoulli, E., Herodotou, H.: Octopusfs: a distributed file system with tiered storage management. In: Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD 2017, pp. 65–78 (2017)Google Scholar
  12. 12.
    Dong, B., Byna, S., Wu, K.P., Johansen, H., Johnson, J.N., Keen, N.: Data elevator: low-contention data movement in hierarchical storage system. In: 23rd IEEE International Conference on High Performance Computing (HiPC 2016), pp. 152–161. Hyderabad, India (2016)Google Scholar
  13. 13.
    Jin, T., et al.: Exploring data staging across deep memory hierarchies for coupled data intensive simulation workflows. In: 2015 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, pp. 1033–1042 (2015)Google Scholar
  14. 14.
    Cheng, P., Lu, Y., Du, Y., Chen, Z.: Accelerating scientific workflows with tiered data management system. In: IEEE International Conference on High Performance Computing and Communications (2018)Google Scholar
  15. 15.
    Subedi, P., Davis, P.E., Duan, S., Klasky, S., Kolla, H., Parashar, M.: Stacker: an autonomic data movement engine for extreme-scale data staging-based in-situ workflows. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2018), pp. 73:1–73:11 (2018)Google Scholar
  16. 16.
    Alluxio Inc., Alluxio overview, May 2019. https://docs.alluxio.io/os/user/stable/en/Overview.html
  17. 17.
    Deelman, E., Gannon, D., Shields, M.S., Taylor, I.J.: Workflows and e-science: an overview of workflow system features and capabilities. Future Gener. Comp. Syst. 25(5), 528–540 (2009)CrossRefGoogle Scholar
  18. 18.
    Deelman, E., et al.: Pegasus, a workflow management system for science automation. Future Gener. Comput. Syst. 46, 17–35 (2015)CrossRefGoogle Scholar
  19. 19.
    Wilde, M., Hategan, M., Wozniak, J.M., Clifford, B., Katz, D.S., Foster, I.: Swift: a language for distributed parallel scripting. Parallel Comput. 37(9), 633–652 (2011)CrossRefGoogle Scholar
  20. 20.
    Chen, W., Deelman, E.: Workflowsim: a toolkit for simulating scientific workflows in distributed environments. In: 8th IEEE International Conference on E-Science, pp. 1–8 (2012)Google Scholar
  21. 21.
    Hazekamp, N., et al.: Combining static and dynamic storage management for data intensive scientific workflows. IEEE Trans. Parallel and Distrib. Syst. 99, 1 (2018)Google Scholar
  22. 22.
    Pegasus. Pegausus syntheticworkflows, February 2019. https://download.pegasus.isi.edu/misc/SyntheticWorkflows.tar.gz
  23. 23.
    Liao, X., Xiao, L., Yang, C., Yutong, L.: Milkyway-2 supercomputer: system and application. Front. Comput. Sci. 8(3), 345–356 (2014)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Taft, R., Vartak, M., Satish, N.R., Sundaram, N., Madden, S., Stonebraker, M.:. Genbase: a complex analytics genomics benchmark. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SGIMOD 2014). ACM (2014)Google Scholar
  25. 25.
    Krish, K.R., Anwar, A., Butt, A.R.: hats: a heterogeneity-aware tiered storage for hadoop. In: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 502–511 (2014)Google Scholar
  26. 26.
    Wang, T., Byna, S., Dong, B., Tang, H.: Univistor: integrated hierarchical and distributed storage for HPC. In: IEEE International Conference on Cluster Computing, CLUSTER 2018, Belfast, UK, 10–13 September 2018, pp. 134–144 (2018)Google Scholar
  27. 27.
    Kougkas, A., Devarajan, H., Sun, X.H.: Hermes: a heterogeneous-aware multi-tiered distributed I/O buffering system. In: Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2018), pp. 219–230 (2018)Google Scholar
  28. 28.
    Dai, D., Bao, F.S., Zhou, J., Shi, X., Chen, Y.: Vectorizing disks blocks for efficient storage system via deep learning. Parallel Comput. 82, 75–90 (2019)CrossRefGoogle Scholar
  29. 29.
    Tomes, E., Rush, E.N., Altiparmak, N.: Towards adaptive parallel storage systems. IEEE Trans. Comput. 67(12), 1840–1848 (2018)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Zheng, S., Hoseinzadeh, M., Swanson, S.: Ziggurat: a tiered file system for non-volatile main memories and disks. In: 17th USENIX Conference on File and Storage Technologies, FAST 2019, Boston, MA, 25–28 February 2019, pp. 207–219 (2019)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2019

Authors and Affiliations

  • Peng Cheng
    • 1
    • 2
  • Yutong Lu
    • 3
    Email author
  • Yunfei Du
    • 3
  • Zhiguang Chen
    • 3
  • Yang Liu
    • 4
  1. 1.College of Computer, National University of Defense TechnologyChangshaChina
  2. 2.State Key Laboratory of High Performance ComputingChangshaChina
  3. 3.National Supercomputer Center in Guangzhou, School of Data and Computer ScienceSun Yat-Sen UniversityGuangzhouChina
  4. 4.Department of Computer Science and TechnologyTsinghua UniversityBeijingChina

Personalised recommendations