Skip to main content
Log in

A GPU-Accelerated In-Memory Metadata Management Scheme for Large-Scale Parallel File Systems

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Driven by the increasing requirements of high-performance computing applications, supercomputers are prone to containing more and more computing nodes. Applications running on such a large-scale computing system are likely to spawn millions of parallel processes, which usually generate a burst of I/O requests, introducing a great challenge into the metadata management of underlying parallel file systems. The traditional method used to overcome such a challenge is adopting multiple metadata servers in the scale-out manner, which will inevitably confront with serious network and consistence problems. This work instead pursues to enhance the metadata performance in the scale-up manner. Specifically, we propose to improve the performance of each individual metadata server by employing GPU to handle metadata requests in parallel. Our proposal designs a novel metadata server architecture, which employs CPU to interact with file system clients, while offloading the computing tasks about metadata into GPU. To take full advantages of the parallelism existing in GPU, we redesign the in-memory data structure for the name space of file systems. The new data structure can perfectly fit to the memory architecture of GPU, and thus helps to exploit the large number of parallel threads within GPU to serve the bursty metadata requests concurrently. We implement a prototype based on BeeGFS and conduct extensive experiments to evaluate our proposal, and the experimental results demonstrate that our GPU-based solution outperforms the CPU-based scheme by more than 50% under typical metadata operations. The superiority is strengthened further on high concurrent scenarios, e.g., the high-performance computing systems supporting millions of parallel threads.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Braam P. The lustre storage architecture. arXiv:1903.01955, 2009. https://arxiv.org/pdf/1903.01955.pdf, Oct. 2020.

  2. Weil S A, Brandt S A, Miller E L, Long D D E, Maltzahn C. Ceph: A scalable, high-performance distributed file system. In Proc. the 7th Symposium on Operating Systems Design and Implementation, November 2006, pp.307-320.

  3. Shvachko K, Kuang H, Radia S, Chansler R. The Hadoop distributed file system. In Proc. the 26th IEEE Symposium on Mass Storage Systems and Technologies, May 2010. https://doi.org/10.1109/MSST.2010.5496972.

  4. Ren K, Zheng Q, Patil S, Gibson G. IndexFS: Scaling file system metadata performance with stateless caching and bulk insertion. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis, Nov. 2014, pp.237-248. https://doi.org/10.1109/SC.2014.25.

  5. Liao X, Pang Z, Wang K F, Lu Y, Xie M, Xia J, Dong D, Suo G. High performance interconnect network for Tianhe system. Journal of Computer Science and Technology, 2015, 30(2): 259-272. https://doi.org/10.1007/s11390-015-1520-7.

    Article  Google Scholar 

  6. Davies A, Orsaria A. Scale out with GlusterFS. Linux Journal, 2013, 235: Article No. 1.

  7. Rodeh O, Bacik J, Mason C. BTRFS: The Linux B-tree file system. ACM Transactions on Storage, 2013, 9(3): Article No. 9. https://doi.org/10.1145/2501620.2501623.

  8. Xiao L, Ren K, Zheng Q, Gibson G A. ShardFS vs. IndexFS: Replication vs. caching strategies for distributed metadata management in cloud storage systems. In Proc. the 6th ACM Symposium on Cloud Computing, August 2015, pp.236-249. https://doi.org/10.1145/2806777.2806844.

  9. Li S, Lu Y, Shu J, Hu Y, Li T. LocoFS: A loosely-coupled metadata service for distributed file systems. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis, November 2017, Article No. 4. https://doi.org/10.1145/3126908.3126928.

  10. Yuan J, Zhan Y, Jannen W et al. Optimizing every operation in a write-optimized file system. In Proc. the 14th USENIX Conference on File and Storage Technologies, February 2016, pp.1-14.

  11. Zheng Q, Ren K, Gibson G, Settlemyer B W, Grider G. DeltaFS: Exascale file systems scale better without dedicated servers. In Proc. the 10th Parallel Data Storage Workshop, November 2015, pp.1-6. https://doi.org/10.1145/2834976.2834984.

  12. Zheng Q, Cranor C D, Guo D et al. Scaling embedded in-situ indexing with DeltaFS. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis, November 2018, Article No. 3. https://doi.org/10.1109/SC.2018.00006.

  13. Zheng Q, Ren K, Gibson G. BatchFS: Scaling the file system control plane with client-funded metadata servers. In Proc. the 9th Parallel Data Storage Workshop, November 2014, pp.1-6. https://doi.org/10.1109/PDSW.2014.7.

  14. Liu Y, Lu Y, Chen Z, Zhao M. Pacon: Improving scalability and efficiency of metadata service through partial consistency. In Proc. the IEEE International Parallel and Distributed Processing Symposium, May 2020, pp.986-996. https://doi.org/10.1109/IPDPS47924.2020.00105.

  15. Xu W, Lu Y, Li Q et al. Hybrid hierarchy storage system in MilkyWay-2 supercomputer. Frontiers of Computer Science, 2014, 8(3): 367-377. https://doi.org/10.1007/s11704-014-3499-6.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu-Tong Lu.

Supplementary Information

ESM 1

(PDF 320 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, ZG., Liu, YB., Wang, YF. et al. A GPU-Accelerated In-Memory Metadata Management Scheme for Large-Scale Parallel File Systems. J. Comput. Sci. Technol. 36, 44–55 (2021). https://doi.org/10.1007/s11390-020-0783-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-020-0783-9

Keywords

Navigation