Skip to main content
Log in

HydraFS: an efficient NUMA-aware in-memory file system

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Emerging persistent file systems are designed to achieve high-performance data processing by effectively exploiting the advanced features of Non-volatile Memory (NVM). Non-uniform memory access (NUMA) architectures are universally used in high-performance computing and data centers due to its scalability. However, existing NVM-based in-memory file systems are all designed for uniformed memory access systems. Their performance is not satisfactory on NUMA machine as they do not consider the architecture of multiple nodes and the asymmetric memory access speed. In this paper, we design an efficient NUMA-aware in-memory file system which distributes file data on all nodes to effectively balance the loads of file requests. Three approaches for improving the performance of the file system on NUMA machine are proposed, including Node-oriented File Creation algorithm to dispatch files over multiple nodes, File-oriented Thread Binding algorithm to bind threads to the gainful nodes and a buffer assignment technique to allocate the user buffer from the proper node. Further, based on the new design, we implement a functional NUMA-aware in-memory file system, HydraFS, in Linux kernel. Extensive experiments show that HydraFS significantly outperforms existing representative in-memory file systems on NUMA machine. The average performance of HydraFS is 76.6%, 91.9%, 26.7% higher than EXT4-DAX, PMFS, and SIMFS, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Freitas, R.F., Wilcke, W.W.: Storage-class memory: the next storage system technology. IBM J. Res. Dev. 52, 435 (2008)

    Article  Google Scholar 

  2. Chen, X., Sha, E.H.M., Zhuge, Q., Xue, C.J., Jiang, Weiwen, Wang, Yuangang: Efficient data placement for improving data access performance on domain-wall memory. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 24(10), 3094–3104 (2016)

    Article  Google Scholar 

  3. Hady, F.T., Foong, A., Veal, B., Dan, W.: Platform storage performance with 3d xpoint technology. Proc. IEEE 105(9), 1822–1833 (2017)

    Article  Google Scholar 

  4. Wu, X., Qiu, S., Narasimha Reddy, A.L.: Scmfs: a file system for storage class memory and its extensions. ACM Trans. Storage (TOS) 9(3), 7 (2013)

    Google Scholar 

  5. Dulloor, S.R., Kumar, S., Keshavamurthy, A., Lantz, P., Reddy, D., Sankaran, R., Jackson, J.: System software for persistent memory. In: European Conference on Computer Systems, pp. 1–15 (2014)

  6. Chen, Y., Shu, J., Ou, J., Lu, Y.: Hinfs: a persistent memory file system with both buffering and direct-access. ACM Trans. Storage 14(1), 1–30 (2018)

    Google Scholar 

  7. Xu, J., Swanson, S.: Nova: a log-structured file system for hybrid volatile/non-volatile main memories. In: Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST 16), pp. 323–338, USENIX Association, Santa Clara, CA (2016)

  8. Wilcox, M.: Add support for nv-dimms to ext4. https://lwn.net/Articles/613384/

  9. Sha, E.H.M., Chen, X., Zhuge, Q., Shi, L., Jiang, Weiwen: A new design of in-memory file system based on file virtual address framework. IEEE Trans. Comput. 65(10), 2959–2972 (2016)

    Article  MathSciNet  Google Scholar 

  10. Song, W., Jung, H.J., Ahn, J.H., Lee, J.W., Kim, John: Evaluation of performance unfairness in numa system architecture. IEEE Comput. Archit. Lett. 16(1), 26–29 (2017)

    Article  Google Scholar 

  11. Tan, J., Wang, F.: Optimizing virtual machines scheduling on high performance network numa systems. In: Proceedings of the 3rd IEEE International Conference on Computer and Communications (ICCC), pp. 821–825. IEEE, (2017)

  12. Cheng, Y., Chen, W., Wang, Z., Xinjie, Y.: Performance-monitoring-based traffic-aware virtual machine deployment on numa systems. IEEE Syst. J. 11(2), 973–982 (2017)

    Article  Google Scholar 

  13. Tang, L., Mars, J., Zhang, X., Hagmann, R., Hundt, R., Tune, E.: Optimizing google’s warehouse scale computers: the numa experience. In: Proceedings of the 19th International Symposium on High Performance Computer Architecture (HPCA2013), pp. 188–197. IEEE, (2013)

  14. Majo, Zoltan., Gross, T.R.: Memory system performance in a numa multicore multiprocessor. In: Proceedings of the 4th Annual International Conference on Systems and Storage, p. 12. ACM, (2011)

  15. Gaud, F., Lepers, B., Funston, J., Dashti, M., Fedorova, Alexandra, Quéma, Vivien, Lachaize, Renaud, Roth, Mark: Challenges of memory management on modern numa systems. Commun. ACM 58(12), 59–66 (2015)

    Article  Google Scholar 

  16. Lepers, B., Quéma, V., Fedorova, A.: Thread and memory placement on numa systems: asymmetry matters. In: Proceedings of the USENIX Annual Technical Conference, pp. 277–289. (2015)

  17. Wang, Y.: Numa-aware design and mapping for pipeline network functions. In: Proceedings of the 4th International Conference on Systems and Informatics (ICSAI), pp. 1049–1054. IEEE, (2017)

  18. Guo, X., Han, H.: A good data allocation strategy on non-uniform memory access architecture. In: Proceedings of the IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), 2017, pp. 527–530. IEEE (2017)

  19. Wagle, M., Booss, D., Schreter, I.: Non-uniform memory access (numa) database management system. US Patent 9,697,048, 4 July 2017

  20. Kim, J., Kim, Y., Khan, A., Park, S.: Understanding the performance of storage class memory file systems in the numa architecture. Clust. Comput. 22, 1–14 (2018)

    Google Scholar 

  21. Liu, Z., Sha, E.H-M., Chen, X., Jiang, W., Zhuge, Q.: Performance optimization for in-memory file systems on numa machines. In: Proceedings of the 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 7–12. IEEE, (2016)

  22. Axboe, J.: Fio’s documentation. https://fio.readthedocs.io/en/latest/fio_doc.html (2006)

  23. Tarasov, V., Zadok, E., Shepler, S.: Filebench: a flexible framework for file system benchmarking. USENIX; Login 41, 6–12 (2016)

    Google Scholar 

  24. Blagodurov, S., Fedorova, A., Zhuravlev, S., Kamali, A.: A case for numa-aware contention management on multicore systems. In: proceedings of the International Conference on Parallel Architectures and Compilation Techniques, pp. 557–558, (2010)

  25. Zhou, P., Zhao, B., Yang, J., Zhang, Y.: A durable and energy efficient main memory using phase change memory technology. In: proceedings of the International Symposium on Computer Architecture, pp. 14–23, (2009)

  26. Jung, J.Y., Cho, S.: Memorage:emerging persistent ram based malleable main memory and storage architecture. In: proceedings of the International ACM Conference on International Conference on Supercomputing, pp. 115–126, (2013)

  27. Sha, E.H-M., Chen, X., Zhuge, Q., Shi, L., Jiang, W.: Designing an efficient persistent in-memory file system. In: Proceedings of the Non-Volatile Memory System and Applications Symposium (NVMSA), pp. 1–6. IEEE, (2015)

  28. Bovet, D.P., Cesati, M.: Understanding the Linux Kernel: from I/O ports to process management. O’Reilly Media, Inc., Newton (2005)

    Google Scholar 

  29. Chen, X., Sha, E.H.-M., Zhuge, Q., Wu, T., Jiang, Weiwen, Zeng, Xiaoping, Wu, Lin: Umfs: an efficient user-space file system for non-volatile memory. J. Syst. Archit. 89, 18–29 (2018)

    Article  Google Scholar 

  30. Diener, M., Madruga, F.L., Rodrigues, E.R., Alves, M. Schneider, J., Navaux, P., Heiss, H.U.: Evaluating thread placement based on memory access patterns for multi-core processors. In: Proceedings of the IEEE 12th International Conference on High Performance Computing and Communications (HPCC), pp. 491–496, (2010)

  31. da Cruz, E.H.M., Alves, M.A., Carissimi, A., Navaux, P.O., Ribeiro, C.P., Méhaut, J.F.: Using memory access traces to map threads and data on hierarchical multi-core platforms. In: Proceedings of the IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp. 551–558. IEEE (2011)

  32. Kiefer, T., Schlegel, B., Lehner, W.: Experimental evaluation of numa effects on database management systems. BTW 13, 185–204 (2013)

    Google Scholar 

  33. Hong, Y., Zheng, Y., Yang, F., Zang, B.Y., Guan, Hai-Bing, Chen, Hai-Bo: Scaling out numa-aware applications with rdma-based distributed shared memory. J. Comput. Sci. Technol. 34(1), 94–112 (2019)

    Article  Google Scholar 

  34. Wu, L., Zhuge, Q., Sha, E.H.M., Chen, X., Cheng, Linfeng: Dwarm: a wear-aware memory management scheme for in-memory file systems. Future Generat. Comput. Syst. 88, 1–15 (2018)

    Article  Google Scholar 

  35. Zeng, Y., Sha, E.H-M., Zhuge, Q., Chen, X., Ma, Z., Wu, L.: An efficient file system for hybrid in-memory nvm and block devices. In: Proceedings of the 2018 IEEE 7th Non-Volatile Memory Systems and Applications Symposium (NVMSA), pp. 43–48. IEEE, (2018)

Download references

Acknowledgements

We thank Mr. Lin Wu for his careful proofreading and constructive comments that have significantly improved the paper. This work was partially supported by National Natural Science Foundation of China (Grant Nos. 61872049, 61472052 and 61502061) and Postdoctoral Research Foundation of China (Grant No. 2017M620412).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, T., Chen, X., Liu, K. et al. HydraFS: an efficient NUMA-aware in-memory file system. Cluster Comput 23, 705–724 (2020). https://doi.org/10.1007/s10586-019-02952-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-019-02952-y

Keywords

Navigation