Cluster Computing

, Volume 18, Issue 3, pp 1075–1086 | Cite as

Design and evaluation of a user-level file system for fast storage devices

  • Yongseok Son
  • Nae Young Song
  • Hyuck Han
  • Hyeonsang Eom
  • Heon Young Yeom
Article

Abstract

Lately, fast storage devices are rapidly increasing in social network services, cloud platforms, etc. Unfortunately, the traditional Linux I/O stack is designed to maximize performance on disk-based storage. Emerging byte-addressable and low-latency non-volatile memory technologies (e.g., phase-change memories, MRAMs, and the memristor) provide very different characteristics, so the disk-based I/O stack cannot lead to high performance. This paper presents a high performance I/O stack for the fast storage devices. Our scheme is to remove the concept of block and to simplify the whole I/O path and software stack, which results in only two layers that are the byte-capable interface and the byte-aware file system called BAFS. We aim to minimize I/O latency and maximize bandwidth by eliminating the unnecessary layers and supporting byte-addressable I/O without requiring changes to applications. We have implemented a prototype and evaluated its performance with multiple benchmarks. The experimental results show that our I/O stack achieves 6.2 times on average and up to 17.5 times performance gains compared to the existing Linux I/O stack.

Keywords

File system Fast storage device I/O stack Low latency I/O 

References

  1. 1.
    Axboe, J.: Fiobenchmark, April (1998)Google Scholar
  2. 2.
    Card, R., Tso, T., Tweedie, S.: Design and implementation of the second extended filesystem. In: Proceedings of the First Dutch International Symposium on Linux, pp. 1–6. Monterey (1994)Google Scholar
  3. 3.
    Caulfield, A.M., De, A., Coburn, J., Mollow, T.I., Gupta, R.K., Swanson, S.: Moneta: a high-performance storage array architecture for next-generation, non-volatile memories. In: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 385–395. IEEE Computer Society, Washington, DC (2010)Google Scholar
  4. 4.
    Caulfield, A.M., Mollov, T.I., Eisner, L.A., De, A., Coburn, J., Swanson, S.: Providing safe, user space access to fast, solid state disks. SIGARCH Comput. Archit. News 40(1), 387–400 (2012)CrossRefGoogle Scholar
  5. 5.
    Condit, J., Nightingale, E.B., Frost, C., Ipek, E., Lee, B., Burger, D., Coetzee, D.: Better I/O through byte-addressable, persistent memory. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP ’09, pp. 133–146. ACM, New York (2009)Google Scholar
  6. 6.
    Katti, R.R., Stadler, H.L., Wu, J.-C.: Non-volatile magnetic random access memory. US Patent 5,289,410, 22 Feb 1994Google Scholar
  7. 7.
    Kim, H., Seshadri, S., Dickey, C.L., Chiu, L.: Evaluating phase change memory for enterprise storage systems: a study of caching and tiering approaches. In: Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 14) USENIX, pp. 33–45. Santa Clara, CA (2014)Google Scholar
  8. 8.
    Lu, L., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H., Lu, S.: A study of linux file system evolution. Trans. Storage 10(1), 3:1–3:32 (2014)CrossRefGoogle Scholar
  9. 9.
    Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., Vivier, L.: The new ext4 filesystem: current status and future plans. In: Ottawa Linux Symposium. http://ols.108.redhat.com/2007/Reprints/mathur-Reprint.pdf (2007)
  10. 10.
    Norcott, W.D.: Lozone file system benchmark (2011)Google Scholar
  11. 11.
    Oi, H.: A case study: performance evaluation of a dram-based solid state disk. In: Japan–China Joint Workshop on Frontier of Computer Science and Technology, FCST 2007, pp. 57–60 (2007)Google Scholar
  12. 12.
    Raoux, S., Burr, G., Breitwisch, M., Rettner, C., Chen, Y., Shelby, R., Salinga, M., Krebs, D., Chen, S.H., Lung, H.L., Lam, C.: Phase-change random access memory: a scalable technology. IBM J. Res. Dev. 52(4.5), 465–479 (2008)CrossRefGoogle Scholar
  13. 13.
    Rodeh, O.: B-trees, shadowing, and clones. Trans. Storage 3(4), 2:1–2:27 (2008)CrossRefGoogle Scholar
  14. 14.
    Rodeh, O., Bacik, J., Mason, C.: The linux b-tree filesystem. Trans. Storage 9(3), 9:1–9:32 (2013)CrossRefGoogle Scholar
  15. 15.
    Seppanen, E., O’Keefe, M., Lilja, D.: High performance solid state storage under linux. In: IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–12 (2010)Google Scholar
  16. 16.
    Shin, D.I., Yu, Y.J., Kim, H.S., Choi, J.W., Jung, D.Y., Yeom, H.Y.: Dynamic interval polling and pipelined post i/o processing for low-latency storage class memory. In: Proceedings of the 5th USENIX Conference on Hot Topics in Storage and File Systems, USENIX Association, pp. 5–5 (2013)Google Scholar
  17. 17.
    Son, Y., Choi, J. W., Eom, H., Yeom, H.Y.: Optimizing the file system with variable-length I/O for fast storage devices. In: Proceedings of the 4th Asia-Pacific Workshop on Systems, APSys ’13, pp. 14:1–14:6. ACM, New York (2013)Google Scholar
  18. 18.
    Son, Y., Song, N.Y., Eom, H., Yeom, H.Y.: A user-level file system for fast storage devices. Workshop on Autonomic Management of High Performance Grid and Cloud ComputingGoogle Scholar
  19. 19.
    Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., Peck, G.: Scalability in the xfs file system. In: USENIX Annual Technical Conference, vol. 15 (1996)Google Scholar
  20. 20.
    TAILWINDSTORAGE. Extreme s3804 (2014)Google Scholar
  21. 21.
    Worthington, B.L., Ganger, G.R., and Patt, Y.N.: Scheduling algorithms for modern disk drives. In: Proceedings of the 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, SIGMETRICS’ 94, pp. 241–251. ACM, New York (1994)Google Scholar
  22. 22.
    Wu, X., Reddy, A.L.N.: Scmfs: a file system for storage class memory. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pp. 39:1–39:11. ACM, New York (2011)Google Scholar
  23. 23.
    Yang, J., Minturn, D.B., Hady, F.: When poll is better than interrupt. In: Proceedings of the 10th USENIX Conference on File and Storage Technologies, FAST’12, pp. 3–3. USENIX Association, Berkeley (2012)Google Scholar
  24. 24.
    Yu, Y.J., Shin, D.I., Shin, W., Song, N.Y., Choi, J.W., Kim, H.S., Eom, H., Kim, H.S., Eom, H., Yeom, H.Y.: Optimizing the block I/O subsystem for fast storage devices. ACM Trans. Comput. Syst. 32(2), 6:1–6:48 (2014)CrossRefGoogle Scholar
  25. 25.
    Yu, Y.J., Shin, D.I., Shin, W., Song, N.Y., Eom, H., Yeom, H.Y.: Exploiting peak device throughput from random access workload. In: Proceedings of the 4th USENIX Conference on Hot Topics in Storage and File Systems, USENIX Association, pp. 7–7 (2012)Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Yongseok Son
    • 1
  • Nae Young Song
    • 1
  • Hyuck Han
    • 1
    • 2
  • Hyeonsang Eom
    • 1
  • Heon Young Yeom
    • 1
  1. 1.Department of Computer Science and EngineeringSeoul National UniversitySeoulSouth Korea
  2. 2.Department of Computer ScienceDongduk Women’s UniversitySeoulSouth Korea

Personalised recommendations