Science China Information Sciences

, Volume 58, Issue 3, pp 1–14 | Cite as

Mlock: building delegable metadata service for the parallel file systems

  • Quan Zhang
  • Dan Feng
  • Fang Wang
  • Sen Wu
Research Paper


The ever-growing demand for high performance computation calls for progressively larger parallel distributed file systems to match their requirement. These file systems can achieve high performance for large I/O operations through distributing load across numerous data servers. However, they fail to provide quality service for applications pertaining to small files. In this paper, we propose a delegable metadata service (DMS) for hiding latency of metadata accesses and optimizing small-file performance. In addition, four techniques have been designed to maintain consistency and efficiency in DMS: pre-allocate serial metahandles, directory-based metadata replacement, packing transaction operations and fine-grained lock revocation. These schemes have been employed in Cappella parallel distributed file system, and various experiments complying with industrial standards have been conducted for evaluation of its efficiency. The results show that our design has achieved significant improvement in performance of both metadata operations and small-file access. Moreover, this scheme is widely applicable for integration within many other distributed file systems.


delegable metadata service metadata performance consistency small file parallel file system 



并行文件系统 委托元数据服务 元数据吞吐率 小文件性能优化 一致性 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Devulapalli A, Wyckoff P. File creation strategies in a distributed metadata file system. In: Proceedings of the 21st International Parallel and Distributed Processing Symposium, Long Beach, 2007. 1–10Google Scholar
  2. 2.
    The Luster File Systems. version 1.8.3. (FL): Distributed File Systems. 2008Google Scholar
  3. 3.
    Ghemawat S, Gobioff H, Leung S. The Google File System. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles, New York, 2003. 29–43CrossRefGoogle Scholar
  4. 4.
    Roselli D, Lorch J R, Anderson T E. A comparison of file system workloads. In: Proceedings of the USENIX Annual Technical Conference, San Diego, 2000. 41–54Google Scholar
  5. 5.
    Beave D, Kumar S, Li H C, et al. Finding a needle in Haystack: Fackbook’s photo storage. In: Proceedings of the 9th Symposium on Operating Systems Design and Implementation, Vancouver, 2010. 47–60Google Scholar
  6. 6.
    Pawlowski B, Juszczak C, Staubach P, et al. NFS version 3: design and implementation. In: Proceedings of the Summer USENIX Conference, Boston, 1994. 137–152Google Scholar
  7. 7.
    Carns P H, Ligon W B, Ross R B, et al. PVFS: a parallel file system for Linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, Atlanta, 2000. 317–327Google Scholar
  8. 8.
    Zhang Q, Feng D, Wang F. Metadata performance optimization in Distributed File System. In: Proceedings of IEEE/ACIS 11th International Conference on Computer and Information Science, Shanghai, 2012. 476–481Google Scholar
  9. 9.
    Carns P, Lang S, Ross R, et al. Small-file access in parallel file systems. In: Proceedings of the 23rd IEEE IPDPS, Rome, 2009. 1–11Google Scholar
  10. 10.
    Xia P, Feng D, Jiang H, et al. FARMER: a novel approach to file access correlation mining and evaluation reference model for optimizing peta-scale file system performance. In: Proceedings of the 17th International Symposium on HPDC, Boston, 2008. 185–196Google Scholar
  11. 11.
    Leung A W, Pasupathy S, Goodson G, et al. Measurement and analysis of large-scale network file system workloads. In: Proceedings of the USENIX Annual Technical Conference, Boston, 2008. 213–226Google Scholar
  12. 12.
    Welch B, Unangst M, Abbasi Z, et al. Scalable performance of the panasas parallel file system. In: Proceedings of the 6th USENIX Conference on FAST, San Jose, 2008. 17–33Google Scholar
  13. 13.
    Leung A W, Shao M L, Bisson T, et al. Spyglass: fast, scalable metadata search for large-scale storage systems. In: Proceedings of the 7th Conference on FAST, San Francisco, 2009. 153–166Google Scholar
  14. 14.
    Wang J, Feng D, Wang F, et al. MHS: a distributed metadata management strategy. J Syst Softw, 2009, 82: 2004–2011CrossRefGoogle Scholar
  15. 15.
    Yu L H, Chen G, Wang W, et al. MSFSS: a storage system for mass small files. In: Proceedings of the 11th International Conference on Computer Supported Cooperative Work in Design, Melbourne, 2007. 1087–1092Google Scholar
  16. 16.
    Sinnamohideen S, Sambasivan R R, Hendricks J. A transparently-scalable metadata service for the Ursa Minor storage system. In: Proceedings of the USENIX ATC, Boston, 2010. 153–166Google Scholar
  17. 17.
    Xiong J, Hu Y M, Li G J, et al. Metadata distribution and consistency techniques for large-scale cluster file systems. IEEE Trans Parall Distrib Syst, 2011, 22: 803–816CrossRefGoogle Scholar
  18. 18.
    Katcher J. Postmark: a New File System Benchmark. Network Appliance Technical Report TR3022, 1997Google Scholar
  19. 19.
    Zoe S, Kostas M, Manolis M, et al. A comparative experimental study of parallel file systems for large-scale data processing. In: Proceedings of the 1st USENIX Workshop on Large-Scale Computing, Berkeley, 2008Google Scholar
  20. 20.
    Patil S, Gibson G. Scale and concurrency of GIGA+: file system directories with millions of files. In: Proceedings of the 9th USENIX Conference on File and Storage Technologies, San Jose, 2011. 177–190Google Scholar
  21. 21.
    Xing J, Xiong J, Sun N H, et al. Adaptive and scalable metadata management to support a trillion files. In: Proceedings of the ACM/IEEE Conference on High Performance Computing, Portland, 2009. 1–11Google Scholar
  22. 22.
    Meshram V, Ouyang X Y, Panda D K. Minimizing Lookup RPCs in Lustre File System Using Metadata Delegation at Client Side. Department of Compute Science and Engineering, Ohio State University Technical Report TR20, 2011Google Scholar
  23. 23.
    Hendricks J, Sambasivan R R, Sinhamohideen S, et al. Improving small file performance in object-based storage. In: Proceedings of IEEE International Conference on Services Computing, Miami, 2010. 65–72Google Scholar
  24. 24.
    Kuhn M, Kunkel J M, Ludwig T. Dynamic file system semantics to enable metadata optimizations in PVFS. Concurr Comput-Pract Exper, 2009, 21: 1775–1788CrossRefGoogle Scholar
  25. 25.
    Li X Q, Dong B, Xiao L M, et al. Adaptive tradeoff in metadata-based small file optimization for a cluster file system. Int J Numer Anal Model, 2012, 9: 289–303zbMATHGoogle Scholar

Copyright information

© Science China Press and Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Wuhan National Laboratory for OptoelectronicsWuhanChina
  2. 2.School of ComputerHuazhong University of Science and TechnologyWuhanChina
  3. 3.Information & Telecommunication CompanyState Grid Hubei Electric Power CompanyWuhanChina

Personalised recommendations