Abstract
Today’s file systems are required to store PB-scale or even EB-scale data across thousands of servers. Under this scenario, distributed metadata management schemes, which store metadata on a group of metadata servers (MDS’s), are used to alleviate the workload of a single server. However, they present a significant challenge as the group of MDS’s should maintain a high level of metadata locality and load balancing, which are practically contradictory to each other. In this paper we propose a novel and specially designed hashing scheme called AngleCut to partition metadata namespace tree and serve large-scale distributed storage systems. AngleCut first uses a locality preserving hashing (LPH) function to project the namespace tree into linear keyspace, i.e., multiple Chord-like rings. Then we design a history-based allocation strategy to adjust the workload of MDS’s dynamically. The metadata cache mechanism is also adopted in AngleCut to improve the query efficiency. In general, our scheme preserves the metadata locality essentially as well as maintaining high load balancing between MDS’s. The theoretical proof and extensive experiments on trace data exhibit the superiority of AngleCut over the previous literature.
This work has been supported in part by the program of International S&T Cooperation (2016YFE0100300), the China 973 Project (2014CB3-40303), and National Natural Science Foundation of China (Nos. 61672353, 61472252 and 61322208).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Data source: http://iotta.snia.org/traces/158.
References
Brandt, S.A., Miller, E.L., Long, D.D., Xue, L.: Efficient metadata management in large distributed storage systems. In: MSST, pp. 290–298 (2003)
Fonseca, P., Rodrigues, R., Gupta, A., Liskov, B.: Full-information lookups for peer-to-peer overlays. TPDS 20(9), 1339–1351 (2009)
Fu, Y., Xiao, N., Zhou, E.: A novel dynamic metadata management scheme for large distributed storage systems. In: HPCC, pp. 987–992 (2008)
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: SOSP, vol. 37, pp. 29–43 (2003)
Hong, Y., Tang, Q., Gao, X., Yao, B., Chen, G., Tang, S.: Efficient R-tree based indexing scheme for server-centric cloud storage system. TKDE 28(6), 1503–1517 (2016)
Miller, E.L., Greenan, K., Leung, A., Long, D., Wildani, A.: Reliable and efficient metadata storage and indexing using NVRAM (2008), http://dcslab.hanyang.ac.kr/nvramos08/EthanMiller.pdf
Ousterhout, J.K., Da Costa, H., Harrison, D., Kunze, J.A., Kupfer, M., Thompson, J.G.: A trace-driven analysis of the UNIX 4.2 BSD file system. In: SOSP, vol. 19 (1985)
Park, J.-H., Kim, B.-K., Lee, Y.-H., Lee, M.-W., Jung, M.-O., Kang, J.-H.: XQuery-based TV-anytime metadata management. In: Zhou, L., Ooi, B.C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 151–162. Springer, Heidelberg (2005). doi:10.1007/11408079_15
Patil, S., Gibson, G.A.: Scale and concurrency of giga+: file system directories with millions of files. In: FAST, vol. 11 (2011)
Pawlowski, B., Juszczak, C., Staubach, P., Smith, C., Lebel, D., Hitz, D.: NFS version 3: design and implementation. In: USENIX ATC, pp. 137–152 (1994)
Rodeh, O., Teperman, A.: zFS-a scalable distributed file system using object disks. In: MSST, pp. 207–218 (2003)
Satyanarayanan, M., Kistler, J.J., Kumar, P., Okasaki, M.E., Siegel, E.H., Steere, D.C.: CODA: a highly available file system for a distributed workstation environment. TC 39(4), 447–459 (1990)
Sevilla, M.A., Watkins, N., Maltzahn, C., Nassi, I., Brandt, S.A., Weil, S.A., Farnum, G., Fineberg, S.: Mantle: a programmable metadata load balancer for the ceph file system. In: SC, pp. 21:1–21:12 (2015)
Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.R., Kaashoek, M.F., Dabek, F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup protocol for internet applications. TON 11(1), 17–32 (2003)
Thomson, A., Abadi, D.J.: Calvinfs: consistent wan replication and scalable metadata management for distributed file systems. In: FAST, pp. 1–14 (2015)
Xiong, J., Hu, Y., Li, G., Tang, R., Fan, Z.: Metadata distribution and consistency techniques for large-scale cluster file systems. TPDS 22(5), 803–816 (2011)
Xu, Q., Arumugam, R.V., Yong, K.L., Mahadevan, S.: Efficient and scalable metadata management in EB-scale file systems. TPDS 25(11), 2840–2850 (2014)
Xue, M., Papadimitriou, P., Raïssi, C., Kalnis, P., Pung, H.K.: Distributed privacy preserving data collection. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011. LNCS, vol. 6587, pp. 93–107. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20149-3_9
Yu, W., Liang, S., Panda, D.K.: High performance support of parallel virtual file system (pvfs2) over quadrics. In: ICS, pp. 323–331 (2005)
Zhang, X., Shou, L., Tan, K.-L., Chen, G.: iDISQUE: tuning high-dimensional similarity queries in DHT networks. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5981, pp. 19–33. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12026-8_4
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Liu, J., Wang, R., Gao, X., Yang, X., Chen, G. (2017). AngleCut: A Ring-Based Hashing Scheme for Distributed Metadata Management. In: Candan, S., Chen, L., Pedersen, T., Chang, L., Hua, W. (eds) Database Systems for Advanced Applications. DASFAA 2017. Lecture Notes in Computer Science(), vol 10177. Springer, Cham. https://doi.org/10.1007/978-3-319-55753-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-55753-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55752-6
Online ISBN: 978-3-319-55753-3
eBook Packages: Computer ScienceComputer Science (R0)