Abstract
There has been considerable interest recently in the use of highly-available configuration management services based on the Paxos family of algorithms to address long-standing problems in the management of large-scale heterogeneous distributed systems. These problems include providing distributed locking services, determining group membership, electing a leader, managing configuration parameters, etc. While these services are finding their way into the management of distributed middleware systems and data centers in general, there are still areas of applicability that remain largely unexplored. One such area is the management of metadata in distributed file systems. In this paper we show that a Paxos-based approach to building metadata services in distributed file systems can achieve high availability without incurring a performance penalty. Moreover, we demonstrate that it is easy to retrofit such an approach to existing systems (such as PVFS and HDFS) that currently use different approaches to availability. Our overall approach is based on the use of a general-purpose Paxos-compatible component (the embedded Oracle Berkeley database) along with a methodology for making it interoperate with existing distributed file system metadata services.
Keywords
- Directory Object
- Hadoop Distribute File System
- Distribute File System
- Metadata Server
- Operating System Principle
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Download conference paper PDF
References
Lamport, L.: The Part-Time Parliament. ACM Transactions on Computer Systems (TOCS) 16(2), 133–169 (1998)
Oki, B.M., Liskov, B.H.: Viewstamped Replication: A New Primary Copy Method to Support Highly Available Distributed Systems. In: Proc. of the 7th ACM Symposium on Principles of Distributed Computing (PODC 1988), Toronto, Canada (1988)
Lampson, B.W.: How to Build a Highly Available System using Consensus. In: Babaoğlu, Ö., Marzullo, K. (eds.) WDAG 1996. LNCS, vol. 1151, pp. 1–17. Springer, Heidelberg (1996)
Burrows, M.: The Chubby Lock Service for Loosely-Coupled Distributed Systems. In: Proceedings of OSDI 2006, Seattle, WA (2006)
Junqueira, F., Reed, B.C., Serafini, M.: Zab: High-performance Broadcast for Primary-Backup Systems. In: Proc. of IEEE/IFIP International Conference on Dependable Systems and Networks, Hong Kong, China (2011)
Olson, M.A., Bostic, K., Seltzer, M.I.: Berkeley DB. In: Proceedings of USENIX Annual Technical Conference, FREENIX Track, Monterey, CA (1999)
Perl, S.E., Seltzer, M.I.: Data Management for Internet-Scale Single-Sign-On. In: Proc. of USENIX WORLDS 2006, Seattle, WA (2006)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A Distributed Storage System for Structured Data. ACM Transactions on Computer Systems (TOCS) 26(2), 1–26 (2008)
Redstone, J., Chandra, T., Griesemer, R.: Paxos Made Live: An Engineering Perspective. In: Proc. of the 26th Annual ACM Symposium on Principles of Distributed Computing (PODC 2007), Portland, OR (2007)
Lee, E., Thekkath, C.: Petal: Distributed Virtual Disks. In: Proc. of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Cambridge, MA (1996)
Liskov, B., Ghemawat, S., Gruber, R., Johnson, P., Shrira, L.: Replication in the Harp File System. In: Proc. of the 13th ACM Symposium on Operating Systems Principles, Pacific Grove, CA (1991)
MacCormick, J., Murphy, N., Najork, M., Thekkath, C.A., Zhou, L.: Boxwood: Abstractions as the Foundation for Storage Infrastructure. In: Proc. of 6th Symposium on Operating Systems Design & Implementation (OSDI 2004), San Francisco, CA (2004)
MacCormick, J., Thekkath, C.A., Jager, M., Roomp, K., Zhou, L., Peterson, R.: Niobe: A Practical Replication Protocol. ACM Trans. on Storage 3(4), 1–43 (2008)
Shepler, S., et al.: Parallel NFS, RFC 5661-5664, http://tools.ietf.org/html/rfc5661
Ligon, M., Ross, R.: Overview of the Parallel Virtual Fle System. In: Proceedings of USENIX Extreme Linux Workshop, Monterey, CA (1999)
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop Distributed File System. In: Proc. of IEEE Conference on Mass Storage Systems and Technologies (MSST), Lake Tahoe, NV (2010)
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google File System. In: Proc. of 19th ACM Symposium on Operating Systems Principles (SOSP-19), Bolton Landing, New York (2003)
Bhide, A.K., Elnozahy, E.N., Morgan, S.P.: A Highly Available Network File Server. In: Proc. of the USENIX Winter Conference, Nashville, TE (January 1991)
Anderson, T., Dahlin, M., Neefe, J., Patterson, D., Roselli, D., Wang, R.: Serverless Network File Systems. In: Proc. of 15th Symposium on Operating Systems Principles, Copper Mountain, CO (1996)
Weil, S., Brandt, S., Miller, E.L., Long, D., Maltzahn, C.: Ceph: A Scalable, High-Performance Distributed File System. In: Proc. of 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2006), Seattle, WA (2006)
Vahalia, U.: Unix Internals: The New Frontiers. Prentice Hall (2008)
Garcia-Molina, H., Salem, K.: Main Memory Database Systems: An Overview. IEEE Transactions on Knowledge and Data Engineering 4(6), 509–516 (1992)
Pacemaker. A Scalable High-Availability Cluster Resource Manager, http://clusterlabs.org
Katcher, J.: PostMark: A New File System Benchmark. Technical report, Network Appliance TR-3022 (October 1997)
Schneider, F.B.: Implementing Fault-Tolerant Services Using the State Machine Approach: a Tutorial. ACM Computing Surveys 22(4), 299–319 (1990)
Gray, J.: Why Do Computers Stop and What Can be Done About it? Technical report, Tandem TR 85-7 (1985)
Gifford, D.: Weighted Voting for Replicated Data. In: Proc. of the 7th ACM Symposium on Operating Systems Principles (SOSP), Pacific Grove, CA (1979)
Thekkath, C., Mann, T., Lee, E.: Frangipani: a Scalable Distributed File System. In: Proc. of the 16th Symp. on Operating Systems Principles, S. Malo, France (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 IFIP International Federation for Information Processing
About this paper
Cite this paper
Stamatakis, D., Tsikoudis, N., Smyrnaki, O., Magoutis, K. (2012). Scalability of Replicated Metadata Services in Distributed File Systems. In: Göschka, K.M., Haridi, S. (eds) Distributed Applications and Interoperable Systems. DAIS 2012. Lecture Notes in Computer Science, vol 7272. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30823-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-30823-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30822-2
Online ISBN: 978-3-642-30823-9
eBook Packages: Computer ScienceComputer Science (R0)
