FaTLease: scalable fault-tolerant lease negotiation with Paxos
A lease is a token which grants its owner exclusive access to a resource for a defined span of time. In order to be able to tolerate failures, leases need to be coordinated by distributed processes. We present FaTLease, an algorithm for fault-tolerant lease negotiation in distributed systems. It is built on the Paxos algorithm for distributed consensus, but avoids Paxos’ main performance bottleneck of requiring persistent state. This property makes our algorithm particularly useful for applications that can not dispense any disk bandwidth. Our experiments show that FaTLease scales up to tens of thousands of concurrent leases and can negotiate thousands of leases per second in both LAN and WAN environments.
KeywordsLeases High-availability Locks Paxos Replication
Unable to display preview. Download preview PDF.
- 2.Burrows, M.: Chubby distributed lock service. In: Proceedings of the 7th Symposium on Operating System Design and Implementation, OSDI’06, Seattle, WA, November 2006 Google Scholar
- 4.Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: an engineering perspective. In: PODC ’07: Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing, New York, NY, pp. 398–407. ACM Press, New York (2007) Google Scholar
- 7.Hupfeld, F., Cortes, T., Kolbeck, B., Stender, J., Focht, E., Hess, M., Malo, J., Marti, J., Cesario, E.: XtreemFS: a case for object-based storage in Grid data management. In: 3rd VLDB Workshop on Data Management in Grids, Co-Located with VLDB 2007 (2007) Google Scholar
- 10.Lamport, L.: Paxos made simple. SIGACT News 32(4), 18–25 (2001) Google Scholar
- 11.Lampson, B.W.: How to build a highly available system using consensus. In: WDAG ’96: Proceedings of the 10th International Workshop on Distributed Algorithms, London, pp. 1–17. Springer, Berlin (1996) Google Scholar
- 12.MacCormick, J., Murphy, N., Najork, M., Thekkath, C.A., Zhou, L.: Boxwood: abstractions as the foundation for storage infrastructure. In: OSDI, pp. 105–120 (2004) Google Scholar
- 15.van Renesse, R., Schneider, F.B.: Chain replication for supporting high throughput and availability. In: OSDI, pp. 91–104 (2004) Google Scholar