International Journal of Parallel Programming

, Volume 34, Issue 2, pp 143–170 | Cite as

Light-Weight Leases for Storage-Centric Coordination

  • Gregory Chockler
  • Dahlia Malkhi


Reaching agreement among processes sharing read/write memory is possible only in the presence of an eventual unique leader. A leader that fails must be recoverable, but on the other hand, a live and well-performing leader should never be decrowned. This paper presents the first leader algorithm in shared memory environments that guarantees an eventual leader following global stabilization time. The construction is built using light-weight lease and renew primitives. The implementation is simple, yet efficient. It is uniform, in the sense that the number of potentially contending processes for leadership is not a priori known.


Leases file systems mutual exclusion consensus 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abadi M. and Lamport L. (September 1994). An Old-Fashioned Recipe for Real Time. ACM Transactions on Programming Languages and Systems 16(5):1543–1571CrossRefGoogle Scholar
  2. 2.
    Alur R., Attiya H., and Taubenfeld G. (1997). Time-Adaptive Algorithms for Synchronization. SIAM Journal on Computing 26(2):539–556zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    R. Alur and G. Taubenfeld, How to share a data structure: A fast timing-based solution, in Proceedings of the 5th IEEE Symposium on Parallel and Distributed Processing, pp. 470–477 (1993).Google Scholar
  4. 4.
    Alur R. and Taubenfeld G. (1996). Fast Timing-based Algorithms. Distributed Computing 10(1):1–10CrossRefMathSciNetGoogle Scholar
  5. 5.
    Alur R. and Taubenfeld G. (1996). Contention-free Complexity of Shared Memory Algorithms. Information and Computation 126(1):62–73zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    K. Amiri, G.A. Gibson, and R. Golding, Highly Concurrent Shared Storage, in Proceedings of the International Conference on Distributed Computing Systems (ICDCS2000), (April 2000).Google Scholar
  7. 7.
    Attiya H., Bar-Noy A., and Dolev D. (1995). Sharing Memory Robustly in Message-Passing Systems. Journal of the ACM 42(1):124–142zbMATHCrossRefGoogle Scholar
  8. 8.
    H. Attiya and A. Bar-Or, Sharing Memory with Semi-Byzantine Clients and Faulty Storage Servers. The 22nd Symposium on Reliable Distributed Systems (SRDS), (October, 2003).Google Scholar
  9. 9.
    A. Barry et al. An Overview of Version 0.9.5 Proposed SCSI Device Locks, in Proceedings of the 17th IEEE Symposium on Mass Storage Systems, pp. 243–252, College Park, Maryland, March 27–30, IEEE Computer Society, (2000).Google Scholar
  10. 10.
    R. Boichat, P. Dutta, and R. Guerraoui, Asynchronous Leasing. Invited Paper at the 7th IEEE International Workshop on Object-oriented Real-time Dependable Systems (WORDS 2002), San Diego, California (January 2002).Google Scholar
  11. 11.
    Burns R. (March, 2000). Data Management in a Distributed File System for Storage Area Networks. PhD Thesis. Department of Computer Science, University of California, Santa CruzGoogle Scholar
  12. 12.
    Burns J. and Lynch N. (December 1993). Bounds on Shared Memory for Mutual Exclusion. Information and Computation 107 (2):171–184zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Cheng Shao, E. Pierce, J. Welch, Multi-Writer Consistency Conditions for Shared Memory Objects. in Proceedings of the 17th International Symposium on Distributed Computing (DISC’2003), (to appear).Google Scholar
  14. 14.
    G. Chockler and D. Malkhi, Active Disk Paxos with Infinitely Many Processes. Proceedings of the 21st ACM Symposium on Principles of Distributed Computing (PODC), (August 2002).Google Scholar
  15. 15.
    G. Chockler, D. Malkhi, and M. K. Reiter, Backoff Protocols for Distributed Mutual Exclusion and Ordering. Proceedings of the 21st International Conference on Distributed Computing Systems, pp. 11–20, (April 2001).Google Scholar
  16. 16.
    Chandra T.D. and Toueg S. (March 1996). Unreliable Failure Detectors for Reliable Distributed Systems. Journal of the ACM 43(2):225–267zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Cristian F. and Fetzer C. (1999). The Timed Asynchronous Distributed System Model. IEEE Transactions on Parallel and Distributed Systems 10(6):642–657CrossRefGoogle Scholar
  18. 18.
    Dwork C., Lynch N., and Stockmeyer L. (1988). Consensus in the Presence of Partial Synchrony. Journal of the ACM 35(2):288–323CrossRefMathSciNetGoogle Scholar
  19. 19.
    Gafni E. and Lamport L. (2003). Disk Paxos. Distributed Computing 16(1):1–20CrossRefGoogle Scholar
  20. 20.
    E. Gafni and M. Mitzenmacher, Analysis of Timing-Based Mutual Exclusion with Random Times, in Proceedings of the 18th Annual ACM Symposium on Principles of Distributed Computing (PODC’99), pp. 13–21, May 3–6, Atlanta, Georgia, USA (1999).Google Scholar
  21. 21.
    J. S. Glider, C. F. Fuente, and W. J. Scales, Software Architecture of a SAN Storage Control System. IBM Systems Journal, 2(42) (2003).Google Scholar
  22. 22.
    R. Golding and O. Rodeh, Group Communication – Still Complex after All These Years, in International Workshop on Large-Scale Group Communication (in conjunction with SRDS’2003), October 5, Florence, Italy (2003).Google Scholar
  23. 23.
    C. Gray and D. Cheriton, Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency, in Proceedings of the 12th ACM Symposium on Operating Systems Principles, pp. 202–210 (1989).Google Scholar
  24. 24.
    Jayanti P., Chandra T., and Toueg S. (May 1998). Fault-Tolerant Wait-free Shared Objects. Journal of the ACM 45(3):451–500zbMATHCrossRefMathSciNetGoogle Scholar
  25. 25.
    D. K. Kaynar, N. Lynch, R. Segala, and F. Vaandrager, Timed I/O Automata. Manuscript in progress (2003).Google Scholar
  26. 26.
    Lamport L. (February 1987) A Fast Mutual Exclusion Algorithm. ACM Transactions on Computer Systems 5(1):1–11 Also appeared as SRC Research Report 7.CrossRefGoogle Scholar
  27. 27.
    Lamport L. December (2001). Paxos Made Simple. Distributed Computing Column of SIGACT News 32(4):34–58Google Scholar
  28. 28.
    B. W. Lampson, How to Build a Highly Available System using Consensus, in Proceedings of the 10th International Workshop on Distributed Algorithms (WDAG), Vol. 1151: pp. 1–17, Springer-Verlag LNCS Berlin (1996).Google Scholar
  29. 29.
    B. W. Lampson, The ABCD’s of Paxos. Lamport Celebration Lecture 2, Presented on the 20th Annual ACM Symposium on Principles of Distributed Computing (PODC’01), August 26–29, Newport, Rhode Island, USA, (2001).Google Scholar
  30. 30.
    W.K. Lo and V. Hadzilacos, Using Failure Detectors to Solve Consensus in Asynchronous Shared-Memory Systems, in Proceedings of the 8th International Workshop on Distributed Algorithms (WDAG), pp. 280–295, The Netherlands (1994).Google Scholar
  31. 31.
    Lynch N. (1996). Distributed Algorithms. Morgan Kaufman Publishers, San Mateo, CAzbMATHGoogle Scholar
  32. 32.
    N. Lynch and N. Shavit, Timing-based Mutual Exclusion, in Proceedings of the 13rd Real-Time Systems Symposium, pp. 2–11, Phoenix, Arizona, IEEE Computer Society, (December 1992).Google Scholar
  33. 33.
    J. Menon, D. Pease, R. Rees, L. Duyanovich, and B. Hillsberg. StorageTank, a Heterogeneous Scalable SAN File System. IBM Systems Journal 2(42) (2003).Google Scholar
  34. 34.
    The Object-Based Storage Devices Technical Work Group. tech_activities/workgroups/osd.Google Scholar
  35. 35.
    K. Preslan, et al. A 64-bit, Shared Disk File System for Linux, in Proceedings of the 16th IEEE Symposium on Mass Storage Systems, pp. 22–41, San Diego, California, March 15–18, IEEE Computer Society, (1999).Google Scholar
  36. 36.
    K. Preslan, S. Soltis, C. Sabol, and M. O’Keefe, Device Locks: Mutual Exclusion for Storage Area Networks, in Proceedings of the 16th IEEE Symposium on Mass Storage Systems, pp. 262–274, San Diego, California, March 15–18, IEEE Computer Society, (1999).Google Scholar
  37. 37.
    O. Rodeh and A. Teperman. zFS – a scalable distributed file system using object disks, in Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, pages 207–218, San Diego, California, April 7–10, IEEE Computer Society, (2003).Google Scholar
  38. 38.
    F. Schmuck and R. Haskin, GPFS: A Shared-Disk File System for Large Computing Clusters. in Proceedings of the First Conference on File and Storage Technologies (FAST) (January 2002).Google Scholar
  39. 39.
    S. Soltis, T. Ruwart, and M. O’Keefe, The Global File System, in Proceedings of the 5th NASA Goddard Conference on Mass Storage Systems and Technologies, College Park, Maryland (September, 1996).Google Scholar
  40. 40.
    S. Soltis, G. Erickson, K. Preslan, M. O’Keefe, and T. Ruwart, The Design and Performance of a Shared File System for IRIX, in Proceedings of the 6th NASA Goddard Conference on Mass Storage Systems and Technologies, College Park, Maryland, March 23–26 (1998).Google Scholar

Copyright information

© Springer Science+Business Media, Inc. 2006

Authors and Affiliations

  1. 1.IBM Haifa LabsHaifa University CampusHaifaIsrael
  2. 2.School of Computer Science and EngineeringThe Hebrew University of Jerusalem, and Microsoft ResearchSilicon ValleyUSA

Personalised recommendations