A Timing Assumption and Two t-Resilient Protocols for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems
- 95 Downloads
This paper considers the problem of electing an eventual leader in an asynchronous shared memory system. While this problem has received a lot of attention in message-passing systems, very few solutions have been proposed for shared memory systems. As an eventual leader cannot be elected in a pure asynchronous system prone to process crashes, the paper first proposes to enrich the asynchronous system model with an additional assumption. That assumption (denoted AWB) is particularly weak. It is made up of two complementary parts. More precisely, it requires that, after some time, (1) there is a process whose write accesses to some shared variables be timely, and (2) the timers of (t−f) other processes be asymptotically well-behaved (t denotes the maximal number of processes that may crash, and f the actual number of process crashes in a run). The asymptotically well-behaved timer notion is a new notion that generalizes and weakens the traditional notion of timers whose durations are required to monotonically increase when the values they are set to increase (a timer works incorrectly when it expires at arbitrary times, i.e., independently of the value it has been set to).
The paper then focuses on the design of t-resilient AWB-based eventual leader protocols. “t-resilient” means that each protocol can cope with up to t process crashes (taking t=n−1 provides wait-free protocols, i.e., protocols that can cope with any number of process failures). Two protocols are presented. The first enjoys the following noteworthy properties: after some time only the elected leader has to write the shared memory, and all but one shared variables have a bounded domain, be the execution finite or infinite. This protocol is consequently optimal with respect to the number of processes that have to write the shared memory. The second protocol guarantees that all the shared variables have a bounded domain. This is obtained at the following additional price: t+1 processes are required to forever write the shared memory. A theorem is proved which states that this price has to be paid by any protocol that elects an eventual leader in a bounded shared memory model. This second protocol is consequently optimal with respect to the number of processes that have to write in such a constrained memory model. In a very interesting way, these protocols show an inherent tradeoff relating the number of processes that have to write the shared memory and the bounded/unbounded attribute of that memory.
KeywordsAsynchronous system Atomic register Eventual leader Fault-tolerance Omega Process crash Shared memory System model Timer property Timing assumptions t-resilient protocol
Unable to display preview. Download preview PDF.
- 1.Abraham, I., Chockler, G.V., Keidar, I., Malkhi, D.: Byzantine disk Paxos, optimal resilience with Byzantine shared memory. In: Proc. 23th ACM Symposium on Principles of Distributed Computing (PODC’04), pp. 226–235. ACM Press, New York (2004) Google Scholar
- 2.Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: On implementing omega with weak reliability and synchrony assumptions. In: Proc. 22th ACM Symposium on Principles of Distributed Computing (PODC’03), pp. 306–314. ACM Press, New York (2003) Google Scholar
- 3.Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Communication-efficient leader election and consensus with limited link synchrony. In: Proc. 23th ACM Symposium on Principles of Distributed Computing (PODC’04), pp. 328–337. ACM Press, New York (2004) Google Scholar
- 4.Aguilera, M.K., Englert, B., Gafni, E.: On using network attached disks as shared memory. In: Proc. 21th ACM Symposium on Principles of Distributed Computing (PODC’03), pp. 315–324. ACM Press, New York (2003) Google Scholar
- 10.Gibson, G.A., Nagle, D., Amiri, K., Butler, J., Chang, F.W., Gobioff, H., Hardin, C., Riedel, E., Rochberg, D., Zelenka, J.: A cost-effective high-bandwidth storage architecture. In: Proc. 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’98), pp. 92–103. ACM Press, New York (1998) CrossRefGoogle Scholar
- 11.Guerraoui, R., Kapalka, M., Kouznetsov, P.: The weakest failure detectors to boost obstruction-freedom. In: Proc. 20th Symposium on Distributed Computing (DISC’06). Lecture Notes in Computer Science, vol. 4167, pp. 376–390. Springer, Berlin (2006) Google Scholar
- 13.Guerraoui, R., Raynal, M.: A leader election protocol for eventually synchronous shared memory systems. In: 4th International IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems (SEUS’06), pp. 75–80. IEEE Computer Society Press, Los Alamitos (2006) CrossRefGoogle Scholar
- 16.Herlihy, M.P., Luchangco, V., Moir, M.: Obstruction-free synchronization: double-ended queues as an example. In: Proc. 23th IEEE International Conference on Distributed Computing Systems (ICDCS’03), pp. 522–529. IEEE Computer Society Press, Los Alamitos (2003) Google Scholar
- 17.Herlihy, M.P., Luchangco, V., Moir, M., Scherer III, W.N.: Software transactional memory for dynamic sized data structure. In: Proc. 21th ACM Symposium on Principles of Distributed Computing (PODC’03), pp. 92–101. ACM Press, New York (2003) Google Scholar
- 22.Lo, W.-K., Hadzilacos, V.: Using failure detectors to solve consensus in asynchronous shared memory systems. In: Proc. 8th International Workshop on Distributed Computing (WDAG’94). Lecture Notes in Computer Science, vol. 857, pp. 280–295. Springer, Berlin (1994) Google Scholar
- 23.Malkhi, D., Oprea, F., Zhou, L.: Ω meets Paxos: leader election and stability without eventual timely links. In: Proc. 19th International Symposium on Distributed Computing (DISC’05). Lecture Notes in Computer Science, vol. 3724, pp. 199–213. Springer, Berlin (2005) Google Scholar
- 24.Mills, D.L.: Network Time Protocol (Version 3). Request for Comments (RFC) 1305, March 1992 Google Scholar
- 29.Powell, D.: Failure mode assumptions and assumption coverage. In: Proc. of the 22nd International Symposium on Fault-Tolerant Computing (FTCS-22), pp. 386–395. IEEE Computer Society Press, Boston (1992) Google Scholar