Distributed Universality

Abstract

A notion of a universal construction suited to distributed computing has been introduced by Herlihy in his celebrated paper “Wait-free synchronization” (ACM Trans Program Lang Syst 13(1):124–149, 1991). A universal construction is an algorithm that can be used to wait-free implement any object defined by a sequential specification. Herlihy’s paper shows that the basic system model, which supports only atomic read/write registers, has to be enriched with consensus objects to allow the design of universal constructions. The generalized notion of a k -universal construction has been recently introduced by Gafni and Guerraoui (Proceedings of 22nd international conference on concurrency theory (CONCUR’11), Springer LNCS 6901, pp 17–27, 2011). A k-universal construction is an algorithm that can be used to simultaneously implement k objects (instead of just one object), with the guarantee that at least one of the k constructed objects progresses forever. While Herlihy’s universal construction relies on atomic registers and consensus objects, a k-universal construction relies on atomic registers and k-simultaneous consensus objects (which are wait-free equivalent to k-set agreement objects in the read/write system model). This paper significantly extends the universality results introduced by Herlihy and Gafni–Guerraoui. In particular, we present a k-universal construction which satisfies the following five desired properties, which are not satisfied by the previous k-universal construction: (1) among the k objects that are constructed, at least \(\ell \) objects (and not just one) are guaranteed to progress forever; (2) the progress condition for processes is wait-freedom, which means that each correct process executes an infinite number of operations on each object that progresses forever; (3) if any of the k constructed objects stops progressing, all its copies (one at each process) stop in the same state; (4) the proposed construction is contention-aware, in the sense that it uses only read/write registers in the absence of contention; and (5) it is generous with respect to the obstruction-freedom progress condition, which means that each process is able to complete any one of its pending operations on the k objects if all the other processes hold still long enough. The proposed construction, which is based on new design principles, is called a \((k,\ell )\)-universal construction. It uses a natural extension of k-simultaneous consensus objects, called \((k,\ell )\)-simultaneous consensus objects (\((k,\ell )\)-SC). Together with atomic registers, \((k,\ell )\)-SC objects are shown to be necessary and sufficient for building a \((k,\ell )\)-universal construction, and, in that sense, \((k,\ell )\)-SC objects are \((k,\ell )\)-\( {universal}\).

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. 1.

    This is no longer the case in asynchronous message-passing systems, namely k-simultaneous consensus is then strictly stronger than k-set agreement (as shown using different techniques in [6, 30]).

  2. 2.

    Let us recall that, in worst case scenarios, hardware operations such as \( \mathsf{compare \& swap}()\) can be \(1000\times \) more expensive that read or write.

  3. 3.

    It is possible to express \((k,\ell )\)-UC as an object accessed by appropriate operations. This is not done here because such an object formulation would be complicated without providing us with more insight on the question we are interested in.

References

  1. 1.

    Afek, Y., Gafni, E., Rajsbaum, S., Raynal, M., Travers, C.: The \(k\)-simultaneous consensus problem. Distrib. Comput. 22(3), 185–195 (2010)

    Article  MATH  Google Scholar 

  2. 2.

    Anderson, J.H., Moir, M.: Universal constructions for large objects. IEEE Trans. Parallel Distrib. Syst. 10(12), 1317–1332 (1999)

    Article  Google Scholar 

  3. 3.

    Attiya, H., Guerraoui, R., Hendler, D., Kutnetsov P.: The complexity of obstruction-free implementations. J. ACM 56(4), Article 24 (2009)

  4. 4.

    Attiya, H., Welch, J.L.: Distributed Computing: Fundamentals, Simulations and Advanced Topics, 2nd edn. Wiley-Interscience, Hoboken (2004). ISBN 0-471-45324-2

    Google Scholar 

  5. 5.

    Borowsky, E., Gafni, E.: Generalized FLP impossibility results for \(t\)-resilient asynchronous computations. In: Proceedings of 25th ACM Symposium on Theory of Computing (STOC’93), pp. 91–100. ACM Press (1993)

  6. 6.

    Bouzid, Z., Travers, C.: Simultaneous consensus is harder than set agreement in message-passing. In: Proceedings of 33rd International IEEE Conference on Distributed Computing Systems (ICDCS’13), pp. 611–620. IEEE Press (2013)

  7. 7.

    Capdevielle, C., Johnen, C., Milani, A.: Solo-fast universal cosntructions for deterministic abortable objects. In: Proceedings of 28th International Symposium on Distributed Computing (DISC’14), Springer LNCS 8784, pp. 288–302 (2014)

  8. 8.

    Chaudhuri, S.: More choices allow more faults: set consensus problems in totally asynchronous systems. Inf. Comput. 105(1), 132–158 (1993)

    MathSciNet  Article  MATH  Google Scholar 

  9. 9.

    Chuong, Ph., Ellen, F., Ramachandran V.: A Universal construction for wait-free transaction friendly data structures. In: Proceedings of 22th International ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’10), pp. 335–344. ACM Press (2010)

  10. 10.

    Crain, T., Imbs, D., Raynal, M.: Towards a universal construction for transaction-based multiprocess programs. Theor. Comput. Sci. 496, 154–169 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  11. 11.

    Ellen, F., Fatourou, P., Kosmas, E., Milani, A., Travers, C.: Universal construction that ensure disjoint-access parallelism and wait-freedom. In: Proceedings of 31th ACM Symposium on Principles of Distributed Computing (PODC), pp. 115–124. ACM Press (2012)

  12. 12.

    Fatourou, P., Kallimanis, N.D.: A highly-efficient wait-free universal construction. In: Proceedings of 23th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 325–334. ACM Press (2012)

  13. 13.

    Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)

    MathSciNet  Article  MATH  Google Scholar 

  14. 14.

    Gafni, E.: Round-by-round fault detectors: unifying synchrony and asynchrony. In: Proceedings of 17th ACM Symposium on Principles of Distributed Computing (PODC), pp. 143–152. ACM Press (1998)

  15. 15.

    Gafni, E., Guerraoui, R.: Generalizing universality. In: Proceedings of 22nd International Conference on Concurrency Theory (CONCUR’11), Springer LNCS 6901, pp. 17–27 (2011)

  16. 16.

    Guerraoui, R., Kapalka, M., Kouznetsov, P.: The weakest failure detectors to boost obstruction-freedom. Distrib. Comput. 20(6), 415–433 (2008)

    Article  MATH  Google Scholar 

  17. 17.

    Guerraoui, R., Lynch, N.A.: A general characterization of indulgence. ACM Trans. Auton. Adapt. Syst. 3(4), Article 20 (2008)

  18. 18.

    Herlihy, M.P.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)

    Article  Google Scholar 

  19. 19.

    Herlihy, M.P., Luchangco, V., Moir, M.: Obstruction-free synchronization: double-ended queues as an example. In: Proceedings of 23th International IEEE Conference on Distributed Computing Systems (ICDCS’03), pp. 522–529. IEEE Press (2003)

  20. 20.

    Herlihy, M., Luchangco, V., Moir, M., Scherer, W.M. III: Software transactional memory for dynamic-sized data structures. In: Proceedings of 22nd International ACM Symposium on Principles of Distributed Computing (PODC’03), pp. 92–101. ACM Press (2003)

  21. 21.

    Herlihy, M.P., Moss, J.E.B.: Transactional memory: architectural support for lock-free data structures. In: Proceedings of 20th ACM International Symposium on Computer Architecture (ISCA’93), pp. 289–300. ACM Press (1993)

  22. 22.

    Herlihy, M.P., Shavit, N.: The topological structure of asynchronous computability. J. ACM 46(6), 858–923 (1999)

    MathSciNet  Article  MATH  Google Scholar 

  23. 23.

    Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)

    Article  Google Scholar 

  24. 24.

    Lamport, L.: On inter-process communications. Part I: Basic formalism. Distrib. Comput. 1(2), 77–85 (1986)

    Article  MATH  Google Scholar 

  25. 25.

    Loui, M., Abu-Amara, H.: Memory requirements for agreement among unreliable asynchronous processes. Adv. Comput. Res. 4, 163–183 (1987)

    MathSciNet  Google Scholar 

  26. 26.

    Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann, Kaufmann (1996)

    Google Scholar 

  27. 27.

    Luchangco, V., Moir, M., Shavit, N.: On the Uncontended complexity of consensus. In: Proceedings of 17th International Symposium on Distributed Computing (DISC’03), Springer LNCS 2848, pp. 45–59 (2003)

  28. 28.

    Merritt, M., Taubenfeld, G.: Resilient consensus for infinitely many processes. In: Proceedings of 17th International Symposium on Distributed Computing (DISC’03), Springer LNCS 2848, pp. 1–15 (2003)

  29. 29.

    Raynal, M.: Concurrent Programming: Algorithms, Principles, and Foundations. Springer, Berlin (2013). ISBN 978-3-642-32026-2

    Google Scholar 

  30. 30.

    Raynal, M., Stainer, J.: Simultaneous consensus vs set agreement: a message-passing-sensitive hierarchy of agreement problems. In: Proceedings of 20th International Colloquium on Structural Information and Communication Complexity (SIROCCO 2013), Springer LNCS 8179, pp. 298–309 (2013)

  31. 31.

    Raynal, M., Stainer, J., Taubenfeld, G.: Distributed universality. In: Proceedings of 18th International Conference on Principles of Distributed Systems (OPODIS 14), Springer LNCS 8878, pp. 469–484 (2014)

  32. 32.

    Saks, M., Zaharoglou, F.: Wait-free \(k\)-set agreement is impossible: the topology of public knowledge. SIAM J. Comput. 29(5), 1449–1483 (2000)

    MathSciNet  Article  MATH  Google Scholar 

  33. 33.

    Shavit, N., Touitou, D.: Software transactional memory. Distrib. Comput. 10(2), 99–116 (1997)

    Article  Google Scholar 

  34. 34.

    Taubenfeld, G.: Contention-sensitive data structure and algorithms. In: Proceedings of 23rd International Symposium on Distributed Computing (DISC’09), Springer LNCS 5805, pp. 157–171 (2009)

Download references

Acknowledgments

This work has been partially supported by the French ANR project DISPLEXITY devoted to computability and complexity in distributed computing, and the Franco-German ANR project DISCMAT devoted to connections between mathematics and distributed computing. We want to thank Reviewer 1 and Reviewer 3 for their constructive comments, which helped us improve the content and the presentation of the paper.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Michel Raynal.

Additional information

A preliminary version of some results presented in this paper appeared in the Proceedings of the 18th International Conference on Principles of Distributed Systems (OPODIS 2014) [31].

Appendices

Appendix 1: Gafni and Guerraoui’s Lock-Free k-Universal Construction

Gafni and Guerraoui’s Construction

This section presents Gafni and Guerraoui’s generalized non-blocking k-universal construction introduced in [15], and denoted GG in the following. To make reading easier, we use the same variable names as in the construction presented in Fig. 1 for local and shared objects that have the same meaning in both constructions. The objects considered in GG are deterministic state machines, and “operations” are accordingly called “commands”.

Principle The algorithm GG is based on local replication, namely, the only shared objects are the control objects \( KSC [1\ldots ]\) and \( AC [1\ldots ][\ldots k]\). Each process \(p_i\) manages a copy of every state machine m, denoted \(machine_i[m]\), which contains the last state of machine m as known by \(p_i\). The invocation by \(p_i\) of \(machine_i[m].\mathsf{execute}(c)\) applies the command c to its local copy of machine m.

As explained in [15], the use of a naive strategy to update local copies of states machines, makes possible the following bad scenario. During a round r, a process \(p_1\) executes a command c1 on its copy of machine m1, while a process \(p_2\) executes a command c2 on a different machine m2. Then, during round \(r+1,\,p_1\) executes a command \(c2'\) on the machine m2 without having executed first c2 on its copy of m2. This bad behavior is prevented from occurring in [15] by a combined used of adopt-commit objects and an appropriate marking mechanism. When a process \(p_i\) applies a command c to its local copy of a machine m, it has necessarily received the pair (commitc) from the adopt-commit object associated with the current round, and consequently the other processes have received (commitc) or (adoptc). The process \(p_i\) attaches then to its next command for machine m, namely \(oper_i[m]\), the indication that \(oper_i[m]\) has to be applied to m after c, so that no process executes \(oper_i[m]\) without having previously executed c.

Algorithm As before, \(my\_list_i[m]\) defines the list of commands that \(p_i\) wants to apply to the machine m. Moreover, \(my\_list_i[m].\mathsf{first}()\) sets the read head to point to the first element of this list and returns its value; \(my\_list_i[m].\mathsf{current}()\) returns the command under the read head; finally, \(my\_list_i[m].\mathsf{next}()\) advances the read head before returning the command pointed to by the read head.

The algorithm is described in Fig. 5. as the algorithm of Fig. 1, it is round-based and has the same first four lines. When a process \(p_i\) enters a new asynchronous round (line 1), it first executes lines 2–4, which are the lines involving the k-simultaneous consensus object and the adopt-commit object associated with the current round r.

Fig. 5
figure5

Gafni–Guerraoui’s generalized universality lock-free algorithm (code of \(p_i\)) [15]

After the execution of these lines, for \(1\le m\le k,\,(tag_i[m],ac\_op_i[m])\) contains the command that \(p_i\) has to consider for the machine m. For each of them it does the following. First, if \(ac\_op_i[m]\) is marked “to be executed after” \(oper_i[m],\,p_i\) applies \(oper_i[m]\) to \(machine_i[m]\) (lines 6–8). Then, if \(tag_i[m]=adopt,\,p_i\) adopts \(ac\_op_i[m]\) as its next proposal for machine m (lines 9–10). Otherwise, \(tag_i[m]=commit\). In this case \(p_i\) first applies \(ac\_op_i[m]\) to its local copy of the machine m (line 11). Then, if \(ac\_op_i[m]\) was a command it has issued, \(p_i\) computes its next proposal \(oper_i[m]\) for the machine m (lines 12–15). Finally, to prevent the bad behavior previously described, it attaches to \(oper_i[m]\) the fact that this command cannot be applied to any copy of the machine m before the command \(ac\_op_i[m]\) (line 16).

Discussion: Gafni–Guerraoui’s Construction Revisited

The GG algorithm has two flaws. The first lies in the fact that it does not prevent a process from executing twice the same command on a given machine. The second lies in the fact that it is possible that, when a state machine stops progressing, it stops in different states at different processes. While the first can be easily fixed (see below), the second seems more difficult to fix.

Let us consider the following execution of the GG algorithm (Fig. 5). During some round r, a process \(p_i\) applies a command c to its local copy of the machine m (hence, \(p_i\) obtained (commitc) from \( AC [r][m]\), and each other process has obtained either (commitc) or (adoptc)). It follows from line 16 that \(p_i\) marks its next command on m (\(c'=oper_i[m]\)) “to be executed after c”. Let us consider now two distinct scenarios for the round \(r+1\).

Scenario 1 It is possible that all the processes, except \(p_i\), have received (adoptc) during the round r and propose c to \( AC [r+1][m]\). Moreover, according to the specification of an adopt-commit object, nothing prevent \( AC [r+1][m]\) from outputting (commitc) at all the processes. In this case \(p_i\) will execute the command c twice on \(machine_i[m]\). This erroneous behavior can be easily fixed by adding the following filtering after line 8:

figurec

This filtering amounts to check if the command \(ac\_op_i[m]\) has already been locally executed. The fact that \(ac\_op_i[m]\) has been previously committed is encoded in \(oper_i[m]\) by the marking mechanism.

Scenario 2 Let us again consider the round \(r+1\), and consider the possible case where the pair \((m,-)\) is not output by \( KSC [r+1]\) (let us remember that \( KSC [r+1]\) outputs one pair per process and globally at least one and at most k pairs). According to the specification of \( AC [r+1][m]\), it is possible to have \((tag_j[m],ac\_op_j[m])=(adopt,c)\) at any process \(p_j\ne p_i\), and \((tag_i[m],ac\_op_i[m])=(adopt,c')\) where \(c'\) is the new command that \(p_i\) wants to apply to the machine m. Hence, as far as m is concerned, all the processes execute the lines 9–10, and we are in the same configuration as at the end of round r. It follows that this can repeat forever. If it is the case, \(p_i\) has executed one more command on its local copy of machine m than the other processes. This means that state machine m stops progressing in different states at distinct processes.

Appendix 2: Proofs of the Lemmas of Sect. 3.3

To make this appendix self-contained, some definitions and explanations of Sect. 3.3 are repeated here.

Lemma 1 \(\forall ~i,m\): \((op\in GSTATE [i][m])\,\Rightarrow \,(\exists ~j{:}\; op\in my\_list_j[m])\) (i.e., if an operation op is applied to an object m, then op has been proposed by a process).

Proof

Before being written into \( GSTATE [i][m]\) (line 31), an operation op is first appended to m’s local history for the first time at line 18. It follows from lines 2–4 that this operation was proposed to an adopt-commit object by some process \(p_j\) in \(oper_j[m]\). If \(oper_j[m]\) was updated in the initialization phase, at line 24 or line 27, it is an operation of \(my\_list_j[m]\). If \(oper_j[m]\) was updated at line 25, it was proposed to an adopt-commit object by another process \(p_x\), and (by a simple induction) the previous reasoning shows that this operation belongs then to some \(my\_list_z[m]\). \(\square \)

Lemma 2 \(\forall ~i,j,m\): \((op\in my\_list_j[m])\) \(\Rightarrow \) (op appears at most once in \( GSTATE [i][m]\big )\) (i.e., an operation is executed at most once).

Proof

Suppose by contradiction that, at a given time and for an object m, a history \( GSTATE [-][m]\) contains twice the same operation op. Let \(p_i\) be the first process that wrote such a history with op appearing twice in \( GSTATE [i][m]\), and let \(\tau \) be the time instant at which \(p_i\) does it. Since \( GSTATE [i][m]\) is written only at line 31 with the content of \(\ell \_hist_i[m],\,p_i\) necessarily stored before \(\tau \) an history containing twice op in \(\ell \_hist_i[m]\). As \(\ell \_hist_i[m]\) is initially empty, it does not contain twice op in the initial state of \(p_i\). Since \(\ell \_hist_i[m]\) is updated only at line 7 or line 18, \(p_i\) sets it to a history containing twice op at one of these lines. According to the predicate of line 16, \(p_i\) cannot append op to \(\ell \_hist_i[m]\) at line 18 if op already appears in that sequence. It follows that \(p_i\) updates \(\ell \_hist_i[m]\) before \(\tau \) at line 7 with one of the longest local histories of m which contains op twice. Consequently, when \(p_i\) read (non-atomically) \( GSTATE \) at line 5, it retrieved that history from one of the \( GSTATE [j][m]\), also before \(\tau \). But this contradicts the fact that no process writes a history containing op twice before \(\tau \). It follows that no history containing several times the same operation can ever be written into one of the registers \( GSTATE [-][-]\). \(\square \)

The Sequence \((op_r^m)_{r\ge 1}\) of Committed Operations According to the specification of the adopt-commit object, for any round r and any object m there is at most one operation returned with the tag commit by the object \( AC [r][m]\) to some processes. Let \(op_r^m\) denote this unique operation if at least one process obtains a pair with the tag commit, and let \(op_r^m\) be \(\bot \) if all the pairs returned by \( AC [r][m]\) contain the tag adopt.

From the Sequence \((op_r^m)_{r\ge 1}\) to the Notion of Valid Histories Considering an execution of the algorithm of Fig. 1, the following lemmas show that, for any process \(p_i\) and any object m, all the sequences of operations appearing in \(\ell \_hist_i[m]\) are finite prefixes of a unique valid sequence depending only on the sequence \((op_r^m)_{r\ge 1}\) of committed operations.

More precisely, given a sequence \((op_r^m)_{r\ge 1}\), a history \((vh_x^m)_{1\le x\le xmax}\) is valid if it is equal to a sequence \((op_r^m)_{1\le r\le R}\) from which the \(\bot \) values and the repetitions have been removed. More formally, \((vh_x^m)_{1\le x\le xmax}\) is valid if there is a round number R and a strictly increasing function \(\sigma {:}\;\{1,\ldots ,xmax\}\rightarrow \{1,\ldots ,R\}\) such that for all x in \(\{1,\ldots ,xmax\}\): (a) \(vh_x^m=op_{\sigma (x)}^m\), (b) \(vh_x^m\ne \bot \), (c) for all x in \(\{1,\ldots ,xmax-1\}\): \(vh_x^m\ne vh_{x+1}^m\), and (d) the sets \(\{vh_1^m,\ldots ,vh_{xmax}^m\}\) and \(\{op_1^m,\ldots ,op_R^m\}\setminus \{\bot \}\) are equal.

Let us remark that this definition has two consequences: (i) the value of R for which item (d) is verified defines unambiguously the sequence \((vh_x^m)_{1\le x\le xmax}\) (and accordingly this sequence is denoted \( VH^m(R) \) in the following), and (ii) for any two valid histories \((vh_x^m)_{1\le x\le xmax1}\) and \((vh_x^m)_{1\le x\le xmax2}\), one is a prefix of the other.

Lemma 3 For any process \(p_i\) and any object m, at any time the local history \(\ell \_hist_i[m]\) is valid.

Proof

Let us suppose by contradiction that a process \(p_j\) updates \(\ell \_hist_j[m]\) with a sequence that is not valid. Let \(p_i\) be the first process that writes an invalid sequence (denoted s) into its variable \(\ell \_hist_i[m]\). Let \(\rho \) be the round and \(\tau \) the time at which it does it.

Since \(p_i\) is the first process that writes s into its local history \(\ell \_hist_i[m]\), it cannot do it at line 7 (this would imply that \(p_i\) retrieved s in some \(g\_state_i[j][m]\) obtained from its previous non-atomic read of \( GSTATE \)—line 5—implying that a process \(p_j\) would have written s into its local history \(\ell \_hist_j[m]\) before \(\tau \)). Consequently \(p_i\) writes s into \(\ell \_hist_i[m]\) at line 18. It follows that the adopt-commit object \( AC [\rho ][m]\) returned to \(p_i\) the pair (commitop) (where op is the last operation in s) at line 3 or 4 during round \(\rho \), hence, \(op_\rho ^m=op\).

Let us remind that, by assumption, before \(p_i\) appended op to \(\ell \_hist_i[m]\) at line 18 of round \(\rho ,\,\ell \_hist_i[m]\) was valid; let \(s'\) denote that history. Moreover, as \(p_i\) executes line 18 of round \(\rho \), it fulfilled the condition of line 16, hence we have \(op\notin s'\). Let \(R_1\) be the smallest (resp. \(R_2\) the largest) round number R such that \(s'= VH ^m(R)\). It follows from the previous observation that \(R_2<\rho \), and from the definition of \(R_1\), that \(op_{R_1}^m\ne \bot \) (\(op_{R_1}^m\) is the last operation appearing in \( VH ^m(R_1)= VH ^m(R_2)\)). Let us remark that, since \(s'\) is valid while s is not, there is necessarily a round number r such that \(R_2<r<\rho ,\,op_r^m\ne \bot \) and \(s'= VH ^m(R_2)\ne VH ^m(r)\) (intuitively, \(p_i\) “missed” a committed operation). Let \(r_0\) be the smallest round number verifying these conditions. According to this definition, \(op_{r_0}^m\ne op_{R_1}^m\).

Let us first show that \(op_{r_0}^m\notin VH ^m(R_1)= VH ^m(R_2)\). Suppose by contradiction that it exists a round \(r_1<R_2\) such that \(op_{r_1}^m=op_{r_0}^m\) and consider a process \(p_j\) executing round \(r_1\). The proof boils down to show that such a process \(p_j\) cannot propose \(op_{r_1}^m=op_{r_0}^m\) to a \( KSC [r]\) object with \(r>r_1+1\) before \(\tau \), which entails that this operation cannot be committed during round \(r_0\) and leads to a contradiction. If \(p_j\) commits \(op_{r_1}^m=op_{r_0}^m\) during that round, then, after the execution of lines 16–28, it has \(op_{r_1}\) into its variable \(\ell \_hist_i[m]\), has set its variable \(oper_j[m]\) to a different operation and will never propose \(op_{r_1}\) further in the execution. If \(p_j\) adopts \(op_{r_1}\) during round \(r_1\), then two cases are possible: (a) \(p_j\) returns from its invocation of \( AC [r_1+1][m].\mathsf {propose}(-)\) before that any process, which has committed \(op_{r_1}\) during round \(r_1\), invokes \( KSC [r_1+1][m].\mathsf {propose}(-)\), or (b) one of the processes that committed \(op_{r_1}\) during round \(r_1\), invokes \( KSC [r_1+1][m].\mathsf {propose}(-)\) before \(p_j\) returns from its invocation of \( AC [r_1+1][m].\mathsf {propose}(-)\). In the case (a), according to the validity properties of the k-simultaneous consensus and adopt-commit objects, \(p_j\) commits \(op_{r_1}\) during round \(r_1+1\) and, as before, will not propose this operation further in the execution since it appears in its local history. In the case (b), one of the processes that committed \(op_{r_1}\) during round \(r_1\) wrote an history containing it before \(p_j\) executes line 5 of round \(r_1+1\). If this happens before \(\tau \), then both this history and the history of \(p_j\) are valid, thus \(p_j\) adopts that history that strictly contains its own local history. It follows that \(p_j\) executes lines 16–28 of round \(r_1+1\) with an history containing \(op_{r_1}\) and consequently never proposes this operation further in the execution. This ends the proof of the fact that \(op_{r_0}^m\notin VH ^m(R_1)= VH ^m(R_2)\).

From the previous remark, it follows that, before \(\tau ,\,p_i\) never retrieves any history \( VH ^m(r)\) with \(r\ge r_0\) during its non-atomic read of \( GSTATE \) (or it would have set its variable \(\ell \_hist_i[m]\) to one of these histories at line 7 and never reset it to \(s'\), since these histories contain \( VH ^m(r_0)\), and are consequently strictly longer than \(s'\)).

Let us consider the execution of round \(r_0\) by \(p_i\) (since \(p_i\) reaches line 18 of round \(\rho >r_0\), this occurs). Let us suppose that \(p_i\) obtains the pair \((commit, op_{r_0}^m)\) from \( AC [r_0][m]\). As, (a) before \(\tau \), the values of \(\ell \_hist_i[m]\) are valid (hence they can only increase), and (b) \(op_{r_0}^m\notin VH ^m(R_2)\), it follows that \(p_i\) appends \(op_{r_0}^m\) to \(\ell \_hist_i[m]\) at line 18 of round \(r_0\), contradicting the fact that, just before \(\tau ,\,\ell \_hist_i[m]=s'= VH ^m(R_2)\). Consequently, according to the definition of \(r_0\) and the specification of the adopt-commit object, \( AC [r_0][m]\) returns \((adopt, op_{r_0}^m)\) to \(p_i\).

During round \(r_0\), since \(op_{r_0}^m\ne \bot \), all the processes that do not crash before obtain one of the two pairs \((adopt, op_{r_0}^m)\) or \((commit, op_{r_0}^m)\) from \( AC [r_0][m]\). Let \({\mathcal {C}}\) denote the ones that obtain \((commit, op_{r_0}^m)\), and \({\mathcal {A}}\) the one that obtain \((adopt, op_{r_0}^m)\). Among the processes of \({\mathcal {A}}\), some fulfills the condition of line 16 during round \(r_0\), namely those which do not have \(op_{r_0}^m\) in their local history. Let \({\mathcal {A}}_-\) denote this set of processes and let \({\mathcal {A}}_+\) be \({\mathcal {A}}\setminus {\mathcal {A}}_-\). As previously shown, \(p_i\) cannot have \(op_{r_0}^m\) in \(\ell \_hist_i[m]\) before \(\tau \); consequently \(p_i\in {\mathcal {A}}_-\). Let \(\mu \) be the first time at which a process of \({\mathcal {C}}\cup {\mathcal {A}}_+\) (the set of processes that have \(op_{r_0}^m\) in their local histories at the end of round \(r_0\)) executes line 31 of round \(r_0\). Let \(\mu '\) be the first time at which one of these processes invokes \( KSC [r_0+1][m].\mathsf {propose}(-)\) at round \(r_0+1\). Let \(\tau _i\) be the time at which \(p_i\) terminates its invocation of \( AC [r_0+1][m].\mathsf {propose}(-)\), and \(\tau '_i\) the time at which it terminates its read of line 5 during round \(r_0+1\).

Let us remark that any process \(p_j\) of \({\mathcal {A}}_-\) (including \(p_i\)) starts round \(r_0+1\) with \(oper_j[m]=op_{r_0}^m\). It follows from the k-simultaneous consensus and adopt-commit specifications and the structure of the lines 2–4, that if \(\tau _i<\mu '\) then \(p_i\) necessarily obtains the pair \((commit, op_{r_0}^m)\) from \( AC [r_0+1][m]\). As this happens before \(\tau ,\,op_{r_0}^m\notin \ell \_hist_i[m]\) when \(p_i\) checks the condition of line 16, and it consequently appends \(op_{r_0}\) to \(\ell \_hist_i[m]\) at line 18 of round \(r_0+1\). This is contradicts the fact that \(s'= VH ^m(R_2)\), except for the case \(r_0+1=\rho \). But, for \(r_0+1=\rho \), we should have \(op_{r_0}^m=op_{\rho }^m=op\), and, by definition of \(r_0,\,s\) would be valid, which contradicts the fact that (due to the definition of s) it is not.

The only remaining case is thus \(\mu '<\tau _i\), but since \(\mu <\mu '\) and \(\tau _i<\tau '_i\), it follows that \(\mu <\tau '_i\) which implies that \(p_i\) obtains a valid history containing \(op_{r_0}\) during its read of \( GSTATE \) at round \(r_0+1\) and consequently updates \(\ell \_hist_i[m]\) to one of these histories at line 7, thus before \(\tau \). This leads to a contradiction which concludes the proof of the lemma. \(\square \)

The execution on an object m of an operation op, issued by a process \(p_i\), starts when the process \(p_i\) proposes op to a k-simultaneous consensus object \( KSC [-][m]\) for the first time (i.e., when \(p_i\) makes op public), and terminates when a set res including (mopoutput[m]) is returned by \(p_i\) at line 10 or line 31. The next lemma shows that any execution is linearizable.

Lemma 4 The execution of an operation op issued by a process \(p_i\) on an object m can be linearized at the first time at which a process \(p_j\) writes into \( GSTATE [j][m]\) a local history \(\ell \_hist_j[m]\) such that \(op\in \ell \_ hist_j[m]\).

Proof

Let op be an operation applied on an object m and \(p_i\) be the process such that \(op\in my\_list_i[m]\). Let us first show that op cannot appear in the local history \(\ell \_hist_j[m]\) before being proposed by \(p_i\) to one of the k-simultaneous consensus objects \( KSC [-][m]\). Let \(p_j\) be the first process that adds op to its local history \(\ell \_hist_j[m]\) and \(\tau \) the time at which this occurs. It follows that time \(\tau \) cannot occur at line 7, but occurs when \(p_j\) executes line 18 when it appends op to \(\ell \_hist_j[m]\) during some round r. Process \(p_j\) consequently obtained the pair (commitop) from the adopt-commit object \( AC [r][m]\) at line 3 or line 3 of round r. According to the validity properties of k-simultaneous consensus and adopt-commit objects and to the structure of the lines 2–4, it follows that a process proposed op to \( KSC [r][m]\) before \(\tau \).

There are two ways for a process to propose op to \( KSC [r][m]\): either (a) it adopted it at line 25 of round \(r-1\) (if \(r>1\)) or (b) the process is \(p_i,\,op\in my\_list_i[m]\), and \(p_i\) wrote op into \(oper_i\) at line 24 or line 27 of round \(r-1\) (if \(r>1\)), or during initialization (if \(r=1\)). With the same reasoning as in the previous paragraph, case (a) implies that a process proposed op to \( KSC [r-1][m]\) before \(\tau \). This can be explained by case (a) at round \(r-2\) only if \(r>2\), or by case (b) at round \(r-2\). By iterating this reasoning, in the worst case until reaching round 1, it comes that in any case (b) happened, and that \(p_i\) necessarily proposed op to one of the \( KSC [-][m]\) objects before \(\tau \). Consequently, no process \(p_j\) has op in \(\ell \_hist_j[m]\) before \(p_i\) proposed it to one of the \( KSC [-][m]\) objects, thus the linearization point of op is after \(p_i\) has made public the operation op.

On the other hand, if it terminates, the operation op issued by \(p_i\) ends at lines 10 or 31 after that \(p_i\) computed an output for op. It can do it only at lines 9 or 20, and, in both cases, thanks to line 8 or lines 18–19, this happens only when op appears in \(\ell \_hist_i[m]\). This implies that \(p_i\) either obtained a history containing op at line 5 of the same round, or writes a history containing op in \( GSTATE [i][m]\) at line 30 of the same round before executing line 31, which proves that the linearization point of op is before op terminates at \(p_i\) (if it ever terminates).

Finally, according to Lemma 3, all the processes construct the same history of operations on m. Since the results locally returned are appropriately computed with \(\mathsf {compute\_output}()\) on the right prefix of the local history of m, the sequential specification of the object m is satisfied. This concludes the fact that there is a linearization of the sequence of operations applied on any object m. As any object m is linearizable, and as linearizability is a local property [23], it follows that the execution is linearizable, which ends the proof of the lemma. \(\square \)

Lemma 5 \(\forall ~r\ge 1\), there is a process \(p_i\) such that at least one operation op output by \( KSC [r].\mathsf{propose}()\) at \(p_i\) (line 2) is such that the invocation of \( AC [r][-].\mathsf{propose}()\) by \(p_i\) returns (commitop) (line 3 or 4).

Proof

The proof is based on an observation presented in [15]. Let us first notice that, after it has received a pair \((ksc\_obj_1,ksc\_op_1)\) from \( KSC [r].\mathsf{propose}()\) at line 2, a process \(p_{i1}\) invokes first the operation \( AC [r][ksc\_obj_1].\mathsf{propose}(ksc\_op_1)\) at line 3 before invoking \( AC [r][ksc\_obj].\mathsf{propose}(-)\) at line 4 for any object \(ksc\_obj \ne ksc\_obj_1\). If the invocation \( AC [r][ksc\_obj_1].\mathsf{propose}(ksc\_op_1)\) issued by \(p_{i1}\) returns the pair \((commit,-)\), the lemma follows.

Hence, let us assume that the invocation by \(p_{i1}\) of \( AC [r][ksc\_obj_1].\mathsf{propose}(ksc\_op_1)\) at line 3 returns the pair \((adopt,-)\). It follows from the “non-conflicting values” property of the adopt-commit object \( AC [r][ksc\_obj_1]\), that a process \(p_{i2}\) has necessarily invoked \( AC [r][ksc\_obj_1].\mathsf{propose}(op')\), with \(op'\ne ksc\_op_1\), and this invocation was issued at line 4 (if both \(p_{i1}\) and \(p_{i2}\) had invoked the operation \( AC [r][ksc\_obj_1].\mathsf{propose}()\) at line 3, they would have obtained the same pair from the object \( KSC [r]\) at line 2, and consequently, \(p_{i2}\) could not prevent \(p_{i1}\) from obtaining \((commit,-)\) from the adopt-commit object \( AC [r][ksc\_obj_1]\)). It follows that \(p_{i2}\) starts line 4 before \(p_{i1}\) terminates line 3. The invocation by \(p_{i2}\) of \( AC [r][-]\) at line 3 involved some object \(ksc\_obj_2\) obtained by \(p_{i2}\) from its invocation of \( KSC [r].\mathsf{propose}()\) at line 2 (as seen previously, we necessarily have \(ksc\_obj_2\ne ksc\_obj_1\)).

If the invocation by \(p_{i2}\) of \( AC [r][ksc\_obj_2].\mathsf{propose}()\) returns \((commit,-)\), the lemma follows. Otherwise, due to the “non-conflicting values” property of adopt-commit, there is a process \(p_{i3}\) that prevented \(p_{i2}\) from obtaining \((commit,-)\) from its invocation of \( AC [r][ksc\_obj_2].\mathsf{propose}()\) at line 3. let us notice that \(p_{i3}\ne p_{i1}\) (this follows from the observation that \(p_{i3}\) started line 4 before \(p_{i2}\) terminates line 3, which itself started line 4 before \(p_{i1}\) terminates line 3, hence \(p_{i3}\) started line 4 before \(p_{i1}\) terminates line 3). The execution pattern between \(p_{i2}\) and \(p_{i3}\) is then the same as the previous pattern between \(p_{i1}\) and \(p_{i2}\). While this pattern can be reproduced between \(p_{i3}\) and another process \(p_{i4}\), then between \(p_{i4}\) and \(p_{i5}\), etc., its number of occurrences is necessarily bounded because the number of processes is bounded. It then follows that there is a process \(p_{ix}\) that obtains the pair \((commit,-)\) when it invokes \( AC [r][ksc\_obj_{ix}].\mathsf{propose}()\) at line 3 (where \(ksc\_obj_{ix}\) is the object returned to \(p_{ix}\) by its invocation \( KSC [r].\mathsf{propose}()\) at line 2). \(\square \)

Lemma 6 There is at least one object on which an infinite number of operations are executed.

Proof

This lemma follows from (a) the fact that an operation committed during some round at some process is eventually made globally visible in \( GSTATE \) (lines 17, 18, and 30), (b) Lemma 5 (at every round an operation is committed at some process), and (c) the fact that the number of objects is bounded. \(\square \)

Appendix 3: Contention Awareness: Reducing the Number of Uses of k-SC Objects

As announced in Sect. 4.1, it is possible to reduce the number of uses of the underlying k-SC synchronization objects. This is obtained by replacing the lines N1–N3 in Fig. 2 by the lines as described in Fig. 6. There is one modified line (N2M) and three new lines (NN1, NN2, and NN3).

Fig. 6
figure6

Efficient contention-aware lock-free (k, 1)-universal construction (code for \(p_i\))

More precisely, if after it has used the adopt-commit objects \( AC [2r_i-1][m]\), for each constructed object \(m,\,p_i\) has received only tags adopt (modified line N2M), it executes the lines 2M, 3, and 4M, as in basic contention aware construction of Fig. 2. Differently, if it has received the tag commit for at least one constructed object, it invokes AC[2r][m] for all the objects m for which it has received the tag adopt (new lines NN1–NN3).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Raynal, M., Stainer, J. & Taubenfeld, G. Distributed Universality. Algorithmica 76, 502–535 (2016). https://doi.org/10.1007/s00453-015-0053-3

Download citation

Keywords

  • Asynchronous read/write system
  • Universal construction
  • Consensus
  • Distributed computability
  • k-Set agreement
  • k-Simultaneous consensus
  • Wait-freedom
  • Obstruction-freedom
  • Contention-awareness
  • Crash failures
  • State machine replication