Advertisement

Eventual Consistency for CRDTs

  • Radha Jagadeesan
  • James Riely
Open Access
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10801)

Abstract

We address the problem of validity in eventually consistent (EC) systems: In what sense does an EC data structure satisfy the sequential specification of that data structure? Because EC is a very weak criterion, our definition does not describe every EC system; however it is expressive enough to describe any Convergent or Commutative Replicated Data Type (CRDT).

1 Introduction

In a replicated implementation of a data structure, there are two impediments to requiring that all replicas achieve consensus on a global total order of the operations performed on the data structure (Lamport 1978): (a) the associated serialization bottleneck negatively affects performance and scalability (e.g. see (Ellis and Gibbs 1989)), and (b) the cap theorem imposes a tradeoff between consistency and partition-tolerance (Gilbert and Lynch 2002).

In systems based on optimistic replication (Vogels 2009; Saito and Shapiro 2005), a replica may execute an operation without synchronizing with other replicas. If the operation is a mutator, the other replicas are updated asynchronously. Due to the vagaries of the network, the replicas could receive and apply the updates in possibly different orders.

For sequential systems, the correctness problem is typically divided into two tasks: proving termination and proving partial correctness. Termination requires that the program eventually halt on all inputs, whereas partial correctness requires that the program only returns results that are allowed by the specification.

For replicated systems, the analogous goals are convergence and validity. Convergence requires that all replicas eventually agree. Validity requires that they agree on something sensible. In a replicated list, for example, if the only value put into the list is 1, then convergence ensures that all replicas eventually see the same value for the head of the list; validity requires that the value be 1.

Convergence has been well-understood since the earliest work on replicated systems. Convergence is typically defined as eventual consistency, which requires that once all messages are delivered, all replicas have the same state. Strong eventual consistency (sec) additionally requires convergence for all subsets of messages: replicas that have seen the same messages must have the same state.

Perhaps surprisingly, finding an appropriate definition of validity for replicated systems remains an open problem. There are solutions which use concurrent specifications, discussed below. But, as Shavit (2011) noted:

“It is infinitely easier and more intuitive for us humans to specify how abstract data structures behave in a sequential setting, where there are no interleavings. Thus, the standard approach to arguing the safety properties of a concurrent data structure is to specify the structure’s properties sequentially, and find a way to map its concurrent executions to these ‘correct’ sequential ones.”

In this paper we give the first definition of validity that is both (1) derived from standard sequential specifications and (2) validates the examples of interest.

We take the “examples of interest” to be Convergent/Commutative Replicated Data Types (crdts). These are replicated structures that obey certain monotonicity or commutativity properties. As an example of a crdt, consider the add-wins set, also called an “observed remove” set in Shapiro et al. (2011a). The add-wins set behaves like a sequential set if add and remove operations on the same element are ordered. The concurrent execution of an add and remove result in the element being added to the set; thus the remove is ignored and the “add wins.” This concurrent specification is very simple, but as we will see in the next section, it is quite difficult to pin down the relationship between the crdt and the sequential specification used in the crdt’s definition. This paper is the first to successfully capture this relationship.

Many replicated data types are crdts, but not all (Shapiro et al. 2011a). Notably, Amazon’s Dynamo (DeCandia et al. 2007) is not a crdt. Indeed, interest in crdts is motivated by a desire to avoid the well-know concurrency anomalies suffered by Dynamo and other ad hoc systems (Bieniusa et al. 2012).

Shapiro et al. (2011b) introduced the notion of crdt and proved that every crdt has an sec implementation. Their definition of sec includes convergence, but not validity.

The validity requirement can be broken into two components. We describe these below using the example of a list data type that supports only two operations: the mutator put, which adds an element to the end of the list, and the query q, which returns the state of the list. This structure can be specified as a set of strings such as Open image in new window and Open image in new window .

  • Linearization requires that a response be consistent with some specification string. A state that received put(1) and put(3), may report q=[1,3] or q=[3,1], but not q=[2,1,3], since 2 has not been put into the list.

  • Monotonicity requires that states evolve in a sensible way. We might permit the state q=[1,3] to evolve into q=[1,2,3], due to the arrival of action put(2). But we would not expect that q=[1,3] could evolve into q=[3,1], since the data type does not support deletion or reordering.

Burckhardt et al. (2012) provide a formal definition of validity using partial orders over events: linearizations respect the partial order on events; monotonicity is ensured by requiring that evolution extends the partial order. Similar definitions can be found in Jagadeesan and Riely (2015) and Perrin et al. (2015). Replicated data structures that are sound with respect to this definition enjoy many good properties, which we discuss throughout this paper. However, this notion of correctness is not general enough to capture common crdts, such as the add-wins set.

This lack of expressivity lead Burckhardt et al. (2014) to abandon notions of validity that appeal directly to a sequential specification. Instead they work directly with concurrent specifications, formalizing the style of specification found informally in Shapiro et al. (2011b). This has been a fruitful line of work, leading to proof rules (Gotsman et al. 2016) and extensions (Bouajjani et al. 2014). See (Burckhardt 2014; Viotti and Vukolic 2016) for a detailed treatment.

Positively, concurrent specifications can be used to validate any replicated structure, including crdts as well as anomalous structures such as Dynamo. Negatively, concurrent specifications have no the clear connection to their sequential counterparts. In this paper, we restore this connection. We arrive at a definition of sec that admits crdts, but rejects Dynamo.

The following “corner cases” are a useful sanity-check for any proposed notion of validity.

  • The principle of single threaded semantics (psts) (Haas et al. 2015) states that if an execution uses only a single replica, it should behave according to the sequential semantics.

  • The principle of single master (psm) (Budhiraja et al. 1993) states that if all mutators in an execution are initiated at a single replica, then the execution should be linearizable (Herlihy and Wing 1990).

  • The principle of permutation equivalence (ppe) (Bieniusa et al. 2012) states that “if all sequential permutations of updates lead to equivalent states, then it should also hold that concurrent executions of the updates lead to equivalent states.”

psts and psm say that a replicated structure should behave sequentially when replication is not used. ppe says that the order of independent operations should not matter. Our definition implies all three conditions. Dynamo fails ppe (Bieniusa et al. 2012), and thus fails to pass our definition of sec.

In the next section, we describe the validity problem and our solution in detail, using the example of a binary set. The formal definitions follow in Sect. 3. We state some consequences of the definition and prove that the add-wins set satisfies our definition. In Sect. 4, we describe a collaborative text editor and prove that it is sec. In Sect. 5 we characterize the programmer’s view of a crdt by defining the most general crdt that satisfies a given sequential specification. We show that any program that is correct using the most general crdt will be correct using a more restricted crdt. We also show that our validity criterion for sec is local in the sense of Herlihy and Wing (1990): independent structures can be verified independently. In Sect. 6, we apply these results to prove the correctness of a graph that is implemented using two sec sets.

Our work is inspired by the study of relaxed memory, such as (Alglave 2012). In particular, we have drawn insight from the rmo model of Higham and Kawash (2000).

2 Understanding Replicated Sets

In this section, we motivate the definition of sec using replicated sets as an example. The final definition is quite simple, but requires a fresh view of both executions and specifications. We develop the definition in stages, each of which requires a subtle shift in perspective. Each subsection begins with an example and ends with a summary.

2.1 Mutators and Non-mutators

An implementation is a set of executions. We model executions abstractly as labelled partial orders (lpos). The ordering of the lpo captures the history that precedes an event, which we refer to as visibility.Here the events are \(a\) through \(j\), with labels Open image in new window , Open image in new window , etc., and order represented by arrows. The lpo describes an execution with two replicas, shown horizontally, with time passing from left to right. Initially, the top replica receives a request to add 0 to the set ( Open image in new window ). Concurrently, the bottom replica receives a request to add 1 ( Open image in new window ). Then each replica is twice asked to report on the items contained in the set. At first, the top replica replies that 0 is present and 1 is absent ( Open image in new window ), whereas the bottom replica answers with the reverse ( Open image in new window ). Once the add operations are visible at all replicas, however, the replicas give the same responses ( Open image in new window and Open image in new window ).

lpos with non-interacting replicas can be denoted compactly using sequential and parallel composition. For example, the prefix of (1) that only includes the first three events at each replica can be written Open image in new window .

A specification is a set of strings. Let \({\textsc {set}}\) be the specification of a sequential set with elements 0 and 1. Then we expect that set includes the string “ Open image in new window ”, but not “ Open image in new window ”. Indeed, each specification string can uniquely be extended with either Open image in new window or Open image in new window and either Open image in new window or Open image in new window .

There is an isomorphism between strings and labelled total orders. Thus, specification strings correspond to the restricted class of lpos where the visibility relation provides a total order.

Linearizability (Herlihy and Wing 1990) is the gold standard for concurrent correctness in tightly coupled systems. Under linearizability, an execution is valid if there exists a linearization \(\tau \) of the events in the execution such that for every event e, the prefix of e in \(\tau \) is a valid specification string.

Execution (1) is not linearizable. The failure can already be seen in the sub-lpo Open image in new window . Any linearization must have either Open image in new window before Open image in new window or Open image in new window before Open image in new window . In either case, the linearization is invalid for set.

Although it is not linearizable, execution (1) is admitted by every crdt set in Shapiro et al. (2011a). To validate such examples, Burckhardt et al. (2012) develop a weaker notion of validity by dividing labels into mutators and accessors (also known as non-mutators). Similar definitions appear in Jagadeesan and Riely (2015) and Perrin et al. (2015). Mutators change the state of a replica, and accessors report on the state without changing it. For \({\textsc {set}}\), the mutators \(\mathbf {M}\) and non-mutators Open image in new window are as follows.Define the mutator prefix of an event e to include e and the mutators visible to e. An execution is valid if there exists a linearization of the execution, \(\tau \), such that for every event e, the mutator prefix of e in \(\tau \) is a valid specification string.

It is straightforward to see that execution (1) satisfies this weaker criterion. For both Open image in new window and Open image in new window , the mutator prefix is Open image in new window . This includes Open image in new window but not Open image in new window , and thus their answers are validated. Symmetrically, the mutator prefixes of Open image in new window and Open image in new window only include Open image in new window . The mutator prefixes for the final four events include both Open image in new window and Open image in new window , but none of the prior accessors.

Summary: Convergent states must agree on the final order of mutators, but intermediate states may see incompatible subsequences of this order. By restricting attention to mutator prefixes, the later states need not linearize these incompatible views of the partial past.

This relaxation is analogous to the treatment of non-mutators in update serializability (Hansdah and Patnaik 1986; Garcia-Molina and Wiederhold 1982), which requires a global serialization order for mutators, ignoring non-mutators.

2.2 Dependency

The following lpo is admitted by the add-wins set discussed in the introduction.In any crdt implementation, the effect of Open image in new window is negated by the subsequent Open image in new window The same reasoning holds for Open image in new window and Open image in new window . In an add-wins set, however, the concurrent adds, Open image in new window and Open image in new window , win over the deletions. Thus, in the final state both 0 and 1 are present.

This lpo is not valid under the definition of the previous subsection: Since Open image in new window and Open image in new window see the same mutators, they must agree on a linearization of Open image in new window . Any linearization must end in either Open image in new window or Open image in new window ; thus it is not possible for both Open image in new window and Open image in new window to be valid.

Similar issues arise in relaxed memory models, where program order is often relaxed between uses of independent variables (Alglave et al. 2014). Generalizing, we write Open image in new window to indicate that labels \(m\) and \(n\) are dependent. Dependency is a property of a specification, not an implementation. Our results only apply to specifications that support a suitable notion of dependency, as detailed in Sect. 3. For set, Open image in new window is an equivalence relation with two equivalence classes, corresponding to actions on the independent values 0 and 1.While the dependency relation for set is an equivalence, this is not required: In Sect. 4 we establish the correctness of collaborative text editing protocol with an intransitive dependency relation.
The dependent restriction of (2) is as follows.In the previous subsection, we defined validity using the mutator prefix of an event. We arrive at a weaker definition by restricting attention to the mutator prefix of the dependent restriction.

Under this definition, execution (2) is validated: Any interleaving of the strings Open image in new window and Open image in new window linearizes the dependent restriction of (2) given in (3).

Summary: crdts allow independent mutators to commute. We formalize this intuition by restricting attention to mutator prefixes of the dependent restriction. The crdt must respect program order between dependent operations, but is free to reorder independent operations.

This relaxation is analogous to the distinction between program order and preserved program order (ppo) in relaxed memory models (Higham and Kawash 2000; Alglave 2012). Informally, ppo is the suborder of program order that removes order between independent memory actions, such as successive reads on different locations without an intervening memory barrier.

2.3 Puns

The following lpo is admitted by the add-wins set.As in execution (2), the add Open image in new window is undone by the following remove Open image in new window , but the concurrent add Open image in new window wins over Open image in new window , allowing Open image in new window . In effect, Open image in new window sees the order of the mutators as Open image in new window . Symmetrically, Open image in new window sees the order as Open image in new window . While this is very natural from the viewpoint of a crdt, there is no linearization of the events that includes both Open image in new window and Open image in new window since Open image in new window and Open image in new window must appear in different orders.

Indeed, this lpo is not valid under the definition of the previous subsection. First note that all events are mutually dependent. To prove validity we must find a linearization that satisfies the given requirements. Any linearization of the mutators must end in either Open image in new window or Open image in new window . Suppose we choose Open image in new window and look for a mutator prefix to satisfy Open image in new window . (All other choices lead to similar problems.) Since Open image in new window precedes Open image in new window and is the last mutator in our chosen linearization, every possible witness for Open image in new window must end with mutator Open image in new window . Indeed the only possible witness is Open image in new window . However, this is not a valid specification string.

The problem is that we are linearizing events, rather than labels. If we shift to linearizing labels, then execution (4) is allowed. Fix the final order for the mutators to be Open image in new window . The execution is allowed if we can find a subsequence that linearizes the labels visible at each event. It suffices to choose the witnesses as follows. In the table, we group events with a common linearization together.

Each of these is a valid specification string. In addition, looking only at mutators, each is a subsequence of Open image in new window .

In execution (4), each of the witnesses is actually a prefix of the final mutator order, but, in general, it is necessary to allow subsequences.Execution (5) is admitted by the add-wins set. It is validated by the final mutator sequence Open image in new window . The mutator prefix Open image in new window of \(b\) is a subsequence of Open image in new window , but not a prefix.

Summary: While dependent events at a single replica must be linearized in order, concurrent events may slip anywhere into the linearization. A crdt may pun on concurrent events with same label, using them in different positions at different replicas. Thus a crdt may establish a final total over the labels of an execution even when there is no linearization of the events.

2.4 Frontiers

In the introduction, we mentioned that the validity problem can be decomposed into the separate concerns of linearizability and monotonicity. The discussion thus far has centered on the appropriate meaning of linearizability for crdts. In this subsection and the next, we look at the constraints imposed by monotonicity.

Consider the prefix Open image in new window of execution (4), extended with action Open image in new window , with visibility order as follows.This execution is not strong ec, since Open image in new window and Open image in new window see exactly the same mutators, yet provide incompatible answers.

Unfortunately, execution (6) is valid by the definition given in the previous section: The witnesses for \(a\)\(f\) are as before. In particular, the witness for Open image in new window is “ Open image in new window ”. The witness for Open image in new window is “ Open image in new window ”. In each case, the mutator prefix is a subsequence of the global mutator order “ Open image in new window ”.

It is well known that punning can lead to bad jokes. In this case, the problem is that Open image in new window is punning on a concurrent Open image in new window that cannot be matched by a visible Open image in new window in its history: the execution Open image in new window that is visible to Open image in new window must appear between the two Open image in new window operations; the specification Open image in new window that is used by Open image in new window must appear after. The final states of execution (4) have seen both remove operations, therefore the pun is harmless there. But Open image in new window and Open image in new window have seen only one remove. They must agree on how it is used.

Up to now, we have discussed the linearization of each event in isolation. We must also consider the relationship between these linearizations. When working with linearizations of events, it is sufficient to require that the linearization chosen for each event be a subsequence for the linearization chosen for each visible predecessor; since events are unique, there can be no confusion in the linearization about which event is which. Execution (6) shows that when working with linearizations of labels, it is insufficient to consider the relationship between individual events. The linearization “ Open image in new window ” chosen for Open image in new window is a supersequence of those chosen for its predecessors: “ Open image in new window ” for Open image in new window and “ Open image in new window ” for Open image in new window . The linearization “ Open image in new window ” chosen for Open image in new window is also a supersequence for the same predecessors. And yet, Open image in new window and Open image in new window are incompatible states.

Sequential systems have a single state, which evolves over time. In distributed systems, each replica has its own state, and it is this set of states that evolves. Such a set of states is called a (consistent) cut (Chandy and Lamport 1985).

A cut of an lpo is a sub-lpo that is down-closed with respect to visibility. The frontier of cut is the set of maximal elements. For example, there are 14 frontiers of execution (6): the singletons Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , the pairs Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , Open image in new window , and the triple Open image in new window . As we explain below, we consider non-mutators in isolation. Thus we do not consider the last four cuts, which include a non-mutator with other events. That leaves 10 frontiers. The definition of the previous section only considered the 6 singletons. Singleton frontiers are generated by pointed cuts, with a single maximal element.

When applied to frontiers, the monotonicity requirement invalidates execution (6). Monotonicity requires that the linearization chosen for a frontier be a subsequence of the linearization chosen for any extension of that frontier. If we are to satisfy state Open image in new window in execution (6), the frontier Open image in new window must linearize to “ Open image in new window ”. If we are to satisfy state Open image in new window , the frontier Open image in new window must linearize to “ Open image in new window ”. Since we require a unique linearization for each frontier, the execution is disallowed.

Since crdts execute non-mutators locally, it is important that we ignore frontiers with multiple non-mutators. Recall execution (4):
There is no specification string that linearizes the cut with frontier Open image in new window , since we cannot have Open image in new window immediately after Open image in new window . If we consider only pointed cuts for non-mutators, then the execution is sec, with witnesses as follows.

In order to validate non-mutators, we must consider singleton non-mutator frontiers. The example shows that we must not consider frontiers with multiple non-mutators. There is some freedom in the choices otherwise. For set, we can “saturate” an execution with accessors by augmenting the execution with accessors that witness each cut of the mutators. In a saturated execution, it is sufficient to consider only the pointed accessor cuts, which end in a maximal accessor. For non-saturated executions, we are forced to examine each mutator cut: it is possible that a future accessor extension may witness that cut. The status of “mixed” frontiers, which include mutators with a single maximal non-mutator, is open for debate. We choose to ignore them, but the definition does not change if they are included.

Summary: A crdt must have a strategy for linearizing all mutator labels, even in the face of partitions. In order to ensure strong ec, the definition must consider sets of events across multiple replicas. Because non-mutators are resolved locally, sec must ignore frontiers with multiple non-mutators.

Cuts and frontiers are well-known concepts in the literature of distributed systems (Chandy and Lamport 1985). It is natural to consider frontiers when discussing the evolving correctness of a crdt.

2.5 Stuttering

Consider the following execution.This lpo represents a partitioned system with events \(a\)\(e\) in one partition and \(x\)\(z\) in the other. As the partition heals, we must be able to account for the intermediate states. Because of the large number of events in this example, we have elided all accessors. We will present the example using the semantics of the add-wins set. Recall that the add-wins set validates Open image in new window if and only if there is a maximal Open image in new window beforehand. Thus, a replica that has seen the cut with frontier Open image in new window must answer Open image in new window , whereas a replica that has seen Open image in new window must answer Open image in new window .

Any linearization of Open image in new window must end in Open image in new window , since the add-win set must reply Open image in new window : the only possibility is “ Open image in new window ”. The linearization of Open image in new window must end in Open image in new window . If it must be a supersequence, the only possibility is “ Open image in new window ”. Taking one more step on the left, Open image in new window must linearize to “ Open image in new window ”. Thus the final state Open image in new window must linearize to “ Open image in new window ”. Reasoning symmetrically, the linearization of Open image in new window must be “ Open image in new window ”, and thus the final Open image in new window must linearize to “ Open image in new window ”. The constraints on the final state are incompatible. Each of these states can be verified in isolation; it is the relation between them that is not satisfiable.

Recall that monotonicity requires that the linearization chosen for a frontier be a subsequence of the linearization chosen for any extension of that frontier. The difficulty here is that subsequence relation ignores the similarity between “ Open image in new window ” and “ Open image in new window ”. Neither of these is a subsequence of the other, yet they capture exactly the same sequence of states, each with six alternations between Open image in new window and Open image in new window . The canonical state-based representative for these sequences is “ Open image in new window ”.

crdts are defined in terms of states. In order to relate crdts to sequential specifications, it is necessary to extract information about states from the specification itself. Adapting Brookes (1996), we define strings as stuttering equivalent (notation Open image in new window ) if they pass through the same states. So Open image in new window but Open image in new window If we consider subsequences up to stuttering, then execution (7) is sec, with witnesses as follow:

Recall that without stuttering, we deduced that Open image in new window must linearize to “ Open image in new window ” and Open image in new window must linearize to “ Open image in new window ”. Under stuttering equivalence, these are the same, with canonical representative “ Open image in new window ”. Thus, monotonicity under stuttering allows both linearizations to be extended to satisfy the final state Open image in new window , which has canonical representative “ Open image in new window ”.

Summary: crdts are described in terms of convergent states, whereas specifications are described as strings of actions. Actions correspond to labels in the lpo of an execution. Many strings of actions may lead to equivalent states. For example, idempotent actions can be applied repeatedly without modifying the state.

The stuttering equivalence of Brookes (1996) addresses this mismatch. In order to capture the validity of crdts, the definition of subsequence must change from a definition over individual specification strings to a definition over equivalence classes of strings up to stuttering.

3 Eventual Consistency for CRDTs

This section formalizes the intuitions developed in Sect. 2. We define executions, specifications and strong eventual consistency (sec). We discuss properties of eventual consistency and prove that the add-wins set is sec.

3.1 Executions

An execution realizes causal delivery if, whenever an event is received at a replica, all predecessors of the event are also received. Most of the crdts in Shapiro et al. (2011a) assume causal delivery, and we assumed it throughout the introductory section. There are costs to maintaining causality, however, and not all crdts assume that executions incur these costs. In the formal development, we allow non-causal executions.

Shapiro et al. (2011a) draw executions as timelines, explicitly showing the delivery of remote mutators. Below left, we give an example of such a timeline.

This is a non-causal execution: at the bottom replica, Open image in new window is received before Open image in new window , even though Open image in new window precedes Open image in new window at the top replica.

Causal executions are naturally described as Labelled Partial Orders (lpos), which are transitive and antisymmetric. Section 2 presented several examples of lpos. To capture non-causal systems, we move to Labelled Visibility Orders (lvos), which are merely acyclic. Acyclicity ensures that the transitive closure of an lvo is an lpo. The right picture above shows the lvo corresponding to the timeline on the left. The zigzag arrow represents an intransitive communication. When drawing executions, we use straight lines for “transitive” edges, with the intuitive reading that “this and all preceding actions are delivered”.

lvos arise directly due to non-causal implementations. As we will see in Sect. 4, they also arise via projection from an lpo.

lvos are unusual in the literature. To make this paper self-contained, we define the obvious generalizations of concepts familiar from lpos, including isomorphism, suborder, restriction, maximality, downclosure and cut.

Fix a set \(\mathbf {L}\) of labels. A Labelled Visibility Order (lvo, also known as an execution) is a triple Open image in new window where Open image in new window is a finite set of events, Open image in new window and Open image in new window is reflexive and acyclic.

Let \({u}\), \(v\) range over lvos. Many concepts extend smoothly from lpos to lvos.

Replica-Specific Properties. In the literature on replicated data types, some properties of interest (such as “read your writes” (Tanenbaum and Steen 2007)) require the concept of “session” or a distinction between local and remote events. These can be accommodated by augmenting lvos with a replica labelling Open image in new window which maps events to a set \(\mathbf {R}\) of replica identifiers.

Executions can be generated operationally as follows: Replicas receive mutator and accessor events from the local client; they also receive mutator events that are forwarded from other replicas. Each replica maintains a set of seen events: an event that is received is added to this set. When an event is received from the local client, the event is additionally added to the execution, with the predecessors in the visibility relation corresponding to the current seen set. If we wish to restrict attention to causal executions, then we require that replicas forward all the mutators in their seen sets, rather than individual events, and, thus, the visibility relation is transitive over mutators.

All executions that are operationally generated satisfy the additional property that Open image in new window is per-replica total: Open image in new window . We do not demand per-replica totality because our results do not rely on replica-specific information.

3.2 Specifications and Stuttering Equivalence

Specifications are sets of strings, equipped with a distinguished set of mutators and a dependency relation between labels. Specifications are subject to some constraints to ensure that the mutator set and dependency relations are sensible; these are inspired by the conditions on Mazurkiewicz executions (Diekert and Rozenberg 1995). Every specification set yields a derived notion of stuttering equivalence. This leads to the definition of observational subsequence ( Open image in new window ).

We use standard notation for strings: Let \(\sigma \) and \(\tau \) range over strings. Then Open image in new window denotes concatenation, \(\sigma ^*\) denotes Kleene star, \(\sigma |||\tau \) denotes the set of interleavings, \(\varepsilon \) denotes the empty string and \(\sigma ^{i}\) denotes the \(i^{\text {th}}\) element of \(\sigma \). These notations lift to sets of strings via set union.

A specification is a quadruple Open image in new window where
  • \(\mathbf {L}\) is a set of actions (also known as labels),

  • \(\mathbf {M}\subseteq \mathbf {L}\) is a distinguished set of mutator actions,

  • Open image in new window is a symmetric and reflexive dependency relation, and

  • \({\varSigma }\subseteq \mathbf {L}^*\) is a set of valid strings.

Let Open image in new window be the sets of non-mutators.

A specification must satisfy the following properties:
  1. (a)

    prefix closed: Open image in new window

     
  2. (b)

    non-mutators are closed under stuttering, and commutation:

    Open image in new window

    Open image in new window

     
  3. (c)

    independent actions commute:

    Open image in new window

     

Property (b) ensures that non-mutators do not affect the state of the data structure. Property (c) ensures that commuting of independent actions does not affect the state of the data structure.

Recall that the set specification takes Open image in new window , representing addition and removal of bits 0 and 1, and Open image in new window , representing membership tests returning false or true. The dependency relation is Open image in new window , where \(D^2=D\times D\).

The dependency relation for set is an equivalence, but this need not hold generally. We will see an example in Sect. 4.

The definitions in the rest of the paper assume that we have fixed a specification Open image in new window . In the examples of this section, we use set.

State and Stuttering Equivalence. Specification strings \(\sigma \) and \(\tau \) are state equivalence (notation Open image in new window ) if every valid extension of \(\sigma \) is also a valid extension of \(\tau \), and vice versa. For example, Open image in new window and Open image in new window but Open image in new window . In particular, state equivalent strings agree on the valid accessors that can immediately follow them: either Open image in new window or Open image in new window and either Open image in new window or Open image in new window . Formally, we define state equivalence, Open image in new window , as follows2.

From specification property (b), we know that non-mutators do not affect the state. Thus we have that Open image in new window whenever Open image in new window and Open image in new window . From specification property (c), we know that independent actions commute. Thus we have that Open image in new window whenever Open image in new window and Open image in new window .

Two strings are stuttering equivalent3 if they only differ in operations that have no effect on the state of the data structure, as given by \({\varSigma }\). Adapting Brookes (1996) to our notion of state equivalence, we define stuttering equivalence, Open image in new window , to be the least equivalence relation generated by the following rules, where Open image in new window ranges over Open image in new window .The first rule above handles the empty string. The second rule allows stuttering in any context. The third rule motivates the name stuttering equivalence, for example, allowing Open image in new window . The last case captures the equivalence generated by independent labels, for example, allowing Open image in new window but not Open image in new window Using the properties of Open image in new window discussed above, we can conclude, for example, that Open image in new window

Consider specification strings for a unary set over value 0. Since stuttering equivalence allows us to remove both accessors and adjacent mutators with the same label we deduce that the canonical representatives of the equivalence classes induced by \(\sim \) are generated by the regular expression Open image in new window .

Observational Subsequence. Recall that Open image in new window is a subsequence of Open image in new window , although it is not a prefix. We write Open image in new window for subsequence and Open image in new window for observational subsequence, defined as follows.Note that observational subsequence includes both subsequence and stuttering equivalence Open image in new window .

Open image in new window can be understood in isolation, whereas Open image in new window can only be understood with respect to a given specification. In the remainder of the paper, the implied specification will be clear from context. Open image in new window is a partial order, whereas Open image in new window is only a preorder, since it is not antisymmetric.

Let \(\sigma \) and \(\tau \) be strings over the unary set with canonical representatives Open image in new window and Open image in new window . Then we have that Open image in new window exactly when either Open image in new window and Open image in new window or Open image in new window and Open image in new window . Thus, observational subsequence order is determined by the number of alternations between the mutators.

Specification strings for the binary set, then, are stuttering equivalent exactly when they yield the same canonical representatives when restricted to 0 and to 1. Thus, observational subsequence order is determined by the number of alternations between the mutators, when restricted to each dependent subsequence. (The final rule in the definition of stuttering, which allows stuttering across independent labels, is crucial to establishing this canonical form.)

3.3 Eventual Consistency

Eventual consistency is defined using the cuts of an execution and the observational subsequence order of the specification. As noted in Sects. 2.2 and 2.4, it is important that we not consider all cuts. Thus, before we define sec, we must define dependent cuts.

The dependent restriction of an execution is defined: Open image in new window where Open image in new window when Open image in new window . See Sect. 2.2 for an example of dependent restriction.

The dependent cuts of \(v\) are cuts of the dependent restriction. As discussed in Sect. 2.4, we only consider pointed cuts (with a single maximal element) for non-mutators. See Sect. 2.4 for an example.
An execution \(v\) is Eventually Consistent (sec) for specification Open image in new window iff there exists a function Open image in new window that satisfies the following.  
Linearization:

Open image in new window and

Monotonicity:

Open image in new window

  A data structure implementation is \(\textsc {sec}{}\) if all of its executions are \(\textsc {sec}{}\).

In Sect. 2, we gave several examples that are sec. See Sects. 2.4 and 2.5 for examples where Open image in new window is given explicitly. Section 2.4 also includes an example that is not sec.

The concerns raised in Sect. 2 are reflected in the definition.

  • Non-mutators are ignored by the dependent restriction of other non-mutators. As discussed in Sect. 2.1, this relaxation is similar that of update-serializability (Hansdah and Patnaik 1986; Garcia-Molina and Wiederhold 1982).

  • Independent events are ignored by the dependent restriction of an event. As discussed in Sect. 2.2, this relaxation is similar to preserved program order in relaxed memory models (Higham and Kawash 2000; Alglave 2012).

  • As discussed in Sect. 2.3, punning is allowed: each cut \(p\) is linearized separately to a specification string Open image in new window .

  • As discussed in Sect. 2.4, we constrain the power puns by considering cuts of the distributed system (Chandy and Lamport 1985).

  • Monotonicity ensures that the system evolves in a sensible way: new order may be introduced, but old order cannot be forgotten. As discussed in Sect. 2.5, the preserved order is captured in the observational subsequence relation, which allows stuttering (Brookes 1996).

3.4 Properties of Eventual Consistency

We discuss some basic properties of sec. For further analysis, see Sect. 5.

An important property of crdts is prefix closure: If an execution is valid, then every prefix of the execution should also be valid. Prefix closure follows immediately from the definition, since whenever \({u}\) is a prefix of \(v\) we have that Open image in new window .

Prefix closure looks back in time. It is also possible to look forward: A system satisfies eventual delivery if every valid execution can be extended to a valid execution with a maximal element that sees every mutator. If one assumes that every specification string can be extended to a longer specification string by adding non-mutators, then eventual delivery is immediate.

The properties psts, psm and ppe are discussed in the introduction. An sec implementation must satisfies ppe since every dependent set of mutators is linearized: sec enforces the stronger property that there are no new intermediate states, even when executing all mutators in parallel. For causal systems, where Open image in new window is transitive, psts and psm follow by observing that if there is a total order on the mutators of \({u}\) then any linearization of \({u}\) is a specification string.

Burckhardt (2014, Sect. 5) provides a taxonomy of correctness criteria for replicated data types. Our definition implies NoCircularCausality and CausalArbitration, but does not imply either ConsistentPrefix or CausalVisibility. For lpos, which model causal systems, our definition implies CausalVisibility. ReadMyWrites and MonotonicReads require a distinction between local and remote events. If one assumes the replica-specific constraints given in Sect. 3.1, then our definition satisfies these properties; without them, our definition is too abstract.

3.5 Correctness of the Add-Wins Set

The add-wins set is defined to answer Open image in new window for a cut \({u}\) exactly when

It answers Open image in new window otherwise. The add-wins set is called the “observed-remove” set.

We show that any lpo that meets this specification is sec with respect to set. We restrict attention to lpos since causal delivery is assumed for the add-wins set in (Shapiro et al. 2011a).

For set, the dependency relation is an equivalence. For an equivalence relation R, let Open image in new window denote the set of (disjoint) equivalence classes for R. For set, Open image in new window . When dependency is an equivalence, then every interleaving of independent actions is valid if any interleaving is valid. Formally, we have the following, where \(|||\) denotes interleaving.

Using the forthcoming composition result (Theorem 2), it suffices for us to address the case when \({u}\) only involves operations on a single element, say 0. For any such lvo \({u}\), we choose a linearization Open image in new window that has a maximum number of alternations between Open image in new window and Open image in new window . If there is a linearization that begins with Open image in new window , then we choose one of these. Below, we summarize some of the key properties of such a linearization.

The first property above ensures that the accessors are validated correctly, i.e., 0 is deemed to be present iff there is an Open image in new window that is not followed by any Open image in new window .

We are left with proving monotonicity, i.e., if \({u}\subseteq v\), then Open image in new window . Consider Open image in new window and Open image in new window .

4 A Collaborative Text Editing Protocol

In this section we consider a variant of the collaborative text editing protocol defined by Attiya et al. (2016). After stating the sequential specification, text, we sketch a correctness proof with respect to our definition of eventual consistency. This example is interesting formally: the dependency relation is not an equivalence, and therefore the dependent projection does not preserve transitivity. The generality of intransitive lvos is necessary to understand text, even assuming a causal implementation.

Specification. Let \(a\), \(b\) range over nodes, which contain some text, a unique identifier, and perhaps other information. Labels have the following forms:

We demonstrate the correct answers to queries by example. Initially, the document is empty, whereas after initialization, the document contains a single node; thus the specification contains strings such as Open image in new window , where \(\varepsilon \) represents the empty document. Nodes can be added either before or after other nodes; thus Open image in new window results in the document Open image in new window . Nodes are always added adjacent to the target; thus, order matters in Open image in new window which results in Open image in new window rather than Open image in new window . Removal does what one expects; thus Open image in new window results in Open image in new window .

Attiya et al. (2016) define the interface for text using integer indices as targets, rather than nodes. Using the unique correspondence between the nodes and it indices (since node are unique), one can easily adapt an implementation that satisfies our specification to their interface.

We say that node \(a\) is a added in the actions Open image in new window , Open image in new window and Open image in new window . Node Open image in new window is a target in Open image in new window and Open image in new window . In addition to correctly answering queries, specifications must satisfy the following constraints:
  • Initialization may occur at most once,

  • each node may be added at most once,

  • a node may be removed only after it is added, and

  • a node may be used as a target only if it has been added and not removed.

These constraints forbid adding to a target that has been removed; thus Open image in new window is a valid string, but Open image in new window is not. It also follows that initialization must precede any other mutators.

Because add operations use unique identifiers, punning and stuttering play little role in this example. In order to show the implementation correct, we need only choose an appropriate notion of dependency. As we will see, it is necessary that removes be independent of adds with disjoint label sets, but otherwise all actions may be dependent. Let Open image in new window be the set of add and query labels, and let Open image in new window return the set of nodes that appear in a label. Then we define dependency as follows.

Implementation. We consider executions that satisfy the same four conditions above imposed on specifications. We refer the reader to the algorithm of Attiya et al. (2016) that provides timestamps for insertions that are monotone with respect to causality.

As an example, Attiya et al. (2016) allow the execution given on the left below. In this case, the dependent restriction is an intransitive lvo, even though the underlying execution is an lpo: in particular, Open image in new window does not precede Open image in new window in the dependent restriction. We give the order considered by dependent cuts on the right—this is a restriction of the dependent restriction: since we only consider pointed accessor cuts, we can safely ignore order out of non-mutators.
This execution is not linearizable, but it is sec, choosing witnesses to be subsequences of the mutator string Open image in new window . Here, the document is initialized to \(b\), then \(c\) and \(d\) are added after \(b\), resulting in Open image in new window . The order of \(c\) and \(d\) is determined by their timestamps. Afterwards, the top replica removes \(d\) and adds \(a\); the bottom replica removes \(b\) and adds \(e\), resulting in the final state Open image in new window . In the right execution, the removal of order out of the non-mutators shows the “update serializability” effect; the removal of order between Open image in new window and Open image in new window (and between Open image in new window and Open image in new window ) shows the “preserved program order” effect.

Correctness. Given an execution, we can find a specification string \(s_1 s_2\) that linearizes the mutators in the dependent restriction of the execution such that \(s_1\) contains only adds and \(s_2\) contains only removes. Such a specification string exists because by the conditions on executions, deletes do not have any outgoing edges to other mutators in the dependent restriction; so, they can be moved to the end in the matching specification string. In order to find \(s_1\) that linearizes the add events, any linearization that respects causality and timestamps (yielded by the algorithm of Attiya et al. (2016)) suffices for our purposes. The conditions required by sec follow immediately.

5 Compositional Reasoning

The aim of this section is to establish compositional methods to reason about replicated data structures. We do so using Labelled Transition Systems (ltss), where the transitions are labelled by dependent cuts. We show how to derive an lts from an execution, Open image in new window . We also define an lts for the most general crdt that validates a specification, Open image in new window . We show that \({u}\) is sec for \({\varSigma }\) exactly when Open image in new window is a refinement of Open image in new window . We use this alternative characterization to establish composition and abstraction results.

LTSs. An lts is a triple consisting of a set a states, an initial state and a labelled transition function between states. We first define the ltss for executions and specifications, then provide examples and discussion.

For both executions and specifications, the labels of the lts are dependent cuts: for executions, these are dependent cuts of the execution itself; for specifications, they are drawn from the set Open image in new window of all possible dependent cuts. We compare lts labels up to isomorphism, rather than identity. Thus it is safe to think of lts labels as (potentially intransitive) pomsets (Plotkin and Pratt 1997).

The states of the lts are different for the execution and specification. For executions, the states are cuts of the execution \({u}\) itself, Open image in new window ; these are general cuts, not just dependent cuts. For specifications, the states are the stuttering equivalence classes of strings allowed by the specification, Open image in new window .

There is an isomorphism between strings and total orders. We make use of this in the definition, treating strings as totally-ordered lvos.

Define Open image in new window where Open image in new window if Open image in new window andDefine Open image in new window where Open image in new window if Open image in new window andWe explain the definitions using examples from set, first for executions, then for specifications. Consider the execution on the left below. The derived lts is given on the right.

The states of the lts are cuts of the execution. The labels on transitions are dependent cuts. The requirements for execution transitions relate the source \(p\), target \(q\) and label \(v\). The leftmost requirements state that the target state must extend both the source and the label; thus the target state must be a combination of events and order from source and label. The middle requirements state that the maximal elements of the label must be new in the target; only the maximal elements of the label are added when moving from source to target. The upper right requirement states that the non-maximal order of the label must be respected by the source; thus the causal history reported by the label cannot contradict the causal history of the source. The lower right requirement ensures that maximal elements of the label are also maximal in the target. The restriction to dependent cuts explains the labels on transitions Open image in new window and Open image in new window . By definition, there is a self-transition labelled with the empty lvo at every state; we elide these transitions in drawings.

The specification lts for set is infinite, of course. To illustrate, below we give two sub-ltss with limitations on mutators. On the left, we only allow Open image in new window and Open image in new window . On the right, we only allow Open image in new window and Open image in new window and only consider the case in which there is at most one alternation between them. The states are shown using their canonical representatives. Because of the number of transitions, we show all dependent accessors as a single transition, with labels separated by commas.

The requirements for specification transitions are similar to those for implementations, but the states are equivalence classes over specification strings: with source \({[}\sigma {]}\) and target \({[}\tau {]}\). There is a transition between the states if there are members of the equivalence classes, \(\sigma \) and \(\tau \), that satisfy the requirements. Since these are total orders, the leftmost requirements state that there must be linearizations of the source and label that are subsequences of the target. Similarly, the upper right requirement states that the non-maximal order of the label must be respected by the source; thus we have Open image in new window but not Open image in new window for any \(\sigma \). The use of sub-order rather than subsequence allows Open image in new window but prevents nonsense transitions such as Open image in new window . Because the states are total orders, we drop the implementation lts requirement that maximal events of the label must be maximal in the target. If we were to impose this restriction, we would disallow Open image in new window .

It is worth noting that the specification of the add-wins set removes exactly three edges from the right lts: Open image in new window Open image in new window and Open image in new window .

Refinement. Refinement is a functional form of simulation (Hoare 1972; Lamport 1983; Lynch and Vaandrager 1995). Let Open image in new window and Open image in new window be ltss. A function Open image in new window is a (strong) refinement if Open image in new window imply that there exist Open image in new window and Open image in new window such that Open image in new window . Then \({P}\) refines \({Q}\) (notation Open image in new window ) if there exists a refinement Open image in new window such that the initial states are related, i.e., \(f(p_{0}) =q_{0}\).

We now prove that sec can be characterized as a refinement. We write Open image in new window when \(p_n\) is reachable from \(p_0\) via a finite sequence of steps Open image in new window .

Theorem 1

Open image in new window is EC for the specification \({\varSigma }\) iff Open image in new window .

Proof

For the forward direction, assume \({u}\) is EC and therefore there exists a function Open image in new window such that Open image in new window . For each cut Open image in new window , we start with the dependent restriction, Open image in new window . We further restriction attention to mutators, Open image in new window . The required refinement maps \(p\) to the equivalence class of the linearization of Open image in new window chosen by Open image in new window : Open image in new window . We abuse notation below by identifying each equivalence class with a canonical element of the class.

We show that Open image in new window implies Open image in new window . Since \(p\subseteq q\), we deduce that Open image in new window and by monotonicity, Open image in new window .

We show that Open image in new window implies Open image in new window . Suppose \(v\) only contains mutators. Since \(v\subseteq q\), we deduce that Open image in new window and by monotonicity, Open image in new window . On the other hand, suppose \(v\) contains the non-mutator \(a\). Let \(A= \mathbf {M}\cup \{a\}\). Since \(v\subseteq q\), we deduce that Open image in new window . By monotonicity, Open image in new window . Since Open image in new window we have Open image in new window as required.

Thus Open image in new window , completing this direction of the proof.

For the reverse direction, we are given a refinement Open image in new window . For any Open image in new window , define Open image in new window to be a string in the equivalence class Open image in new window that includes any non-mutator found in \(p\).

We first prove that Open image in new window is a linearization of \(p\). A simple inductive proof demonstrates that for any Open image in new window , there is a transition sequence of the form Open image in new window . Thus, we deduce from the label on the final transition into \(p\) that the Open image in new window related to \(p\) is a linearization of \(p\).

We now establish monotonicity. A simple inductive proof shows that for any Open image in new window , Open image in new window . Thus Open image in new window , by the properties of Open image in new window and the definition of Open image in new window .

Composition. Given two non-interacting data structures whose replicated implementations satisfy their sequential specifications, the implementation that combines them satisfies the interleaving of their specifications. We formalize this as a composition theorem in the style of Herlihy and Wing (1990).

Given an execution \({u}\) and \(L\subseteq \mathbf {L}\), write Open image in new window for the execution that results by restricting \({u}\) to events with labels in L: Open image in new window . This notation lifts to sets in the standard way: Open image in new window . Write Open image in new window to indicate that \({u}\) is sec for \({\varSigma }\).

Theorem 2 (Composition)

Let \(L_1\) and \(L_2\) be mutually independent subsets of \(\mathbf {L}\). For \(i\in \{1,2\}\), let \({\varSigma }_i\) be a specification with labels chosen from \(L_i\), such that \({\varSigma }_1|||{\varSigma }_2\) is also a specification. If Open image in new window and Open image in new window then Open image in new window (equivalently Open image in new window ).

The proof is immediate. Since \(L_1\) and \(L_2\) are mutually independent, any interleaving of the labels will satisfy the definition.

Abstraction. We describe a process algebra with parallel composition and restriction and establish congruence results. We ignore syntactic details and work directly with ltss. Replica identities do not play a role in the definition; thus, we permit implicit mobility of the client amongst replicas with the only constraint being that the replica has at least as much history on the current item of interaction as the client. This constraint is enforced by the synchronization of the labels, defined below. While the definition includes the case where the client itself is replicated, it does not provide for out-of-band interaction between the clients at different replicas: All interaction is assumed to happen through the data structure.

The relation Open image in new window is defined between ltss so that Open image in new window describes the system that results when client \({P}\) interacts with data structure \({Q}\). For ltss \({P}\) and \({Q}\), define Open image in new window inductively, as follows, where Open image in new window represents the empty lvo.Let

The Open image in new window operator is asymmetric between the client and data structure in two ways. First, note that every action of the client must be matched by the data structure. The condition of client quiescence in the definition of \(S_{\times }\), that all of the actions of the client \({P}\) must be matched by \({Q}\); otherwise Open image in new window . However, the first rule for Open image in new window explicitly permits actions of the data structure that may not be matched by the client. This asymmetry permits the composition of the data structure with multiple clients to be described incrementally, one client at a time. Thus, we expect that Open image in new window .

Second, note that right rule for Open image in new window interaction permits the data structure \({Q}\) to introduce order not found in the clients. This is clearly necessary to ensure that that the composition of client Open image in new window with the set data structure is nonempty. In this case, the client has no order between Open image in new window and Open image in new window whereas the data structure orders Open image in new window after Open image in new window . In this paper, we do not permit the client to introduce order that is not seen in the data structure. For a discussion of this issue, see (Jagadeesan and Riely 2015).

We can also define restriction for some set Open image in new window of labels, a lá CCS. Open image in new window . The definitions lift to sets: Open image in new window and Open image in new window .

It suffices to show that: Open image in new window . The proof proceeds in the traditional style of such proofs in process algebra. We illustrate by sketching the case for client parallel composition. Let \(f\) be the witness for Open image in new window . The proof proceeds by constructing a “product” refinement \(\mathrel {\mathcal {S}}\) relation of the identity on the states of \({P}\) with \(f\), i.e.: Open image in new window .

Thus, an sec implementation can be replaced by the specification.

Theorem 4 (Abstraction)

If \({u}\) is sec for \({\varSigma }\), then Open image in new window .

6 A Replicated Graph Algorithm

We describe a graph implemented with sets for vertices and edges, as specified by Shapiro et al. (2011a). The graph maintains the invariant that the vertices of an edge are also part of the graph. Thus, an edge may be added only if the corresponding vertices exist; conversely, a vertex may be removed only if it supports no edge. In the case of a concurrent addition of an edge with the deletion of either of its vertices, the deletion takes precedence.

The vertices \(v,w, \ldots \) are drawn from some universe \({\mathcal U}\). An edge \(e, e', \ldots \) is a pair of vertices. Let \(\mathsf{vert}(e) = \{ v,w\}\) be the vertices of edge \(e=(v,w)\). The vocabulary of the set specification includes mutators for the addition and removal of vertices and edges and non-mutators for membership tests.Valid graph specification strings answer queries like sets. In addition, we require the following.
Graph Implementation. We rewrite the graph program of Shapiro et al. (2011a) in a more abstract form. Our distributed graph implementation is written as a client of two replicate set: for vertices (V) and for edges (E). The implementation uses usets, which require that an element be added at most once and that each remove causally follow the corresponding add. Here we show the graph implementation for various methods as client code that runs at each replica. At each replica, the code accesses its local copy of the usets. All the message passing needed to propagate the updates is handled by the uset implementations of the sets V, E. For several methods, we list preconditions, which prescribe the natural assumptions that need to satisfied when these client methods are invoked. For example, an edge operation requires the presence of the vertices at the current replica.
We assume a causal transition system (as needed in Shapiro et al. (2011a)).

Correctness Using the Set Specification. We first show the correctness of the graph algorithm, using the set specification for the vertex and edge sets. We then apply the abstraction and composition theorems to show the correctness of the algorithm using a set implementation.

Let \({u}\) be a lvo generated in an execution of the graph implementation. The preconditions ensure that \({u}\) has the following properties:
  1. (a)

    For any \(v\), Open image in new window is never ordered after Open image in new window , and likewise for \(e\).

     
  2. (b)
     
  3. (c)
     

Define \(\sigma _1\), \(\sigma _2\) and \(\sigma _3\) as follows.

Then \({u}\) is sec with witness Open image in new window .

Full Correctness of the Implementation. We now turn to proving the correctness of the algorithm when the two sets are replaced by their implementations.

Consider two (distributed implementations of) separate and independent sets for vertices and edges, i.e. Open image in new window . Suppose we have two implementations, each of which is correct individually: Open image in new window . By composition, we have that they are correct when composed together: Open image in new window . Let \(\mathcal {P}\) be the graph implementation, which is a client of the two sets. By abstraction, we know that Open image in new window implies Open image in new window . By congruence, we deduce:
Thus, in order to validate the full graph implementation, it is sufficient to establish the correctness of the graph client when interacting with the specification of the two independent sets for edges and vertices, which we have already done in the previous treatment of abstract correctness.

7 Conclusions

We have provided a definition of strong eventual consistency that captures validity with respect to a sequential specification. Our definition reflects an attempt to resolve the tension between expressivity (cover the extant examples in the literature) and facilitating reasoning (by retaining a direct relationship with the sequential specification). The notion of concurrent specification developed by Burckhardt et al. (2014) has been used to prove the validity of several replicated data structure implementations. In future work, we would like to discover sufficient conditions relating concurrent and sequential specifications such that any implementation that is correct under the concurrent specification (as defined by Burckhardt et al. (2014)) will also be correct under the sequential counterpart (as defined here).

Footnotes

  1. 1.

    We use the standard definitions for restriction on functions and relations. Given a function Open image in new window and \(D\subseteq E\), define Open image in new window and Open image in new window .

  2. 2.

    To extend the definition to non-specification strings, we allow Open image in new window when Open image in new window .

  3. 3.

    Readers of Brookes (1996) should note that mumbling is not relevant here, since all mutators are visible.

Notes

Acknowledgements

This paper has been greatly improved by the comments of the anonymous reviewers.

This material is based upon work supported by the National Science Foundation under Grant No. 1617175. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

References

  1. Alglave, J.: A formal hierarchy of weak memory models. Formal Methods Syst. Des. 41(2), 178–210 (2012)CrossRefGoogle Scholar
  2. Alglave, J., Maranget, L., Tautschnig, M.: Herding cats: modelling, simulation, testing, and data mining for weak memory. ACM Trans. Program. Lang. Syst. 36(2), 7:1–7:74 (2014)CrossRefGoogle Scholar
  3. Attiya, H., Burckhardt, S., Gotsman, A., Morrison, A., Yang, H., Zawirski, M.: Specification and complexity of collaborative text editing. In: Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing, PODC 2016, Chicago, IL, USA, pp. 259–268, 25–28 July 2016Google Scholar
  4. Bieniusa, A., Zawirski, M., Preguiça, N., Shapiro, M., Baquero, C., Balegas, V., Duarte, S.: Brief announcement: semantics of eventually consistent replicated sets. In: Aguilera, M.K. (ed.) DISC 2012. LNCS, vol. 7611, pp. 441–442. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33651-5_48CrossRefGoogle Scholar
  5. Bouajjani, A., Enea, C., Hamza, J.: Verifying eventual consistency of optimistic replication systems. In: POPL 2014, pp. 285–296 (2014)Google Scholar
  6. Brookes, S.D.: Full abstraction for a shared-variable parallel language. Inf. Comput. 127(2), 145–163 (1996)MathSciNetCrossRefGoogle Scholar
  7. Budhiraja, N., Marzullo, K., Schneider, F.B., Toueg, S.: The primary-backup approach. In: Mullender, S. (ed.) Distributed Systems, 2nd edn., pp. 199–216 (1993)Google Scholar
  8. Burckhardt, S.: Principles of eventual consistency. Found. Trends Program. Lang. 1(1–2), 1–150 (2014). ISSN 2325-1107Google Scholar
  9. Burckhardt, S., Leijen, D., Fähndrich, M., Sagiv, M.: Eventually consistent transactions. In: Seidl, H. (ed.) ESOP 2012. LNCS, vol. 7211, pp. 67–86. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-28869-2_4CrossRefGoogle Scholar
  10. Burckhardt, S., Gotsman, A., Yang, H., Zawirski, M.: Replicated data types: specification, verification, optimality. In: POPL 2014, pp. 271–284 (2014)Google Scholar
  11. Chandy, K.M., Lamport, L.: Distributed snapshots: determining global states of distributed systems. ACM Trans. Comput. Syst. 3(1), 63–75 (1985)CrossRefGoogle Scholar
  12. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)CrossRefGoogle Scholar
  13. Diekert, V., Rozenberg, G. (eds.): The Book of Traces. World Scientific Publishing Co., Inc., River Edge (1995). ISBN 9810220588Google Scholar
  14. Ellis, C.A., Gibbs, S.J.: Concurrency control in groupware systems. ACM SIGMOD Rec. 18(2), 399–407 (1989)CrossRefGoogle Scholar
  15. Garcia-Molina, H., Wiederhold, G.: Read-only transactions in a distributed database. ACM Trans. Database Syst. 7(2), 209–234 (1982)CrossRefGoogle Scholar
  16. Gilbert, S., Lynch, N.: Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2), 51–59 (2002)CrossRefGoogle Scholar
  17. Gotsman, A., Yang, H., Ferreira, C., Najafzadeh, M., Shapiro, M.: ‘Cause i’m strong enough: reasoning about consistency choices in distributed systems. In: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT POPL, pp. 371–384 (2016)Google Scholar
  18. Haas, A., Henzinger, T.A., Holzer, A., Kirsch, C.M., Lippautz, M., Payer, H., Sezgin, A., Sokolova, A., Veith, H.: Local linearizability. CoRR, abs/1502.07118 (2015)Google Scholar
  19. Hansdah, R.C., Patnaik, L.M.: Update serializability in locking. In: Ausiello, G., Atzeni, P. (eds.) ICDT 1986. LNCS, vol. 243, pp. 171–185. Springer, Heidelberg (1986).  https://doi.org/10.1007/3-540-17187-8_36CrossRefGoogle Scholar
  20. Herlihy, M., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM TOPLAS 12(3), 463–492 (1990)CrossRefGoogle Scholar
  21. Higham, L., Kawash, J.: Memory consistency and process coordination for SPARC multiprocessors. In: Valero, M., Prasanna, V.K., Vajapeyam, S. (eds.) HiPC 2000. LNCS, vol. 1970, pp. 355–366. Springer, Heidelberg (2000).  https://doi.org/10.1007/3-540-44467-X_32CrossRefGoogle Scholar
  22. Hoare, C.A.R.: Proof of correctness of data representations. Acta Informatica 1(4), 271–281 (1972)CrossRefGoogle Scholar
  23. Jagadeesan, R., Riely, J.: From sequential specifications to eventual consistency. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015, Part II. LNCS, vol. 9135, pp. 247–259. Springer, Heidelberg (2015).  https://doi.org/10.1007/978-3-662-47666-6_20CrossRefGoogle Scholar
  24. Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)CrossRefGoogle Scholar
  25. Lamport, L.: Specifying concurrent program modules. ACM Trans. Program. Lang. Syst. 5(2), 190–222 (1983)CrossRefGoogle Scholar
  26. Lynch, N., Vaandrager, F.: Forward and backward simulations. Inf. Comput. 121(2), 214–233 (1995)MathSciNetCrossRefGoogle Scholar
  27. Perrin, M., Mostéfaoui, A., Jard, C.: Update consistency for wait-free concurrent objects. In: 2015 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, Hyderabad, India, pp. 219–228, 25–29 May 2015Google Scholar
  28. Plotkin, G., Pratt, V.: Teams can see pomsets. In: Workshop on Partial Order Methods in Verification. DIMACS Series, vol. 29, pp. 117–128. AMS (1997)CrossRefGoogle Scholar
  29. Saito, Y., Shapiro, M.: Optimistic replication. ACM Comput. Surv. 37(1), 42–81 (2005)CrossRefGoogle Scholar
  30. Shapiro, M., Preguiça, N., Baquero, C., Zawirski, M.: A comprehensive study of Convergent and Commutative Replicated Data Types. TR 7506, Inria (2011a)Google Scholar
  31. Shapiro, M. , Preguiça, N.M., Baquero, C., Zawirski, M.: Conflict-free replicated data types. In: Proceedings of the 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 386–400 (2011b)Google Scholar
  32. Shavit, N.: Data structures in the multicore age. Commun. ACM 54(3), 76–84 (2011)CrossRefGoogle Scholar
  33. Tanenbaum, A., Steen, M.V.: Distributed systems. Pearson Prentice Hall, Upper Saddle River, NJ (2007)zbMATHGoogle Scholar
  34. Viotti, P., Vukolic, M.: Consistency in non-transactional distributed storage systems. ACM Comput. Surv. 49(1), 19 (2016)CrossRefGoogle Scholar
  35. Vogels, W.: Eventually consistent. Commun. ACM 52(1), 40–44 (2009)CrossRefGoogle Scholar

Copyright information

© The Author(s) 2018

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.DePaul UniversityChicagoUSA

Personalised recommendations