Automatic Generation of Precise and Useful Commutativity Conditions
Abstract
Reasoning about commutativity between datastructure operations is an important problem with applications including parallelizing compilers, optimistic parallelization and, more recently, Ethereum smart contracts. There have been research results on automatic generation of commutativity conditions, yet we are unaware of any fully automated technique to generate conditions that are both sound and effective.
We have designed such a technique, driven by an algorithm that iteratively refines a conservative approximation of the commutativity (and noncommutativity) condition for a pair of methods into an increasingly precise version. The algorithm terminates if/when the entire state space has been considered, and can be aborted at any time to obtain a partial yet sound commutativity condition. We have generalized our work to left/rightmovers [27] and proved relative completeness. We describe aspects of our technique that lead to useful commutativity conditions, including how predicates are selected during refinement and heuristics that impact the output shape of the condition.
We have implemented our technique in a prototype opensource tool Servois. Our algorithm produces quantifierfree queries that are dispatched to a backend SMT solver. We evaluate Servois through two case studies: (i) We synthesize commutativity conditions for a range of data structures including Set, HashTable, Accumulator, Counter, and Stack. (ii) We consider an Ethereum smart contract called BlockKing, and show that Servois can detect serious concurrencyrelated vulnerabilities and guide developers to construct robust and efficient implementations.
1 Introduction
Reasoning about the conditions under which datastructure operations commute is an important problem. The ability to derive sound yet effective commutativity conditions unlocks the potential of multicore architectures, including parallelizing compilers [30, 34], speculative execution (e.g. transactional memory [19]), peephole partialorder reduction [37], futures, etc. Another important application domain that has emerged recently is Ethereum [1] smart contracts: efficient execution of such contracts hinges on exploiting their commutativity [14] and blockwise concurrency can lead to vulnerabilities [31]. Intuitively, commutativity is an important property because linearizable datastructure operations that commute can be executed concurrently: their effects do not interfere with each other in an observable way. When using a linearizable HashTable, for example, knowledge that put(x,‘a’) commutes with get(y) provided that \(\texttt {x}\ne \texttt {y}\) enables significant parallelization opportunities. Indeed, it’s important for the commutativity condition to be sufficiently granular so that parallelism can be exploited effectively [12]. At the same time, to make safe use of a commutativity condition, it must be sound [23, 24]. Achieving both of these goals using manual reasoning is burdensome and error prone.
In light of that, researchers have investigated ways of verifying userprovided commutativity conditions [22] as well as synthesizing such conditions automatically, e.g. based on random interpretation [6], profiling [33] or sampling [18]. None of these approaches, however, meet the goal of computing a commutativity condition that is both sound and granular in a fully automated manner.
In this paper, we present a refinementbased technique for synthesizing commutativity conditions. Our technique builds on wellknown descriptions and representations of abstract data types (ADTs) in terms of logical \(({ Pre}_{m},{ Post}_{m})\) specifications [10, 16, 17, 20, 26, 28] for each method m. Our algorithm iteratively relaxes underapproximations of the commutativity and noncommutativity conditions of methods m and n, starting from false, into increasingly precise versions. At each step, we conjunctively subdivide the symbolic state space into regions, searching for areas where m and n commute and where they don’t. Counterexamples to both the positive side and the negative side are used in the next symbolic subdivision. Throughout this recursive process, we accumulate the commutativity condition as a growing disjunction of these regions. The output of our procedure is a logical formula \(\varphi _m^n\) which specifies when method m commutes with method n. We have proven that the algorithm is sound, and can also be aborted at any time to obtain a partial, yet useful [19, 33], commutativity condition. We show that, under certain conditions, termination is guaranteed (relative completeness).
We address several challenges that arise in using an iterative refinement approach to generating precise and useful commutativity conditions. First, we show how to pose the commutativity question in a way that does not introduce additional quantifiers. We also show how to generate the predicate vocabulary for expressing the condition \(\varphi _m^n\), as well as how to choose the predicates throughout the refinement loop. A further question that we address is how predicate selection impacts the conciseness and readability of the generated commutativity conditions. Finally, we have generalized our algorithm to left/rightmovers [27], a more precise version of commutativity.
We have implemented our approach as the Servois tool, whose code and documentation are available online [2]. Servois is built on top of the CVC4 SMT solver [11]. We evaluate Servois through two case studies. First, we generate commutativity conditions for a collection of popular data structures, including Set, HashTable, Accumulator, Counter, and Stack. The conditions typically combine multiple theories, such as sets, integers, arrays, etc. We show the conditions to be comparable in granularity to manually specified conditions [22]. Second, we consider BlockKing [31], an Ethereum smart contract, with its known vulnerability. We demonstrate how a developer can be guided by Servois to create a more robust implementation.

The first sound and precise technique to automatically generate commutativity conditions (Sect. 5).

Proof of soundness and relative completeness (Sect. 5).

An implementation that takes an abstract code specification and automatically generates commutativity conditions using an SMT solver (Sect. 6).

A novel technique for selecting refinement predicates that improves scalability and the simplicity of the generated formulae (Sect. 6).

Demonstrated efficacy for several key data structures as well as the BlockKing Ethereum smart contract [31] (Sect. 7).
An extended version of this paper can be found in [8].
Related Work. The closest to our contribution in this paper is a technique by Gehr et al. [18] for learning, or inference, of commutativity conditions based on blackbox sampling. They draw concrete arguments, extract relevant predicates from the sampled set of examples, and then search for a formula over the predicates. There are no soundness or completeness guarantees.
Both Aleen and Clark [6] and Tripp et al. [33] identify sequences of actions that commute (via random interpretation and dynamic analysis, respectively). However, neither technique yields an explicit commutativity condition. Kulkarni et al. [25] point out that varying degrees of commutativity specification precision are useful. Kim and Rinard [22] use Jahob to verify manually specified commutativity conditions of several different linked data structures. Commutativity specifications are also found in dynamic analysis techniques [15]. More distantly related is work on synthesis of programs [32] and of synchronization [35, 36].
2 Example
Specifying commutativity conditions is generally nontrivial and it is easy to miss subtle corner cases. Additionally, it has to be done pairwise for all methods. For ease of illustration, we will focus on the relatively simple Set ADT, whose state consists of a single set \(S\) that stores an unordered collection of unique elements. Let us consider one pair of operations: (i) \(\mathtt{contains(}x\mathtt{)/bool}\), a sideeffectfree check whether the element \(x\) is in \(S\); and (ii) \(\mathtt{add(}y\mathtt{)/bool}\) adds \(y\) to \(S\) if it is not already there and returns true, or otherwise returns false. add and contains clearly commute if they refer to different elements in the set. There is another case that is less obvious: add and contains commute if they refer to the same element e, as long as in the prestate \(e \in S\). In this case, under both orders of execution, add and contains leave the set unmodified and return false and true, respectively. The algorithm we describe in this paper completes within a few seconds, producing a precise logical formula \(\varphi \) that captures this commutativity condition, i.e. the disjunction of the two cases above: \(\varphi \equiv x\ne y\vee (x= y\wedge x\in S)\). The algorithm also generates the conditions under which the methods do not commute: \(\tilde{\varphi }\equiv x=y\wedge x\notin S\). These are precise, since \(\varphi \) is the negation of \(\tilde{\varphi }\).
Capturing precise conditions such as these by hand, and doing so for many pairs of operations, is tedious and error prone. This paper instead presents a way to automate this. Our algorithm recursively subdivides the state space via predicates until, at the base case, regions are found that are either entirely commutative or else entirely noncommutative. Returning to our Set example, the conditions we incrementally generate are denoted \(\varphi \) and \(\tilde{\varphi }\), respectively. The following diagram illustrates how our algorithm proceeds to generate the commutativity conditions for add and contains (abbreviated as m and n).
In this diagram, each subsequent panel depicts a partitioning of the state space into regions of commutativity (\(\varphi \)) or noncommutativity (\(\tilde{\varphi }\)). The counterexamples \(\chi _\text {c},\chi _\text {nc}\) give values for the arguments x, y and the current state \(S\).
We denote by H the logical formula that describes the current state space at a given recursive call. We begin with \(H_0=\textsf {true}\), \(\varphi =\textsf {false}\), and \(\tilde{\varphi }=\textsf {false}\). There are three cases for a given H: (i) H describes a precondition for m and n in which they always commute; (ii) H describes a precondition for m and n in which they never commute; or (iii) neither of the above. The latter case drives the algorithm to subdivide the region by choosing a new predicate.
We now detail the run of this refinement loop on our earlier Set example. We elaborate on the other challenges that arise in later sections. At each step of the algorithm, we determine which case we are in via carefully designed validity queries to an SMT solver (Sect. 4). For \(H_0\), it returns the commutativity counterexample: \(\chi _\text {c}= \{ x=0,y=0,S=\emptyset \}\) as well as the noncommutativity counterexample \(\chi _\text {nc}= \{ x=0,y=1,S=\{0\} \}\). Since, therefore, \(H_0=\textsf {true}\) is neither a commutativity nor a noncommutativity condition, we must refine \(H_0\) into regions (or stronger conditions). In particular, we would like to perform a useful subdivision: Divide \(H_0\) into an \(H_1\) that allows \(\chi _\text {c}\) but disallows \(\chi _\text {nc}\), and an \(H'_1\) that allows \(\chi _\text {nc}\) but not \(\chi _\text {c}\). So we must choose a predicate p (from a suitable set of predicates \(\mathcal{P}\), discussed later), such that \(H_0 \wedge p \Rightarrow \chi _\text {c}\) while \(H_0 \wedge \lnot p \Rightarrow \chi _\text {nc}\) (or vice versa). The predicate \(x=y\) satisfies this property. The algorithm then makes the next two recursive calls, adding p as a conjunct to H, as shown in the second column of the diagram above: one with \(H_1 \equiv \textsf {true}\wedge x=y\) and one with \(H'_1 \equiv \textsf {true}\wedge x\ne y\). Taking the \(H'_1\) case, our algorithm makes another SMT query and finds that \(x\ne y\) implies that add always commutes with contains. At this point, it can update the commutativity condition \(\varphi \), letting \(\varphi := \varphi \vee H'_1\), adding this \(H'_1\) region to the growing disjunction. On the other hand, \(H_1\) is neither a sufficient commutativity nor a sufficient noncommutativity condition, and so our algorithm, again, produces the respective counterexamples: \(\chi _\text {c}= \{ x=0,y=0,S=\emptyset \}\) and \(\chi _\text {nc}= \{ x=0,y=0,S=\{0\} \}\). In this case, our algorithm selects the predicate \(x\in S\), and makes two further recursive calls: one with \(H_2 \equiv x=y\wedge x\in S\) and another with \(H'_2 \equiv x=y\wedge x\notin S\). In this case, it finds that \(H_2\) is a sufficiently strong precondition for commutativity, while \(H'_2\) is a strong enough precondition for noncommutativity. Consequently, \(H_2\) is added as a new conjunct to \(\varphi \), yielding \(\varphi \equiv x\ne y \vee (x=y\wedge x\in S)\). Similarly, \(\tilde{\varphi }\) is updated to be: \(\tilde{\varphi }\equiv (x=y\wedge x\notin S)\). No further recursive calls are made so the algorithm terminates and we have obtained a precise (complete) commutativity/noncommutativity specification: \(\varphi \vee \tilde{\varphi }\) is valid (Lemma 2).
Challenges and Outline. While the algorithm outlined so far is a relatively standard refinement, the above generated conditions were not immediate. We now discuss challenges involved in generating sound and useful conditions.
(Section 4) A first question is how to pose the underlying commutativity queries for each subsequent H in a way that avoids the introduction of additional quantifiers, so that we can remain in fragments for which the solver has complete decision procedures. Thus, if the data structure can be encoded using theories that are decidable, then the queries we pose to the SMT solver are guaranteed to be decidable as well. \({ Pre}_{m}/{ Post}_{m}\) specifications that are partial would introduce quantifier alternation, but we show how this can be avoided by, instead, transforming them into total specifications.
(Section 5) We have proved that our algorithm is sound even if aborted or if the ADT description involves undecidable theories. We further show that termination implies completeness, and specify broad conditions that imply termination.
(Section 6) Another challenge is to prioritize predicates during the refinement loop. This choice impacts not only the algorithm’s performance, but also the quality/conciseness of the resulting conditions. Our choice of next predicate p is governed by two requirements. First, for progress, p/\(\lnot p\) must eliminate the counterexamples to commutativity/noncommutativity due to the last iteration. This may still leave multiple choices, and we propose two heuristics – called simple and poke—with different tradeoffs to break ties.
(Section 7) We conclude with an evaluation on a range of popular data structures and a case study on boosting the security of an Ethereum smart contract.
3 Preliminaries
States, Actions, Methods. We will work with a state space \(\varSigma \), with decidable equality and a set of actions A. For each \(\alpha \in A\), we have a transition function \((\! \alpha \!) : \varSigma \rightharpoondown \varSigma \). We denote a single transition as \(\sigma \xrightarrow {\alpha }\sigma '\). We assume that each such action arc completes in finite time. Let \(\mathfrak {T}\equiv (\varSigma ,A,(\! \bullet \!))\). We say that two actions \(\alpha _1\) and \(\alpha _2\) commute [15], denoted \(\alpha _1 \bowtie \alpha _2\), provided that \((\! \alpha _1 \!) \circ (\! \alpha _2 \!) = (\! \alpha _2 \!) \circ (\! \alpha _1 \!)\). Note that \(\bowtie \) is with respect to \(\mathfrak {T}=(\varSigma ,A,(\! \bullet \!))\). Our formalism, implementation, and evaluation all extend to a more finegrained notion of commutativity: an asymmetric version called leftmovers and rightmovers [27], where a method commutes in one direction and not the other. Details can be found in [8]. Also, in our evaluation (Sect. 7) we show left/rightmover conditions that were generated by our implementation.
An action \(\alpha \in A\) is of the form \(m(\bar{x})/\bar{r}\), where m, \(\bar{x}\) and \(\bar{r}\) are called a method, arguments and return values respectively. As a convention, for actions corresponding to a method n, we use \(\bar{y}\) for arguments and \(\bar{s}\) for return values. The set of methods will be finite, inducing a finite partitioning of A. We refer to an action, say \(m(\bar{a})/\bar{v}\), as corresponding to method m (where \(\bar{a}\) and \(\bar{v}\) are vectors of values). The set of actions corresponding to a method m, denoted \(A_m\), might be infinite as arguments and return values may be from an infinite domain.
Definition 1
Methods m and n commute, denoted \(m\ \bowtie \ n\) provided that \( \forall \bar{x}\ \bar{y}\ \bar{r}\ \bar{s}.\;\; m(\bar{x})/\bar{r}\bowtie n(\bar{y})/\bar{s}\).
The quantification \(\forall \bar{x}\bar{r}\) above means \(\forall m(\bar{x})/\bar{r}\in A_m\), i.e., all vectors of arguments and return values that constitute an action in \(A_m\).
Abstract Specifications. We symbolically describe the actions of a method m as precondition \({ Pre}_{m}\) and postcondition \({ Post}_{m}\). Preconditions are logical formulae over method arguments and the initial state: \([\![ { Pre}_{m} ]\!] : \bar{x}\rightarrow \varSigma \rightarrow \mathbb {B}\). Postconditions are over method arguments, and return values, initial state and final state: \([\![ { Post}_{m} ]\!] : \bar{x}\rightarrow \bar{r}\rightarrow \varSigma \rightarrow \varSigma \rightarrow \mathbb {B}\). Given \(({ Pre}_{m},{ Post}_{m})\) for every method m, we define a transition system \(\mathfrak {T}=(\varSigma ,A,(\! \bullet \!))\) such that \(\sigma \xrightarrow {m(\bar{a})/\bar{v}} \sigma '\) iff \([\![ { Pre}_{m} ]\!]\ \bar{a}\ \sigma \) and \([\![ { Post}_{m} ]\!]\ \bar{a}\ \bar{v}\ \sigma \ \sigma '\).
Since our approach works on deterministic transition systems, we have implemented an SMTbased check (Sect. 7) that ensures the input transition system is deterministic. Deterministic specifications were sufficient in our examples. This is unsurprising given the inherent difficulty of creating efficient concurrent implementations of nondeterministic operations, whose effects are hard to characterize. Reducing nondeterministic datastructure methods to deterministic ones through symbolic partial determinization [5, 13] is left as future work.
Logical Commutativity Formulae. We will generate a commutativity condition for methods m and n as logical formulae over initial states and the arguments/return values of the methods. We denote a logical commutativity formula as \(\varphi \) and assume a decidable interpretation of formulae: \([\![ \varphi ]\!] : (\sigma ,\bar{x},\bar{y},\bar{r},\bar{s}) \rightarrow \mathbb {B}\). (We tuple the arguments for brevity.) The first argument is the initial state. Commutativity post and midconditions can also be written [22] but here, for simplicity, we focus on commutativity preconditions. We may write \([\![ \varphi ]\!]\) as \(\varphi \) when it is clear from context that \(\varphi \) is meant to be interpreted.
4 Commutativity Without Quantifier Alternation
Definition 2
Intuitively, \((\!\!] \alpha [\!\!)\) wraps \((\! \alpha \!)\) so that \(\textsf {err}{}\) loops back to \(\textsf {err}{}\), and the (potentially partial) \((\! \alpha \!)\) is made to be total by mapping elements to \(\textsf {err}{}\) when they are undefined in \((\! \alpha \!)\). It is not necessary to lift the actions (or, indeed, the methods), but only the states and transition function. Once lifted, for a given state \(\hat{\sigma }_0\), the question of some successor state becomes equivalent to all successor states because there is exactly one successor state.
Lemma 1
\(m\ \bowtie \ n \text { if and only if } m\ \hat{\bowtie }\ n\). (All proofs in [8].)
5 Iterative Refinement
We now present an iterative refinement strategy that, when given a lifted abstract transition system, generates the commutativity and the noncommutativity conditions. We then discuss soundness and relative completeness and, in Sects. 6 and 7, challenges in generating precise and useful commutativity conditions.
The refinement algorithm symbolically searches the state space for regions where the operations commute (or do not commute) in a conjunctive manner, adding on one predicate at a time. We add each subregion H (described conjunctively) in which commutativity always holds to a growing disjunctive description of the commutativity condition \(\varphi \), and each subregion H in which commutativity never holds to a growing disjunctive description of the noncommutativity condition \(\tilde{\varphi }\).
We now need to subdivide H into two regions. This is accomplished by selecting a new predicate p via the Choose method. For now, let the method Choose and the choice of predicate vocabulary \(\mathcal{P}\) be parametric. Refine is sound regardless of the behavior of Choose. Below we give the conditions on Choose that ensure relative completeness, and in Sect. 7 we discuss our particular strategy. Regardless of what p is returned by Choose, two recursive calls are made to Refine, one with argument \(H \wedge p\), and the other with argument \(H\wedge \lnot p\). The algorithm is exponential in the number of predicates. In Sect. 6 we discuss prioritizing predicates.
The refinement algorithm generates commutativity conditions in disjunctive normal form. Hence, any finite logical formula can be represented. This logical language is more expressive than previous commutativity logics that, because they were designed for runtime purposes, were restricted to conjunctions of inequalities [25] and boolean combinations of predicates over finite domains [15].
Theorem 1
(Soundness). For each \(\text {Refine}^m_n\) iteration: \(\varphi \Rightarrow m\ \hat{\bowtie }\ n\), and Open image in new window .
All proofs available in [8]. Soundness holds regardless of what Choose returns and even when the theories used to model the underlying datastructure are incomplete. Next we show termination implies completeness:
Lemma 2
If Refine\(^m_n\) terminates, then \(\varphi \vee \tilde{\varphi }\).
Theorem 2
(Conditions for Termination). Refine\(^m_n\) terminates if 1. (expressiveness) the state space \(\varSigma \) is partitionable into a finite set of regions \(\varSigma _1,...,\varSigma _N\), each described by a finite conjunction of predicates \(\psi _i\), such that either \(\psi _i\Rightarrow m\ \hat{\bowtie }\ n\) or Open image in new window ; and 2. (fairness) for every \(p\in \mathcal{P}\), Choose eventually picks p (note that this does not imply that \(\mathcal{P}\) is finite).
Note that while these conditions ensure termination, the bound on the number of iterations depends on the predicate language and behavior of Choose.
6 The Servois Tool and Practical Considerations
Input. We use an input specification language building on YAML (which has parser and printer support for all common programming languages) with SMTLIB as the logical language. This can be automatically generated relatively easily, thus enabling the integration with other tools [10, 16, 17, 20, 26, 28]. In [8], we show the Counter ADT specification, which was derived from the \({ Pre}_{}\) and \({ Post}_{}\) conditions used in earlier work [22]. The states of a transition system describing an ADT are encoded as list of variables (each as a name/type pair), and each method specification requires a list of argument types, return type, and \({ Pre}_{}\)/\({ Post}_{}\) conditions. Again, the Counter example can be seen in [8].
Implementation. We have developed the opensource Servois tool [3], which implements Refine, Lift, predicate generation, and a method for selecting predicates (Choose) discussed below. Servois uses CVC4 [11] as a backend SMT solver. Servois begins by performing some preprocessing on the input transition system. It checks that the transition system is deterministic. Next, in case the transition system is partial, Servois performs the Lift transformation (Sect. 4). An example of Lift applied to Counter is in [8].
Next, Servois automatically generates the predicate language (PGen) in addition to userprovided hints. If the predicate vocabulary is not sufficiently expressive, then the algorithm would not be able to converge on precise commutativity and noncommutativity conditions (Sect. 5). We generate predicates by using terms and operators that appear in the specification, and generating welltyped atoms not trivially true or false. As we demonstrate in Sect. 7, this strategy works well in practice. Intuitively, \({ Pre}_{}\) and \({ Post}_{}\) formulas suffice to express the footprint of an operation. So, the atoms comprising them are an effective vocabulary to express when operations do or do not interfere.
Predicate Selection (Choose). Even though the number of computed predicates is relatively small, since our algorithm is exponential in number of predicates it is essential to be able to identify relevant predicates for the algorithm. To this end, in addition to filtering trivial predicates, we prioritize predicates based on the two counterexamples generated by the validity checks in Refine. Predicates that distinguish between the given counter examples are tried first (call these distinguishing predicates). Choose must return a predicate such that \(\chi _\text {c}\Rightarrow \; H \wedge p\) and \(\chi _\text {nc}\Rightarrow \; H \wedge \lnot p\). This guarantees progress on both recursive calls. When combined with a heuristic to favor less complex atoms, this ensured timely termination on our examples. We refer to this as the simple heuristic.
Though this produced precise conditions, they were not always very concise, which is desirable for human understanding, and inspection purposes. We thus introduced a new heuristic which significantly improves the qualitative aspect of our algorithm. We found that doing a lookahead (recurse on each predicate one level deep, or poke) and computing the number of distinguishing predicates for the two branches as a good indicator of importance of the predicate. More precisely, we pick the predicate with lowest sum of remaining number of distinguishing predicates by the two calls. As an aside, those familiar with decision tree learning, might see a connection with the notion of entropy gain. This requires more calls to the SMT solver at each call, but it cuts down the total number of branches to be explored. Also, all individual queries were relatively simple for CVC4. The heuristic converges much faster to the relevant predicates, and produces smaller, concise conditions.
7 Case Studies
Depending on the pair of methods, the number of predicates generated by PGen were (count after filtering in parentheses): Counter: 25–25 (12–12), Accumulator: 1–20 (0–20), Set: 17–55 (17–34), HashTable: 18–36 (6–36), Stack: 41–61 (41–42). We did not provide any hints to the algorithm for this case study. On all our examples, the simple heuristic terminated with precise commutativity conditions. In Fig. 2, we give the number of solver queries and total time (in paren.) consumed by this heuristic. The experiments were run on a 2.53 GHz Intel Core 2 Duo machine with 8 GB RAM. The conditions in Fig. 2 are those generated by the poke heuristic, and interested reader may compare them with the simple heuristic in [7]. On the theoretical side, our Choose implementation is fair (satisfies condition 2 of Theorem 2, as Lines 9–10 of the algorithm remove from \(\mathcal {P}\) the predicate being tried). From our experiments we conclude that our choice of predicates satisfies condition 1 of Theorem 2.
The BlockKing Ethereum Smart Contract. We further validated our approach by examining a realworld situation in which noncommutativity opens the door for attacks that exploit interleavings. We examined “smart contracts”, which are programs written in the Solidity programming language [4] and executed on the Ethereum blockchain [1]. Eliding many details, smart contracts are like objects, and blockchain participants can invoke methods on these objects. Although the initial intuition is that smart contracts are executed sequentially, practitioners and academics [31] are increasingly realizing that the blockchain is a concurrent environment due to the fact the execution of one actor’s smart contract can be split across multiple blocks, with other actors’ smart contracts interleaved. Therefore, the execution model of the blockchain has been compared to that of concurrent objects [31]. Unfortunately, many smart contracts are not written with this in mind, and attackers can exploit interleavings to their benefit.
As an example, we study the BlockKing smart contract. Figure 3 provides a simplification of its description, as discussed in [31]. This is a simple game in which the players—each identified by an address \(\textsf {sendr}_{}\)—participate by making calls to BlockKing.enter(), sending money \(\textsf {val}_{}\) to the contract. (The grey variables are external input that we have lifted to be parameters. \(\textsf {bk}_{}\) reflects the caller’s current block number and \(\textsf {rnd}_{}\) is the outcome of a random number generation, described shortly.) The variables on Line 1 are globals, writable in any call to enter. On Line 3 there is a trivial case when the caller hasn’t put enough value into the game, and the money is simply returned. Otherwise, the caller stores their address and value into the shared state. A random number is then generated and, since this requires complex algorithms, it is done via a remote procedure call to a thirdparty on Line 5, with a callback method provided on Line 7. If the randomly generated number is equal to a modulus of the current block number, then the caller is the winner, and warrior’s (caller’s) details are stored to king and kingBlock on Line 10.
Since random number generation is done via an RPC, players’ invocations of enter can be interleaved. Moreover, these calls all write \(\textsf {sendr}_{}\) and \(\textsf {val}_{}\) to shared variables, so the RPC callback will always roll the dice for whomever most recently wrote to warriorBlock. An attacker can use this to leverage other players’ investments to increase his/her own chance to win.
In summary, if we assume that \(\textsf {sendr}_{1}\ne \textsf {sendr}_{2}\), the noncommutativity of the original version is \(\textsf {val}_{1}\ge 50 \vee \textsf {val}_{2}\ge 50\) (very strong). By contrast, the noncommutativity of the fixed version is \(\textsf {val}_{1}\ge 50 \wedge \textsf {val}_{2}\ge 50 \wedge \textsf {md}(\textsf {bk}_{2}) = \textsf {rnd}_{2}\wedge \textsf {md}(\textsf {bk}_{1}) = \textsf {rnd}_{1}\). We have thus demonstrated that the commutativity (and noncommutativity) conditions generated by Servois can help developers understand the model of interference between two concurrent calls.
8 Conclusions and Future Work
This paper demonstrates that it is possible to automatically generate sound and effective commutativity conditions, a task that has so far been done manually or without soundness. Our commutativity conditions are applicable in a variety of contexts including transactional boosting [19], open nested transactions [29], and other nontransactional concurrency paradigms such as race detection [15], parallelizing compilers [30, 34], and, as we show, robustness of Ethereum smart contracts [31]. It has been shown that understanding the commutativity of datastructure operations provides a key avenue to improved performance [12] or ease of verification [23, 24].
This work opens several avenues of future research. For instance, leveraging the internal state of the SMT solver (beyond counterexamples) in order to generate new predicates [21]; automatically building abstract representation or making inferences such as one we made for the stack example; and exploring strategies to compute commutativity conditions directly from the program’s code, without the need for an intermediate abstract representation [34].
References
 1.Ethereum. https://ethereum.org/
 2.Servois homepage. http://cs.nyu.edu/~kshitij/projects/servois
 3.Servois source code. https://github.com/kbansal/servois
 4.Solidity programming language. https://solidity.readthedocs.io/en/develop/
 5.Abadi, M., Lamport, L.: The existence of refinement mappings. Theor. Comput. Sci. 82, 253–284 (1991)MathSciNetCrossRefGoogle Scholar
 6.Aleen, F., Clark, N.: Commutativity analysis for software parallelization: letting program transformations see the big picture. In: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOSXII), pp. 241–252. ACM (2009)Google Scholar
 7.Bansal, K.: Decision procedures for finite sets with cardinality and local theory extensions. Ph.D. thesis, New York University, January 2016Google Scholar
 8.Bansal, K., Koskinen, E., Tripp, O.: Automatic generation of precise and useful commutativity conditions (extended version). CoRR, abs/1802.08748 (2018). https://arxiv.org/abs/1802.08748
 9.Bansal, K., Reynolds, A., Barrett, C., Tinelli, C.: A new decision procedure for finite sets and cardinality constraints in SMT. In: Olivetti, N., Tiwari, A. (eds.) IJCAR 2016. LNCS (LNAI), vol. 9706, pp. 82–98. Springer, Cham (2016). https://doi.org/10.1007/9783319402291_7CrossRefGoogle Scholar
 10.Barnett, M., Leino, K.R.M., Schulte, W.: The Spec# programming system: an overview. In: Barthe, G., Burdy, L., Huisman, M., Lanet, J.L., Muntean, T. (eds.) CASSIS 2004. LNCS, vol. 3362, pp. 49–69. Springer, Heidelberg (2005). https://doi.org/10.1007/9783540305699_3CrossRefGoogle Scholar
 11.Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanović, D., King, T., Reynolds, A., Tinelli, C.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 171–177. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642221101_14CrossRefGoogle Scholar
 12.Clements, A.T., Kaashoek, M.F., Zeldovich, N., Morris, R.T., Kohler, E.: The scalable commutativity rule: designing scalable software for multicore processors. ACM Trans. Comput. Syst. 32(4), 10 (2015)CrossRefGoogle Scholar
 13.Cook, B., Koskinen, E.: Making prophecies with decision predicates. In: Proceedings of the 38th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, 26–28 January 2011, pp. 399–410 (2011)Google Scholar
 14.Dickerson, T., Gazzillo, P., Herlihy, M., Koskinen, E.: Adding concurrency to smart contracts. In: Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC 2017, pp. 303–312. ACM, New York (2017)Google Scholar
 15.Dimitrov, D., Raychev, V., Vechev, M.T., Koskinen, E.: Commutativity race detection. In: O’Boyle, M.F.P., Pingali, K. (eds.) ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2014, Edinburgh, United Kingdom, 09–11 June 2014, p. 33. ACM (2014)Google Scholar
 16.Ernst, G.W., Ogden, W.F.: Specification of abstract data types in modula. ACM Trans. Program. Lang. Syst. 2(4), 522–543 (1980)CrossRefGoogle Scholar
 17.Flon, L., Misra, J.: A unified approach to the specification and verification of abstract data types. In: Proceedings of the Specifications of Reliable Software Conference. IEEE Computer Society (1979)Google Scholar
 18.Gehr, T., Dimitrov, D., Vechev, M.: Learning commutativity specifications. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 307–323. Springer, Cham (2015). https://doi.org/10.1007/9783319216904_18CrossRefGoogle Scholar
 19.Herlihy, M., Koskinen, E.: Transactional boosting: a methodology for highly concurrent transactional objects. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2008) (2008)Google Scholar
 20.Hoare, C.A.R.: Proof of correctness of data representations. In: Broy, M., Denert, E. (eds.) Software Pioneers, pp. 385–396. Springer, New York (2002). https://doi.org/10.1007/9783642594120_24CrossRefGoogle Scholar
 21.Hu, Y., Barrett, C., Goldberg, B.: Theory and algorithms for the generation and validation of speculative loop optimizations. In: Proceedings of the 2nd IEEE International Conference on Software Engineering and Formal Methods (SEFM 2004), pp. 281–289. IEEE Computer Society, September 2004Google Scholar
 22.Kim, D., Rinard, M.C.: Verification of semantic commutativity conditions and inverse operations on linked data structures. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, pp. 528–541. ACM (2011)Google Scholar
 23.Koskinen, E., Parkinson, M.J.: The push/pull model of transactions. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2015, Portland, OR, USA, June 2015 (2015)Google Scholar
 24.Koskinen, E., Parkinson, M.J., Herlihy, M.: Coarsegrained transactions. In: Hermenegildo, M.V., Palsberg, J. (eds.) Proceedings of the 37th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL 2010, pp. 19–30. ACM (2010)Google Scholar
 25.Kulkarni, M., Nguyen, D., Prountzos, D., Sui, X., Pingali, K.: Exploiting the commutativity lattice. In: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, pp. 542–555. ACM (2011)Google Scholar
 26.Leino, K.R.M.: Specifying and verifying programs in Spec#. In: Virbitskaite, I., Voronkov, A. (eds.) PSI 2006. LNCS, vol. 4378, p. 20. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540708810_3CrossRefGoogle Scholar
 27.Lipton, R.J.: Reduction: a method of proving properties of parallel programs. Commun. ACM 18(12), 717–721 (1975)MathSciNetCrossRefGoogle Scholar
 28.Meyer, B.: Applying “design by contract”. IEEE Comput. 25(10), 40–51 (1992)CrossRefGoogle Scholar
 29.Ni, Y., Menon, V., AdlTabatabai, A., Hosking, A.L., Hudson, R.L., Moss, J.E.B., Saha, B., Shpeisman, T.: Open nesting in software transactional memory. In: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2007, pp. 68–78. ACM (2007)Google Scholar
 30.Rinard, M.C., Diniz, P.C.: Commutativity analysis: a new analysis technique for parallelizing compilers. ACM Trans. Program. Lang. Syst. (TOPLAS) 19(6), 942–991 (1997)CrossRefGoogle Scholar
 31.Sergey, I., Hobor, A.: A concurrent perspective on smart contracts. In: Brenner, M., Rohloff, K., Bonneau, J., Miller, A., Ryan, P.Y.A., Teague, V., Bracciali, A., Sala, M., Pintore, F., Jakobsson, M. (eds.) FC 2017. LNCS, vol. 10323, pp. 478–493. Springer, Cham (2017). https://doi.org/10.1007/9783319702780_30CrossRefGoogle Scholar
 32.SolarLezama, A., Jones, C.G., Bodík, R.: Sketching concurrent data structures. In: Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation PLDI 2008, pp. 136–148 (2008)Google Scholar
 33.Tripp, O., Manevich, R., Field, J., Sagiv, M.: JAUNS: exploiting parallelism via hindsight. In: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2012, pp. 145–156. ACM, New York (2012)Google Scholar
 34.Tripp, O., Yorsh, G., Field, J., Sagiv, M.: HAWKEYE: effective discovery of dataflow impediments to parallelization. In: Proceedings of the 26th Annual ACM SIGPLAN Conference on ObjectOriented Programming, Systems, Languages, and Applications, OOPSLA 2011, pp. 207–224 (2011)Google Scholar
 35.Vechev, M.T., Yahav, E.: Deriving linearizable finegrained concurrent objects. In: Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, pp. 125–135 (2008)Google Scholar
 36.Vechev, M.T., Yahav, E., Yorsh, G.: Abstractionguided synthesis of synchronization. In: Proceedings of the 37th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL 2010, pp. 327–338 (2010)Google Scholar
 37.Wang, C., Yang, Z., Kahlon, V., Gupta, A.: Peephole partial order reduction. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 382–396. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540788003_29CrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.