1 Introduction

Weak memory models [1, 4] are now a standard feature of concurrent systems and programmers may choose to exploit them at both the level of hardware (e.g., Intel TSO, Arm) and the level of programming languages (e.g., C11, Java). However, these models differ significantly from each other and are generally incomparable (i.e., allowed behaviours in one memory model are not necessarily a subset of the behaviours allowed by another [21, 38]). This means that reasoning about a particular memory model can be challenging since one needs bespoke logics and assertions for verification. A variety of separation logics (e.g., [13, 19, 37]) and (timestamp-based) Owicki-Gries logics (e.g., [7, 11, 12, 27, 39]) have been developed for specific weak memory models, but are not a generic technique.

In this paper, we aim to simplify weak memory reasoning by developing a unifying framework that captures the behaviours of different weak memory models. Our motivation is similar to prior works [3, 14, 16, 22], which also aim to uniformly reason about programs under different memory models within a single verification framework. Our point of departure is a new interval-based framework, \({\textsf{Piccolo}}\)  [25], that uses a notion of potentials to describe a program’s behaviour, which is both intuitive to use and simple to describe. Thus far, \({\textsf{Piccolo}}\) has been applied to a memory model known as strong release-acquire (\({\textsf{SRA}}\)  [23]), which strengthens the release-acquire memory model used by C11 [24].

We extend \({\textsf{Piccolo}}\) and show that potential-based reasoning can also be used in other memory models, namely sequential consistency (\({\textsf{SC}}\)) [28] and total store order (\({\textsf{TSO}}\)) [31, 35]. While the extension to \({\textsf{SC}}\) is straightforward, the \({\textsf{TSO}}\) memory model presents a new set of challenges. Namely, unlike \({\textsf{SRA}}\), \({\textsf{TSO}}\) is a weak memory model that guarantees multi-copy atomicity (MCA), which means that all threads see the writes to each location in the same order. As we shall see, our logic provides a novel insight into reasoning about memory models that satisfy MCA. In particular, we develop a proof rule, which shows that for particular memory configurations, one can make a deduction on a thread based on the observations made by another thread.

Related Work. A number of works have proposed program logics for reasoning about concurrent programs on weak memory models [7, 11, 19, 25, 27, 36], all specific to one memory model. Bargmann and Wehrheim [6] build proof rules on top of the generic approach of [14], using the program logic proposed in [11]. This logic is, however, not able to express sequences of values seen by threads, as possible in \({\textsf{Piccolo}}\) (and needed for IRIW). Rely-guarantee reasoning on weak memory models (without defining specific logics) is furthermore studied in [9, 10].

Alglave et al. [3] enhance Owicki-Gries reasoning [32] with so-called Pythia variables and communication-based proof obligations between different read and write events. This however introduces additional complexity since validity of the communication assertions must be proved in addition to the local correctness and non-interference proof obligations of the Owicki-Gries method. Doherty et al. [14] work with a timestamp-based operational semantics using a large set of axioms to characterise the properties of each memory model. By introducing assertions that directly describe the communication state, this method avoids an extra set of checks, allowing correctness to be proved by establishing local correctness and interference freedom of the assertions (as in the standard setting [32]). However, the timestamp model tends to induce a large set of bespoke assertions that describe a range of different phenomena and state configurations [7, 11, 12, 14, 39]. Besides these deductive approaches, Gavrilenko et al. [16] and Kokologiannakis et al. [22] propose model checking techniques for weak memory models, both parametric in the memory model, but only provide a bounded proof (i.e., a proof of correctness for paths of bounded length).

Our work encompasses a logic for \({\textsf{TSO}}\). While many prior works have studied verification under \({\textsf{TSO}}\), only a handful [9, 14, 34] consider program logics.

Contributions. The main contribution of this article is the use of potentials and its associated logical framework as a unifying model for reasoning across \({\textsf{SC}}\), \({\textsf{TSO}}\) and \({\textsf{SRA}}\). While \({\textsf{SRA}}\)  is already defined in terms of potentials [25], we provide a novel technique for potential-based reasoning for \({\textsf{SC}}\)  and \({\textsf{TSO}}\) by mapping their existing operational semantics into a potential domain. This unification requires two extensions to the existing logic for potentials: (a) assertions for reasoning about views of threads (e.g., view-maximality), and (b) a new proof rule for reasoning about the behaviour of reads in the presence of multi-copy atomicity (as guaranteed by memory models \({\textsf{SC}} \) and \({\textsf{TSO}} \)). Finally, we show how our proof rules can be applied to reason about key examples from the literature.

2 Motivation

As a first illustration of our reasoning framework, consider the concurrent program called the message-passing litmus test (see Fig. 1), typically used for demonstrating the causal consistency of a memory model. Thread \(\texttt{T}_1\) updates \(\texttt{x}\) (representing some data) to 1, then updates \(\texttt{y}\) (representing a flag) to 1. Thread \(\texttt{T}_2\) reads from \(\texttt{y}\), then from \(\texttt{x}\) and guarantees that seeing the flag to be set (i.e., \(\texttt{a} = 1\)), it must also read the data written by \(\texttt{T}_1\). Causal consistency holds for all three memory models that we consider, but does not hold for weaker models such as C11 (when using relaxed atomics) or Arm [15]. That is, for these weaker memory models, even if \(\texttt{T}_2\) reads 1 for \(\texttt{y}\), it may subsequently read a stale (in this case initial) value for \(\texttt{x}\), missing the updated to \(\texttt{x}\) at line 1.

Fig. 1.
figure 1

Message-passing proof using potentials that is valid for \({\textsf{SC}}\), \({\textsf{TSO}}\) and \({\textsf{SRA}}\), adapted from [25]. The assertions are new for our unified proof.

We seek to develop a correctness proof that demonstrates causal consistency (i.e., a proof showing that the postcondition \(\texttt{a}=1 \Rightarrow \texttt{b}=1\) holds) which uniformly applies to several memory models. The proof outline in Fig. 1 is slightly adapted from the proof outline by Lahav et al. [25] for the \({\textsf{SRA}}\) memory model. It shows correctness of message passing using a notion of potentials and an extension of the logic \({\textsf{Piccolo}}\) to reason over potentials. The logic aims to exploit an operational semantics comprising a state domain over mappings from threads to potentials. Potentials are lists (sequences) of stores (which themselves are mappings from shared locations to values). Thereby, the semantic domain accounts for the fact that in weak memory models (a) threads do not all see the same value of a shared location at the same time, and (b) threads see written values in a certain order. Assertions formalise such states using an interval-based logic. The assertion \({\texttt{T}_2} \!\ltimes \! [ \texttt{y} \ne 1] \) states the list  of stores corresponding to \(\texttt{T}_2\) are such that, for all stores in the list, \(\texttt{y} \ne 1\) holds. The values of other shared locations (including \(\texttt{x}\)) are unconstrained. Similarly, \({\texttt{T}_2} \!\ltimes \! [\texttt{y}\ne 1]\mathbin {;}[ \texttt{x}=1] \) states that the list of stores corresponding to \(\texttt{T}_2\) may be split into an initial (possibly empty) list, say \(L_1\), such that \(\texttt{y}\ne 1\) for all stores in \(L_1\), and a remaining (possibly empty) list, say \(L_2\), such that \(\texttt{x} = 1\) for all stores in \(L_2\). Shared locations different from \(\texttt{y}\) (resp., \(\texttt{x}\)) are unconstrained in \(L_1\) (resp., \(L_2\)). Finally, the assertion \({\texttt{T}_1} \uparrow {\texttt{x}}\) states that thread \(\texttt{T}_1\) is currently viewing the last update to shared location \(\texttt{x}\).

The prior work on \({\textsf{Piccolo}}\)  [25] employs a Hoare logic for atomic steps (stores and loads) and potential-based assertions, allowing one to discharge the standard (Owicki-Gries [32]) proof obligationsFootnote 1 generated by the proof outline in Fig. 1. Namely, the framework requires that we establish local correctness of the assertions within a thread, as well as interference freedom from other thread(s). As an example, consider again Fig. 1. Local correctness of the assertions in \(\texttt{T}_1\) for instance is straightforward. The only non-trivial assertion is the precondition to line 2, which we refer to as \(\texttt{T}_1.2\) (second assertion in \(\texttt{T}_1\)). Local correctness of \(\texttt{T}_1.2\) is straightforward since execution of \({\textsf{STORE}}(\texttt{x},1)\) directly establishes \(\texttt{T}_1.2\), while interference freedom holds because \(\texttt{T}_2\) only contains loads, which cannot affect the potentials of \(\texttt{T}_1\). In thread \(\texttt{T}_2\), local correctness of \(\texttt{T}_2.1\) is established by the precondition of the program (since the second interval of \(\texttt{T}_2.1\) is allowed to be empty). Interference freedom against line 1 holds because line 1 executes in a state in which \(\texttt{T}_1\) is view maximal on \(\texttt{x}\), and hence can only introduce a store with \(\texttt{x} = 1\) at the end of \(\texttt{T}_2\)’s potential. These and similar correctness arguments are captured as proof rules in the reasoning framework (see Sect. 4 and Sect. 5).

Our main motivation for this paper is to generalise and unify this approach. Namely, is it possible for the same proof outline to be valid for several memory models? Showing this would mean that a verifier only needs to understand a single proof system, and for the resulting proof to apply to multiple memory models. We seek to answer this question in the context of potentials and the logic \({\textsf{Piccolo}}\), avoiding the shortcomings of previous approaches as discussed in the introduction.

To this end, we provide a mapping from the operational semantics of both \({\textsf{SC}}\) and \({\textsf{TSO}}\) to a potential-based semantics, allowing one to interpret (extended) \({\textsf{Piccolo}}\) assertions and proof rules for these memory models. Using this technique we show that the proof outline in Fig. 1 also holds for \({\textsf{SC}}\) and \({\textsf{TSO}}\), allowing us to validate that both models satisfy causal consistency. Later, from Sect. refsec:IRIW onwards, we shall see proof outlines that only hold in some of our memory models (i.e., for \({\textsf{SC}}\) and \({\textsf{TSO}}\), but not for \({\textsf{SRA}}\)). A distinguishing feature between these memory models is then that some proof rules used to construct proof outlines are sometimes sound in one, but unsound in another model.

3 Background

In this section, we define the program syntax, and present the potential-based domain to unify weak memory models. Later, in Sect. 4, we present a logic over this domain.

Notation. Lists over an alphabet A are written as \(L= \langle a_1 \cdot \ldots \cdot a_n \rangle \) where \(a_1, \ldots , a_n\in A\). We use \(\cdot \) to concatenate lists, write \(\langle \rangle \) for the empty list, \(L[i]\) for the i’th element of L and \(\# L\) for the length of \(L\). We assume the first element to be \(L[1]\) and write \(a \in L\) to say that element a occurs in the list \(L\). We furthermore use \(\mathbb {Q}^{+}\) to denote the positive rational numbers including 0. Given a function f, we let \(f[y \mapsto v] = \lambda x.\ \mathbf{if\ } x = y \mathbf{\ then\ } v \mathbf{\ else\ } f(x)\) denote functional override.

Fig. 2.
figure 2

Program syntax

Fig. 3.
figure 3

Local semantics of commands (\({l}\in \textsf{Lab}, {l}_\varepsilon \in \textsf{Lab}\cup \{\varepsilon \}\))

Program Syntax. The syntax of programs, given in Fig. 2, is mostly standard, comprising primitive (atomic) commands \(c\) and compound commands \(C\). The non-standard components are instrumented commands \(\tilde{c}\) (typically used to support auxiliary variables), which atomically execute a primitive command \(c\) and a local assignment \({a}\;{:=}\;e\). Atomic commands (such as \({\textsf{CAS}}\)), are elided since they induce a different set of proof rules. Rules for compound statements such as if-then-else and loops are straightforward to derive [25].

We assume top-level parallelismFootnote 2, i.e., that programs are of the form \(\mathcal {C}\triangleq (\lambda {\tau }\in \textsf{Tid}.\ C)\), mapping threads (of type \(\textsf{Tid}\)) to sequential commands. Often, we write \(C_1 \Vert C_2 \Vert \ldots \Vert C_n\) (ignoring thread ids) for a program \(\mathcal {C}\).

Semantics. As in prior works (e.g., [7, 14, 25, 26]), we present the semantics of the language in three steps.

Local Semantics. Here, the label (of type \(\textsf{Lab}= \{{{\texttt{R}}} ({{x}},{v_{\texttt{R}}}), {{\texttt{W}}} ({{x}},{v_{\texttt{W}}})\}\)) for each action (read/write) associated with each command is extracted. This semantics (see Fig. 3) also tracks and updates a local register store, \(\gamma \in \textsf{Reg}\rightarrow \textsf{Val}\). In this semantics, in the read rule, the value read is parametric and determined by the transition label. Later, in the combined program semantics, this value will be fixed so that the read value is consistent with the memory semantics.

Memory semantics. The semantics of memory models given by a labelled transition system (LTS), \(\mathcal {M}\), with set of states denoted by \(\mathcal {M}.{\texttt{Q}}\), initial states \(\mathcal {M}.{\texttt{Q}_0}\), and transitions denoted by . Transition labels, k, of \(\mathcal {M}\) consist of program transition labels (elements of \(\textsf{Tid}\times (\textsf{Lab}\cup \{\varepsilon \})\)) and a (disjoint) set \(\mathcal {M}.{\mathbf {\Theta }}\) of internal memory steps. As an example, we present the \({\textsf{SC}}\) memory model below. The \({\textsf{SRA}}\) model is presented in Example 2 and the \({\textsf{TSO}}\) model in Sect. 6.

Example 1

( \({\textsf{SC}}\) memory model). The memory model \({\textsf{SC}} \) simply tracks the most recent value written to each variable (plus the id of the writing thread). \({\textsf{SC}} \) has no internal memory transitions (i.e., \({\textsf{SC}}.{\mathbf {\Theta }}\triangleq \emptyset \)), and the initial state is defined by (where \(\texttt{T}_0\) is a special initialising thread), and transitions are given by:

figure d

Combined Program Semantics. This semantics combines the local with the memory semantics using the three generic rules below for steps corresponding to the external memory (left), non-memory (middle) and internal memory (right):

figure e

Potential Domain. Under weak memory models, a thread may read from several possible writes to a location when determining the location’s value, and different semantics have been developed to capture this phenomenon. In this paper, our unifying model is based on the notion of potentials [24, 25]. Each potential store is a mapping from shared locations to values as well as the thread that performed the write plus some auxiliary information required by specific memory models. This auxiliary information differs between memory models: \({\textsf{SC}}\) requires no additional auxiliary information, \({\textsf{TSO}}\) keeps track of timestamps, while \({\textsf{SRA}}\) keeps track of update flags. As we shall see, the \({\textsf{SRA}}\) memory model is defined directly over potentials, whereas for \({\textsf{SC}}\) and \({\textsf{TSO}}\), we develop a mapping from the memory model to the potential domain.

Definition 1

A potential store is a function \(\delta : \textsf{Loc}\rightarrow \textsf{Val}\times \textsf{Tid}\times \textsf{Aux}\), where \(\textsf{Aux}\) captures the auxiliary information required by the memory model at hand.

We use \(\delta ({x}).{\texttt{val}}\) and \(\delta ({x}).{\texttt{tid}}\) to retrieve the value and thread id of \(\delta ({x})\), respectively. Additionally, in \({\textsf{TSO}}\), we use \(\delta ({x}).{\texttt{ts}}\) to retrieve the (auxiliary) timestamp and in \({\textsf{SRA}}\), we use \(\delta ({x}).{\texttt{flag}}\) to retrieve the (auxiliary) update flag.

Definition 2

A potential is a non-empty set of store lists. We let \({\mathcal {{L}}}\) be the set of all potentials. A potential mapping (of the set of all potential mappings \(\mathcal{P}\)) is a partial function \({\mathcal {{D}}}: \textsf{Tid}\rightarrow 2^{{\mathcal {{L}}}} \setminus \{\emptyset \}\) that maps thread identifiers to potentials such that all lists agree on the last store.

Fig. 4.
figure 4

\({\textsf{SRA}}\) semantics of [25] (\(L[{x}\mapsto {\texttt{R}}]\) changes the update flag of \({x}\) to \({\texttt{R}}\) in \(L\))

Example 2

( \({\textsf{SRA}}\) memory model). The operational semantics of the memory model \({\textsf{SRA}}\) is directly defined on the potential domain using update flags as auxiliary information, i.e., \({\textsf{SRA}}.\textsf{Aux}\triangleq \{\texttt{R}, \texttt{RMW}\}\). With this, \({\textsf{SRA}}.{\texttt{Q}}\triangleq \mathcal{P}\), \({\textsf{SRA}}.{\texttt{Q}_0}\triangleq \lambda {\tau }.\{ \langle \lambda {x}. \langle 0, \texttt{T}_0, \texttt{RMW}\rangle \rangle \}\), \(\textsf{SRA}.{\mathbf {\Theta }}\triangleq \{\textsf{lose}, \textsf{dup}\}\) and the transitions are defined in Fig. 4. Reading requires all lists in a thread’s potential to agree on the first value of a location. Writing changes the value of a location in all stores in the writer thread \({\tau }\), and on a suffix of the store lists in other threads. Potential stores can furthermore be arbitrarily dropped from store lists in potentials (and can thus enable reading) as well as duplicated. This is modelled by two internal transitions, lose and dup. For this, we employ two relations on store lists, \(L' \sqsubseteq L\) for losing (e.g., \(\delta _1 \cdot \delta _2 \cdot \delta _3 \sqsubseteq \delta _2 \cdot \delta _3\)) and \(L\preceq L'\) for duplication (e.g., \(\delta _1 \cdot \delta _2 \preceq \delta _1 \cdot \delta _2 \cdot \delta _2\)). The relations are lifted to potential mappings as expected.

We refer the interested reader to [25] for full details of the \({\textsf{SRA}}\) semantics.

Example 3

Consider the MP litmus test from Fig. 1. After executing instructions 1 and 2 of \(\texttt{T}_1\), thread \(\texttt{T}_2\) could have the potential:

figure f

in which it currently sees both \(\texttt{x}\) and \(\texttt{y}\) to have the value 0. In the future (i.e., after some lose steps) \(\texttt{T}_2\) will first observe to become 1, then to become 1. Note that once \(\texttt{T}_2\) reads 1 for , it can only read 1 for the value of .

4 A Logic for Potentials

In this section, we present an extension of \({\textsf{Piccolo}}\)  [25], an interval-based logic for weak memory models formalised using a notion of potentials. \({\textsf{Piccolo}}\) (originally developed for \({\textsf{SRA}}\)  [25]) comprises a set of assertions over potential-based states and a set of proof rules that allow one to formalise the values that a thread may see now, and in the future. The extended version of \({\textsf{Piccolo}}\) that we develop enables reasoning about \({\textsf{SC}}\) and \({\textsf{TSO}}\) in addition to \({\textsf{SRA}}\).

Figure 5 gives the syntax of our extension to \({\textsf{Piccolo}} \).  The extension concerns two concepts: First, we add assertions for specifying view maximality. Informally, a thread \({\tau }\) is view maximal on a location \({x}\), \({{\tau }} \uparrow {{x}}\), if it can only see the “last” write to \({x}\). Second, we incorporate the possibility for specifying a writer thread’s id within the logic (by stating the writer to a location \({x}\) to be \({\tau }\), \({x}.{\texttt{tid}}= {\tau }\)).  We require this to later be able to formulate the proof rule stating MCA. Besides these new concepts, the other operators inherited from \({\textsf{Piccolo}}\) are intervals: a list fulfills an interval assertion \([E]\) when all elements in the list satisfy \(E\), and a list \(L\) satisfies \([I_1] \mathbin {;}[I_2]\) (where \(\mathbin {;}\) is the chop operator [8, 29]) iff \(L\) can be split into lists \(L_1\) and \(L_2\) such that \(L_1\) satisfies \([I_1]\) and \(L_2\) satisfies \([I_2]\).

Fig. 5.
figure 5

Assertions of \({\textsf{Piccolo}}\) (extended)

Notation. For an assertion \(\varphi \), we let \( fv (\varphi ) \subseteq \textsf{Reg}\cup \textsf{Loc}\cup \textsf{Tid}\) be the set of registers, locations and thread identifiers occurring in \(\varphi \). Instead of writing \({x}.{\texttt{val}}= e\), we often simply write \({x}= e\).

Next, we formally define the interpretation of \({\textsf{Piccolo}} \) on the domain of potentials.

Definition 3

Let \({\gamma }\) be a register store, \(\delta \) a potential store, \(L\) a store list, and \({\mathcal {{D}}}\) a potential mapping. We let \(\llbracket e \rrbracket _{{\langle {{\gamma },\delta }\rangle }} \triangleq {\gamma }(e)\), \(\llbracket {x}.{\texttt{val}} \rrbracket _{{\langle {{\gamma },\delta }\rangle }} \triangleq \delta ({x}).{\texttt{val}}\) and \(\llbracket {x}.{\texttt{tid}} \rrbracket _{{\langle {{\gamma },\delta }\rangle }} \triangleq \delta ({x}).{\texttt{tid}}\). The extension of this notation to any extended expression \(E\) is standard. The validity of assertions in \({\langle {{\gamma },{\mathcal {{D}}}}\rangle }\), denoted by \({\langle {{\gamma },{\mathcal {{D}}}}\rangle } \models \varphi \), is defined as follows:

  1. 1.

    \({\langle {{\gamma },L}\rangle } \models {[E]}\) if \(\llbracket E \rrbracket _{{\langle {{\gamma },\delta }\rangle }} = true \) for every \(\delta \in L\).

  2. 2.

    \({\langle {{\gamma },L}\rangle } \models {I_1 \mathbin {;}I_2}\) if \({\langle {{\gamma },L_1}\rangle } \models I_1\) and \({\langle {{\gamma },L_2}\rangle } \models I_2\) for some (possibly empty) \(L_1\) and \(L_2\) such that \(L=L_1 \cdot L_2\).

  3. 3.

    \({\langle {{\gamma },L}\rangle } \models {I_1 \wedge I_2}\) if \({\langle {{\gamma },L}\rangle } \models I_1\) and \({\langle {{\gamma },L}\rangle } \models I_2\) (similarly for \(\vee \)).

  4. 4.

    \({\langle {{\gamma },{\mathcal {{D}}}}\rangle } \models {{\tau }} \!\ltimes \! I \) if \({\langle {{\gamma },L}\rangle } \models I\) for every \(L\in {\mathcal {{D}}}({\tau })\).

  5. 5.

    \({\langle {{\gamma }, {\mathcal {{D}}}}\rangle } \models {{\tau }} \uparrow {{x}}\) if \(L[i]({x})=L[1]({x})\) for every \(L\in {\mathcal {{D}}}({\tau }), 1 \le i \le \# L\).

  6. 6.

    \({\langle {{\gamma },{\mathcal {{D}}}}\rangle } \models e\) if \({\gamma }(e)= true \).

  7. 7.

    \({\langle {{\gamma }, {\mathcal {{D}}}}\rangle } \models \varphi _1 \wedge \varphi _2\) if \({\langle {{\gamma }, {\mathcal {{D}}}}\rangle } \models \varphi _1\) and \({\langle {{\gamma }, {\mathcal {{D}}}}\rangle } \models \varphi _2\) (similarly for \(\vee \)).

View maximality of a thread is determined by inspecting its entries for a location \({x}\): if they are all the same (including thread id and auxiliary information), the thread can only see the value of the last update to \({x}\).

Before discussing the concrete rules, we note an important property of program logics for weak memory, namely the stability of assertions under internal memory transitions [14, 25]. That is, for every assertion \(\varphi \), register store \({\gamma }\) and memory state \({m}\) the following must be satisfied:

figure k

This property holds for all assertions described by Fig. 5 with respect to the lose and dup steps of \({\textsf{SRA}}\) (see [25]), and trivially for \({\textsf{SC}}\). This property also holds for \({\textsf{TSO}}\) and its internal memory transition (flush), see Sect. 6.

Fig. 6.
figure 6

\({\textsf{Piccolo}}\) proof rules (\(\varphi ({a}:=e)\) means replacement of \({a}\) by \(e\) in \(\varphi \)), where

Proof Rules. The proof rules we introduce here solely concern primitive instructions. These can be used either within an Owicki-Gries-like proof framework [32] constructing proof outlines and showing these to be interference-free, or within a rely-guarantee approach [18, 40].

Figure 6 gives the proof rulesFootnote 3. Note that we do not explicitly state a proof rule for instrumented primitive commands; for these, we can employ combinations of rules for primitive commands with rule Subst. First, we have rules for stability, Stable-Ld and Stable-St, stating that \({\textsf{Piccolo}}\) formulae not referring to registers or locations, respectively, are not affected by load and store instructions. Rule Subst next states the standard axiom of assignment of Hoare logic [17], which is here only defined with respect to registers and local expressions.

The next three rules concern store instructions. Rule St-Own describes the changes a store has on the potential of the writing thread, namely if a thread is view-maximal, the only value it can see for location \({x}\) after the store is its own value (and the id of the writer is its own id). Rule St-Other1 states a similar effect for the non-writing threads, which however can also still see “old" values for \({x}\) after the store instruction. Rule St-Other2 states that properties of suffixes of lists are preserved when the writing thread \({\tau }\) satisfies the same property. This rule is essential for proving message-passing-like properties (e.g., in Fig. 1).

Rules Ld-Single and Ld-Shift describe the loading of values of shared locations into registers when the thread sees a list satisfying an interval assertion (consisting of one interval or several intervals, respectively). These rules are for instance required for in the proof outline of \(\texttt{T}_2\) in Fig. 1.

Finally, the novel rule Mca describes the property of multi-copy atomicity. It has not occurred in [25] as the memory model \({\textsf{SRA}}\) studied there is not multi-copy atomic. It details the fact that in multi-copy atomic memory models threads (other than the writer) will all get to see a written value at the same time. Here, we formulate it via intervals: if thread \({\pi }_1\) loads the value \(e\) to \({a}\), then thread \({\pi }_2\) is also able to see this value. This rule is essential for building a proof outline for the litmus test IRIW (see Sect. 5). This is the only rule requiring the specification of thread identifiers: we need to be able to state that threads \({\pi }_1\) and \({\pi }_2\) are different from the writing thread \({\tau }\), and that \({\pi }_1\) loads the value written by \({\tau }\).

5 Example Proofs

As examples we employ two standard litmus tests for weak memory models, the message-passing example MP already seen in Fig. 1 and a concurrent program called Independent-Reads-of-Independent-Writes (IRIW). For both litmus tests, we give proof outlines (programs interspersed with assertions) which can be derived using our proof rules. As underlying base reasoning technique, we employ Owicki-Gries reasoning [5, 32], replacing the normal rule of assignment by our proof rules. Owicki-Gries reasoning requires performing two correctness checks:

Local Correctness.:

For every command \(\tilde{c}\) of thread \({\tau }\) with pre-assertion \(\varphi \) and post-assertion \(\psi \), we need to prove .

Global Correctness.:

For every assertion \(\varphi \) in the proof outline of a thread \({\tau }\) and every command \(\tilde{c}\) in a thread \({\pi }\) (\({\tau }\ne {\pi }\)) with pre-assertion \(\psi \), we need to show (non-interference).

Each proof rule employed in these checks must furthermore be shown to be sound w.r.t. the memory model of interest; if this is not the case, the proof outline is not valid for the particular memory model. In §7, we study soundness of our proof rules for \({\textsf{SC}}\), \({\textsf{TSO}}\) and \({\textsf{SRA}}\).

Message-Passing. Figure 1 already gives the proof outline of MP. Note that we can also employ the standard rules of conjunction, disjunction and consequence of Hoare logic [17] for checking local and global correctness. The interesting cases in MP concern the non-interference checks of the first assertion in \(\texttt{T}_2\) with respect to the store instructions of thread \(\texttt{T}_1\). For this, we need to prove

figure p

(an instance of St-Other1) as well as the following (by St-Other2):

figure q
Fig. 7.
figure 7

\({\textsf{Piccolo}}\) proof of IRIW using \([\texttt{x}=0]\) as shorthand for \({{\tau }} \!\ltimes \! [\texttt{x}=0] \) for all \({\tau }\)

Independent-Reads-of-Independent-Writes. Our next litmus test IRIW (see Fig. 7) gives an example of a proof outline which is only valid for \({\textsf{SC}}\) and \({\textsf{TSO}}\) (as the employed proof rules are all sound in \({\textsf{SC}}\) and \({\textsf{TSO}}\), but one rule is not sound for \({\textsf{SRA}}\), see Sect. 7). IRIW is typically employed to show differences in the behaviour of multi-copy atomic and non multi-copy atomic memory models. In IRIW, we have two writer and two reader threads, the two readers reading values of \(\texttt{x}\) and \(\texttt{y}\) in opposite order. When IRIW runs on a memory model guaranteeing multi-copy atomicity, the threads \(\texttt{T}_2\) and \(\texttt{T}_3\) either both see the write to \(\texttt{x}\) before the one to \(\texttt{y}\) or the other way around. In the first case, since the two reads in each thread are in program order, if \(\texttt{a}=1\) and \(\texttt{c}=1\) then \(\texttt{T}_3\) has to see the write to \(\texttt{x}\) when reading from it. Hence, then \(\texttt{d}=1\). Equally, in the second case \(\texttt{b}=1\) when \(\texttt{a}=1\) and \(\texttt{c}=1\). Both cases together are described in the postcondition of Fig. 7 ( ).

Fig. 8.
figure 8

Impossible reading order and values of auxiliary variables

Again, we use the notation \(\texttt{T}_k.i\) to describe the i’th assertion in thread \(\texttt{T}_k\). For reasoning about IRIW (and thus construct a proof outline) we need to describe the possible orders in which the two reads can happen. To this end, we employ two auxiliary variables [32] here, \(\texttt{f}\) (for orderings on reads of \(\texttt{x}\)) and \(\texttt{g}\) (for \(\texttt{y})\). These are set atomically together with their respective load instructions. If at the end of the program, auxiliary variable \(\texttt{f}\) is 23, this means that thread \(\texttt{T}_2\) has read from \(\texttt{x}\) before \(\texttt{T}_3\) did. Therefore, \(\texttt{f}=23\) and \(\texttt{a}=1\) implies \(\texttt{d}=1\) (see line 2 of \(\texttt{T}_3.3\)). In the case where \(\texttt{f}=32\) at the end of the program, \(\texttt{T}_3\) has read from \(\texttt{x}\) first. Analogously, auxiliary variable \(\texttt{g}\) describes the ordering of reads from \(\texttt{y}\).

The proof outline contains several assertions detailing possible values of the two auxiliary variables and the registers. They basically state that certain orders of reads and thus certain combinations of values of registers \(\texttt{a}, \texttt{b}, \texttt{c}\) and \(\texttt{d}\) are excluded. In particular, we cannot have the ordering (cycle) depicted by the graph in Fig. 8, and hence we cannot have \(\texttt{g}=23 \wedge \texttt{f}=32\) at the end of the program. We use this fact in our proof outline as for example seen in \(\texttt{T}_2.3\).

Next, we exemplarily show one correctness check required for showing validity of the proof outline, namely the non-interference of \(\texttt{T}_2.1\) with respect to the store in \(\texttt{T}_1\), i.e. proving . Its pre-assertion can be weakened to

figure t

For the upper part of the assertion we apply the St-Other1-rule twice and get

figure u

Since neither \(\texttt{c}\), \(\texttt{f}\), \(\texttt{g}\) nor \(\texttt{y}\) are changed by the store instruction, the Stable-St-rule tells us that the lower part of the assertion remains unchanged.

The key rule making this proof outline sound for \({\textsf{TSO}}\) (and \({\textsf{SC}}\)Footnote 4) but not for \({\textsf{SRA}}\) is Mca. We need Mca to show the local correctness of \(\langle \texttt{a} \;{:=}\;{\textsf{LOAD}}({\texttt{x}}); \texttt{f}:=10 * \texttt{f} +2 \rangle \) (analogous for \(\langle \texttt{c} \;{:=}\;{\textsf{LOAD}}({\texttt{y}}); \texttt{g}:=10 * \texttt{g} +3 \rangle \)). For this, we prove by dividing the pre-assertion in two parts. For the first part

figure w

we apply the Mca-rule and receive (eliding the id of the executing thread)

figure x

For the second part, by applying the rules Stable-St and Subst, we get

figure y

By combining the two Hoare-triples and weaken \(\texttt{f}\in \{2,32\}\) to \(\texttt{f}\in \{2,23,32\}\) we show local correctness.

6 Lifting \({\textsf{SC}}\) and \({\textsf{TSO}}\) to Potentials

The previous section has introduced a proof calculus for \({\textsf{Piccolo}}\) which allows to construct proof outlines and thus enables reasoning over concurrent programs on weak memory models. The validity of proof outlines for a specific memory model depends on the soundness of the employed rules within the memory model. To this end, we first of all need to lift states of memory models to the level of potentials (and thus to \({\textsf{Piccolo}}\)), which we will next do for \({\textsf{SC}}\) and \({\textsf{TSO}}\).

Fig. 9.
figure 9

Operational semantics of prophetic \({\textsf{TSO}}\) using colours to highlight the updated and components

SC Memory Model. To interpret \({\textsf{Piccolo}}\) formulae on \({\textsf{SC}} \) states, we provide a mapping \(map_{\textsf{SC}}: {\textsf{SC}}.{\texttt{Q}}\rightarrow \mathcal{P}\). For \({\textsf{SC}}\), \(\textsf{Aux}\) is empty and every thread sees the same one value only. Thus, we define:

$$\begin{aligned} map_{\textsf{SC}} (m) \triangleq \lambda {\tau }. \{ \langle \lambda {x}. \langle m({x}).{\texttt{val}}, m({x}).{\texttt{tid}}\rangle \rangle \} \end{aligned}$$

Let \({\gamma }\) be a register store and \(\varphi \) a \({\textsf{Piccolo}}\) formula. Then \(\langle {\gamma }, m\rangle \models \varphi \) is defined as \(\langle {\gamma }, map_{\textsf{SC}} (m) \rangle \models \varphi \). In the memory model \({\textsf{SC}}\), all proof rules of Fig. 6 are sound (see §7) and assertions of \({\textsf{Piccolo}}\) are stable under internal memory transitions (since there are none).

TSO Memory Model. Next, we consider \({\textsf{TSO}}\)  [31, 35], for this define an operational semantics for \({\textsf{TSO}}\) and derive potentials out of \({\textsf{TSO}}\) states. We base our semantics on the prophetic, timestamp-based version given in [14].

Operational Semantics. \({\textsf{TSO}} \) has one memory-model internal action which is a flush, i.e. \({\textsf{TSO}}.{\mathbf {\Theta }}\triangleq \{\textsf{flush}\}\). A state \(\sigma ={\langle {s,wb}\rangle }\) in prophetic \({\textsf{TSO}}\) consists of the shared memory \(s: \textsf{Loc}\rightarrow (\textsf{Val}\times \textsf{Tid}\times {\mathbb {Q}^{+}})\) (recording value, writing thread and timestamp) and write buffers \(wb\) for all threads. The entries in write buffers record the location, written value and timestamp (to determine the order in which writes are flushed to shared memory). Together, \({\textsf{TSO}}.{\texttt{Q}}\triangleq (\textsf{Loc}\rightarrow (\textsf{Val}\times \textsf{Tid}\times {\mathbb {Q}^{+}})) \times (\textsf{Tid}\rightarrow (\textsf{Loc}\times \textsf{Val}\times {\mathbb {Q}^{+}})^{*})\). Initially, we have where \(\texttt{T}_0\) is the thread initializing shared locations.

The transition relation  is given in Fig. 9. The read transition needs to determine the value which thread \({\tau }\) can read in state \(\sigma \) for location \({x}\) (either from its own write buffer or from shared memory):

$$\begin{aligned} val_\sigma ({\tau },{x}) & \triangleq \textbf{if}\ {x}\in \sigma .wb({\tau })\ \textbf{then}\ { wbVal}_\sigma ({\tau },{x})\ \textbf{else}\ \sigma .s({x}) \end{aligned}$$

with \({ wbVal_\sigma }({\tau }, {x})\) a partial function extracting values out of write buffers. It is defined iff \({\langle {{x}, \_,\_}\rangle } \in wb({\tau })\). If defined, we have \({ wbVal_\sigma }({\tau }, {x}) \triangleq \textsf{last}((\sigma .wb({\tau }))_{|{x}}).{\texttt{val}}\), where \(\textsf{last}((\sigma .wb({\tau }))_{|{x}})\) extracts the last entry for \({x}\) in the write buffer of \({\tau }\).

The write transition is writing the value to the writer’s write buffer and to this end has to choose a new timestamp (which determines the time of flushing). The timestamp has to be larger than any other timestamp of writes of this thread, larger than all timestamps of entries in shared memory and different from any other timestamp:

$$\begin{aligned} fresh _\sigma ({\tau },q) \triangleq & (\forall {x}\in \textsf{Loc}.\ \sigma .s({x}).{\texttt{ts}}< q) \wedge \\ & (\forall {\pi }\in \textsf{Tid}.\ {\langle {\_,\_,q}\rangle } \notin \sigma .wb({{\pi }})) \wedge (\forall {\langle {\_,\_,q'}\rangle } \in \sigma .{ wb({\tau })}.\ q > q') \end{aligned}$$

Finally, flushing needs to determine which write buffer entry to flush next.

$$\begin{aligned} nextFlush _\sigma ({\tau }) \triangleq & \exists q.\ {\langle {\_, \_, q}\rangle } = \sigma .wb({\tau })[1]\wedge \\ & \forall {\pi }\in \textsf{Tid}\backslash \{{\tau }\}.\forall {\langle {\_,\_,q'}\rangle } \in \sigma .wb({{\pi }}).\ q' > q \end{aligned}$$

Potentials of \({\textsf{TSO}}\) States. For \({\textsf{TSO}}\), the auxiliary information \(\textsf{Aux}\) in the potentials concerns timestamps, i.e. \(\textsf{Aux}= \mathbb {Q}^{+}\). A state in prophetic \({\textsf{TSO}}\) determines one potential per thread. In this, the ordering in which a thread sees values of shared locations depends on the timestamps. The first potential store thread \({\tau }\) sees in a state \(\sigma \) is fixed by shared memory and its own write buffer.

$$\begin{aligned} \varDelta _{\tau }(\sigma ): \textsf{Loc} & \rightarrow (\textsf{Val}\times \textsf{Tid}\times \mathbb {Q}^{+}) \\ {x} & \mapsto {\langle {val_\sigma ({\tau },{x}), tid_\sigma ({\tau },{x}),ts_\sigma ({\tau },{x})}\rangle } \end{aligned}$$

where we let \(tid_\sigma ({\tau },{x}) \triangleq \textbf{if}\ {x}\in \sigma .wb({\tau })\ \textbf{then}\ {\tau }\ \textbf{else}\ \sigma .s({x}).{\texttt{tid}}\) and \(ts_\sigma ({\tau },{x}) \triangleq \textbf{if}\ {x}\in \sigma .wb({\tau })\ \textbf{then}\ \textsf{last}((\sigma .wb({\tau }))_{|{x}}).{\texttt{ts}}\ \textbf{else}\ \sigma .s({x}).{\texttt{ts}}\).

With this at hand, we can define a mapping which relates prophetic \({\textsf{TSO}}\) states to entire potentials: \(map_{\textsf{TSO}}: {\textsf{TSO}}.{\texttt{Q}}\rightarrow \mathcal{P}\) is defined as

figure ad

where . This definition recursively builds a potential by flushing the next entry in a state \(\sigma \) and then constructing the next element of a list. The else case applies when all write buffers are empty.

Alike \({\textsf{SC}}\), we can now fix \({\langle {{\gamma }, \sigma }\rangle } \models \varphi \) to be \({\langle {{\gamma }, map_{\textsf{TSO}} (\sigma )}\rangle } \models \varphi \). All assertions of \({\textsf{Piccolo}}\) are stable under internal memory transition \(\mathsf flush\).

7 Soundness of Rules in Memory Models

With the lifting for \({\textsf{SC}}\) and \({\textsf{TSO}}\) at hand, we can formally study the soundness of \({\textsf{Piccolo}}\) proof rules for our three memory models, \({\textsf{SC}}\), \({\textsf{TSO}}\) and \({\textsf{SRA}}\). A proof rule is sound for a memory model \({\textsf{MM}}\) if for all states \({\langle {{\gamma }, {m}}\rangle }\) satisfying \(\varphi \) and all states \({\langle {{\gamma }', {m}'}\rangle }\) reached by executing \(c\) in \({\textsf{MM}}\), the formula \(\psi \) is true in \({\langle {{\gamma }', {m}'}\rangle }\).

Sequential Consistency. As already stated, we get:

Theorem 1

Rules Stable-Ld, Stable-St, Subst, St-Own, St-Other1, St-Other2, Ld-Single, Ld-Shift and Mca are sound for \({\textsf{SC}}\).

The proof is straightforward and therefore elided. Moreover, we have a stronger proof rule for store instructions, reflecting the essential property of sequential consistency: all threads directly see written values.

figure ag

Total Store Ordering. For \({\textsf{TSO}}\), we get:

Theorem 2

Rules Stable-Ld, Stable-St, Subst, St-Own, St-Other1, St-Other2, Ld-Single, Ld-Shift and Mca are sound for \({\textsf{SC}}\). Rule St-SC is not sound for \({\textsf{TSO}}\).

Proof

Due to space restrictions, we only provide a proof sketch for one rule here, the rule Mca. Let \({\langle {{\gamma }, \sigma }\rangle } \models {{\pi }_i} \!\ltimes \! [{x}\ne e] \mathbin {;}[{x}= e\wedge {x}.{\texttt{tid}}= {\tau }] \), i.e., there are lists \(L^i\) s.t. \( mkLst (\sigma )({\pi }_i)=L^i\) and exists \(L_1^i, L_2^i\) with \(L^i = L_1^i \cdot L_2^i\) and \({\langle {{\gamma }, L_1^i}\rangle } \models [{x}\ne e]\) and \({\langle {{\gamma },L_2^i}\rangle } \models [{x}= e\wedge {x}.{\texttt{tid}}= {\tau }]\). If \({a}=e\) after loading \({x}\) by \({\pi }_1\), then at least \(L_2^1\) has to be non-empty. Moreover, \(L_1^1\) has to be empty because load instructions read the value \(val_\sigma ({\pi }_1,{x})\) and by definition of \( mkLst \) this is the entry for \({x}\) in the first potential store. The question is thus why \(L_1^2\) has to be empty as well.

If \({{\pi }_1} \!\ltimes \! [{x}=e\wedge {x}.{\texttt{tid}}= {\tau }] \) holds and \({\tau }\ne {\pi }_1\), then \(\sigma .s({x}).{\texttt{val}}= e\) and \(\sigma .s({x}).{\texttt{tid}}= {\tau }\). Moreover, for all \(\sigma '\) such that this holds as well. Hence, \({{\pi }_2} \!\ltimes \! [{x}= e\wedge {x}.{\texttt{tid}}= {\tau }] \) by definition of \( mkLst \) and \(L_1^2\) is empty. \(\Box \)

Strong Release-Acquire. As \({\textsf{SRA}}\) already has an operational semantics with potentials as semantic domain, no lifting is required here and we get:

Theorem 3

Rules Stable-Ld, Stable-St, Subst, St-Own, St-Other1, St-Other2, Ld-Single and Ld-Shift are sound for \({\textsf{SRA}}\). Rules Mca and St-SC are not sound for \({\textsf{SRA}} \).

Proof

The soundness follows from [25]. Rule Mca is not sound for \({\textsf{SRA}}\), because \({\textsf{SRA}}\) is not multi-copy atomic. As an example, consider a state \({\mathcal {{D}}}\) in which both \({\pi }_1\) and \({\pi }_2\) can see \([{x}=0]\mathbin {;}[{x}=1]\) (and for both intervals the lists are non-empty). Now assume step Lose makes \({\mathcal {{D}}}({\pi }_1)\) lose the entire list with \([{x}=0]\). Then it can load \({x}\) and read 1, whereas \({\pi }_2\) is still able to see the old value 0. \(\Box \)

Knowing the soundness of rules, we get:

Theorem 4

The proof outline in Fig. 1 is valid for \({\textsf{SC}}\), \({\textsf{TSO}}\) and \({\textsf{SRA}}\).

The proof outline in Fig. 7 is valid for \({\textsf{SC}}\) and \({\textsf{TSO}}\), but not for \({\textsf{SRA}}\).

8 Conclusion

This paper proposes the use of the domain of potentials and the logic \({\textsf{Piccolo}}\) to build unified proof calculi for concurrent programs on weak memory models. As future work, we see the study of other memory models and semantics (like C11 [26], PSO [2]) and the treatment of read-modify-write operations. We do not expect our technique to be applicable to promise-based semantics [20, 36], though. We furthermore aim at developing tool support for reasoning, e.g. as in [12] or [33].