Layered Concurrent Programs
Abstract
We present layered concurrent programs, a compact and expressive notation for specifying refinement proofs of concurrent programs. A layered concurrent program specifies a sequence of connected concurrent programs, from most concrete to most abstract, such that common parts of different programs are written exactly once. These programs are expressed in the ordinary syntax of imperative concurrent programs using gated atomic actions, sequencing, choice, and (recursive) procedure calls. Each concurrent program is automatically extracted from the layered program. We reduce refinement to the safety of a sequence of concurrent checker programs, one each to justify the connection between every two consecutive concurrent programs. These checker programs are also automatically extracted from the layered program. Layered concurrent programs have been implemented in the Civl verifier which has been successfully used for the verification of several complex concurrent programs.
1 Introduction
Refinement is an approach to program correctness in which a program is expressed at multiple levels of abstraction. For example, we could have a sequence of programs \(\mathcal {P}_1, \ldots , \mathcal {P}_h, \mathcal {P}_{h+1}\) where \(\mathcal {P}_1\) is the most concrete and the \(\mathcal {P}_{h+1}\) is the most abstract. Program \(\mathcal {P}_1\) can be compiled and executed efficiently, \(\mathcal {P}_{h+1}\) is obviously correct, and the correctness of \(\mathcal {P}_i\) is guaranteed by the correctness of \(\mathcal {P}_{i+1}\) for all \(i \in [1,h]\). These three properties together ensure that \(\mathcal {P}_1\) is both efficient and correct. To use the refinement approach, the programmer must come up with each version \(\mathcal {P}_i\) of the program and a proof that the correctness of \(\mathcal {P}_{i+1}\) implies the correctness of \(\mathcal {P}_i\). This proof typically establishes a connection from every behavior of \(\mathcal {P}_i\) to some behavior of \(\mathcal {P}_{i+1}\).
Refinement is an attractive approach to the verified construction of complex programs for a number of reasons. First, instead of constructing a single monolithic proof of \(\mathcal {P}_1\), the programmer constructs a collection of localized proofs establishing the connection between \(\mathcal {P}_i\) and \(\mathcal {P}_{i+1}\) for each \(i \in [1,h]\). Each localized proof is considerably simpler than the overall proof because it only needs to reason about the (relatively small) difference between adjacent programs. Second, different localized proofs can be performed using different reasoning methods, e.g., interactive deduction, automated testing, or even informal reasoning. Finally, refinement naturally supports a bidirectional approach to correctness—bottomup verification of a concrete program via successive abstraction or topdown derivation from an abstract program via successive concretization.
A layered concurrent program \(\mathcal {LP}\) represents a sequence \(\mathcal {P}_1, \ldots , \mathcal {P}_h, \mathcal {P}_{h+1}\) of concurrent programs such that common parts of different programs are written exactly once. These programs are expressed not as flat transition systems but in the ordinary syntax of imperative concurrent programs using gated atomic actions [4], sequencing, choice, and (recursive) procedure calls. Our programming language is accompanied by a type system that allows each \(\mathcal {P}_i\) to be automatically extracted from \(\mathcal {LP}\). Finally, refinement between \(\mathcal {P}_i\) and \(\mathcal {P}_{i+1}\) is encoded as the safety of a checker program \(\mathcal {C}_i\) which is also automatically extracted from \(\mathcal {LP}\). Thus, the verification of \(\mathcal {P}_1\) is split into the verification of \(h\) concurrent checker programs \(\mathcal {C}_1,\ldots ,\mathcal {C}_h\) such that \(\mathcal {C}_i\) connects \(\mathcal {P}_i\) and \(\mathcal {P}_{i+1}\) (Fig. 1).
We highlight two crucial aspects of our approach. First, while the programs \(\mathcal {P}_i\) have an interleaved (i.e., preemptive) semantics, we verify the checker programs \(\mathcal {C}_i\) under a cooperative semantics in which preemptions occur only at procedure calls. Our type system [5] based on the theory of right and left movers [10] ensures that the cooperative behaviors of \(\mathcal {C}_i\) cover all preemptive behaviors of \(\mathcal {P}_i\). Second, establishing the safety of checker programs is not tied to any particular verification technique. Any applicable technique can be used. In particular, different layers can be verified using different techniques, allowing for great flexibility in verification options.
1.1 Related Work
This paper formalizes, clarifies, and extends the most important aspect of the design of Civl [6], a deductive verifier for layered concurrent programs. Hawblitzel et al. [7] present a partial explanation of Civl by formalizing the connection between two concurrent programs as sound program transformations. In this paper, we provide the first formal account for layered concurrent programs to represent all concurrent programs in a multilayered refinement proof, thereby establishing a new foundation for the verified construction of concurrent programs.
Civl is the successor to the Qed [4] verifier which combined a type system for mover types with logical reasoning based on verification conditions. Qed enabled the specification of a layered proof but required each layer to be expressed in a separate file leading to code duplication. Layered programs reduce redundant work in a layered proof by enabling each piece of code to be written exactly once. Qed also introduced the idea of abstracting an atomic action to enable attaching a stronger mover type to it. This idea is incorporated naturally in layered programs by allowing a concrete atomic action to be wrapped in a procedure whose specification is a more abstract atomic action with a more precise mover type.
EventB [1] is a modeling language that supports refinement of systems expressed as interleaved composition of events, each specified as a toplevel transition relation. Verification of EventB specifications is supported by the Rodin [2] toolset which has been used to model and verify several systems of industrial significance. TLA+ [9] also specifies systems as a flat transition system, enables refinement proofs, and is more general because it supports liveness specifications. Our approach to refinement is different from EventB and TLA+ for several reasons. First, EventB and TLA+ model different versions of the program as separate flat transition systems whereas our work models them as different layers of a single layered concurrent program, exploiting the standard structuring mechanisms of imperative programs. Second, EventB and TLA+ connect the concrete program to the abstract program via an explicitly specified refinement mapping. Thus, the guarantee provided by the refinement proof is contingent upon trusting both the abstract program and the refinement mapping. In our approach, once the abstract program is proved to be free of failures, the trusted part of the specification is confined to the gates of atomic actions in the concrete program. Furthermore, the programmer never explicitly specifies a refinement mapping and is only engaged in proving the correctness of checker programs.
The methodology of refinement mappings has been used for compositional verification of hardware designs [11, 12]. The focus in this work is to decompose a large refinement proof connecting two versions of a hardware design into a collection of smaller proofs. A variety of techniques including compositional reasoning (converting a large problem to several small problems) and customized abstractions (for converting infinitestate to finitestate problems) are used to create small and finitestate verification problems for a model checker. This work is mostly orthogonal to our contribution of layered programs. Rather, it could be considered an approach to decompose the verification of each (potentially large) checker program encoded by a layered concurrent program.
2 Concurrent Programs
Preliminaries. Let \( Val \) be a set of values containing the Booleans. The set of variables \( Var \) is partitioned into global variables \( GVar \) and local variables \( LVar \). A store \(\sigma \) is a mapping from variables to values, an expression \(e\) is a mapping from stores to values, and a transition \(t\) is a binary relation between stores.
Atomic Actions. A fundamental notion in our approach is that of an atomic action. An atomic action captures an indivisible operation on the program state together with its precondition, providing a universal representation for both lowlevel machine operations (e.g., reading a variable from memory) and highlevel abstractions (e.g., atomic procedure summaries). Most importantly for reasoning purposes, our programming language confines all accesses to global variables to atomic actions. Formally, an atomic action is a tuple \((I,O,e,t)\). The semantics of an atomic action in an execution is to first evaluate the expression \(e\), called the gate, in the current state. If the gate evaluates to \( false \) the execution fails, otherwise the program state is updated according to the transition \(t\). Input variables in \(I\) can be read by \(e\) and \(t\), and output variables in \(O\) can be written by \(t\).
Remark 1
Command  \(e\)  \(t\) 

\(x := x + y\)  \( true \)  \(x' = x + y \wedge y' = y\) 
havoc x  \( true \)  \(y' = y\) 
assert \(x < y\)  \(x < y\)  \(x' = x \wedge y' = y\) 
assume \(x < y\)  \( true \)  \(x < y \wedge x' = x \wedge y' = y\) 
Procedures. A procedure is a tuple \((I,O,L,s)\) where \(I,O,L\) are the input, output, and local variables of the procedure, and \(s\) is a statement composed from skip, sequencing, if, and parallel call statements. Since only atomic actions can refer to global variables, the variables accessed in if conditions are restricted to the inputs, outputs, and locals of the enclosing procedure. The meaning of skip, sequencing, and if is as expected and we focus on parallel calls.
Pcalls. A parallel call (pcall, for short) \(\mathtt {pcall}_{}\ \overline{(A,\iota ,o)}~\overline{(P,\iota ,o)}~\overline{(A,\iota ,o)}\) consists of a sequence of invocations of atomic actions and procedures. We refer to the invocations as the arms of the pcall. In particular \((A,\iota ,o)\) is an atomicaction arm and \((P,\iota ,o)\) is a procedure arm. An atomicaction arm executes the called atomic action, and a procedure arm creates a child thread that executes the statement of the called procedure. The parent thread is blocked until all arms of the pcall finish. In the standard semantics the order of arms does not matter, but our verification technique will allow us to consider the atomic action arms before and after the procedure arms to execute in the specified order. Parameter passing is expressed using partial mappings \(\iota ,o\) between local variables; \(\iota \) maps formal inputs of the callee to actual inputs of the caller, and \(o\) maps actual outputs of the caller to formal outputs of the callee. Since we do not want to introduce races on local variables, the outputs of all arms must be disjoint and the output of one arm cannot be an input to another arm. Finally, notice that our general notion of a pcall subsumes sequential statements (single atomicaction arm), synchronous procedure calls (single procedure arm), and unbounded thread creation (recursive procedure arm).
Concurrent Programs. A concurrent program \(\mathcal {P}\) is a tuple \(( gs , as , ps ,m,\mathcal {I})\), where \( gs \) is a finite set of global variables used by the program, \( as \) is a finite mapping from action names \(A\) to atomic actions, \( ps \) is a finite mapping from procedure names \(P\) to procedures, \(m\) is either a procedure or action name that denotes the entry point for program executions, and \(\mathcal {I}\) is a set of initial stores. For convenience we will liberally use action and procedure names to refer to the corresponding atomic actions and procedures.
Semantics. Let \(\mathcal {P}= ( gs , as , ps ,m,\mathcal {I})\) be a fixed concurrent program. A state consists of a global store assigning values to the global variables and a pool of threads, each consisting of a local store assigning values to local variables and a statement that remains to be executed. An execution is a sequence of states, where from each state to the next some thread is selected to execute one step. Every step that switches the executing thread is called a preemption (also called a context switch). We distinguish between two semantics that differ in (1) preemption points, and (2) the order of executing the arms of a pcall.
In preemptive semantics, a preemption is allowed anywhere and the arms of a pcall are arbitrarily interleaved. In cooperative semantics, a preemption is allowed only at the call and return of a procedure, and the arms of a pcall are executed as follows. First, the leading atomicaction arms are executed from left to right without preemption, then all procedure arms are executed arbitrarily interleaved, and finally the trailing atomicaction arms are executed, again from left to right without preemption. In other words, a preemption is only allowed when a procedure arm of a pcall creates a new thread and when a thread terminates.
For \(\mathcal {P}\) we only consider executions that start with a single thread that execute \(m\) from a store in \(\mathcal {I}\). \(\mathcal {P}\) is called safe if there is no failing execution, i.e., an execution that executes an atomic action whose gate evaluates to \( false \). We write \( Safe (\mathcal {P})\) if \(\mathcal {P}\) is safe under preemptive semantics, and \( CSafe (\mathcal {P})\) if \(\mathcal {P}\) is safe under cooperative semantics.
2.1 Running Example
Proving \(\mathcal {P}^ lock _1\) safe amounts to showing that RESET is never called with b set to \( false \), which expresses that \(\mathcal {P}^ lock _1\) follows a locking discipline of releasing only previously acquired locks. Doing this proof directly on \(\mathcal {P}^ lock _1\) has two drawbacks. First, the proof must relate the possible values of b with the program counters of all running threads. In general, this approach requires sound introduction of ghost code and results in complicated case distinctions in program invariants. Second, the proof is not reusable across different lock implementations. The correctness of the client does not specifically depend on using a spinlock over a Boolean variable, and thus the proof should not as well. We show how our refinementbased approach addresses both problems.
Program \(\mathcal {P}^ lock _2\) in Fig. 3(b) is an abstraction of \(\mathcal {P}^ lock _1\) that introduces an abstract lock specification. The global variable b is replaced by lock which ranges over integer thread identifiers (0 is a dedicated value indicating that the lock is available). The procedures Alloc, Enter and Leave are replaced by the atomic actions ALLOC, ACQUIRE and RELEASE, respectively. ALLOC allocates unique and nonzero thread identifiers using a set of integers slot to store the identifiers not allocated so far. ACQUIRE blocks executions where the lock is not available (assume lock == 0) and sets lock to the identifier of the acquiring thread. RELEASE asserts that the releasing thread holds the lock and sets lock to 0. Thus, the connection between \(\mathcal {P}^ lock _1\) and \(\mathcal {P}^ lock _2\) is given by the invariant Open image in new window which justifies that Enter refines ACQUIRE and Leave refines RELEASE. The potential safety violation in \(\mathcal {P}^ lock _1\) by the gate of RESET is preserved in \(\mathcal {P}^ lock _2\) by the gate of RELEASE. In fact, the safety of \(\mathcal {P}^ lock _2\) expresses the stronger locking discipline that the lock can only be released by the thread that acquired it.
Reasoning in terms of ACQUIRE and RELEASE instead of Enter and Leave is more general, but it is also simpler! Figure 3(b) declares atomic actions with a mover type [5], right for right mover, and left for left mover. A right mover executed by a thread commutes to the right of any action executed by a different thread. Similarly, a left mover executed by thread commutes to the left of any action executed by a different thread. A sequence of right movers followed by at most one nonmover followed by a sequence of left movers in a thread can be considered atomic [10]. The reason is that any interleaved execution can be rearranged (by commuting atomic actions), such that these actions execute consecutively. For \(\mathcal {P}^ lock _2\) this means that Worker is atomic and thus the gate of RELEASE can be discharged by pure sequential reasoning; ALLOC guarantees tid != 0 and after executing ACQUIRE we have lock == tid. As a result, we finally obtain that the atomic action SKIP in \(\mathcal {P}^ lock _3\) (Fig. 3(c)) is a sound abstraction of procedure Main in \(\mathcal {P}^ lock _2\). Hence, we showed that program \(\mathcal {P}^ lock _1\) is safe by soundly abstracting it to \(\mathcal {P}^ lock _3\), a program that is trivially safe.
The correctness of right and left annotations on ACQUIRE and RELEASE, respectively, depends on pairwise commutativity checks among atomic actions in \(\mathcal {P}^ lock _2\). These commutativity checks will fail unless we exploit the fact that every thread identifier allocated by Worker using the ALLOC action is unique. For instance, to show that ACQUIRE executed by a thread commutes to the right of RELEASE executed by a different thread, it must be known that the parameters tid to these actions are distinct from each other. The linear annotation on the local variables named tid and the global variable slots (which is a set of integers) is used to communicate this information.
The overall invariant encoded by the linear annotation is that the set of values stored in slots and in local linear variables of active stack frames across all threads are pairwise disjoint. This invariant is guaranteed by a combination of a linear type system [14] and logical reasoning on the code of all atomic actions. The linear type system ensures using a flow analysis that a value stored in a linear variable in an active stack frame is not copied into another linear variable via an assignment. Each atomic action must ensure that its state update preserves the disjointness invariant for linear variables. For actions ACQUIRE and RELEASE, which do not modify any linear variables, this reasoning is trivial. However, action ALLOC modifies slots and updates the linear output parameter tid. Its correctness depends on the (semantic) fact that the value put into tid is removed from slots; this reasoning can be done using automated theorem provers.
3 Layered Concurrent Programs
A layered concurrent program represents a sequence of concurrent programs that are connected to each other. That is, the programs derived from a layered concurrent program share syntactic structure, but differ in the granularity of the atomic actions and the set of variables they are expressed over. In a layered concurrent program, we associate layer numbers and layer ranges with variables (both global and local), atomic actions, and procedures. These layer numbers control the introduction and hiding of program variables and the summarization of compound operations into atomic actions, and thus provide the scaffolding of a refinement relation. Concretely, this section shows how the concurrent programs \(\mathcal {P}^ lock _1\), \(\mathcal {P}^ lock _2\), and \(\mathcal {P}^ lock _3\) (Fig. 3) and their connections can all be expressed in a single layered concurrent program. In Sect. 4, we discuss how to check refinement between the successive concurrent programs encoded in a layered concurrent program.
Syntax. The syntax of layered concurrent programs is summarized in Fig. 4. Let \(\mathbb {N}\) be the set of nonnegative integers and \(\mathbb {I}\) the set of nonempty intervals [a, b]. We refer to integers as layer numbers and intervals as layer ranges. A layered concurrent program \(\mathcal {LP}\) is a tuple \(( GS , AS , IS , PS ,m,\mathcal {I})\) which, similarly to concurrent programs, consists of global variables, atomic actions, and procedures, with the following differences.
 1.
\( GS \) maps global variables to layer ranges. For \( GS (v) = [a,b]\) we say that \(v\) is introduced at layer a and available up to layer b.
 2.
\( AS \) assigns a layer range \(r\) to atomic actions denoting the layers at which an action exists.
 3.
\( IS \) (with a disjoint domain from \( AS \)) distinguishes a special type of atomic actions called introduction actions. Introduction actions have a single layer number \(n\) and are responsible for assigning meaning to the variables introduced at layer \(n\). Correspondingly, statements in layered concurrent programs are extended with an icall statement for calling introduction actions.
 4.
\( PS \) assigns a layer number \(n\), a layer number mapping for local variables \( ns \), and an atomic action \(A\) to procedures. We call \(n\) the disappearing layer and \(A\) the refined atomic action. For every local variable v, \( ns (v)\) is the introduction layer of v.
The pcall\(_\alpha \) statement in a layered concurrent program differs from the pcall statement in concurrent programs in two ways. First, it can only have procedure arms. Second, it has a parameter \(\alpha \) which is either \(\varepsilon \) (unannotated pcall) or the index of one of its arms (annotated pcall). We usually omit writing \(\varepsilon \) in unannotated pcalls.
 5.
\(m\) is a procedure name.
Intuition Behind Layer Numbers. Recall that a layered concurrent program \(\mathcal {LP}\) should represent a sequence of \(h+1\) concurrent programs \(\mathcal {P}_1,\cdots ,\mathcal {P}_{h+1}\) that are connected by a sequence of \(h\) checker programs \(\mathcal {C}_1,\cdots ,\mathcal {C}_h\) (cf. Fig. 1). Before we provide formal definitions, let us get some intuition on two core mechanisms: global variable introduction and procedure abstraction/refinement.
Let \(v\) be a global variable with layer range [a, b]. The meaning of this layer range is that the “first” program that contains \(v\) is \(\mathcal {C}_a\), the checker program connecting \(\mathcal {P}_a\) and \(\mathcal {P}_{a+1}\). In particular, \(v\) is not yet part of \(\mathcal {P}_a\). In \(\mathcal {C}_a\) the introduction actions at layer a can modify \(v\) and thus assign its meaning in terms of all other available variables. Then \(v\) is part of \(\mathcal {P}_{a+1}\) and all programs up to and including \(\mathcal {P}_b\). The “last” program containing \(v\) is \(\mathcal {C}_b\). In other words, when going from a program \(\mathcal {P}_i\) to \(\mathcal {P}_{i+1}\) the variables with upper bound i disappear and the variables with lower bound i are introduced; the checker program \(\mathcal {C}_i\) has access to both and establishes their relationship.
Let \(P\) be a procedure with disappearing layer \(n\) and refined atomic action \(A\). The meaning of the disappearing layer is that \(P\) exists in all programs from \(\mathcal {P}_1\) up to and including \(\mathcal {P}_n\). In \(\mathcal {P}_{n+1}\) and above every invocation of \(P\) is replaced by an invocation of \(A\). To ensure that this replacement is sound, the checker program \(\mathcal {C}_n\) performs a refinement check that ensures that every execution of \(P\) behaves like \(A\). Observe that the body of procedure \(P\) itself changes from \(\mathcal {P}_1\) to \(\mathcal {P}_n\) according to the disappearing layer of the procedures it calls.
With the above intuition in mind it is clear that the layer annotations in a layered concurrent program cannot be arbitrary. For example, if procedure \(P\) calls a procedure \(Q\), then \(Q\) cannot have a higher disappearing layer than \(P\), for \(Q\) could introduce further behaviors into the program after \(P\) was replaced by \(A\), and those behaviors are not captured by \(A\).
3.1 Type Checker
(Atomic Action)/(Introduction Action). Global variables can only be accessed by atomic actions and introduction actions. For a global variable \(v\) with layer range [a, b], introduction actions with layer number a are allowed to modify \(v\) (for sound variable introduction), and atomic actions with a layer range contained in \([a+1,b]\) have access to \(v\). Introduction actions must be nonblocking, which means that every state that satisfies the gate must have a possible transition to take. This ensures that introduction actions only assign meaning to introduced variables but do not exclude any program behavior.
(If). Procedure bodies change from layer to layer because calls to procedures become calls to atomic actions. But the controlflow structure within a procedure is preserved across layers. Therefore (local) variables accessed in an if condition must be available on all layers to ensure that the if statement is welldefined on every layer.
(Introduction Call). Let \(A\) be an introduction action with layer number \(n\). Since \(A\) modifies global variables introduced at layer \(n\), icalls to \(A\) are only allowed from procedures with disappearing layer \(n\). Similarly, the formal output parameters of an icall to \(A\) must have introduction layer \(n\). The icall is only preserved in \(\mathcal {C}_n\).
(Parallel Call). All arms in a pcall must be procedure arms invoking a procedure with a disappearing layer less than or equal to the disappearing layer of the caller. Furthermore, above the disappearing layer of the callee its refined atomic action must be available up to the disappearing layer of the caller. Parameter passing can only be welldefined if the actual inputs exist before the formal inputs, and the formal outputs exist before the actual outputs. The sequence of disappearing layers of the procedures in a pcall must be monotonically increasing and then decreasing, such that the resulting pcall in the extracted programs consists of procedure arms surrounded by atomicaction arms on every layer.
Annotated pcalls are only used for invocations to procedures with the same disappearing layer \(n\) as the caller. In particular, during refinement checking in \(\mathcal {C}_n\) only the arm with index \(\alpha \) is allowed to modify the global state, which must be according to the refined atomic action of the caller. The remaining arms must leave the global state unchanged.
3.2 Concurrent Program Extraction
Thus \(\mathcal {P}_\ell \) includes the global and local variables that were introduced before \(\ell \) and the atomic actions with \(\ell \) in their layer range. Furthermore, it does not contain introduction actions and correspondingly all icall statements are removed. Every arm of a pcall statement, depending on the disappearing layer \(n\) of the called procedure \(Q\), either remains a procedure arm to \(Q\), or is replaced by an atomicaction arm to \(A\), the atomic action refined by \(Q\). The input and output mappings are restricted to the local variables at layer \(\ell \). The set of initial stores of \(\mathcal {P}_\ell \) is the same as for \(\mathcal {LP}\), since stores range over all program variables.
In our programming language, loops are subsumed by the more general mechanism of recursive procedure calls. Observe that \(\mathcal {P}_\ell \) can indeed have recursive procedure calls, because our type checking rules (Fig. 5) allow a pcall to invoke a procedure with the same disappearing layer as the caller.
3.3 Running Example
First, observe that \(\mathcal {LP}^ lock \) is wellformed, i.e., \(\vdash \mathcal {LP}^ lock \). Then it is an easy exercise to verify that \(\varGamma _{\ell }(\mathcal {LP}^ lock ) = \mathcal {P}^ lock _\ell \) for \(\ell \in [1,3]\). Let us focus on procedure Worker. In \(\mathcal {P}^ lock _1\) (Fig. 3(a)) tid does not exist, and correspondingly Alloc, Enter, and Leave do not have input respectively output parameters. Furthermore, the icall in the body of Alloc is replaced with skip. In \(\mathcal {P}^ lock _2\) (Fig. 3(b)) we have tid and the calls to Alloc, Enter, and Leave are replaced with their respective refined atomic actions ALLOC, ACQUIRE, and RELEASE. The only annotated pcall in \(\mathcal {LP}^ lock \) is the recursive call to Enter.
In addition to representing the concurrent programs in Fig. 3, the program \(\mathcal {LP}^ lock \) also encodes the connection between them via introduction actions and calls. The introduction action iSetLock updates lock to maintain the relationship between lock and b, expressed by the predicate InvLock. It is called in Enter in case the CAS operation successfully set b to \( true \), and in Leave when b is set to \( false \). The introduction action iIncr implements linear thread identifiers using the integer variables pos which points to the next value that can be allocated. For every allocation, the current value of pos is returned as the new thread identifier and pos is incremented.
The variable slots is introduced at layer 1 to represent the set of unallocated identifiers. It contains all integers no less than pos, an invariant that is expressed by the predicate InvAlloc and maintained by the code of iIncr. The purpose of slots is to encode linear allocation of thread identifiers in a way that the body of iIncr can be locally shown to preserve the disjointness invariant for linear variables; slots plays a similar role in the specification of the atomic action ALLOC in \(\mathcal {P}_2\). The variable pos is both introduced and hidden at layer 1 so that it exists neither in \(\mathcal {P}^ lock _1\) nor \(\mathcal {P}^ lock _2\). However, pos is present in the checker program \(\mathcal {C}_1\) that connects \(\mathcal {P}^ lock _1\) and \(\mathcal {P}^ lock _2\).
The bodies of procedures Cas and Reset are not shown in Fig. 6 because they are not needed. They disappear at layer 0 and are replaced by the atomic actions CAS and RESET, respectively, in \(\mathcal {P}^ lock _1\).
The degree of compactness afforded by layered programs (as in Fig. 6) over separate specification of each concurrent program (as in Fig. 3) increases rapidly with the size of the program and the maximum depth of procedure calls. In our experience, for realistic programs such as a concurrent garbage collector [7] or a datarace detector [15], the saving in code duplication is significant.
4 Refinement Checking
Section 3 described how a layered concurrent program \(\mathcal {LP}\) encodes a sequence \(\mathcal {P}_1,\ldots ,\mathcal {P}_h,\mathcal {P}_{h+1}\) of concurrent programs. In this section, we show how the safety of any concurrent program in the sequence is implied by the safety of its successor, ultimately allowing the safety of \(\mathcal {P}_1\) to be established by the safety of \(\mathcal {P}_{h+1}\).
There are three ingredients to connecting \(\mathcal {P}_\ell \) to \(\mathcal {P}_{\ell +1}\) for any \(\ell \in [1,h]\)—reduction, projection, and abstraction. Reduction allows us to conclude the safety of a concurrent program under preemptive semantics by proving safety only under cooperative semantics.
Theorem 1 (Reduction)
Let \(\mathcal {P}\) be a concurrent program. If \( MSafe (\mathcal {P})\) and \( CSafe (\mathcal {P})\), then \( Safe (\mathcal {P})\).
The judgment \( MSafe (\mathcal {P})\) uses logical commutativity reasoning and mover types to ensure that cooperative safety is sufficient for preemptive safety (Sect. 4.1). We use this theorem to justify reasoning about \( CSafe (\mathcal {P}_\ell )\) rather than \( Safe (\mathcal {P}_\ell )\).
The next step in connecting \(\mathcal {P}_\ell \) to \(\mathcal {P}_{\ell +1}\) is to introduce computation introduced at layer \(\ell \) into the cooperative semantics of \(\mathcal {P}_\ell \). This computation comprises global and local variables together with introduction actions and calls to them. We refer to the resulting program at layer \(\ell \) as \(\widetilde{\mathcal {P}}_\ell \).
Theorem 2 (Projection)
Let \(\mathcal {LP}\) be a layered concurrent program with top layer \(h\) and \(\ell \in [1,h]\). If \( CSafe (\widetilde{\mathcal {P}}_\ell )\), then \( CSafe (\mathcal {P}_\ell )\).
Since introduction actions are nonblocking and \(\widetilde{\mathcal {P}}_\ell \) is safe under cooperative semantics, every cooperative execution of \(\mathcal {P}_\ell \) can be obtained by projecting away the computation introduced at layer \(\ell \). This observation allows us to conclude that every cooperative execution of \(\mathcal {P}_\ell \) is also safe.
Finally, we check that the safety of the cooperative semantics of \(\widetilde{\mathcal {P}}_\ell \) is ensured by the safety of the preemptive semantics of the next concurrent program \(\mathcal {P}_{\ell +1}\). This connection is established by reasoning about the cooperative semantics of a concurrent checker program \(\mathcal {C}_\ell \) that is automatically constructed from \(\mathcal {LP}\).
Theorem 3 (Abstraction)
Let \(\mathcal {LP}\) be a layered concurrent program with top layer \(h\) and \(\ell \in [1,h]\). If \( CSafe (\mathcal {C}_\ell )\) and \( Safe (\mathcal {P}_{\ell +1})\), then \( CSafe (\widetilde{\mathcal {P}}_\ell )\).
The checker program \(\mathcal {C}_\ell \) is obtained by instrumenting the code of \(\widetilde{\mathcal {P}}_\ell \) with extra variables and procedures that enable checking that procedures disappearing at layer \(\ell \) refine their atomic action specifications (Sect. 4.2).
Our refinement check between two consecutive layers is summarized by the following corollary of Theorems 1–3.
Corollary 1
Let \(\mathcal {LP}\) be a layered concurrent program with top layer \(h\) and \(\ell \in [1,h]\). If \( MSafe (\mathcal {P}_\ell )\), \( CSafe (\mathcal {C}_\ell )\) and \( Safe (\mathcal {P}_{\ell +1})\), then \( Safe (\mathcal {P}_\ell )\).
The soundness of our refinement checking methodology for layered concurrent programs is obtained by repeated application of Corollary 1.
Corollary 2
Let \(\mathcal {LP}\) be a layered concurrent program with top layer \(h\). If \( MSafe (\mathcal {P}_\ell )\) and \( CSafe (\mathcal {C}_\ell )\) for all \(\ell \in [1,h]\) and \( Safe (\mathcal {P}_{h+ 1})\), then \( Safe (\mathcal {P}_1)\).
4.1 From Preemptive to Cooperative Semantics
We present the judgment \( MSafe (\mathcal {P})\) that allows us to reason about a concurrent program \(\mathcal {P}\) under cooperative semantics instead of preemptive semantics. Intuitively, we want to use the commutativity of individual atomic actions to rearrange the steps of any execution under preemptive semantics in such a way that it corresponds to an execution under cooperative semantics. We consider mappings \(M\in Action \rightarrow \{\mathrm {N},\mathrm {R},\mathrm {L},\mathrm {B}\}\) that assign mover types to atomic actions; \(\mathrm {N}\) for nonmover, \(\mathrm {R}\) for right mover, \(\mathrm {L}\) for left mover, and \(\mathrm {B}\) for both mover. The judgment \( MSafe (\mathcal {P})\) requires a mapping \(M\) that satisfies two conditions.
First, the atomic actions in \(\mathcal {P}\) must satisfy the following logical commutativity conditions [7], which can be discharged by a theorem prover.

Commutativity: If \(A_1\) is a right mover or \(A_2\) is a left mover, then the effect of \(A_1\) followed by \(A_2\) can also be achieved by \(A_2\) followed by \(A_1\).

Forward preservation: If \(A_1\) is a right mover or \(A_2\) is a left mover, then the failure of \(A_2\) after \(A_1\) implies that \(A_2\) must also fail before \(A_1\).

Backward preservation: If \(A_2\) is a left mover (and \(A_1\) is an arbitrary), then the failure of \(A_1\) before \(A_2\) implies that \(A_1\) must also fail after \(A_2\).

Nonblocking: If \(A\) is a left mover, then \(A\) cannot block.
In the case \(\overline{P} \ne \varepsilon \) there is a preemption point under cooperative semantics between \(\overline{A}_1\) and \(\overline{A}_2\), the actions in \(\overline{A}_1\) are executed in order before the preemption, and the actions in \(\overline{A}_2\) are executed in order after the preemption. To ensure that the cooperative execution can simulate an arbitrarily interleaved preemptive execution of the pcall, we must be able to move actions in \(\overline{A}_1\) to the left and actions in \(\overline{A}_2\) to the right of the preemption point. We enforce this condition by requiring that \(\overline{A}_1\) is all left (or both) movers and \(\overline{A}_2\) all right (or both) movers, expressed by the leading \(\mathcal {E}(\mathrm {L})\) and trailing \(\mathcal {E}(\mathrm {R})\) in the edge composition.
4.2 Refinement Checker Programs
In this section, we describe the construction of checker programs that justify the formal connection between successive concurrent programs in a layered concurrent program. The description is done by example. In particular, we show the checker program \(\mathcal {C}^ lock _1\) that establishes the connection between \(\mathcal {P}^ lock _1\) and \(\mathcal {P}^ lock _2\) (Fig. 3) of our running example.
Refinement Instrumentation. We illustrate the instrumentation of procedures Enter and Leave in Fig. 8. The core idea is to track updates by preemptionfree execution fragments to the shared variables that continue to exist at layer 2. There are two such variables—lock and slots. We capture snapshots of lock and slots in the local variables Open image in new window and Open image in new window and use these snapshots to check that the updates to lock and slots behave according to the refined atomic action. In general, any path from the start to the end of the body of a procedure may comprise many preemptionfree execution fragments. The checker program must ensure that exactly one of these fragments behaves like the specified atomic action; all other fragments must leave lock and slot unchanged. To track whether the atomic action has already happened, we use two local Boolean variables—pc and done. Both variables are initialized to \( false \), get updated to \( true \) during the execution, and remain at \( true \) thereafter. The variable pc is set to \( true \) at the end of the first preemptionfree execution fragment that modifies the tracked state, which is expressed by the macro *CHANGED* on line 1. The variable done is set to \( true \) at the end of the first preemptionfree execution fragment that behaves like the refined atomic action. For that, the macros *RELEASE* and *ACQUIRE* on lines 2 and 3 express the transition relations of RELEASE and ACQUIRE, respectively. Observe that we have the invariant pc ==> done. The reason we need both pc and done is to handle the case where the refined atomic action may stutter (i.e., leave the state unchanged).
Instrumenting Leave. We first look at the instrumentation of Leave. Line 8 initializes the snapshot variables. Recall that a preemption inside the code of a procedure is introduced only at a pcall containing a procedure arm. Consequently, the body of Leave is preemptionfree and we need to check refinement across a single execution fragment. This checking is done by lines 14–16. The assertion on line 14 checks that if any tracked variable has changed since the last snapshot, (1) such a change happens for the first time (!pc), and (2) the current value is related to the snapshot value according to the specification of RELEASE. Line 15 updates pc to track whether any change to the tracked variables has happened so far. Line 16 updates done to track whether RELEASE has happened so far. The assertion at line 18 checks that RELEASE has indeed happened before Leave returns. The assumption at line 9 blocks those executions which can be simulated by the failure of RELEASE. It achieves this effect by assuming the gate of RELEASE in states where pc is still \( false \) (i.e., RELEASE has not yet happened). The assumption yields the constraint lock != 0 which together with the invariant InvLock (Fig. 6) proves that the gate of RESET does not fail.
The verification of Leave illustrates an important principle of our approach to refinement. The gates of atomic actions invoked by a procedure \(P\) disappearing at layer \(\ell \) are verified using a combination of invariants established on \(\mathcal {C}_\ell \) and pending assertions at layer \(\ell + 1\) encoded as the gate of the atomic action refined by \(P\). For Leave specifically, assert b in RESET is propagated to Open image in new window in RELEASE. The latter assertion is verified in the checker program \(\mathcal {C}^ lock _2\) when Worker, the caller of RELEASE, is shown to refine the action SKIP which is guaranteed not to fail since its gate is \( true \).
Instrumenting Enter. The most sophisticated feature in a concurrent program is a pcall. The instrumentation of Leave explains the instrumentation of the simplest kind of pcall with only atomicaction arms. We now illustrate the instrumentation of a pcall containing a procedure arm using the procedure Enter which refines the atomic action ACQUIRE and contains a pcall to Enter itself. The instrumentation of this pcall is contained in lines 30–43.
A pcall with a procedure arm is challenging for two reasons. First, the callee disappears at the same layer as the caller so the checker program must reason about refinement for both the caller and the callee. This challenge is addressed by the code in lines 34–40. At line 34, we introduce a nondeterministic choice between two code paths—then branch to check refinement of the caller and else branch to check refinement of the callee. An explanation for this nondeterministic choice is given in the next two paragraphs. Second, a pcall with a procedure arm introduces a preemption creating multiple preemptionfree execution fragments. This challenge is addressed by two pieces of code. First, we check that lock and slots are updated correctly (lines 30–32) by the preemptionfree execution fragment ending before the pcall. Second, we update the snapshot variables (line 42) to enable the verification of the preemptionfree execution fragment beginning after the pcall.
Lines 35–37 in the then branch check refinement against the atomic action specification of the caller, exploiting the atomic action specification of the callee. The actual verification is performed in a fresh procedure Open image in new window invoked on line 35. Notice that this procedure depends on both the Open image in new window and the Open image in new window (indicated in colors), and that it preserves a necessary preemption point. The procedure has input parameters Open image in new window to receive the input of the caller (for refinement checking) and Open image in new window to receive the input of the callee (to generate the behavior of the callee). Furthermore, pc may be updated in Open image in new window and thus passed as both an input and output parameter. In the body of the procedure, the invocation of action ACQUIRE on line 56 overapproximates the behavior of the callee. In the layered concurrent program (Fig. 6), the (recursive) pcall to Enter in the body of Enter is annotated with 1. This annotation indicates that for any execution passing through this pcall, ACQUIRE is deemed to occur during the execution of its unique arm. This is reflected in the checker program by updating done to \( true \) on line 37; the update is justified because of the assertion in Open image in new window at line 58. If the pcall being translated was instead unannotated, line 37 would be omitted.
Lines 39–40 in the else branch ensure that using the atomic action specification of the callee on line 56 is justified. Allowing the execution to continue to the callee ensures that the called procedure is invoked in all states allowed by \(\mathcal {P}_1\). However, the execution is blocked once the call returns to ensure that downstream code sees the sideeffect on pc and the snapshot variables.
To summarize, the crux of our instrumentation of procedure arms is to combine refinement checking of caller and callee. We explore the behaviors of the callee to check its refinement. At the same time, we exploit the atomic action specification of the callee to check refinement of the caller.
Instrumenting Unannotated Procedure Arms. Procedure Enter illustrates the instrumentation of an annotated procedure arm. The instrumentation of an unannotated procedure arm (both in an annotated or unannotated pcall) is simpler, because we only need to check that the tracked state is not modified. For such an arm to a procedure refining atomic action Action, we introduce a procedure Open image in new window (which is independent of the caller) comprising three instructions: take snapshots, pcall A, and assert !*CHANGED*.
Pcalls with Multiple Arms. Our examples show the instrumentation of pcalls with a single arm. Handling multiple arms is straightforward, since each arm is translated independently. Atomic action arms stay unmodified, annotated procedure arms are replaced with the corresponding Open image in new window procedure, and unannotated procedure arms are replaced with the corresponding Open image in new window procedure.
Output Parameters. Our examples illustrate refinement checking for atomic actions that have no output parameters. In general, a procedure and its atomic action specification may return values in output parameters. We handle this generalization but lack of space does not allow us to present the technical details.
5 Conclusion
In this paper, we presented layered concurrent programs, a programming notation to succinctly capture a multilayered refinement proof capable of connecting a deeplydetailed implementation to a highlyabstract specification. We presented an algorithm to extract from the concurrent layered program the individual concurrent programs, from the most concrete to the most abstract. We also presented an algorithm to extract a collection of refinement checker programs that establish the connection among the sequence of concurrent programs encoded by the layered concurrent program. The cooperative safety of the checker programs and the preemptive safety of the most abstract concurrent program suffices to prove the preemptive safety of the most concrete concurrent program.
Layered programs have been implemented in Civl, a deductive verifier for concurrent programs, implemented as a conservative extension to the Boogie verifier [3]. Civl has been used to verify a complex concurrent garbage collector [6] and a stateoftheart datarace detection algorithm [15]. In addition to these two large benchmarks, around fifty smaller programs (including a ticket lock and a lockfree stack) are available at https://github.com/boogieorg/boogie.
There are several directions for future work. We did not discuss how to verify an individual checker program. Civl uses the OwickiGries method [13] and relyguarantee reasoning [8] to verify checker programs. But researchers are exploring many different techniques for verification of concurrent programs. It would be interesting to investigate whether heterogeneous techniques could be brought to bear on checker programs at different layers.
In this paper, we focused exclusively on verification and did not discuss code generation, an essential aspect of any programming system targeting the construction of verified programs. There is a lot of work to be done in connecting the most concrete program in a concurrent layered program to executable code. Most likely, different execution platforms will impose different obligations on the most concrete program and the general idea of layered concurrent programs would be specialized for different target platforms.
Scalable verification is a challenge as the size of programs being verified increases. Traditionally, scalability has been addressed using modular verification techniques but only for singlelayer programs. It would be interesting to explore modularity techniques for concurrent layered programs in the context of a refinementoriented proof system.
Layered concurrent programs bring new challenges and opportunities to the design of programming languages and development environments. Integrating layers into a programming language requires intuitive syntax to specify layer information and atomic actions. For example, ordered layer names can be more readable and easier to refactor than layer numbers. An integrated development environment could provide different views of the layered concurrent program. For example, it could show the concurrent program, the checker program, and the introduced code at a particular layer. Any updates made in these views should be automatically reflected back into the layered concurrent program.
Notes
Acknowledgements
We thank Hana Chockler, Stephen Freund, Thomas A. Henzinger, Viktor Toman, and James R. Wilcox for comments that improved this paper. This research was supported in part by the Austrian Science Fund (FWF) under grants S11402N23 (RiSE/SHiNE) and Z211N23 (Wittgenstein Award).
References
 1.Abrial, J.R.: The BBook  Assigning Programs to Meanings. Cambridge University Press, Cambridge (2005)zbMATHGoogle Scholar
 2.Abrial, J.R., Butler, M.J., Hallerstede, S., Hoang, T.S., Mehta, F., Voisin, L.: Rodin: an open toolset for modelling and reasoning in EventB. STTT 12(6), 447–466 (2010). https://doi.org/10.1007/s100090100145yCrossRefGoogle Scholar
 3.Barnett, M., Chang, B.Y.E., DeLine, R., Jacobs, B., Leino, K.R.M.: Boogie: a modular reusable verifier for objectoriented programs. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.P. (eds.) FMCO 2005. LNCS, vol. 4111, pp. 364–387. Springer, Heidelberg (2006). https://doi.org/10.1007/11804192_17CrossRefGoogle Scholar
 4.Elmas, T., Qadeer, S., Tasiran, S.: A calculus of atomic actions. In: Shao, Z., Pierce, B.C. (eds.) POPL 2009, pp. 2–15. ACM (2009). https://doi.org/10.1145/1480881.1480885
 5.Flanagan, C., Qadeer, S.: A type and effect system for atomicity. In: Cytron, R., Gupta, R. (eds.) PLDI 2003, pp. 338–349. ACM (2003). https://doi.org/10.1145/781131.781169
 6.Hawblitzel, C., Petrank, E., Qadeer, S., Tasiran, S.: Automated and modular refinement reasoning for concurrent programs. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9207, pp. 449–465. Springer, Cham (2015). https://doi.org/10.1007/9783319216683_26CrossRefGoogle Scholar
 7.Hawblitzel, C., Petrank, E., Qadeer, S., Tasiran, S.: Automated and modular refinement reasoning for concurrent programs. Technical report MSRTR20158, Microsoft Research, February 2015. https://www.microsoft.com/enus/research/publication/automatedandmodularrefinementreasoningforconcurrentprograms/
 8.Jones, C.B.: Specification and design of (parallel) programs. In: IFIP Congress (1983)Google Scholar
 9.Lamport, L.: Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. AddisonWesley, Boston (2002)Google Scholar
 10.Lipton, R.J.: Reduction: a method of proving properties of parallel programs. Commun. ACM 18(12), 717–721 (1975). https://doi.org/10.1145/361227.361234MathSciNetCrossRefzbMATHGoogle Scholar
 11.McMillan, K.L.: A compositional rule for hardware design refinement. In: Grumberg, O. (ed.) CAV 1997. LNCS, vol. 1254, pp. 24–35. Springer, Heidelberg (1997). https://doi.org/10.1007/3540631666_6CrossRefGoogle Scholar
 12.McMillan, K.L.: Verification of an implementation of Tomasulo’s algorithm by compositional model checking. In: Hu, A.J., Vardi, M.Y. (eds.) CAV 1998. LNCS, vol. 1427, pp. 110–121. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0028738CrossRefGoogle Scholar
 13.Owicki, S.S., Gries, D.: Verifying properties of parallel programs: an axiomatic approach. Commun. ACM 19(5), 279–285 (1976). https://doi.org/10.1145/360051.360224MathSciNetCrossRefzbMATHGoogle Scholar
 14.Wadler, P.: Linear types can change the world! In: Programming Concepts and Methods (1990)Google Scholar
 15.Wilcox, J.R., Flanagan, C., Freund, S.N.: VerifiedFT: a verified, highperformance precise dynamic race detector. In: Krall, A., Gross, T.R. (eds.) PPoPP 2018, pp. 354–367. ACM (2018). https://doi.org/10.1145/3178487.3178514
Copyright information
<SimplePara><Emphasis Type="Bold">Open Access</Emphasis>This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.</SimplePara><SimplePara>The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.</SimplePara>