Safe Deferred Memory Reclamation with Types

Memory management in lock-free data structures remains a major challenge in concurrent programming. Design techniques including read-copy-update (RCU) and hazard pointers provide workable solutions, and are widely used to great effect. These techniques rely on the concept of a grace period: nodes that should be freed are placed on a \emph{deferred} free list, and all threads obey a protocol to ensure that the deallocating thread can detect when all possible readers have completed their use of the object. This provides an approach to safe deallocation, but only when these subtle protocols are implemented correctly. We present a static type system to ensure correct use of RCU memory management: that nodes removed from a data structure are always scheduled for subsequent deallocation, and that nodes are scheduled for deallocation at most once. As part of our soundness proof, we give an abstract semantics for RCU memory management primitives which captures the fundamental properties of RCU. Our type system allows us to give the first proofs of memory safety for RCU linked list and binary search tree implementations without requiring full verification.


Introduction
For many workloads, lock-based synchronization -even fine-grained locking -has unsatisfactory performance. Often lock-free algorithms yield better performance, at the cost of more complex implementation and additional difficulty reasoning about the code. Much of this complexity is due to memory management: developers must reason about not only other threads violating local assumptions, but whether other threads are finished accessing nodes to deallocate. At the time a node is unlinked from a data structure, an unknown number of additional threads may have already been using the node, having read a pointer to it before it was unlinked in the heap.
A key insight for manageable solutions to this challenge is to recognize that just as in traditional garbage collection, the unlinked nodes need not be reclaimed immediately, but can instead be reclaimed later after some protocol finishes running. Hazard pointers [29] are the classic example: all threads actively collaborate on bookkeeping data structures to track who is using a certain reference. For structures with read-biased workloads, Read-Copy-Update (RCU) [24] provides an appealing alternative. The programming style resembles a combination of reader-writer locks and lock-free programming. Multiple concurrent readers perform minimal bookkeeping -often nothing they wouldn't already do. A single writer at a time runs in parallel with readers, performing additional work to track which readers may have observed a node they wish to deallocate. There are now RCU implementations of many common tree data structures [10,34,6,25,3,21], and RCU plays a key role in Linux kernel memory management [28].
However, RCU primitives remain non-trivial to use correctly: developers must ensure they release each node exactly once, from exactly one thread, after ensuring other threads are finished with the node in question. Model checking can be used to validate correctness of implementations for a mock client [22,9,20,1], but this does not guarantee correctness of arbitrary client code. Sophisticated verification logics can prove correctness of the RCU primitives and clients [18,14,33,23]. But these techniques require significant verification expertise to apply, and are specialized to individual data structures or implementations. One of the important reasons of the sophistication in the logics stems from the complexity of underlying memory reclamation model. However, Meyer and Wolff [?] show that a suitable abstraction enables separating verifying correctness of concurrent data structures from its underlying reclamation model under the assumption of memory safety, and study proofs of correctness assuming memory safety.
We propose a type system to ensure that RCU client code uses the RCU primitives safely, ensuring memory safety for concurrent data structures using RCU memory management. We do this in a general way, not assuming the client implements any specific data structure, only one satisfying some basic properties (like having a tree memory footprint) common to RCU data structures. In order to do this, we must also give a formal operational model of the RCU primitives that abstracts many implementations, without assuming a particular implementation of the RCU primitives. We describe our RCU semantics and type system, prove our type system sound against the model (which ensures memory is reclaimed correctly), and show the type system in action on two important RCU data structures.
Our contributions include: -A general (abstract) operational model for RCU-based memory management -A type system that ensures code uses RCU memory management correctly, which is signifiantly simpler than full-blown verification logics -Demonstration of the type system on two examples: a linked-list based bag and a binary search tree -A proof that the type system guarantees memory safety when using RCU primitives.

Background & Motivation
In this section, we recall the general concepts of read-copy-update concurrency. We use the RCU linked-list-based bag [26] from Figure 1 as a running example. It includes annotations for our type system, which will be explained in Section 4.2. As with concrete RCU implementations, we assume threads operating on a structure are either performing read-only traversals of the structure -reader threads -or are performing an update -writer threads -similar to the use of many-reader single-writer reader-writer locks. 1 It differs, however, in that readers may execute concurrently with the (single) writer.
This distinction, and some runtime bookkeeping associated with the readand write-side critical sections, allow this model to determine at modest cost when a node unlinked by the writer can safely be reclaimed. Figure 1 gives the code for adding and removing nodes from a bag. Type checking for all code, including membership queries for bag, can be found in Appendix C. Algorithmically, this code is nearly the same as any sequential implementation. There are only two differences. First, the read-side critical section in member is indicated by the use of ReadBegin and ReadEnd; the write-side critical section is between WriteBegin and WriteEnd. Second, rather than imme-diately reclaiming the memory for the unlinked node, remove calls SyncStart to begin a grace period -a wait for reader threads that may still hold references to unlinked nodes to finish their critical sections. SyncStop blocks execution of the writer thread until these readers exit their read critical section (via ReadEnd). These are the essential primitives for the implementation of an RCU data structure.
These six primitives together track a critical piece of information: which reader threads' critical sections overlapped the writer's. Implementing them efficiently is challenging [10], but possible. The Linux kernel for example finds ways to reuse existing task switch mechanisms for this tracking, so readers incur no additional overhead. The reader primitives are semantically straightforwardthey atomically record the start, or completion, of a read-side critical section.
The more interesting primitives are the write-side primitives and memory reclamation. WriteBegin performs a (semantically) standard mutual exclusion with regard to other writers, so only one writer thread may modify the structure or the writer structures used for grace periods. SyncStart and SyncStop implement grace periods [31]: a mechanism to wait for readers to finish with any nodes the writer may have unlinked. A grace period begins when a writer requests one, and finishes when all reader threads active at the start of the grace period have finished their current critical section. Any nodes a writer unlinks before a grace period are physically unlinked, but not logically unlinked until after one grace period.
An attentive reader might already realize that our usage of logical/physical unlinking is different than the one used in data-structures literatur where typically a logical deletion (marking/unlinking) is followed by a physical deletion (free). Because all threads are forbidden from holding an interior reference into the data structure after leaving their critical sections, waiting for active readers to finish their critical sections ensures they are no longer using any nodes the writer unlinked prior to the grace period. This makes actually freeing an unlinked node after a grace period safe.
SyncStart conceptually takes a snapshot of all readers active when it is run. SyncStop then blocks until all those threads in the snapshot have finished at least one critical section. SyncStop does not wait for all readers to finish, and does not wait for all overlapping readers to simultaneously be out of critical sections.
To date, every description of RCU semantics, most centered around the notion of a grace period, has been given algorithmically, as a specific (efficient) implementation. While the implementation aspects are essential to real use, the lack of an abstract characterization makes judging the correctness of these implementations -or clients -difficult in general. In Section 3 we give formal abstract, operational semantics for RCU implementations -inefficient if implemented directly, but correct from a memory-safety and programming model perspective, and not tied to the low-level RCU implementation details. To use these semantics or a concrete implementation correctly, client code must ensure: -Reader threads never modify the structure -No thread holds an interior pointer into the RCU structure across critial sections -Unlinked nodes are always freed by the unlinking thread after the unlinking, after a grace period, and inside the critical section -Nodes are freed at most once In practice, RCU data structures typically ensure additional invariants to simplify the above, e.g.: -The data structure is always a tree -A writer thread unlinks or replaces only one node at a time. and our type system in Section 4 guarantees these invariants.

Semantics
In this section, we outline the details of an abstract semantics for RCU implementations. It captures the core client-visible semantics of most RCU primitives, but not the implementation details required for efficiency [28]. In our semantics, shown in Figure 2, an abstract machine state, MState, contains: -A stack s, of type Var × TID ⇀ Loc -A heap, h, of type Loc × FName ⇀ Val -A lock, l, of type TID ⊎ {unlocked} -A root location rt of type Loc -A read set, R, of type P(TID) and -A bounding set, B, of type P(TID) The lock l enforces mutual exclusion between write-side critical sections. The root location rt is the root of an RCU data structure. We model only a single global RCU data structure, as the generalization to multiple structures is straightforward but complicates formal development later in the paper. The reader set R tracks the thread IDs (TIDs) of all threads currently executing a read block. The bounding set B tracks which threads the writer is actively waiting for during a grace period -it is empty if the writer is not waiting. Figure 2 gives operational semantics for atomic actions; conditionals, loops, and sequencing all have standard semantics, and parallel composition uses sequentiallyconsistent interleaving semantics.
The first few atomic actions, for writing and reading fields, assigning among local variables, and allocating new objects, are typical of formal semantics for heaps and mutable local variables. Free is similarly standard. A writer thread's critical section is bounded by WriteBegin and WriteEnd, which acquire and release the lock that enforces mutual exclusion between writers. WriteBegin only reduces (acquires) if the lock is unlocked.
Standard RCU APIs include a primitive synchronize rcu() to wait for a grace period for the current readers. We decompose this here into two actions, SyncStart and SyncStop. SyncStart initializes the blocking set to the current set of readers -the threads that may have already observed any nodes the writer has unlinked. SyncStop blocks until the blocking set is emptied by completing reader threads. However, it does not wait for all readers to finish, and does not wait for all overlapping readers to simultaneously be out of critical sections. If two reader threads A and B overlap some SyncStart-SyncStop's critical section, it is possible that A may exit and re-enter a read-side critical section before B exits, and vice versa. Implementations must distinguish subsequent read-side critical sections from earlier ones that overlapped the writer's initial request to wait: since SyncStart is used after a node is physically removed from the data structure and readers may not retain RCU references across critical sections, A re-entering a fresh read-side critical section will not permit it to re-observe the node to be freed.
Reader thread critical sections are bounded by ReadBegin and ReadEnd. ReadBegin simply records the current thread's presence as an active reader. ReadEnd removes the current thread from the set of active readers, and also removes it (if present) from the blocking set -if a writer was waiting for a certain reader to finish its critical section, this ensures the writer no longer waits once that reader has finished its current read-side critical section.
Grace periods are implemented by the combination of ReadBegin, ReadEnd, SyncStart, and SyncStop. ReadBegin ensures the set of active readers is known. When a grace period is required, SyncStart;SyncStop; will store (in B) the active readers (which may have observed nodes before they were unlinked), and wait for reader threads to record when they have completed their critical section (and implicitly, dropped any references to nodes the writer wants to free) via ReadEnd.
These semantics do permit a reader in the blocking set to finish its read-side critical section and enter a new read-side critical section before the writer wakes. In this case, the writer waits only for the first critical section of that reader to complete, since entering the new critical section adds the thread's ID back to R, but not B.

Type System & Programming Language
In this section, we present a simple imperative programming language with two block constructs for modelling RCU, and a type system that ensures proper (memory-safe) use of the language. The type system ensures memory safety by enforcing these sufficient conditions: -A heap node can only be freed if it is no longer accessible from an RCU data structure or from local variables of other threads. To achieve this we ensure the reachability and access which can be suitably restricted. We explain how our types support a delayed ownership transfer for the deallocation. -Local variables may not point inside an RCU data structure unless they are inside an RCU read or write block. -Heap mutations are local: each unlinks or replaces exactly one node.
-The RCU data structure remains a tree. While not a fundamental constraint of RCU, it is a common constraint across known RCU data structures because it simplifies reasoning (by developers or a type system) about when a node has become unreachable in the heap.
We also demonstrate that the type system is not only sound, but useful: we show how it types Figure 1's list-based bag implementation [26]. We also give type checked fragments of a binary search tree to motivate advanced features of the type system; the full typing derivation can be found in Appendix B. The BST requires type narrowing operations that refine a type based on dynamic checks (e.g., determining which of several fields links to a node). In our system, we presume all objects contain all fields, but the number of fields is finite (and in our examples, small). This avoids additional overhead from tracking wellestablished aspects of the type system -class and field types and presence, for example -and focus on checking correct use of RCU primitives. Essentially, we assume the code our type system applies to is already type-correct for a system like C or Java's type system.

RCU Type System for Write Critical Section
Section 4.1 introduces RCU types and the need for subtyping. Section 4.2, shows how types describe program states, through code for Figure 1's list-based bag example. Section 4.3 introduces the type system itself.
RCU Types There are six types used in Write critical sections rcuItr is the type given to references pointing into a shared RCU data structure. A rcuItr type can be used in either a write region or a read region (without the additional components). It indicates both that the reference points into the shared RCU data structure and that the heap location referenced by rcuItr reference is reachable by following the path ρ from the root. A component N is a set of field mappings taking the field name to local variable names. Field maps are extended when the referent's fields are read. The field map and path components track reachability from the root, and local reachability between nodes. These are used to ensure the structure remains acyclic, and for the type system to recognize exactly when unlinking can occur. Read-side critical sections use rcuItr without path or field map components. These components are both unnecessary for readers (who perform no updates) and would be invalidated by writer threads anyways. Under the assumption that reader threads do not hold references across critical sections, the read-side rules essentially only ensure the reader performs no writes, so we omit the reader critical section type rules. They may be found in Appendix E.
unlinked is the type given to references to unlinked heap locations -objects previously part of the structure, but now unreachable via the heap. A heap location referenced by an unlinked reference may still be accessed by reader threads, which may have aquired their own references before the node became unreachable. Newly-arrived readers, however, will be unable to gain access to these referents.
freeable is the type given to references to an unlinked heap location that is safe to reclaim because it is known that no concurrent readers hold references to it. Unlinked references become freeable after a writer has waited for a full grace period.
undef is the type given to references where the content of the referenced location is inaccessible. A local variable of type freeable becomes undef after reclaiming that variable's referent.
rcuFresh is the type given to references to freshly allocated heap locations. Similar to rcuItr type, it has field mappings set N . We set the field mappings in the set of an existsing rcuFresh reference to be the same as field mappings in the set of rcuItr reference when we replace the heap referenced by rcuItr with the heap referenced by rcuFresh for memory safe replacement.
rcuRoot is the type given to the fixed reference to the root of the RCU data structure. It may not be overwritten.
Subtyping It is sometimes necessary to use imprecise types -mostly for control flow joins. Our type system performs these abstractions via subtyping on individual types and full contexts, as in Figure 3.  includes four judgments for subtyping. The first two -⊢ N ≺: N ′ and ⊢ ρ ≺: ρ ′ -describe relaxations of field maps and paths respectively. ⊢ N ≺: N ′ is read as "the field map N is more precise than N ′ " and similarly for paths. The third judgment ⊢ T ≺: T ′ uses path and field map subtyping to give subtyping among rcuItr types -one rcuItr is a subtype of another if its paths and field maps are similarly more precise -and to allow rcuItr references to be "forgotten" -this is occasionally needed to satisfy non-interference checks in the type rules. The final judgment ⊢ Γ ≺: Γ ′ extends subtyping to all assumptions in a type context.
It is often necessary to abstract the contents of field maps or paths, without simply forgetting the contents entirely. In a binary search tree, for example, it may be the case that one node is a child of another, but which parent field points to the child depends on which branch was followed in an earlier conditional (consider the lookup in a BST, which alternates between following left and right children). In Figure 5, we see that cur aliases different fields of par -either Lef t or Right -in different branches of the conditional. The types after the conditional must overapproximate this, here as Lef t|Right → cur in par's field map, and a similar path disjunction in cur's path. This is reflected in Figure  3's T-NSub1-5 and T-PSub1-2 -within each branch, each type is coerced to a supertype to validate the control flow join.
Another type of control flow join is handling loop invariants -where paths entering the loop meet the back-edge from the end of a loop back to the start for repetition. Because our types include paths describing how they are reachable from the root, some abstraction is required to give loop invariants that work for any number of iterations -in a loop traversing a linked list, the iterator pointer would naïvely have different paths from the root on each iteration, so the exact path is not loop invariant. However, the paths explored by a loop are regular, so we can abstract the paths by permitting (implicitly) existentially quantified indexes on path fragments, which express the existence of some path, without saying which path. The use of an explicit abstract repetition allows the type system to preserve the fact that different references have common path prefixes, even after a loop.
Assertions for the add function in lines 19 and 20 of Figure 1 show the loop's effects on paths of iterator references used inside the loop, cur and par. On line 20, par's path contains has (N ext) k . The k in the (N ext) k abstracts the number of loop iterations run, implicitly assumed to be non-negative. The trailing N ext in cur's path on line 19 -(N ext) k .N ext -expresses the relationship between cur and par: par is reachable from the root by following N ext k times, and cur is reachable via one additional N ext. The types of 19 and 20, however, are not the same as lines 23 and 24, so an additional adjustment is needed for the types to become loop-invariant. Reindexing (T-ReIndex in Figure 4) effectively increments an abstract loop counter, contracting (N ext) k .N ext to N ext k everywhere in a type environment. This expresses the same relationship between par and cur as before the loop, but the choice of k to make these paths accurate after each iteration would be one larger than the choice before. Reindexing the type environment of lines 23-24 yields the type environment of lines 19-20, making the types loop invariant. The reindexing essentially chooses a new value for the abstract k. This is sound, because the uses of framing in the type system(T-UnlinkH, T-Replace and T-Insert) ensure uses of any indexing variable are never separated -either all are reindexed, or none are.
While abstraction is required to deal with control flow joins, reasoning about whether and which nodes are unlinked or replaced, and whether cycles are created, requires precision. Thus the type system also includes means ( Figure 4) to refine imprecise paths and field maps. In Figure 5, we see a conditional with the condition par.Lef t == cur. The type system matches this condition to the imprecise types in line 1's typing assertion, and refines the initial type assumptions  in each branch accordingly (lines 2 and 7) based on whether execution reflects the truth or falsity of that check. Similarly, it is sometimes required to checkand later remember -whether a field is null, and the type system supports this.

Types in Action
The system has three forms of typing judgement: Γ ⊢ C for standard typing outside RCU critical sections; Γ ⊢ R C ⊣ Γ ′ for reader critical sections, and Γ ⊢ M C ⊣ Γ ′ for writer critical sections. The first two are straightforward, essentially preventing mutation of the data structure, and preventing nesting of a writer critical section inside a reader critical section. The last, for writer critical sections, is flow sensitive: the types of variables may differ before and after program statements. This is required in order to reason about local assumptions at different points in the program, such as recognizing that a certain action may unlink a node. Our presentation here focuses exclusively on the judgment for the write-side critical sections.
Below, we explain our types through the list-based bag implementation [26] from Figure 1, highlighting how the type rules handle different parts of the code. Figure 1 is annotated with "assertions" -local type environments -in the style of a Hoare logic proof outline. As with Hoare proof outlines, these annotations can be used to construct a proper typing derivation.
Reading a Global RCU Root All RCU data structures have fixed roots, which we characterize with the rcuRoot type. Each operation in Figure 1 begins by reading the root into a new rcuItr reference used to begin traversing the structure. After each initial read (line 12 of add and line 4 of remove), the path of cur reference is the empty path (ǫ) and the field map is empty ({}), because it is an alias to the root, and none of its field contents are known yet.
Reading an Object Field and a Variable As expected, we explore the heap of the data structure via reading the objects' fields. Consider line 6 of remove and its corresponding pre-and post-type environments. Initially par's field map is empty. After the field read, its field map is updated to reflect that its N ext field is aliased in the local variable cur. Likewise, afer the update, cur's path is N ext (= ǫ · N ext), extending the par node's path by the field read. This introduces field aliasing information that can subsequently be used to reason about unlinking.
Unlinking Nodes Line 24 of remove in Figure 1 unlinks a node. The type annotations show that before that line cur is in the structure (rcuItr), while afterwards its type is unlinked. The type system checks that this unlink disconnects only one node: note how the types of par, cur, and curl just before line 24 completely describe a section of the list.
Grace and Reclamation After the referent of cur is unlinked, concurrent readers traversing the list may still hold references. So it is not safe to actually reclaim the memory until after a grace period. Lines 28-29 of remove initiate a grace period and wait for its completion. At the type level, this is reflected by the change of cur's type from unlinked to freeable, reflecting the fact that the grace period extends until any reader critical sections that might have observed the node in the structure have completed. This matches the precondition required by our rules for calling Free, which further changes the type of cur to undef reflecting that cur is no longer a valid reference. The type system also ensures no local (writer) aliases exist to the freed node and understanding this enforcement is twofold. First, the type system requires that only unlinked heap nodes can be freed. Second, framing relations in related rules (T-UnlinkH, T-Replace and T-Insert) ensure no local aliases still consider the node linked.
Fresh Nodes Some code must also allocate new nodes, and the type system must reason about how they are incorporated into the shared data structure. Line 8 of the add method allocates a new node nw, and lines 10 and 29 initialize its fields. The type system gives it a fresh type while tracking its field contents, until line 32 inserts it into the data structure. The type system checks that nodes previously reachable from cur remain reachable: note the field maps of cur and nw in lines 30-31 are equal (trivially, though in general the field need not be null). Figure 6 gives the primary type rules used in checking write-side critical section code as in Figure 1.

Type Rules
T-Root reads a root pointer into an rcuItr reference, and T-ReadS copies a local variable into another. In both cases, the free variable condition ensures that updating the modified variable does not invalidate field maps of other variables in Γ . These free variable conditions recur throughout the type system, and we will not comment on them further. T-Alloc and T-Free allocate and reclaim objects. These rules are relatively straightforward. T-ReadH reads a field into a local variable. As suggested earlier, this rule updates the post-environment to reflect that the overwritten variable z holds the same value as x.f . T-WriteFH updates a field of a fresh (thread-local) object, similarly tracking the update in the fresh object's field map at the type level. The remaining rules are a bit more involved, and form the heart of the type system.
Grace Periods T-Sync gives pre-and post-environments to the compound statement SyncStart;SyncStop implementing grace periods. As mentioned earlier, Γ, r:rcuRoot, y:undef ⊢ y = r ⊣ y:rcuItrǫN ∅ , r:rcuRoot, Γ  this updates the environment afterwards to reflect that any nodes unlinked before the wait become freeable afterwards.
Unlinking T-UnlinkH type checks heap updates that remove a node from the data structure. The rule assumes three objects x, z, and r, whose identities we will conflate with the local variable names in the type rule. The rule checks the case where x.f 1 == z and z.f 2 == r initially (reflected in the path and field map components, and a write x.f 1 = r removes z from the data structure (we assume, and ensure, the structure is a tree).
The rule must also avoid unlinking multiple nodes: this is the purpose of the first (smaller) implication: it ensures that beyond the reference from z to r, all Finally, the rule must ensure that no types in Γ are invalidated. This could happen one of two ways: either a field map in Γ for an alias of x duplicates the assumption that x.f 1 == z (which is changed by this write), or Γ contains a descendant of r, whose path from the root will change when its ancestor is modified. The final assumption of T-UnlinkH (the implication) checks that for every rcuItr reference n in Γ , it is not a path alias of x, z, or r; no entry of its field map (m) refers to r or z (which would imply n aliased x or z initially); and its path is not an extension of r (i.e., it is not a descendant).
MayAlias is a predicate on two paths (or a path and set of paths) which is true if it is possible that any concrete paths the arguments may abstract (e.g., via adding non-determinism through | or abstracting iteration with indexing) could be the same. The negation of a MayAlias use is true only when the paths are guaranteed to refer to different locations in the heap.
Replacing with a Fresh Node Replacing with a rcuFresh reference faces the same aliasing complications as direct unlinking. We illustrate these challenges in Figures 7a and 7b(Figures 31a and 31b in Appendix D illustrating complexities in unlinking). The square R nodes are root nodes, and H nodes are general heap nodes. All resources in red and green form the memory foot print of unlinking. The hollow red circular nodes -pr and cr -point to the nodes involved in replacing H 1 (referenced by cr) wih H f (referenced by cf ) in the structure. We may have a 0 and a 1 which are aliases with pr and cr respectively. They are path-aliases as they share the same path from root to the node that they reference. Edge labels l and r are abbreviations for the Lef t and Right fields of a binary search tree. The dashed green H f denotes the freshly allocated heap node referenced by green dashed cf . The dashed green field l is set to point to the referent of cl and the green dashed field r is set to point to the referent of the heap node referenced by lm.
H f initially (Figure 7a) is not part of the shared structure. If it was, it would violate the tree shape requirement imposed by the type system. This is why we highlight it separately in green -its static type would be rcuFresh. Note that we cannot duplicate a rcuFresh variable, nor read a field of an object it points to. This restriction localizes our reasoning about the effects of replacing with a fresh node to just one fresh reference and the object it points to. Otherwise another mechanism would be required to ensure that once a fresh reference was linked into the heap, there were no aliases still typed as fresh -since that would have risked linking the same reference into the heap in two locations.
The transition from the Figure 7a to 7b illustrates the effects of the heap mutation (replacing with a fresh node). The reasoning in the type system for replacing with a fresh node is nearly the same as for unlinking an existing node, with one exception. In replacing with a fresh node, there is no need to consider the paths of nodes deeper in the tree than the point of mutation. In the unlinking case, those nodes' static paths would become invalid. In the case of replacing with a fresh node, those descendants' paths are preserved. Our type rule for ensuring safe replacement (T-Replace) prevents path aliasing (nonexistence of a 0 and a 1 ) by negating a MayAlias query and prevents field mapping aliasing (nonexistence of any object field from any other context pointing to cr) via asserting (y = o). It is important to note that objects(H 4 , H 2 ) in the field mappings of the cr whose referent is to be unlinked captured by the heap node's field mappings referenced by cf in rcuFresh. This is part of enforcing locality on the heap mutation and captured by assertion N = N ′ in the type rule(T-Replace).
Inserting a Fresh Node T-Insert type checks heap updates that link a fresh node into a linked data structure. Inserting a rcuFresh reference also faces some of the aliasing complications that we have already discussed for direct unlinking and replacing a node. Unlike the repacement case, the path to the last heap node (the referent of o) from the root is extended by f , which risks falsifying the paths for aliases and descendants of o. We prevent this inconsistency with There is also another rule, T-LinkF-Null, not shown in Figure 6, which handles the case where the fields of the fresh node are not object references, but instead all contain null (e.g., for appending to the end of a linked list or inserting a leaf node in a tree).
Entering a Critical Section (Referencing inside RCU Blocks) There are two rules(ToRCURead in Figure 32 Appendix E and ToRCUWrite in Figure 6) for moving to RCU typing: one for entering a write region, and one for a read region.

Evaluation
We have used our type system to check correct use of RCU primitives in two RCU data structures representative of the broader space. Figure 1 gives the type-annotated code for add and remove operations on a linked list implementation of a bag data structure, following McKenney's example [26]. Appendix C contains the code for membership checking.
We have also type checked the most challenging part of an RCU binary search tree, the deletion (which also contains the code for a lookup). Our implementation is a slightly simplified version of the Citrus BST [3]: their code supports fine-grained locking for multiple writers, while ours supports only one writer by virtue of using our single-writer primitives. For lack of space the annotated code is only in Appendix B, but it motivates some of the conditional-related flexibility discussed in Section 4.2. The use of disjunction (Lef t|Right) in field maps and paths is required to capture traversals which follow different fields at different times, such as the lookup in a binary search tree.
The most subtle aspect of the deletion is the final step in the case the node R to remove has both children. In this case, the value V of the left-most node of R's right child -the next element in the collection order -is copied into a new node, which is then used to replace node R: the replacement's fields exactly match R's except for the data (T-Replace via N 1 = N 2 ), and the parent is updated to reference the replacement, unlinking R. At this point, there are two nodes with value V in the tree (weak BST property of the Citrus [3]): the replacement node, and what was the left-most node under R's right child. This latter (original) node for V must be unlinked, which is simplified because by being left-most the left child is null, avoiding another round of replacement The complexity in checking safety here is that once R is found, another loop is used to find V and its parent (since that node will later be removed as well). After V is found, there are two local unlinking operations, at different depths of the tree. This is why the type system must keep separate abstract iteration counts for traversals in loops -these indices act like multiple cursors into the data structure, and allow the types to carry enough information to keep those changes separate and ensure neither introduces a cycle.
To the best of our knowledge, we are the first to check such code for memorysafe use of RCU primitives modularly, without appeal to the specific implementation of RCU primitives.

Soundness
This section outlines the proof of type soundness -our full proof appears in Appendices A.1, A.2, A.3 and A.4. We prove type soundness by embedding the type system into an abstract concurrent separation logic called the Views Framework [11], which when given certain information about proofs for a specific language (primitives and primitive typing) gives back a full program logic including choice and iteration. As with other work taking this approach [17,16], this consists of several key steps: 1. Define runtime states and semantics for the atomic actions of the language.
These are exactly the semantics from Figure 2 in Section 3 2. Define a denotational interpretation − of the types (Figure 8) in terms of an instrumented execution state -a runtime state (Section 3) with additional bookkeeping to simplify proofs. The denotation encodes invariants specific to each type, like the fact that unlinked references are unreachable from the heap. The instrumented execution states are also constrained by additional global WellFormedness invariants (Appendix A.2) the type system is intended to maintain, such as tree structure of the data structure. 3. Prove a lemma -called Axiom Soundness (Lemma 2) -that the type rules for atomic actions are sound. Specifically, that given a state in the denotation of the pre-type-environment of a primitive type rule, the operational semantics produce a state in the denotation of the post-type-environment. This includes preservation of global invariants. 4. Give a desugaring ↓ − ↓ of non-trivial control constructs ( Figure 10) into the simpler non-deterministic versions provided by Views.
The top-level soundness claim then requires proving that every valid source typing derivation corresponds to a valid derivation in the Views logic: Because the parameters given to the Views framework ensure the Views logic's Hoare triples {−}C{−} are sound, this proves soundness of the type rules with respect to type denotations. Because our denotation of types encodes the property that the post-environment of any type rule accurately characterizes which memory is linked vs. unlinked, etc., and the global invariants ensure all allocated heap memory is reachable from the root or from some thread's stack, this entails that our type system prevents memory leaks.

Invariants of RCU Views and Denotations of Types
In this section we aim to convey the intuition behind the predicate WellFormed which enforces global invariants on logical states, and how it interacts with the denotations of types in key ways. WellFormed is the conjunction of a number of more specific invariants, which we outline here. For full details, see Appendix A.2. The Invariant for Read Traversal Reader threads read access valid heap locations even during the grace period. The validity of their heap accesses ensured by the observations they make over the heap locations -which can only be iterator as they can only use local rcuItr references. To this end, a Readers-Iterators-Only invariant asserts that a heap location can only be observed as iterator by the reader threads.
Invariants on Grace-Period Our logical state (Section 6.2) includes some "free list" auxilliary state tracking which readers are still accessing each unlinked node during grace periods. This must be consistent with the bounding thread set B in the machine state. The Readers-In-Free-List invariant asserts that all reader threads with observations of unlinked locations are in the to-free lists for those locations. This is essentially tracking which readers are being "shown grace" for each location. The Iterators-Free-List invariant complements this by asserting all readers with such observations are in the bounding thread set.
The writer thread can refer to a heap location in the free list with a local reference either in type freeable or unlinked. Once the writer unlinks a heap node, it first observes the heap node as unlinked then freeable. The denotation of freeable is only valid following a grace period: it asserts no readers hold aliases of the freeable reference. The denotation of unlinked permits the either the same (perhaps no readers overlapped) or that it is in the to-free list.
Invariants on Safe Traversal against Unlinking The write-side critical section must guarantee that no updates to the heap cause invalid memory accesses. The Writer-Unlink invariant asserts that a heap location observed as iterator by the writer thread cannot be observed differently by other threads. The denotation of the writer thread's rcuItr reference, rcuItr ρ N tid , asserts that following a path from the root compatible with ρ reaches the referent, and all are observed as iterator.
Only a bounding thread may view an (unlinked) heap location in the free list as iterator. The denotation of the reader thread's rcuItr reference, rcuItr tid , requires the referent be either reachable from the root or an unlinked reference in the to-free list. At the same time, it is essential that reader threads arriving after a node is unlinked cannot access it. The invariants Unlinked-Reachability and Free-List-Reachability ensure that any unlinked nodes are reachable only from other unlinked nodes, and never from the root.
Invariants on Safe Traversal against Inserting/Replacing A writer replacing an existing node with a fresh one or inserting a single fresh node assumes the fresh (before insertion) node is unreachable to readers before it is published/linked. The Fresh-Writes invariant asserts that a fresh heap location can only be allocated and referenced by the writer thread. The relation between a freshly allocated heap and the rest of the heap is established by the Fresh-Reachable invariant, which requires that there exists no heap node pointing to the freshly allocated one. This invariant supports the preservation of the tree structure. Fresh-Not-Reader invariant supports the safe traversal of the reader threads via asserting that they cannot observe a heap location as fresh. Moreover, the denotation of rcuFresh type, rcuFresh N tid , enforces that fields in N point to valid heap locations (observed as iterator by the writer thread).
Invariants on Tree Structure Our invariants enforce the tree structure heap layouts for data structures. Unique-Reachable invariant asserts that every heap location reachable from root can only be reached with following an unique path. To preserve the tree structure, Unique-Root enforces unreachability of the root from any heap location that is reachable from root itself.

Proof
This section provides more details on how the Views Framework [11] is used to prove soundness, giving the majore parameters to the framework and outlining key lemmas. Section 3 defined what Views calls atomic actions (the primitive operations) and their semantics of runtime machine states. The Views Framework uses a separate notion of instrumented (logical) state over which the logic is built, related by a concretization function ⌊−⌋ taking an instrumented state to the machine states of Section 3. Most often -including in our proof -the logical state adds useful auxilliary state to the machine state, and the concretization is simply projection. Thus we define our logical states LState as: The thread ID set T includes the thread ID of all running threads. The free map F tracks which reader threads may hold references to each location. It is not required for execution of code, and for validating an implementation could be ignored, but we use it later with our type system to help prove that memory deallocation is safe. The (per-thread) variables in the undefined variable map U are those that should not be accessed (e.g., dangling pointers).
The remaining component, the observation map O, requires some further explanation. Each memory allocation / object can be observed in one of the following states by a variety of threads, depending on how it was used.
An object can be observed as part of the structure (iterator), removed but possibly accessible to other threads, freshly allocated, safe to deallocate, or the root of the structure.
Assertions in the Views logic are (almost) sets of the logical states that satisfy a validity predicate WellFormed, outlined in Section 6.1: Every type environment represents a set of possible views (WellFormed logical states) consistent with the types in the environment. We make this precise with a denotation function that yields the set of states corresponding to a given type environment. This is defined as the intersection of individual variables' types as in Figure 8.
Individual variables' denotations are extended to context denotations slightly differently depending on whether the environment is a reader or writer thread context: writer threads own the global lock, while readers do not:

Fig. 8: Type Environments
-For read-side as To support framing (weakening), the Views Framework requires that views form a partial commutative monoid under an operation • : M −→ M −→ M, provided as a parameter to the framework. The framework also requires an interference relation R ⊆ M×M between views to reason about local updates to one view preserving validity of adjacent views (akin to the small-footprint property of separation logic). Figure 9 defines our composition operator and the core interference relation R 0 -the actual inferference between views (between threads, or between a local action and framed-away state) is the reflexive transitive closure of R 0 . Composition is mostly straightforward point-wise union (threads' views may overlap) of each component. Interference bounds the interference writers and readers may inflict on each other. Notably, if a view contains the writer thread, other threads may not modify the shared portion of the heap, or release the writer lock. Other aspects of interference are natural restrictions like that threads may not modify each others' local variables. WellFormed states are closed under both composition (with another WellFormed state) and interference (R relates WellFormed states only to other WellFormed states).
The framing/weakening type rule will be translated to a use of the frame rule in the Views Framework's logic. There separating conjunction is simply the existence of two composable intrumented states: In order to validate the frame rule in the Views Framework's logic, the assertions in its logic -sets of well-formed instrumented states -must be restricted to sets of logical states that are stable with respect to expected interference from other threads or contexts, and interference must be compatible in some way with separating conjunction. Thus a View -the actual base assertions in the Views logic -are then: Additionally, interference must distribute over composition: Because we use this induced Views logic to prove soundness of our type system by translation, we must ensure any type environment denotes a valid view:

Lemma 1 (Stable Environment Denotation-M). For any
Alternatively, we say that environment denotation is stable (closed under R).
We ellide the statement of the analagous result for the read-side critical section, available in Appendix A.1.
With this setup done, we we can state the connection between the Views Framework logic induced by earlier parameters, and the type system from Section 4. The induced Views logic has a familiar notion of Hoare triple -{p}C{q} where p and q are elements of View M -with the usual rules for non-deterministic choice, non-deterministic iteration, sequential composition, and parallel composition, sound given the proof obligations just described above. It is parameterized by a rule for atomic commands that requires a specification of the triples for primitive operations, and their soundness (an obligation we must prove). This can then be used to prove that every typing derivation embeds to a valid derivation in the Views Logic, roughly ∀Γ, for the writer type system, once for the readers.
There are two remaining subtleties to address. First, commands C also require translation: the Views Framework has only non-deterministic branches and loops, so the standard versions from our core language must be encoded. The approach to this is based on a standard idea in verification, which we show here for conditionals as shown in Figure 10. assume(b) is a standard idea in verification semantics [4,30], which "does nothing" (freezes) if the condition b is false, so its postcondition in the Views logic can reflect the truth of b. assume in Figure  10 adapts this for the Views Framework as in other Views-based proofs [17,16], specifying sets of machine states as a predicate. We write boolean expressions as shorthand for the set of machine states making that expression true.
Second, we have not addressed a way to encode subtyping. One might hope this corresponds to a kind of implication, and therefore subtyping corresponds to consequence. Indeed, this is how we (and prior work [17,16]) address subtyping in a Views-based proof. Views defines the notion of view shift 2 (⊑) as a way to reinterpret a set of instrumented states as a new (compatible) set of instrumented states, offering a kind of logical consequence, used in a rule of consequence in the Views logic: We are now finally ready to prove the key lemmas of the soundness proof, relating subtying to view shifts, proving soundness of the primitive actions, and finally for the full type system. These proofs occur once for the writer type system, and once for the reader; we show here only the (more complex) writer obligations:

Lemma 2 (Axiom of Soundness for Atomic Commands). For each axiom,
Proof. By case analysis on α. Details in Appendix A.1.

Lemma 3 (Context-SubTyping
Proof. Induction on the subtyping derivation, then inducting on the single-type subtype relation for the first variable in the non-empty context case.

Lemma 4 (Views Embedding for Write-Side).
∀Γ Proof. By induction on the typing derivation, appealing to Lemma 2 for primitives, Lemma 3 and consequence for subtyping, and otherwise appealing to structural rules of the Views logic and inductive hypotheses. Full details in Appendix A.1.
The corresponding obligations and proofs for the read-side critical section type system are similar in statement and proof approach, just for the read-side type judgments and environment denotations.

Related Work
Our type system builds on a great deal of related work on RCU implementations and models; and general concurrent program verification (via program logics, model checking, and type systems).
Modeling RCU and Memory Models Alglave et al. [2] propose a memory model to be assumed by the platform-independent parts of the Linux kernel, regardless of the underlying hardware's memory model. As part of this, they give the first formalization of what it means for an RCU implementation to be correct (previously this was difficult to state, as the guarantees in principle could vary by underlying CPU architecture). Essentially, that reader critical sections do not span grace periods. They prove by hand that the Linux kernel RCU implementation [1] satisfies this property. According to the fundamental requirements of RCU [27], our model in Section 3 can be considered as a valid RCU implementation satisfying all requirements for an RCU implementation(assuming sequential consistency) aside from one performance optimization, Read-to-Write Upgrade, which is important in practice but not memory-safety centric: -Grace-Period and Memory-Barrier Guarantee: To reclaim a heap location, a mutator thread must synchronize with all of the reader threads with overlapping read-side critical sections to guarantee that none of the updates to the memory cause invalid memory accesses. The operational semantics enforce a protocol on the mutator thread's actions. First it unlinks a node from the data structure; the local type for that reference becomes unlinked. Then it waits for current reader threads to exit, after which the local type is freeable. Finally, it may safely reclaim the memory, after which the local type is undef.
The semantics prevent the writer from reclaiming too soon by adding the heap location to the free list of the state, which is checked dynamically by the actual free operation. We discuss the grace period and unlinking invariants in our system in Section 6.1. -Publish-Subscribe Guarantee: Fresh heap nodes cannot be observed by the reader threads until they are published. As we see in the operational semantics, once a new heap location is allocated it can only be referenced by a local variable of type fresh. Once published, the local type for that reference becomes rcuItr, indicating it is now safe for the reader thread to access it with local references in rcuItr type. We discuss the related type assertions for inserting/replacing (Figures 7a-7b) a fresh node in Section 4.3 and the related invariants in Section 6.1. -RCU Primitives Guaranteed to Execute Unconditionally: Uncoditional execution of RCU Primitives are provided by the definitions in our operational semantics for our RCU primitives(e.g. ReadBegin, ReadEnd, WriteBegin and WriteEnd) as their executions do not consider failure/retry. -Guaranteed Read-to-Write Upgrade: This is a performance optimization which allows the reader threads to upgrade the read-side critical section to the write-critical section by acquiring the lock after a traversal for a data element and ensures that the upgrading-reader thread exit the read-critical section before calling RCU synchronization. This optimization also allows sharing the traversal code between the critical sections. Read-to-Write is an important optimization in practice but largely orthogonal to memory-safety. Current version of our system provides a strict separation of traverse-andupdate and traverse-only intentions through the type system(e.g. different iterator types and rules for the RCU Write/Read critical sections) and the programming primitives. As a future work, we want to extend our system to support this performce optimization.
To the best of our knowledge, ours is the first abstract operational model for a to verify a user-level implementation of RCU with a singly linked list client under release-acquire semantics, which is a weaker memory model than sequentialconsistency. They require release-writes and acquire-reads to the QSRB counters for proper synchronization in between the mutator and the reader threads. This protocol is exactly what we enforce over the logical observations of the mutator thread: from unlinked to freeable. Tassarotti et al.'s synchronization for linking/publishing new nodes occurs in a similar way to ours, so we anticipate it would be possible to extend our type system in the future for similar weak memory models.
Program use abstract-predicates -e.g. WriterSafe -that are specialized to the singly-linked structure in their evaluation. This means reusing their ideas for another structure, such as a binary search tree, would require revising many of their invariants. By contrast, our types carry similar information (our denotations are similar to their definitions), but are reusable across at least singly-linked and tree data structures (Section 5). Their proofs of a linked list also require managing assertions about RCU implementation resources, while these are effectively hidden in the type denotations in our system. On the other hand, their proofs ensure full functional correctness. Meyer and Wolff [?] make a compelling argument that separating memory safety from correctness if profitable, and we provide such a decoupled memory safety argument.
Realizing our RCU Model A direct implementation of our semantics would yield unacceptable performance, since both entering (ReadBegin) and exiting (ReadEnd) modify shared data structures for the bounding-threads and readers sets. A slight variation on our semantics would use a bounding set that tracked such a snapshot of counts, and a vector of per-thread counts in place of the reader set. Blocking grace period completion until the snapshot was strictly older than all current reader counts would be more clearly equivalent to these implementations. Our current semantics are simpler than this alternative, while also equivalent.
Model They also have only checked a linked list. They claim handling trees (look-up followed by update) as a future work [19]. Thus our work is the first type system for ensuring correct use of RCU primitives that is known to handle more complex structures than linked lists.

Conclusions
We presented the first type system that ensures code uses RCU memory management safely, and which which is signifiantly simpler than full-blown verification logics. To this end, we gave the first general operational model for RCU-based memory management. Based on our suitable abstractions for RCU in the opoerational semantics we are the first showing that decoupling the memory-safety proofs of RCU clients from the underlying reclamation model is possible. Meyer et al. [?] took similar approach for decoupling the correctness verification of the data structures from the underlying reclamation model under the assumption of the memory-safety for the data structures. We demonstrated the applicability/reusability of our types on two examples: a linked-list based bag [26] and a binary search tree [3]. To our best knowledge, we are the first presenting the memory-safety proof for a tree client of RCU. We managed to prove type soundness by embedding the type system into an abstract concurrent separation logic called the Views Framework [11] and encode many RCU properties as either type-denotations or global invariants over abstract RCU state. By doing this, we managed to discharge these invariants once as a part of soundness proof and did not need to prove them for each different client.

A.1 Complete Constructions for Views
To prove soundness we use the Views Framework [11]. The Views Framework takes a set of parameters satisfying some properties, and produces a soundness proof for a static reasoning system for a larger programming language. Among other parameters, the most notable are the choice of machine state, semantics for atomic actions (e.g., field writes, or WriteBegin), and proofs that the reasoning (in our case, type rules) for the atomic actions are sound (in a way chosen by the framework). The other critical pieces are a choice for a partial view of machine states -usually an extended machine state with meta-information -and a relation constraining how other parts of the program can interfere with a view (e.g., modifying a value in the heap, but not changing its type). Our type system will be related to the views by giving a denotation of type environments in terms of views, and then proving that for each atomic action shown in 2 in Section 3 and type rule in The free map F tracks which reader threads may hold references to each location. It is not required for execution of code, and for validating an implementation could be ignored, but we use it later with our type system to help prove that memory deallocation is safe. Each memory region can be observed in one of the following type states within a snapshot taken at any time We are interested in RCU typed of heap domain which we define as: A thread's (or scope's) view of memory is a subset of the instrumented(logical states), which satisfy certain well-formedness criteria relating the physical state and the additional meta-data (O, U , T and F ) We do our reasoning for soundness over instrumented states and define an erasure relation ⌊−⌋ : MState =⇒ LState that projects instrumented states to the common components with MState. The latter is given in Figure 11. To define the former, we first need to state what it means to combine logical machine states.
Composition of instrumented states is an operation that is commutative and associative, and defined component-wise in terms of composing physical states, observation maps, undefined sets, and thread sets as shown in Figure 12 An important property of composition is that it preserves validity of logical states: h, l, rt, R, B) Fig. 12: Composition(•) and Thread Interference Relation(R 0 )

Lemma 5 (Well Formed Composition). Any successful composition of two well-formed logical states is well-formed:
Proof. By assumption, we know that Wellformed(x) and Wellformed(y) hold. We need to show that composition of two well-formed states preserves wellformedness which is to show that for all z such that x • y = z, Wellformed(z) holds. Both x and y have compontents ((s and ((s y , h y , l y , rt y , R y , B y ), O y , U y , T y , F y ), respectively. • s operator over stacks s x and s y enforces dom(s x ) ∩ dom(s y ) = ∅ which enables to make sure that wellformed mappings in s x does not violate wellformed mappings in s y when we union these mappings for s z . Same argument applies for • F operator over F x and F y . Disjoint unions of wellformed R x with wellformed R y and wellformed B x with wellformed B y preserves wellformedness in composition as it is disjoint union of different wellformed elements of sets. Wellformed unions of O x with O y , U x with U y and T x with T y preserve wellformedness. When we compose h x (s(x, tid), f ) and h y (s(x, l), f ), it is easy to show that we preserve wellformedness if both threads agree on the heap location. Otherwise, if the heap location is undefined for one thread but a value for the other thread then composition considers the value. If a heap location is undefined for both threads then this heap location is also undefined for the location. All the cases for heap composition still preserves the wellformedness from the assumption that x and y are wellformed.
We define separation on elements of type contexts Partial separating conjunction then simply requires the existence of two states that compose: Different threads' views of the state may overlap (e.g., on shared heap locations, or the reader thread set), but one thread may modify that shared state. The Views Framework restricts its reasoning to subsets of the logical views that are stable with respect to expected interference from other threads or contexts. We define the interference as (the transitive reflexive closure of) a binary relation R on M, and a View in the formal framework is then: defines permissible interference on an instrumented state. The relation must distribute over composition: where R is transitive-reflexive closure of R 0 shown at Figure 12. R 0 (and therefore R) also "preserves" validity: Moreover, from semantics we know that l and h can only be changed by writer thread and from R 0 and by assumptions from the lemma(WellFormed(m).RINFL) we can conclude that F ,l and h do not have effect on wellformedness of the m.
Proof. By induction on the structure of Γ . The empty case holds trivially. In the other case where Γ = Γ ′ , x : T , we have by the inductive hypothesis that is stable, and must show that is as well. This latter case proceeds by case analysis on T . We know that O, U , T , R, s and rt are preserved by R 0 . By unfolding the type environment in the assumption we know that tid = l. So we can derive conclusion for preservation of F and h and l by prove this case.
Case 3.root: All the facts we know so far from R 0 , tid = l and additional fact we know from R 0 : prove this case.
Proof. Proof is similar to one for Lemma 7 where there is only one simple case, x : rcuItr .
The Views Framework defines a program logic (Hoare logic) with judgments of the form {p}C{q} for views p and q and commands C. Commands include atomic actions, and soundness of such judgments for atomic actions is a parameter to the framework. The framework itself provides for soundness of rules for sequencing, fork-join parallelism, and other general rules. To prove type soundness for our system, we define a denotation of type judgments in terms of the Views logic, and show that every valid typing derivation translates to a valid derivation in the Views logic: The antecedent of the implication is a type judgment(shown in Figure 6 Section 4.3, Figure 4 Section 4.1 and Figure 32 Appendix E) and the conclusion is a judgment in the Views logic. The environments are translated to views (View M ) as previously described. Commands C also require translation, because the Views logic is defined for a language with non-deterministic branches and loops, so the standard versions from our core language must be encoded. The approach to this is based on a standard idea in verification, which we show here for conditionals as shown in Figure 13. assume(b) is a standard construct in verification semantics [4] [30], which "does nothing" (freezes) if the condition b is false, so its postcondition in the Views logic can reflect the truth of b. This is also the approach used in previous applications of the Views Framework [17,16].
if The framework also describes a useful concept called the view shift operator ⊆, that describes a way to reinterpret a set of instrumented states as a new set of instrumented states. This operator enables us to define an abstract notion of executing a small step of the program. We express the step from p to q with action α ensuring that the operation interpretation of the action satisfies the specification:p ⊑ q def = ∀m ∈ M. ⌊p * {m}⌋ ⊆ ⌊q * R({m})⌋. Because the Views framework handles soundness for the structural rules (sequencing, parallel composition, etc.), there are really only three types of proof obligations for us to prove. First, we must prove that the non-trivial command translations (i.e., for conditionals and while loops) embed correctly in the Views logic, which is straightforward. Second, we must show that for our environment subtyping, if Γ <: Γ ′ , then Γ ⊑ Γ ′ . And finally, we must prove that each atomic action's type rule corresponds to a valid semantic judgment in the Views Framework: The use of * validates the frame rule and makes this obligation akin to an interference-tolerant version of the small footprint property from traditional separation logics [32,5].

Lemma 9 (Axiom of Soundness for Atoms). For each axiom,
Proof. By case analysis on the atomic action α followed by inversion on typing derivation. All the cases proved as different lemmas in Section A.3.
Type soundness proceeds according to the requirements of the Views Framework, primarily embedding each type judgment into the Views logic:
Proof is similar to the one for Lemma 11 except the denotation for type system definition is R t = {{((s, h, l, rt, R, B), O, U, T, F )|t ∈ R} which shrinks down the set of all logical states to the one that can only be defined by types(rcuItr) in read type system.

Lemma 11 (Views Embedding for Write-Side)
. Because the intersection of the environment denotation with the denotations for the different critical sections remains a valid view, the Views Framework provides most of this proof for free, given corresponding lemmas for the atomic actions α: α ranges over any atomic command, such as a field access or variable assignment. Denoting a type environment Γ M,tid , unfolding the definition one step, is merely Γ tid ∩ M tid . In the type system for write-side critical sections, this introduces extra boilerplate reasoning to prove that each action preserves lock ownership. To simplify later cases of the proof, we first prove this convenient lemma.

Lemma 12 (Write-Side Critical Section Lifting).
For each α whose semantics does not affect the write lock, if Proof. Each of these shared actions α preserves the lock component of the physical state, the only component constrained by − M,tid beyond − tid . For the read case, we must prove from the assumed subset relationship that for an aritrary m: By assumption, transitivity of ⊆, and the semantics for the possible αs, the left side of this containment is already a subset of What remains is to show that the intersection with M tid is preserved by the atomic action. This follows from the fact that none of the possible αs modifies the global lock. Figure 14 invariant asserts that none of the heap nodes can be observed as undefined by any of those threads. Figure 15 asserts that if a heap location is not undefined then all reader threads and the writer thread can observe the heap location as iterator or the writer thread can observe heap as fresh, unlinked or freeable. Figure 16 asserts that the unique root location can only be aliased with thread local references through which the unique root location is observed as iterator. Figure 17 asserts that if a heap location is observed as iterator and it is the free list then the observer thread is in the set of bounding threads.   Figure 18 asserts that if a heap node is observed as unlinked then all heap locations from which you can reach to the unlinked one are also unlinked or in the free list. Figure 19 asserts that if a heap location is in the free list then all heap locations from which you can reach to the one in the free list are also in the free list.  Figure 20 asserts that the writer thread cannot observe a heap location as unlinked.  Figure 23 asserts that a heap location allocated freshly cannot be observed as unlinked or iterator. Fig. 23: Fresh-Not-Reader 11. Fresh-Points-Iterator invariant in Figure 24 states that any field of fresh allocated object can only be set to point heap node which can be observed as iterator (not unlinked or freeable). This invariant captures the fact N = N ′ asserted in the type rule for fresh node linking(T-Replace).  Figure 27 asserts that for any mapping from a location to a set of threads in the free list we know the fact that this set of threads is a subset of bounding threads( which itself is subset of reader threads).   Figure 29 asserts that a heap location which is observed as root has no incoming edges from any nodes in the domain of the heap and all nodes accessible from root is is observed as iterator. This invariant is part of enforcement for acyclicity.  Figure 30 asserts that every node is reachable from root node with an unique path. This invariant is a part of acyclicity(tree structure) enforcement on the heap layout of the data structure.

A.3 Soundness Proof of Atoms
In this section, we do proofs to show the soundness of each type rule for each atomic actions.

Lemma 13 (Unlink).
x.f1 From assumptions in the type rule of T-UnlinkH we assume that We split the composition in 1 as We must show ∃ σ ′ We also know from operational semantics that the machine state has changed as and 26 is determined by operational semantics. The only change in the observation map is on s(y, tid) from iterator tid to unliked

M,tid
In the rest of the proof, we prove 24, 25 and show the composition of . To prove 24, we need to show that each of the memory axioms in Section A.2 holds for the state (σ s(x, tid), o y be σ.s(y, tid) and o z be σ.s(z, tid).
-UNQR 29 and 30 follow from framing assumption (3)(4)(5), denotations of the precondition(6) and 13.UNQR and where o x , o y and o z are equal to σ.h * (σ.rt, ρ), σ.h * (σ.rt, ρ, f 1 ) and σ.h * (σ.rt, ρ.f 1 .f 2 ) respectively and they(o x , o y , o z and ρ, ρ ′ ) are unique. We must prove to show that uniqueness is preserved. We know from operational semantics that root has not changed so From denotations (15) we know that all heap locations reached by following ρ and ρ.f 1 are observed as iterator tid including the final reached heap from which 31 follows.
To prove 18 we need to show interference relation which by definition means that we must show To prove all relations (35-40) we assume 28 which is to assume T 2 as subset of reader threads.
We also know from 27 that Both together with 9 and 15 proves 20.
To show 19 we consider two cases: From assumptions in the type rule of T-Replace we assume that We split the composition in 43 as We must show ∃ σ ′ We also know from operational semantics that the machine state has changed as 59 is determined directly from operational semantics. We know that changes in observation map are and 71 follows from 43 Let T ′ 1 be T 1 , F ′ 1 be F 1 and σ ′ 1 be determined by operational semantics. The undefined map need not change so we can pick U ′ 1 as U 1 . Assuming 49 and choices on maps makes (σ ′ In the rest of the proof, we prove 66, 67 and show the composition of . To prove 66, we need to show that each of the memory axioms in Section A.2 holds for the state ( and where o p , o r are σ.h * (σ.rt, ρ), σ.h * (σ.rt, ρ.f ) respectively and they(heap locations in 73 and paths in 72) are unique(From 56.FR, we assume that there exists no field alias/path alias to heap location freshly allocated o n ). We must prove We know from operational semantics that root has not changed so From denotations (58) we know that all heap locations reached by following ρ and ρ.f are observed as iteartor tid including the final reached heap locations(iterator tid ). The preservation of uniqueness follows from 69, 70, 68 and 56.FR.
from which we conclude 75 and 76 Case 29. -ULKR We must prove which follows from 58, 56.OW and determined by operational semantics(68), 69, 70. If o ′ were observed as iterator then that would conflict with 66.UNQR.
To prove 60, we need to show interference relation which by definition means that we must show To prove all relations (78-83) we assume 71 which is to assume T 2 as subset of reader threads. Let To prove 61 consider two cases: The first case is trivial. The second case is where we consider 86 and 87 From 69 we know that Both together with 51 and 58 proves 61. For case 87 From 70 we know that Both together with 51 and 58 proves 61.
From assumptions in the type rule of T-Insert we assume that We split the composition in 88 as We must show ∃ σ ′ We also know from operational semantics that the machine state has changed as We know that changes in observation map are 114 follows from 88 Let T ′ 1 be T 1 , F ′ 1 be F 1 and σ ′ 1 be determined by operational semantics. The undefined map need not change so we can pick U ′ 1 as U 1 . Assuming 94 and choices on maps makes (σ ′ In the rest of the proof, we prove 110, 111 and show the composition of . To prove 110, we need to show that each of the memory axioms in Section A.2 holds for the state (σ ′ , O ′ 1 , U ′ 1 , T ′ 1 , F ′ 1 ). Proofs for OW, RWOW, AWRT, IFL, WULK, FLR, FPI, WF, FR, HD, WNR, RINFL and ULKR. The proof of UNQR is similar to the ones we did for Lemma 14 and Lemma 13 with a simpler fact to prove: we assume framing conditions 90-93 together with the 101.UNQR and 101.FR which makes 110UNQR trivial.
To prove 104, we need to show interference relation which by definition means that we must show To prove all relations (78-83) we assume 114 which is to assume T 2 as subset of reader threads. Let First is trivial. Second follows from 110.OW-HD and 111.OW-HD. 107, 109 and 108 are trivial by choices on related maps and semantics of the composition operators for these maps. All compositions shown let us to derive conclusion for Lemma 16 (ReadStack).
From the assumption in the type rule of T-ReadS we assume that We split the composition in 123 as We must show Let s(x, tid) be o x . We also know from operational semantics that the machine state has changed as We know that there exists no change in the observation of heap locations 146 follows from 123 σ ′ 1 is determined by operational semantics. The undefined map, T 1 and free list need not change so we can pick U ′ 1 as U 1 , T ′ 1 as T 1 and F ′ 1 as F 1 . Assuming 126 and choices on maps makes (σ ′ In the rest of the proof, we prove 142, 143 and show the composition of . 142 follows from 133 trivially.
To prove 136, we need to show interference relation which by definition means that we must show To prove all relations (147-152) we assume 146 which is to assume T 2 as subset of reader threads. 138-141 are trivial by choices on related maps and semantics of the composition operators for these maps. 137 follows trivially from 128. All compositions shown let us to derive conclusion for (σ ′ From the assumption in the type rule of T-ReadH we assume that We split the composition in 155 as Let h(s(z, tid), f ) be o z . We also know from operational semantics that the machine state has changed as 170 is determined directly from operational semantics. We know that there exists no change in the observation of heap locations 181 follows from 155 σ ′ 1 is determined by operational semantics. The undefined map, free list and T 1 need not change so we can pick U ′ 1 as U 1 , F ′ 1 as F 1 and T ′ 1 and T 1 . Assuming 160 and choices on maps makes (σ ′ In the rest of the proof, we prove 177, 178 and show the composition of . To prove 171, we need to show that each of the memory axioms in Section A.2 holds for the state (σ To prove 171, we need to show interference relation which by definition means that we must show 173-176 are trivial by choices on related maps and semantics of the composition operators for these maps. 172 follows trivially from 162. All compositions shown let us to derive conclusion for (σ ′
From the assumption in the type rule of T-WriteFH we assume that We split the composition in 190 as We must show ∃ σ ′ We also know from operational semantics that the machine state has changed as There exists no change in the observation of heap locations In the rest of the proof, we prove 210 and 211 and show the composition of . To prove 210, we need to show that each of the memory axioms in Section A. To prove 204, we need to show interference relation which by definition means that we must show To prove all relations (215-220) we assume 214 which is to assume T 2 as subset of reader threads and 201.

Lemma 19 (Sycn).
We split the composition in 223 as We also know from operational semantics that SyncStart changes Then SyncStop changes it to ∅ so there exists no change in B after SyncStart;SyncStop. So there is no change in machine state.
We split the composition in 257 as From operational semantics we know that s(y, tid) is ℓ. We also know from operational semantics that the machine state has changed as There exists no change in the observation of heap locations To prove all relations (280-287) we assume 279 which is to assume T 2 as subset of reader threads and 267.
We split the composition in 288 as We must show From operational semantics we know that 312 follows from 288 In the rest of the proof, we prove 306, 307 and show the composition of .. To prove 306, we need to show that each of the memory axioms in Section A.2 holds for the state (σ ′ , O ′ 1 , U ′ 1 , T ′ 1 , F ′ 1 ) and it it trivial by 308-311 and 299.
To prove 300, we need to show interference relation which by definition means that we must show To prove all relations (313-320) we assume 312 which is to assume T 2 as subset of reader threads and 267. Let σ ′ 2 be σ 2 .

Lemma 22 (RReadStack
We split the composition in 321 as We must show We also know from operational semantics that the machine state has changed as There exists no change in the observation of heap locations 343 follows from 321 Let T ′ 1 be T 1 and σ ′ 1 be determined by operational semantics as σ 1 . The undefined map and free list need not change so we can pick U ′ 1 as U 1 and F ′ 1 as F 1 . Assuming 323 and choices on maps makes (σ ′ In the rest of the proof, we prove 339, 34024, 25 and show the composition of From 376 and 24 we know By using 379, 380 and 377 as antecedentes of Views Logic's consequence rule, we conclude 378.
Case 72. -M: where C is sequence statement. C has the form C 1 ; C 2 . Our goal is to prove We know By using 384 and 385 as the antecedents for the Views sequencing rule, we can derive the conclusion for 381.
Our goal is to prove We prove 389 by from the consequence rule, based on the proofs of the following 390 and 391 The poof of 390 follows from Views Logic's proof rule for assume construct by using Our goal is to prove where the desugared form includes a fresh variable y. We use fresh variables just for desugaring and they are not included in any type context. We prove396 from the consequence rule of Views Logic based on the proofs of the following 397 and 398 and 398 is trivial from the fact that y is a fresh variable and it is not included in any type context and just used for desugaring.
We prove 397 from the branch rule of Views Logic based on the proofs of the following 399 and 400 We show 399 from Views Logic's proof rule for the assume construct by using Proof. Induction on the subtyping derivation. Then inducting on the first entry in the non-empty context(empty case is trivial) which follows from 26.
Proof. Induction on the subtyping derivation. Then inducting on the first entry in the non-empty context(empty case is trivial) which follows from 27.

Lemma 26 (Singleton-SubTyping-M).
x : T ≺: Proof. Proof by case analysis on structure of T ′ and T . Important case includes the subtyping relation is defined over components of rcuItr type. T ′ including approximation on the path component together with the approximation on the field map lead to subset inclusion in between a set of states defined by denotation of the x : T ′ the set of states defined by denotation of the x : T (which is also obvious for T-Sub). Reflexive relations and relations capturing base cases in subtyping are trivial to show.

Lemma 27 (Singleton-SubTyping-R).
x : T ≺: -via field mappings -or stack pointers -via path components. We see path aliases, a 1 , a 2 and a 3 , illustrated with dashed nodes and arrows to the heap nodes in Figures 31a and 31b. They are depicted as dashed because they are not safe resources to use when unlinking so they are framed-out by the type system via (¬MayAlias(ρ 3 , {ρ, ρ 1 , ρ 2 })) which ensures the non-existence of the path-aliases to any of x, z and r in the rule which corresponds to pr, cr and crl respectively.
Any heap node reached from root by following a path(ρ 3 ) deeper than the path reaching to the last heap node(crl) in the footprint cannot be pointed by any of the heap nodes(pr, cr and crl) in the footprint. We require this restriction to prevent inconsistency on path components of references, ρ 3 , referring to heap nodes deeper than memory footprint (∀ ρ4 =ǫ . ¬MayAlias(ρ 3 , ρ 2 .ρ 4 )) The reason for framing-out these dashed path aliases is obvious when we look at the changes from the Figure 31a to Figure 31b. For example, a 1 points to H 1 which has object field Lef t-l pointing to H 2 which is also pointed by current as depicted in the Figure 31a. When we look at Figure 31b, we see that l of H 1 is pointing to H 3 but a 1 still points to H 1 . This change invalidates the field mapping Lef t → current of a 1 in the rcuItr type.
One another safety achieved with framing shows up in a setting where current and a 2 are aliases. In the Figure 31a, both current and a 2 are in the rcuItr type and point to H 2 . After the unlinking occurs, the type of current becomes unlinked although a 2 is still in the rcuItr type. Framing out a 2 prevents the inconsistency in its type under the unlinking operation.
One interesting and not obvious inconsistency issue shows up due to the aliasing between a 3 and currentL-crl. Before the unlinking occurs, both currentL and a 3 have the same path components. After the unlinking, the path of currentLcrl gets shortened as the path to heap node it points, H 3 , changes to (Lef t) k .Lef t . However, the path component of a 3 would not change so the path component of a 3 in the rcuItr would become inconsistent with the actual path reaching to H 3 .
In addition to path-aliasing, there can also be aliasing via field-mappings which we call field-aliasing. We see field alising examples in Figures 31a and 31b: pr and a 1 are field aliases with Lef t − l from H 0 points to H 1 , cr and a 2 are field aliases with Lef t − l from H 4 points to H 2 and crl and a 3 are field aliases with Lef t − l from H 5 points to H 3 . We do not discuss the problems that can occur due to the field-aliasing as they are same with the ones due to path-aliasing. What we focus on is how the type rule prevents field-aliases. The type rule asserts ∧(m ∈ {z, r}) to make sure that there exists no object field from any other context pointing either to the variable points the heap node that is mutation(unlinking)current-cr -or to the variable which points to the new Lef t of parent after unlinking -currentL-crl. We should also note that it is expected to have object fields in other contexts to point to pr as they are not in the effect zone of unlinking. For example, we see the object field l points from H 0 to H 1 in Figures 31a and 31b.
Once we unlink the heap node, it cannot be accessed by the new coming reader threads the ones that are currently reading this node cannot access to the rest of the heap. We illustrate this with dashed red cr, H 2 and object fields in Figure 31b.
Being aware of how much of the heap is under mutation is important, e.g. a whole subtree or a single node. Our type system ensures that there can be only just one heap node unlinked at a time by atomic field update action. To be able to ensure this, in addition to the proper linkage enforcement, the rule also asserts that all other object fields which are not under mutation must either not exists or point to null via