Local Reasoning for Global Graph Properties

Separation logics are widely used for verifying programs that manipulate complex heap-based data structures. These logics build on so-called separation algebras, which allow expressing properties of heap regions such that modifications to a region do not invalidate properties stated about the remainder of the heap. This concept is key to enabling modular reasoning and also extends to concurrency. While heaps are naturally related to mathematical graphs, many ubiquitous graph properties are non-local in character, such as reachability between nodes, path lengths, acyclicity and other structural invariants, as well as data invariants which combine with these notions. Reasoning modularly about such graph properties remains notoriously difficult, since a local modification can have side-effects on a global property that cannot be easily confined to a small region. In this paper, we address the question: What separation algebra can be used to avoid proof arguments reverting back to tedious global reasoning in such cases? To this end, we consider a general class of global graph properties expressed as fixpoints of algebraic equations over graphs. We present mathematical foundations for reasoning about this class of properties, imposing minimal requirements on the underlying theory that allow us to define a suitable separation algebra. Building on this theory, we develop a general proof technique for modular reasoning about global graph properties expressed over program heaps, in a way which can be directly integrated with existing separation logics. To demonstrate our approach, we present local proofs for two challenging examples: a priority inheritance protocol and the non-blocking concurrent Harris list.


Introduction
Separation logic (SL) [31,37] provides the basis of many successful verification tools that can verify programs manipulating complex data structures [1,4,17,29]. This success is due to the logic's support for reasoning modularly about modifications to heap-based data. For simple inductive data structures such as lists and trees, much of this reasoning can be automated [2,11,20,33]. However, these techniques often fail when data structures are less regular (e.g. multiple overlaid data structures) or provide multiple traversal patterns (e.g. threaded trees). Such idioms are prevalent in real-world implementations such as the fine-grained concurrent data structures found in operating systems and databases. Solutions to these problems have been proposed [14] but remain difficult to automate. For proofs of general graph algorithms, the situation is even more dire. Despite substantial improvements in the verification methodology for such algorithms [35,38], significant parts of the proof argument still typically need to be carried out using nonlocal reasoning [7,8,13,25]. This paper presents a general technique for local reasoning 1 method acquire(p: Node, r: Node) { about global graph properties that can be used within off-the-shelf separation logics. We demonstrate our technique using two challenging examples for which no fully local proof existed before, respectively, whose proof required a tailor-made logic. As a motivating example, we consider an idealized priority inheritance protocol (PIP), a technique used in process scheduling [39]. The purpose of the protocol is to avoid priority inversion, i.e. a situation where a low-priority process causes a high-priority process to be blocked. The protocol maintains a bipartite graph with nodes representing processes and resources. An example graph is shown in Fig. 1. An edge from a process p to a resource r indicates that p is waiting for r to be available whereas an edge in the other direction means that r is currently held by p. Every node has an associated default priority and current; these are natural numbers. The current priority is used for scheduling processes. When a process attempts to acquire a resource currently held by another process, the graph is updated to avoid priority inversion. For example, when process p 1 with current priority 3 attempts to acquire the resource r 1 held by process p 2 of priority 1, p 1 's higher priority is propagated to p 2 and, transitively, to any other process that p 2 is waiting for (p 3 in this case). As a result, all nodes on the created cycle 3 will get current priority 3. The protocol maintains the following invariant: the current priority of each node is the maximum of its default priority and the current priorities of all its predecessors. Priority propagation is implemented by the method update shown in Fig 1. The implementation represents graph edges by next pointers and handles both adding an edge (acquire) and removing one (release -code omitted). To recalculate the current priority of a node (line 12), each node maintains its default priority def_prio and a multiset prios which contains the priorities of all its immediate predecessors.
Verifying that the PIP maintains its invariant using established separation logic (SL) techniques is challenging. In general, SL assertions describe resources and express the fact that the program has permission to access and manipulate these resources. In what follows, we stick to the standard model of SL where resources are memory regions represented as partial heaps. We sometimes view partial heaps more abstractly as partial graphs (hereafter, simply graphs). Assertions describing larger regions are built from smaller ones using separating conjunction, φ 1 * φ 2 . Semantically, the * operator is tied to a notion of resource composition defined by an underlying separation algebra [5,6]. In the standard model, composition enforces that φ 1 and φ 2 must describe disjoint regions. The logic and algebra are set up so that changes to the region φ 1 do not affect φ 2 (and vice versa). That is, if φ 1 * φ 2 holds before the modification and φ 1 is changed to φ 1 , then φ 1 * φ 2 holds afterwards. This so-called frame rule enables modular reasoning about modifications to the heap and extends well to the concurrent setting when threads operate on disjoint portions of memory [3,9,10,36]. However, the mere fact that φ 2 is preserved by modifications to φ 1 does not guarantee that if a global property such as the PIP invariant holds for φ 1 * φ 2 , it also still holds for φ 1 * φ 2 .
For example, consider the PIP scenario depicted in Fig. 1. If φ 1 describes the subgraph containing only node p 1 , φ 2 the remainder of the graph, and φ 1 the graph obtained from φ 1 by adding the edge from p 1 to r 1 , then the PIP invariant will no longer hold for the new composed graph described by φ 1 * φ 2 . On the other hand, if φ 1 captures p 1 and the nodes reachable from r 1 (i.e., the set of nodes modified by update), φ 2 the remainder of the graph, and we reestablish the PIP invariant locally in φ 1 obtaining φ 1 (i.e., run update to completion), then φ 1 * φ 2 will also globally satisfy the PIP invariant. The separating conjunction * is not sufficient to differentiate these two cases; both describe valid partitions of a possible program heap. As a consequence, prior techniques have to revert back to non-local reasoning to prove that the invariant is maintained.
A first helpful idea towards a solution to this problem is that of iterated separating conjunction [30,44], which describes a graph G consisting of a set of nodes X by a formula Ψ = * x∈X N(x) where N(x) is some predicate that holds locally for every node x ∈ X. Using such node-local conditions one can naturally express non-inductive properties of graphs (e.g. "G has no outgoing edges" or "G is bipartite"). The advantages of this style of specification are two-fold. First, one can arbitrarily decompose and recompose Ψ by splitting X into disjoint subsets. For example, if X is partitioned into X 1 and X 2 , then Ψ is equivalent to * x∈X 1 N(x) * * x∈X 2 N(x). Moreover, it is very easy to prove that Ψ is preserved under modifications of subgraphs. For instance, if a program modifies the subgraph induced by X 1 such that * x∈X 1 N(x) is preserved locally, then the frame rule guarantees that Ψ will be preserved in the new larger graph. Iterated separating conjunction thus yields a simple proof technique for local reasoning about graph properties that can be described in terms of node-local conditions. However, this idea alone does not actually solve our problem because general global graph properties such as "G is a direct acyclic graph", "G is an overlay of multiple trees", or "G satisfies the PIP invariant" cannot be directly described via node-local conditions.

Solution.
The key ingredient of our approach is the concept of a flow of a graph: a function fl from the nodes of the graph to flow values. For the PIP, the flow maps each node to the multiset of its incoming priorities. In general, a flow is a fixpoint of a set of algebraic equations induced by the graph. These equations are defined over a flow domain, which determines how flow values are propagated along the edges of the graph and how they are aggregated at each node. In the PIP example, an edge between nodes (n, n ) propagates the multiset containing max(fl (n), n.def_prio) from n to n . The multisets arriving at n are aggregated with multiset union to obtain fl (n ). Flows enable capturing global graph properties in terms of node-local conditions. For example, the PIP invariant can be expressed by the following node-local condition: n.curr_prio = max(fl (n), n.def_prio). To enable compositional reasoning about such properties we need an appropriate separation algebra allowing us to prove locally that modifications to a subgraph do not affect the flow of the remainder of the graph.
To this end, we make the useful observation that a separation algebra induces a notion of an interface of a resource: we say that two resources a and a are equivalent if they compose with the same resources. The interface of a resource a could then be defined as a's equivalence class, but more-succinct and simpler representations may be possible. In the standard model of SL where resources are graphs and composition is disjoint graph union, the interface of a graph G is the set of all graphs G that have the same domain as G; in this model, a graph's domain could be defined to be its interface.
The interfaces of resources described by assertions capture the information that is implicitly communicated when these assertions are conjoined by separating conjunction. As we discussed earlier, in the standard model of SL, this information is too weak to enable local reasoning about global properties of the composed graphs because some additional information about the subgraphs' structure other than which nodes they contain must be communicated. For instance, if the goal is to verify the PIP invariant, the interfaces must capture information about the multisets of priorities propagated between the subgraphs. We define a separation algebra achieving exactly this: the induced flow interface of a graph G in this separation algebra captures how values of the flow domain must enter and leave G such that, when composed with a compatible graph G , the imposed local conditions on the flow of each node are satisfied in the composite graph. This is the key to enabling SL-style framing for global graph properties. Using iterated separating conjunctions over the new separation algebra, we obtain a compositional proof technique that yields succinct proofs of programs such as the PIP, whose proofs with existing techniques would involve non-trivial global reasoning steps.
Contributions. In §2, we present mathematical foundations for flow domains, imposing the minimal requirements on the underlying algebra that allow us to capture a broad range of data structure invariants and graph properties and reason locally about them in a suitable separation algebra. Building on this theory we develop a general proof technique for modular reasoning about global graph properties that can be integrated with existing separation logics ( §3). We further identify general mathematical conditions that can be used when desired to guarantee unique flows, and provide local proof arguments to check the preservation of these conditions ( §4). We demonstrate the versatility of our approach by presenting local proofs for two challenging examples: the PIP and the concurrent non-blocking list due to Harris [12].
Flows Redesigned. Our work is inspired by the recent flow framework explored by some of the authors [22], but was redesigned from the ground up. We revisit the core algebra behind flow reasoning, and derive a different algebraic foundation by analysing the minimal requirements for general local reasoning; we call our newly-designed reasoning framework the foundational flow framework. Our new framework makes several significant improvements over [22] and eliminates its most stark limitations. We provide a detailed technical comparison with [22] and discuss other related work in §5.

The Foundational Flow Framework
In this section, we introduce the foundational flow framework, explaining the motivation for its design with respect to local reasoning principles. We aim for a general technique for modularly proving the preservation of recursively-defined invariants over (partial) graphs, with well-defined decomposition and composition operations.

Preliminaries and Notation
The term (b ? t 1 : t 2 ) denotes t 1 if condition b holds and t 2 otherwise. We write f : A → B for a function from A to B, and f : A B for a partial function from A to B. For a partial function f , we write f (x) = ⊥ if f is undefined at x. We use lambda notation (λx. E) to denote a function that maps x to the expression E (typically containing x). If f is a function from A to B, we write f [x y] to denote the function from A ∪ {x} defined by f [x y](z) := (z = x ? y : f (z)). We use {x 1 y 1 , . . . , x n y n } for pairwise different x i to denote the function [x 1 y 1 ] · · · [x n y n ], where is the function on an empty domain. Given functions f 1 : A 1 → B and f 2 : We write δ n=n : M → M for the function defined by δ n=n (m) := m if n = n else 0. We also write λ 0 := (λm. 0) for the identically zero function, λ id := (λm. m) for the identity function, and use e ≡ e to denote function equality. For e : M → M and m ∈ M we write m e to denote the function application e(m). We write e • e to denote function composition, i.e. (e • e )(m) = e(e (m)) for m ∈ M , and use superscript notation e p to denote the function composition of e with itself p times.
For multisets S, we use standard set notation when clear from the context. We write S(x) to denote the number of occurrences of x in S. We write {x 1 i 1 , . . . , x n i n } for the multiset containing i 1 occurrences of x 1 , i 2 occurrences of x 2 , etc.
A partial monoid is a set M , along with a partial binary operation + : M × M M , and a special zero element 0 ∈ M , such that (1) + is associative, i.e., (m 1 + m 2 ) + m 3 = m 1 + (m 2 + m 3 ); and (2) 0 is an identity, i.e., m + 0 = 0 + m = m. Here, = means either both sides are defined and equal, or both are undefined. We identify a partial monoid with its support set M . If + is a total function, then we call M a monoid. Let m 1 , m 2 , m 3 ∈ M be arbitrary elements of the (partial) monoid in the following. We call a (partial) monoid M commutative if + is commutative, i.e., m 1 + m 2 = m 2 + m 1 . Similarly, a commutative monoid M is cancellative if + is cancellative, i.e., if m 1 + m 2 = m 1 + m 3 is defined, then m 2 = m 3 .

Flows
Recursive properties of graphs naturally depend on non-local information; e.g. we cannot express that a graph is acyclic directly as a conjunction of per-node invariants. Our foundational flow framework defines flow values at each node that capture non-local graph properties, and enables local specification and reasoning about such properties. Flow values are drawn from a flow domain, an algebraic structure which also specifies the operations used to define a flow via recursive computations over the graph. Our entire theory is parametric with the choice of a flow domain, whose components will be explained and motivated in the rest of this section.  Figure 1). This consists of the monoid of multisets of natural numbers under multiset union and two kinds of edge functions: λ 0 and functions mapping a multiset m to the singleton multiset containing the maximum value between m and a fixed value p (used to represent a node's default priority). This can define a flow which locally captures the appropriate current node priorities as the graph is modified.
Further definitions in this section assume a fixed flow domain (M, +, 0, E) and a (potentially infinite) set of nodes N. For this section, we abstract heaps using directed partial graphs; integration of our graph reasoning with direct proofs over program heaps is explained in §3. Flow Values and Flows. Flow values (taken from M ; the first element of a flow domain) are used to capture sufficient information to express desired non-local properties of a graph. In Example 1, flow values are non-negative integers; for the PIP (Example 2) we instead use multisets of integers, representing relevant non-local information: the priorities of nodes currently referencing a given node in the graph. Given such flow values, a node's correct priority can be defined locally per node in the graph. This definition requires only the maximum value of these multisets, but as we will see shortly these multisets enable local recomputation of a correct priority when the graph is changed.
For a graph G = (N, e) we express properties of G in terms of node-local conditions that may depend on the nodes' flow. We let dom(H) = N , and sometimes identify H and dom(H) to ease notational burden. For n ∈ H we write H n for the singleton flow subgraph of H induced by n.
Edge Functions. In any flow graph, the flow value assigned to a node n by a flow is propagated to its neighbours n (and transitively) according to the edge function e(n, n ) labelling the edge (n, n ). The edge function maps the flow value at the source node n to one propagated on this edge to the target node n . Note that we require such a labelling for all pairs consisting of a source node n inside the graph and a target node n ∈ N (i.e., possibly outside the graph). The 0 flow value (the third element of our flow domains) is used to represent no flow; the corresponding (constant) zero function λ 0 = (λm. 0) is used as edge function to model the absence of an edge in the graph. A set of edge functions E from which this labelling is chosen can, other than the requirement λ 0 ∈ E, be chosen as desired. As we will see in §4.4, restrictions to particular sets of edge functions E can be exploited to further strengthen our overall technique. Edge functions can depend on the local state of the source node (as in the following example); dependencies from elsewhere in the graph must be represented by the node's flow. Since the flow value at r 3 is {1, 2, 2}, the edge (r 3 , p 1 ) propagates the value {2} to p 1 , correctly representing the current priority of r 3 .
Flow Aggregation and Inflows. The flow value at a node is defined by those propagated to it from each node in a graph via edge functions, along with an additional inflow value explained here. Since multiple non-zero flow values can be propagated to a node, we require an aggregation of these values via a binary + operator on flow values : the second element of our flow domains. The edges from which the aggregated values originate are unordered. Thus, we require + to be commutative and associative, making this aggregation order-independent. The 0 flow value must act as a unit for +. For example, in the path-counting flow domain + means addition on natural numbers, while for the multisets employed for the PIP it means multiset union.
Each node in a flow graph has an inflow, modelling contributions to its flow value which do not come from inside the graph. Inflows play two important roles: first, since our graphs are partial, they model contributions from nodes outside of the graph. Second, inflow can be artificially added as a means of specialising the computation of flow values to characterise specific graph properties. For example, in the path-counting domain, we give an inflow of 1 to the node from which we are counting paths, and 0 to all others.
Example 4. Let the edges in the graph in Figure 1 be labelled as described in Example 3. If the inflow function in assigns the empty multiset to every node n and we let fl (n) be the multiset labelling every node in the figure, then FlowEqn(in, e, fl ) holds.
The flow equation (FlowEqn) defines the flow of a node n to be the aggregation of flow values coming from other nodes n inside the graph (as given by the respective edge function e(n , n)) as well as the inflow in(n). Preserving solutions to this equation across updates to the graph structure is a fundamental goal of our technique. The following lemma (which relies on the fact that + is required to be cancellative) states that any correct flow values uniquely determine appropriate inflow values: We now turn to how solutions of the flow equation can be preserved or appropriately updated under changes to the underlying graph.
Graph Updates and Cancellativity. Given a flow graph with known flow and inflow values, suppose we remove an edge from n 1 to n 2 (replacing the edge function with λ 0 ). For the same inflow, such an update will potentially affect the flow at n 2 and nodes to which n 2 (transitively) propagates flow. Starting from the simple case that n 2 has no outgoing edges, we need to recompute a suitable flow at n 2 . Knowing the old flow value (say, m) and the contribution m = fl (n 1 ) e(n 1 , n 2 ) previously provided along the removed edge, we know that the correct new flow value is some m such that m + m = m. This constraint has a unique solution (and thus, we can unambiguously recompute a new flow value) exactly when the aggregation + is cancellative; we therefore make cancellativity a requirement on the + of any flow domain.
Cancellativity intuitively enforces that the flow domain carries enough information to enable adaptation to local updates (in particular, removal of edges 6 ). Returning to the PIP example, cancellativity requires us to carry multisets as flow values rather than only the maximum priority value: + cannot be the maximum operation, as this would not be cancellative. The resulting multisets (like the prio fields in the actual code) provide the information necessary to recompute corrected priority values locally.
For example, in the PIP graph shown in Figure 1, removing the edge from p 6 to r 4 would not affect the current priority of r 4 whereas if p 7 had current priority 1 instead of 2, then the current priority of r 4 would have to decrease. In either case, recomputing the flow value for r 4 is simply a matter of subtraction (removing {2} from the multiset at r 4 ); cancellativity guarantees that our flow domains will always provide the information needed for this recomputation. Without this property, the recomputation of a flow value for the target node n 2 would, in general, entail recomputing the incoming flow values from all remaining edges from scratch. Cancellativity is also crucial for Lemma 1 above, forcing uniqueness of inflows, given known flow values in a flow graph. This allows us to define natural but powerful notions of flow graph decomposition and recomposition.

Flow Graph Composition and Abstraction
Building towards the core of our reasoning technique, we now turn to the question of decomposition and recomposition of flow graphs. Two flow graphs with disjoint domains always compose to a graph, but this will be a flow graph only if their flows are chosen consistently to admit a solution to the resulting flow equation (i.e. the flow graph composition operator defined below is partial).
where e ∅ and fl ∅ are the edge functions and flow on the empty set of nodes N = ∅.
Intuitively, two flow graphs compose to a flow graph if their contributions to each others' flow (along edges from one to the other) are reflected in the corresponding inflow of the other graph. For example, consider the subgraph from Figure 1 consisting of the single node p 7 (with 0 inflow). This will compose with the remainder of the graph depicted only if this remainder subgraph has an inflow which, at node r 4 , includes at least the multiset {2}, reflecting the propagated value from p 7 .
We use this intuition to extract an abstraction of flow graphs which we call flow interfaces. Given a flow (sub)graph, its flow interface consists of the node-wise inflow and outflow (the flow contributions its nodes make to all nodes outside of the graph, defined below). It is thus an abstraction that hides the flow values and edges that are wholly inside the flow graph. Flow graphs that have the same flow interface "look the same" to the external graph, as the same values are propagated inwards and outwards. We write I.in, I.out for the two components of the interface I = (in, out). We will again sometimes identify I and dom(I.in) to ease notational burden.
Given a flow graph H ∈ FG, we can compute its interface as follows. Recall that Lemma 1 implies that any flow graph has a unique inflow. Thus This abstraction, while simple, turns out to be powerful enough to build a separation algebra over our flow graphs, allowing them to be decomposed, locally modified and recomposed in ways yielding all the local reasoning benefits of separation logics. In particular, for graph operations within a subgraph with a certain interface, we need to prove: (a) that the modified subgraph is still a flow graph (by checking that the flow equation still has a solution locally in the subgraph) and (b) that it satisfies the same interface (in other words, the effect of the modification on the flow is contained within the subgraph); the meta-level results for our technique then justify that we can recompose the modified subgraph with any graph that the original could be composed with.
We define the corresponding flow interface algebra as follows: Definition 6 (Flow Interface Algebra). For a given flow domain M , the flow interface algebra over M is defined to be (FI, ⊕, I ∅ ), where: Flow interface composition is well-defined because of cancellativity of the underlying flow domain (it is also, exactly as flow graph composition, partial). We next show the key result for this abstraction: the ability for two flow graphs to compose depends only on their interfaces; flow interfaces implicitly define a congruence relation on flow graphs.
Crucially, the following result shows that we can use our flow interfaces as an abstraction directly compatible with existing separation logics.
This result forms the core of our reasoning technique; it enables us to make modifications within a chosen subgraph and, by proving preservation of its interface, know that the result composes with any context exactly as the original did. Flow interfaces capture precisely the information relevant about a flow graph, with respect to composition with other flow graphs. In Appendix B of the accompanying technical report (hereafter, TR) [23] we provide additional examples of flow domains that demonstrate the range of data structures and graph properties that can be expressed using flows, including a notion of universal flow that in a sense provides a completeness result for the expressivity of the framework. We now turn to constructing proofs atop these new reasoning principles.

Proof Technique
This section shows how to integrate flow reasoning into a standard separation logic, using the priority inheritance protocol (PIP) algorithm to illustrate our proof techniques.
Since flow graphs and flow interfaces form separation algebras, it is possible in principle to define a separation logic (SL) using these notions as a custom semantic model (indeed, this is the proof approach taken in [22]). By contrast, we integrate flow interfaces with a standard separation logic without modifying its semantics. This has the important technical advantage that our proof technique can be naturally integrated with existing separation logics and verification tools supporting SL-style reasoning. We consider a standard sequential SL in this section, but our technique can also be directly integrated with a concurrent SL such as RGSep (as we show in §4.5) or frameworks such as Iris [18] supporting (ghost) resources ranging over user-defined separation algebras.

Encoding Flow-based Proofs in SL
Proofs using our flow framework can employ a combination of specifications enforced at the node level and in terms of the flow graphs and interfaces corresponding to larger heap regions such as entire data structures (henceforth, composite graphs and composite interfaces). At the node level, we write invariants that every node is intended to satisfy, typically relating the node's flow value to its local state (fields). For example, in the PIP, we use node-local invariants to express that a node's current priority is the maximum of the node's default priority and those in its current flow value. We typically express such specifications in terms of singleton (flow) graphs, and their singleton interfaces.
Specification in terms of composite interfaces has several important purposes. One is to define custom inflows: e.g. in the path-counting flow domain, specifying that the inflow of a composite interface is 1 at some designated node r and 0 elsewhere enforces in any underlying flow graph that each node n's flow value will be the number of paths from r to n. 7 Composite interfaces can also be used to express that, in two states of execution, a portion of the heap "looks the same" with respect to composition (it has the same interface, and so can be composed with the same flow graphs), or to capture by how much there is an observable difference in inflow or outflow; we employ this idea in the PIP proof below.
We now define an assertion syntax convenient for capturing both node-level and composite-level constraints, defined within an SL-style proof system. We assume an intuitionistic, garbage-collected SL [6] with standard syntax and semantics: 8 see Appendix A of the TR [23] for more details.  (x, fs, y)), fl ) * γ(x, fs, fl (x)) N is implicitly parameterised by fs, edge and γ; these are explained next and are typically fixed across any given flow-based proof. The N predicate expresses that we have a heap cell at location x containing fields fs (a list of field-name/value mappings). 9 It also says that H is a singleton flow graph with domain {x} with some flow fl , whose edge functions are defined by a user-defined abstraction function edge(x, fs, y); this function allows us to define edges in terms of x's field values. Finally, the node, its fields, and its flow in this flow graph satisfy the custom predicate γ, used to encode node-local properties such as constraints in terms of the flow values of nodes.
Graph Predicates. The analogous predicate for composite graphs is Gr. It carries ownership to the nodes making up a potentially unbounded graph, using iterated separating conjunction over a set of nodes X as mentioned in §1: Gr is also implicitly parameterised by fs, edge and γ. The existentially-quantified H is a logical variable representing a function from nodes in X to corresponding singleton flow graphs. Gr(X, H) describes a set of nodes X, such that each x ∈ X is an N (in particular, it satisfies γ), whose singleton flow graphs compose back to H. As well as carrying ownership of the underlying heap locations, Gr's definition allows us to connect a node-level view of the region X (each H(x)) with a composite-level view defined by H, on which we can impose appropriate graph-level properties such as constraints on the region's inflow.
Lifting to Interfaces. Flow based proofs can often be expressed more elegantly and abstractly using predicates in terms of node and composite-level interfaces rather than flow graphs. To this end, we overload both our node and graph predicates with analogues whose second parameter is a flow interface, defined as follows: We will use these versions in the PIP proof below; interfaces capture all relevant properties for decomposition and composition of these flow graphs.  2). Conceptually, it expresses that after decomposing any flow graph into two parts H 1 and H 2 , we can replace H 1 with a new flow graph H 1 with the same interface; when recomposing, the overall graph will be a flow graph with the same overall interface.
Note the connection between rules (COMP)/(DECOMP) and the algebraic laws of standard inductive predicates such as ls describing a segment of a linked list [2]. For instance by combining the definition of Gr with these rules and (SING) we can prove the following graph analogue of the rule to separate a list into the head node and the tail: However, crucially (and unlike when using general inductive predicates [32]), this rule is symmetrical for any node x in X; it works analogously for any desired order of decomposition of the graph, and for any data structure specified using flows.
When working with our overloaded N and Gr predicates, similar steps to those described by the above lemmas are useful. Given these overloaded predicates, we simply apply the lemmas above to the existentially quantified flow-graphs in their definitions and then lift the consequence of the lemma back to the interface level using the congruence between our flow graph and interface composition notions (Lemma 2).

Proof of the PIP
We now have all the tools necessary to verify the priority inheritance protocol (PIP). Figure 3 gives the full algorithm with flow-based specifications; we also include some intermediate assertions to illustrate the reasoning steps for the acquire method, which  we explain in more detail below. 10 We instantiate our framework in order to capture the PIP invariants as follows: fs := next : y, curr_prio : q, def_prio : q 0 , prios : Q Each node has the four fields listed in fs. fs also defines variables such as y to denote field values that are used in the definitions of edge and γ; these variables are bound to the heap by N. edge abstracts the heap into a flow graph by letting each node have an edge to its next successor labelled by a function that passes to it the maximum incoming priority or the node's default priority: whichever is larger. With this definition, one can see that the flow of every node will be the multiset containing exactly the priorities of its predecessors. The node-local invariant γ says that all priorities are non-negative, the flow m of each node is stored in the prios field, and its current priority is the maximum of its default and incoming priorities. Finally, the constraint ϕ on the global interface expresses that the graph is closed -it has no inflow or outflow.
Flows Specifications for the PIP. Our specifications of acquire and release guarantee that if we start with a valid flow graph (closed, according to ϕ), we are guaranteed to return a valid flow graph with the same interface (i.e. the graph remains closed). For clarity of the exposition, we focus here on how we prove that being a flow graph that satisfies the PIP invariant is preserved (as is the composite flow graph's interface).
Extending this specification to one which proves, e.g., that acquire adds the expected edge is straightforward (see Appendix C of the TR [23]). 11 The specification for update is somewhat subtle, and exploits the full flexibility of flow interfaces as a specification medium. The preconditions of update describe an update to the graph which is not yet completed. There are three complementary aspects to this specification. Firstly, (as for acquire and release), node-local invariants (γ) hold for all nodes in the graph (enforced via N and Gr predicates). Secondly, we employ flow interfaces to express a decomposition of the original top-level interface I into compatible (primed) sub-interfaces. The key to understanding this specification is that I n is in some sense a fake interface; it does not abstract the current state of the heap node n. Instead, I n expresses the way in which the node n's current inflow hasn't yet been accounted for in the heap: that if n could adjust its inflow according to the propagated priority change without changing its outflow, then it would compose back with the rest of the graph, and restore the graph's overall interface. The shorthand δ defines the required change to n's inflow.
In general (except when n's next field is null, or n's flow value is unchanged), it is not even possible for n's fields to be updated to satisfy I n ; by updating n's inflow, we will necessarily update its outflow. However, we can then construct a corresponding "fake" interface for the next node in the graph, reflecting the update yet to be accounted for, and establishing the precondition for the recursive call to update.
The third specification aspect is the connection between heap-level nodes and interfaces. The N(n, I n ) predicate connects n with a different interface; I n is the actual current abstraction of n's state. Conceptually, the key property which is broken at this point is this connection between the interface-level specification and the heap at node n, reflected by the decomposition in the specification between X \ {n} and {n}.
We note that the same specification ideas and proof style can be easily adapted to other data structure implementations with an update-notify style, including well-known designs such as Subject-Observer patterns, or the Composite pattern [27].
Proof Outline. To illustrate the application of flows reasoning to our PIP specification ideas more clearly, we examine in detail the first if-branch in the proof of acquire. Our intermediate proof steps are shown as purple annotations surrounded by braces. The first step, as shown in the first line inside the method body, is to apply ((UN)FOLD) twice (on the flow graphs represented by these predicates) and peel off N predicates for each of r and p. The update to r's next field (line 27) causes the correct singleton interface of r to change to I r : its outflow (previously none, since the next field was null) now propagates flow to p. We summarise this state in the assertion on line 29 (we omit e.g. repetition of properties from the function's precondition, focusing on the flow-related steps of the argument). We now rewrite this state; using the definition of interface composition (Definition 6) we deduce that although I r and I p do not compose (since the former has outflow that the latter does not account for as inflow), the alternative "fake" interface I p for p (which artificially accounts for the missing inflow) would do so (cf. line 30). Essentially, we show I r ⊕ I p = I r ⊕ I p , that the interface of {r, p} would be unchanged if p could somehow have interface I p . Now by setting I 2 = I r ⊕ I 1 and using algebraic properties of interfaces, we assemble the precondition expected by update. After the call, update's postcondition gives us the desired postcondition.
We focused here on the details of acquire's proof, but very similar manipulations are required for reasoning about the recursive call in update's implementation. 12 The main difference there is that if the if-condition wrapping the recursive call is false then either the last-modified node has no successor (and so there is no outstanding inflow change needed), or we have from = to which implies that the "fake" interface is actually the same as the currently correct one.
Despite the property proved for the PIP example being a rather delicate recursive invariant over the (potentially cyclic) graph, the power of our framework enables extremely succinct specifications for the example, and proofs which require the application of relatively few generic lemmas. The integration with standard separation logic reasoning, and the complementary separation algebras provided by flow interfaces allow decomposition and recomposition to be simple proof steps. For this proof, we integrated with standard sequential separation logic, but in the next section we will show that compatibility with concurrent SL techniques is similarly straightforward.

Advanced Flow Reasoning and the Harris List
This section introduces some advanced foundational flow framework theory and demonstrates its use in the proof of the Harris list. We note that [22] presented a proof of this data structure in the original flow framework. The proof given here shows that the new framework eliminates the need for the customized concurrent separation logic defined in [22]. We start with a recap of Harris' algorithm adapted from [22].

The Harris List Algorithm
The power of flow-based reasoning is exhibited in the proof of overlaid data structures such as the Harris list, a concurrent non-blocking linked list algorithm [12]. This algorithm implements a set data structure as a sorted list, and uses atomic compare-and-swap (CAS) operations to allow a high degree of parallelism. As with the sequential linked list, Harris' algorithm inserts a new key k into the list by finding nodes k 1 , k 2 such that k 1 < k < k 2 , setting k to point to k 2 , and using a CAS to change k 1 to point to k only if it was still pointing to k 2 . However, a similar approach fails for the delete operation. If we had consecutive nodes k 1 , k 2 , k 3 and we wanted to delete k 2 from the list (say by setting k 1 to point to k 3 ), there is no way to ensure with one CAS that k 2 and k 3 are also still adjacent (another thread could have inserted/deleted in between them). Harris' solution is a two step deletion: first atomically mark k 2 as deleted (by setting a mark bit on its successor field) and then later remove it from the list using a single CAS. After a node is marked, no thread can insert or delete to its right, hence a thread that wanted to insert k to the right of k 2 would first remove k 2 from the list and then insert k as the successor of k 1 .
In a non-garbage-collected environment, unlinked nodes cannot be immediately freed as suspended threads might continue to hold a reference to them. A common solution is to maintain a second "free list" to which marked nodes are added before they are unlinked from the main list (this is the so-called drain technique). These nodes are then labelled with a timestamp, which is used by a maintenance thread to free them when it is safe to do so. This leads to the kind of data structure shown in Figure 4, where each node has two pointer fields: a next field for the main list and an fnext field for the free list (the list from fh to ft via dashed edges). Threads that have been suspended while holding  Fig. 5: Examples of graphs that motivate effective acyclicity. All graphs use the pathcounting flow domain, the flow is displayed inside each node, and the inflow is displayed as curved arrows to the top-left of nodes. (a) shows a graph and inflow that has no solution to (FlowEqn); (b) has many solutions. (c) shows a modification that preserves the interface of the modified nodes, yet goes from a graph that has a unique flow to one that has many solutions to (FlowEqn).
a reference to a node that was added to the free list can simply continue traversing the next pointers to find their way back to the unmarked nodes of the main list.
Even for seemingly simple properties such as that the Harris list is memory safe and not leaking memory, the proof will rely on the following non-trivial invariants: (a) The data structure consists of two (potentially overlapping) lists: a list on next edges beginning at mh and one on fnext edges beginning at fh. (b) The two lists are null terminated and next edges from nodes in the free list point to nodes in the free list or main list. (c) All nodes in the free list are marked. (d) ft is an element in the free list (due to concurrency, it's not always the tail).
Challenges. To prove that Harris' algorithm maintains the invariants listed above we must tackle a number of challenges. First, we must construct flow domains that allow us to describe overlaid data structures, such as the overlapping main and free lists ( §4.2). Second, the flow-based proofs we have seen so far work by showing that the interface of some modified region is unchanged. However, if we consider a program that allocates and inserts a new node into a data structure (like the insert method of Harris), then the interface cannot be the same since the domain has changed (it has increased by the newly allocated node). We must thus have a means to reason about preservation of flows by modifications that allocate new nodes ( §4.3). The third issue is that in some flow domains, there exist graphs G and inflows in for which no solutions to the flow equation (FlowEqn) exist. For instance, consider the path-counting flow domain and the graph in Figure 5(a). Since we would need to use the path-counting flow in the proof of the Harris list to encode its structural invariants, this presents a challenge ( §4.4).
We will next see how to overcome these three challenges in turn, and then apply those solution to the proof of the Harris list in §4.5.

Product Flows for Reasoning about Overlays
An important fact about flows is that any flow of a graph over a product of two flow domains is the product of the flows on each flow domain component. This lemma greatly simplifies reasoning about overlaid graph structures; we will use the product of two path-counting flows to describe a structure consisting of two overlaid lists that make up the Harris list.

Contextual Extensions and the Replacement Theorem
In general, when modifying a flow graph H to another flow graph H , requiring that H satisfies precisely the same interface int(H) can be too strong a condition as it does not permit allocating new nodes. Instead, we want to allow int(H ) to differ from int(H) in that the new interface could have a larger domain, as long as the edges from the new nodes do not change the outflow of the modified region.
The following theorem states that contextual extension preserves composability and is itself preserved under interface composition.
Theorem 2 (Replacement Theorem). If I = I 1 ⊕ I 2 , and I 1 I 1 are all valid interfaces such that I 1 ∩ I 2 = ∅ and ∀n ∈ I 1 \ I 1 . I 2 .out(n) = 0, then there exists a valid I = I 1 ⊕ I 2 such that I I .
In terms of our flow predicates, this theorem gives rise to the following adaptation of the (REPL) rule: The rule (REPL+) is derived from the Replacement Theorem by instantiating with I = int(H), I 1 = int(H 1 ), I 2 = int(H 2 ) and I 1 = int(H 1 ). We know I 1 I 1 ; H = H 1 H 2 tells us (by Lemma 2) that I = I 1 ⊕ I 2 , and Gr(X 1 , H 1 ) * Gr(X 2 , H 2 ) gives us I 1 ∩ I 2 = ∅. The final condition of the Replacement Theorem is to prove that there is no outflow from X 2 to any newly allocated node in X 1 . While we can use additional ghost state to prove such constraints in our proofs, if we assume that the memory allocator only allocates fresh addresses and restrict the abstraction function edge to only propagate flow along an edge (n, n ) if n has a (non-ghost) field with a reference to n then this condition is always true. For simplicity, and to keep the focus of this paper on the flow reasoning, we make this assumption in the Harris list proof.

Existence and Uniqueness of Flows
We typically express global properties of a graph G = (N, e) by fixing a global inflow in : N → M and then constraining the flow of each node in N using node-local conditions. However, as we discussed at the beginning of this section, there is no general guarantee that a flow exists or is unique for a given in and G. The remainder of this section presents two complementary conditions under which we can prove that our flow fixpoint equation always has a unique solution. To this end, we say that a flow domain . This domain can be used to express the property that every node in a graph is reachable from the root via a single edge (by requiring the flow of every node to be (0, 1) under the inflow (λn. (n = r ? (1, 0) : (0, 0)))).
Before we prove that nilpotent endomorphisms lead to unique flows, we present a useful notion when dealing with endomorphic flow domains. For a flow graph H = (N, e, fl ), we write cap(H)(n, n ) = cap((N, e))(n, n ) for the capacity of the underlying graph. Intuitively, cap(G)(n, n ) is the function that summarizes how flow is routed from any source node n in G to any other node n , including those outside of G.
We can now show that if all edges of a flow graph are labelled with edges from a nilpotent set of endomorphisms, then the flow equation has a unique solution: Effectively Acyclic Flow Graphs. There are some flow domains that compute flows useful in practice, but which do not guarantee either existence or uniqueness of fixpoints a priori for all graphs. For example, the path-counting flow from Example 1 is one where for certain graphs, there exist no solutions to the flow equation (see Figure 5(a)), and for others, there can exist more than one (in Figure 5(b), the nodes marked with x can have any path count, as long as they both have the same value).
In such cases, we explore how to restrict the class of graphs we use in our flow-based proofs such that each graph has a unique fixpoint; the difficulty is that this restriction must be respected for composition of our graphs. Here, we study the class of flow domains (M, +, 0, E) such that M is a positive monoid and E is a set of reduced endomorphisms (defined below). In such domains we can decompose the flow computations into the various paths in the graph, and achieve unique fixpoints by restricting the kinds of cycles graphs can have. fl (n 1 ) e(n 1 , n 2 ) · · · e(n k−1 , n k ) e(n k , n 1 ) = 0.
The simplest example of an effectively acyclic graph is one where the edges with non-zero edge functions form an acyclic graph. However, our semantic condition is weaker: for example, when reasoning about two overlaid acyclic lists whose union happens to form a cycle, a product of two path-counting domains will satisfy effective acyclicity because the composition of different types of edges results in the zero function. While the restriction to effectively acyclic flow graphs guarantees us that the flow is the unique fixpoint of the flow equation, it is not easy to show that modifications to the graph preserve EA while reasoning locally. Even modifying a subgraph to another with the same flow interface (which we know guarantees that it will compose with any context) can inadvertently create a cycle in the larger composite graph. For instance, consider Figure 5(c), that shows a modification to nodes {n 3 , n 4 } (the boxed blue region). The interface of this region is ({n 3 1, n 4 1} , {n 5 1, n 2 1}), and so swapping the edges of n 3 and n 4 preserves this interface. However, the resulting graph, despite composing with the context to form a valid flow graph, is not EA (in this case, it has multiple solutions to the flow equation). This shows that flow interfaces are not powerful enough to preserve effective acyclicity. For a special class of endomorphisms, we show that a local property of the modified subgraph can be checked, which implies that the modified composite graph continues to be EA.
Note that if E is reduced, then no e ∈ E can be nilpotent. In that sense, this class of instantiations is complementary to the nilpotent class.
Example 6. Examples of flow domains that fall into this class include positive semirings of reduced rings (with the additive monoid of the semiring being the aggregation monoid of the flow domain and E being any set of functions that multiply their argument with a constant flow value). Note that any direct product of integral rings is a reduced ring. Hence, products of the path counting flow domain are a special case.
For reduced endomorphisms, it suffices to check that a modification preserves the flow routed between every pair of source and sink node in order to ensure that it does not create any new cycles in any composite graph. This pairwise check, apart from requiring the interface of the modified region to be unchanged, also permits allocating new nodes as long as no flow is routed via the new nodes (condition (3)). We now show that it is sufficient to check that a modification is a subflow-preserving extension to guarantee composition back to an effectively-acyclic composite graph: We define effectively acyclic versions of our flow graph predicates, N a (x, H) and Gr a (X, H), that additionally constrain H to be effectively acyclic. The above theorem yields the following variant of the (REPL) rule for EA graphs:

Proof of the Harris List
We use the techniques seen in this section in the proof of the Harris list. As the data structure consists of two potentially overlapping lists, we use Lemma 3 to construct a product flow domain of two path-counting flows: one tracks the path count from the head of the main list, and one from the head of the free list. We also work under the effectively acyclic restriction (i.e. we use the N a and Gr a predicates), both in order to obtain the desired interpretation of the flow as well as to ensure existence of flows in this flow domain. We instantiate the framework using the following definitions of parameters: fs := {key : k, next : y, fnext : z} edge(x, fs, v) := (v = null ? λ 0 : (v = y ∧ y = z ? λ (1,0) : (v = y ∧ y = z ? λ (0,1) : (v = y ∧ y = z ? λ id : λ 0 )))) Here, edge encodes the edge functions needed to compute the product of two path counting flows, the first component tracks path-counts from mh on next edges and the second tracks path-counts from fh on fnext edges 13 . The node-local invariant γ says: the flow is one of {(1, 0), (0, 1), (1, 1)} (meaning that the node is on one of the two lists, invariant (a)); if the flow is not (1, 0) (the node is not only on the main list, i.e. it is on the free list) then the node is marked (indicated by M (y), invariant (c)); and if the node is ft then it must be on the free list (invariant (d)). The constraint on the global interface, ϕ, says that the inflow picks out mh and fh as the roots of the lists, and there is no outgoing flow (thus, all non-null edges must stay within the graph, invariant (b)). Since the Harris list is a concurrent algortihm, we perform the proof in rely-guarantee separation logic (RGSep) [41]. Like in §3, we do not need to modify the semantics of RGSep in any way; our flow-based predicates can be defined and reasoning using our lemmas can be performed in the logic out-of-the-box. For space reasons, the full proof can be found in Appendix D of the TR [23].

Related Work
As mentioned in §1, the most closely related work is the flow framework developed by some of the authors in [22]. We here present a simplified and generalized meta theory of flows that makes the approach much more broadly applicable. There were a number of limitations of the prior framework that prevented its application to more general classes of examples.
First, [22] required flow domains to form a semiring; the analogue of edge functions are restricted to multiplication with a constant which must come from the same flow value set. This restriction made it complex to encode many graph properties of interest. For example, one could not easily encode the PIP flow, or a simple flow that counts the number of incoming edges to each node. Our foundational flow framework decouples the algebraic structure defining how flow is aggregated from the algebraic structure of the edge functions. In this way, we obtain a more general framework that applies to many more examples, and with simpler flow domains.
Second, in [22], a flow graph did not uniquely determine its inflow (cf. Lemma 1). Correspondingly, [22]'s notion of interface included an equivalence class of inflows (all those that induce the same flow values). Since, in [22], the interface also determines which modifications are permitted by the framework, [22] could only handle modifications that preserve the inflow equivalence class. For example, this prevents one from reasoning locally about the removal of a single edge from a graph in certain cases (in particular, like release does in the PIP). Our foundational flow framework solves this problem by requiring that the aggregation operation on flow values is cancellative, guaranteeing unique inflows.
Cancellativity is fundamentally incompatible with [22], which requires the flow domain to form an ω-CPO in order to guarantee the existence of unique flows. For example, in a graph with two nodes n and n with identity edges between them and all other edges zero (in [22], edges labelled with 1 and 0), if we have in(n) = 0 and in(n) = m for some non-zero m, a solution to the flow equation must satisfy fl (n) = m + fl (n). [22] forces such solutions to exist, ruling out cancellativity. To solve this problem, we present a new theory which can optionally guarantee unique flows when desired and show that requiring cancellativity does not limit expressivity.
Next, the proofs of programs shown in [22] depend on a bespoke program logic. This logic requires new reasoning primitives that are not supported by the logics implemented in existing SL-based verification tools. Our general proof technique eliminates the need for a dedicated program logic and can be implemented on top of standard separation logics and existing SL-based tools. Finally, the underlying separation algebra of the original framework makes it hard to use equational reasoning, which is a critical prerequisite for enabling proof automation.
An abundance of SL variants provide complementary mechanisms for modular reasoning about programs (e.g. [18,36,38]). Most are parameterized by the underlying separation algebra; our flow-based reasoning technique easily integrates with these existing logics.
The most common approach to reason about irregular graph structures in SL is to use iterated separating conjunction [30,44] and describe the graph as a set of nodes each of which satisfies some local invariant. This approach has the advantage of being able to naturally describe general graphs. However, it is hard to express non-local properties that involve some form of fixpoint computation over the graph structure. One approach is to abstract the program state as a mathematical graph using iterated separating conjunction and then express non-local invariants in terms of the abstract graph rather than the underlying program state [14,35,38]. However, a proof that a modification to the state maintains a global invariant of the abstract graph must then often revert back to non-local and manual reasoning, involving complex inductive arguments about paths, transitive closure, and so on. Our technique also exploits iterated separating conjunction for the underlying heap ownership, with the key benefit that flow interfaces exactly capture the necessary conditions on a modified subgraph in order to compose with any context and preserve desired non-local invariants.
In recent work, Wang et al. present a Coq-mechanised proof of graph algorithms in C, based on a substantial library of graph-related lemmas, both for mathematical and heap-based graphs [42]. They prove rich functional properties, integrated with the VST tool. In contrast to our work, a substantial suite of lemmas and background properties are necessary, since these specialise to particular properties such as reachability. We believe that our foundational flow framework could be used to simplify framing lemmas in a way which remains parameteric with the property in question.
Proofs of a number of graph algorithms have been mechanized in various verification tools and proof assistants, including Tarjan's SCC algorithm [8], union-find [7], Kruskal's minimum spanning tree algorithm [13], and network flow algorithms [25]. These proofs generally involve non-local reasoning arguments about mathematical graphs.
An alternative approach to using SL-style reasoning is to commit to global reasoning but remain within decidable logics to enable automation [16,21,24,28,43]. However, such logics are restricted to certain classes of graphs and certain types of properties. For instance, reasoning about reachability in unbounded graphs with two successors per node is undecidable [15]. Recent work by Ter-Gabrielyan et al. [40] shows how to deal with modular framing of pairwise reachability specifications in an imperative setting. Their framing notion has parallels to our notion of interface composition, but allows subgraphs to change the paths visible to their context. The work is specific to a reachability relation, and cannot express the rich variety of custom graph properties available in our technique.
Dynamic frames [19] (e.g. implemented in Dafny [26]), can be used to explicitly reason about framing of heap information in a first-order logic. However, by itself, this theory does not enable modular reasoning about global graph properties. We believe that the flow framework could in principle be adapted to the dynamic frames setting.

Conclusions and Future Work
We have presented the foundational flow framework, enabling local modular reasoning about recursively-defined properties over general graphs. The core reasoning technique has been designed to make minimal mathematical requirements, providing great flexibility in terms of potential instantiations and applications. We identified key classes of these instantiations for which we can provide existence and uniqueness guarantees for the fixpoint properties our technique addresses and demonstrate our proof technique on several challenging examples. As future work, we plan to automate flow-based proofs in our new framework using existing tools that support SL-style reasoning such as Viper [29] and GRASShopper [34].