Inferring Inductive Invariants from Phase Structures

Infinite-state systems such as distributed protocols are challenging to verify using interactive theorem provers or automatic verification tools. Of these techniques, deductive verification is highly expressive but requires the user to annotate the system with inductive invariants. To relieve the user from this labor-intensive and challenging task, invariant inference aims to find inductive invariants automatically. Unfortunately, when applied to infinite-state systems such as distributed protocols, existing inference techniques often diverge, which limits their applicability. This paper proposes user-guided invariant inference based on phase invariants, which capture the different logical phases of the protocol. Users conveys their intuition by specifying a phase structure, an automaton with edges labeled by program transitions; the tool automatically infers assertions that hold in the automaton's states, resulting in a full safety proof.The additional structure from phases guides the inference procedure towards finding an invariant. Our results show that user guidance by phase structures facilitates successful inference beyond the state of the art. We find that phase structures are pleasantly well matched to the intuitive reasoning routinely used by domain experts to understand why distributed protocols are correct, so that providing a phase structure reuses this existing intuition.


Introduction
Infinite-state systems such as distributed protocols remain challenging to verify despite decades of work developing interactive and automated proof techniques. Such proofs rely on the fundamental notion of an inductive invariant. Unfortunately, specifying inductive invariants is difficult for users, who must often repeatedly iterate through candidate invariants before achieving an inductive invariant. For example, the Verdi project's proof of the Raft consensus protocol used an inductive invariant with 90 conjuncts and relied on significant manual proof effort [60,61].
The dream of invariant inference is that users would instead be assisted by automatic procedures that could infer the required invariants. While other domains have seen successful applications of invariant inference, using techniques such as abstract interpretation [18] and property-directed reachability [10,21], existing inference techniques fall short for interesting distributed protocols, and often diverge while searching for an invariant. These limitations have hindered adoption of invariant inference. Our Approach. The idea of this paper is that invariant inference can be made drastically more effective by utilizing user-guidance in the form of phase structures. We propose user-guided invariant inference, in which the user provides some additional information to guide the tool towards an invariant. An effective guidance method must (1) match users' high-level intuition of the proof, and (2) convey information in a way that an automatic inference tool can readily utilize to direct the search. In this setting invariant inference turns a partial, high-level argument accessible to the user into a full, formal correctness proof, overcoming scenarios where procuring the proof completely automatically is unsuccessful.
Our approach places phase invariants at the heart of both user interaction and algorithmic inference. Phase invariants have an automaton-based form that is well-suited to the domain of distributed protocols. They allow the user to convey a high-level temporal intuition of why the protocol is correct in the form of a phase structure. The phase structure provides hints that direct the search and allow a more targeted generalization of states to invariants, which can facilitate inference where it is otherwise impossible. This paper makes the following contributions: (1) We present phase invariants, an automaton-based form of safety proofs, based on the distinct logical phases of a certain view of the system. Phase invariants closely match the way domain experts already think about the correctness of distributed protocols by state-machine refinement à la Lamport [e.g. 42].
(2) We describe an algorithm for inferring inductive phase invariants from phase structures. The decomposition to phases through the phase structure guides inference towards finding an invariant. The algorithm finds a proof over the phase structure or explains why no such proof exists. In this way, phase invariants facilitate user interaction with the algorithm.
(3) Our algorithm reduces the problem of inferring inductive phase invariants from phase structures to the problem of solving a linear system of Constrained Horn Clauses (CHC), irrespective of the inference technique and the logic used. In the case of universally quantified phase inductive invariants for protocols modeled in EPR (motivated by previous deductive approaches [50,49,59]), we show how to solve the resulting CHC using a variant of PDR ∀ [39]. (4) We apply this approach to the inference of invariants for several interesting distributed protocols. (This is the first time invariant inference is applied to distributed protocols modeled in EPR.) In the examples considered by our evaluation, transforming our high-level intuition about the protocol into a phase structure was relatively straightforward. The phase structures allowed our algorithm to outperform in most cases an implementation of PDR ∀ that does not exploit such structure, facilitating invariant inference on examples beyond the state of the art and attaining faster convergence.
It is surprising that invariant inference-operating in the realm of logical clauses and implications-can so effectively benefit from guidance by phase structures, which exhibit a much higher level of abstraction. While there remain significant challenges to applying invariant inference on complex distributed protocols-notably, inference of invariants with quantifier alternations, necessary, e.g. for Paxos [49]-our approach demonstrates that the seemingly inherent intractability of sifting through a vast space of candidate invariants can be mitigated by leveraging users' high-level intuition.

Preliminaries
In this section we provide background on modeling and verifying systems using firstorder logic. We assume familiarity with first-order logic. We use many-sorted first order logic, but omit sorts here to simplify the presentation. Although in this paper we will mostly deal with uninterpreted first-order logic, our definitions and results extend to logics with a background theory. Notation. FV(ϕ) denotes the set of free variables of ϕ. F Σ (V ) denotes the set of firstorder well-formed formulas ϕover vocabulary Σ with FV(ϕ) ⊆ V . We extend the notation =⇒ to quantified formulas and write ∀V. ϕ =⇒ ψ to denote that the formula ∀V. ϕ → ψ is valid. We sometimes use f a as a shorthand for f (a). Transition systems. We represent transition systems symbolically, via formulas in firstorder logic. The definitions are standard. A vocabulary Σ consisting of constant, function, and relation symbols is used to represent states(each function and relation symbol is associated with its arity). Post-states of transitions are represented by a copy of Σ denoted Σ = {a | a ∈ Σ}(where the arity of each function and relation symbol is inherited from Σ). A first-order transition system over vocabularyΣ is a tuple TS = (Init, TR), where Init ∈ F Σ (∅) describes the initial states, and TR ∈ FΣ(∅) witĥ Σ = Σ Σ describes the transition relation.The states of TS are first-order structures over Σ, denoted STRUCT [Σ]. Each state s ∈ STRUCT[Σ] is a pair s = (D, I) where D is the domain and I is the interpretation function mapping each symbol in Σ to its interpretation over D. We denote by STRUCT[Σ]| D the set of structures with domain D. In this way, every closed formula over Σ represents the set of states (firstorder structures) that satisfy it. In particular, a state s is initial if s |= Init. A transition of TS is a pair of states s 1 = (D, I 1 ), s 2 = (D, I 2 ) with a shared domain such that (s 1 , s 2 ) |= TR, where (s 1 , s 2 ) is a shorthand for the structure s = (D, I) obtained by defining I(a) = I 1 (a) if a ∈ Σ, and I(a) = I 2 (a) if a ∈ Σ . s 1 is also called the prestate and s 2 the post-state. Traces are finite sequences of states σ 1 , σ 2 , . . . starting from an initial state such that there is a transition between each pair of consecutive states. The reachable states of TSare those that reside on traces starting from an initial state. Safety. A safety property P is a formula in F Σ (∅). We say that TS is safe if all the reachable states satisfy P , in which case we also say that P is an invariant of TS. A prominent way to prove safety is via inductive invariants. An inductive invariant Inv is a closed first-order formula over Σ such that the following requirements hold: (i) Init =⇒ Inv (initiation), and (ii) Init∧TR =⇒ Inv (consecution), where Inv is obtained from Inv by replacing each symbol from Σ with its primed counterpart.Initiation and consecution ensure that all the reachable states satisfy Inv. If, in addition, Inv satisfies: (iii) Inv =⇒ P (safety), it follows that all the reachable states satisfy P , and TS is safe.

Running Example: Distributed Key-Value Store
We begin with a description of the running example we refer to throughout the paper.
The  can be dynamically transferred among nodes to balance load. The safety property ensures that each key is globally associated with one value, even in the presence of key transfers. Messages might be dropped by the network, and the protocol uses retransmissions and sequence numbers to maintain availability and safety. Fig. 1 shows code modeling the protocol in a relational first-order language akin to Ivy [44], which compiles to EPR transition systems. The state of nodes and the network is modeled by global relations. Lines 1 to 4 declare uninterpreted sorts for keys, values, clients, and sequence numbers. Lines 6 to 14 describe the state, consisting of: (i) local state of clients pertaining to the table (which nodes are owners of which keys, and the local shard of the table mapping keys to values); (ii) local state of clients pertaining to sent and received messages (seqnum_sent, unacked, seqnum_recvd); and (iii) the state of the network, comprised of two kinds of messages (transfer_msg, ack_msg). Each message kind is modeled as a relation whose first two arguments indicate the source and destination of the message, and the rest carry the message's payload. For example, ack_msg is a relation over two nodes and a sequence number, with the intended meaning that a tuple (c 1 , c 2 , s) is in ack_msg exactly when there is a message in the network from c 1 to c 2 acknowledging a message with sequence number s.
The initial states are specified in lines 17 to 18. Transitions are specified by the actions declared in lines 20 to 66. Actions can fire nondeterministically at any time when their precondition (require statements) holds. Hence, the transition relation comprises of the disjunction of the transition relations induced by the actions. The state is mutated by modifying the relations. For example, message sends are modeled by inserting a tuple into the corresponding relation (e.g. line 27), while message receives are modeled by requiring a tuple to be in the relation (e.g. line 32), and then removing it (e.g. line 33). The updates in lines 61 and 65 remove a set of tuples matching the pattern.
KV-R protocol. Transferring keys between nodes begins by sending a transfer_msg from the owner to a new node (line 20), which stores the key-value pair when it receives the message (line 39). Upon sending a transfer message the original node cedes ownership (line 26) and does not send new transfer messages. Transfer messages may be dropped (line 30). To ensure that the key-value pair is not lost, retransmissions are performed (line 35) with the same sequence number until the target node acknowledges (which occurs in line 47). Acknowledge messages themselves may be dropped (line 53). Sequence numbers protect from delayed transfer messages, which might contain old values (line 42).
KV-R safety property. Lines 68 to 71 specify the key safety property: at most one value is associated with any key, anywhere in the network. Intuitively, the protocol satisfies this because each key k is either currently (1) owned by a node, in which case this node is unique, or (2) it is in the process of transferring between nodes, in which case the careful use of sequence numbers ensures that the destination of the key is unique. As is typical, it is not straightforward to translate this intuition into a full correctness proof. In particular, it is necessary to relate all the different components of the state, including clients' local state and pending messages.
Invariant inference strives to automatically find an inductive invariant establishing safety. This example is challenging for existing inference techniques ( §6). This paper proposes user-guided invariant inference based on phase-invariants to overcome this challenge. The rest of the paper describes our approach, in which inference is provided with the phase structure in Fig. 2, matching the high level intuitive explanation above. The algorithm then automatically infers facts about each phase to obtain an inductive invariant. §4 describes phase structures and inductive phase invariants, and §5 explains how these are used in user-guided invariant inference.

Phase Structures and Invariants
Phase invariants describe the protocol as transitioning between different logical stages. In this section we introduce phase structures and inductive phase invariantsand explain their role in verifying safety properties. In §5 we explain how we use these in guiding automatic invariant inference. For brevity, proofs are deferred to Appendix E.

Phase Invariants
Definition 1 (Quantified Phase Automaton). A quantified phase automaton (phase automaton for short) over Σ is a tuple A = (Q, ι, V, δ, ϕ) where: -Q is a finite set of phases.
ι ∈ Q is the initial phase.
-V is a set of variables, called the automaton's quantifiers .
δ : Q × Q → FΣ(V) is a function labeling every pair of phases by a transition relation formula, such that FV(δ (q,p) ) ⊆ V for every (q, p) ∈ Q × Q. ϕ : Q → F Σ (V) is a function labeling every phase by a phase characterization formula, such that FV(ϕ q ) ⊆ V for every phase q ∈ Q.
Intuitively, V should be understood as free variables that are implicitly universally quantified outside of the automaton's scope. For each assignment to these variables, the automaton represents the progress along the phases from the point of view of this assignment, and thus V is also called the view (or view quantifiers). We refer to (Q, ι, V, δ), where ϕ is omitted, as the phase structure (or the automaton structure) of A.
We refer by the edges of A to R = {(q, p) ∈ Q × Q | δ (q,p) ≡ false}.
A trace of A is a sequence of phases q 0 , . . . , q n such that q 0 = ι and (q i , q i+1 ) ∈ R for every 0 ≤ i < n.
Example 1 (Quantified Phase Automaton and Structure). Fig. 2 shows a phase automaton for the running example, with the view of a single key k. It describes the protocol as transitioning between two distinct (logical) phases of k: owned (O[k]) and transferring (T[k]). The edges are labeled by actions of the system. A wildcard * means that the action is executed with an arbitrary argument. The two central actions are (i) reshard, which transitions from O[k] to T[k], but cannot execute in T[k], and (ii) recv_transfer_message, which does the opposite. The rest of the actions do not cause a phase change and appear on a self loop in each phase. Actions related to keys other than k are considered as self-loops, and omitted here for brevity. Intuitively, the automaton transitions between phases when the protocol executes an action that matches the automaton's edge between the phases. Some actions are disallowed in certain phases, namely, do not label any outgoing edge from a phase, such Characterizations for each phase are depicted in Fig. 2 (bottom). Without them, Fig. 2 represents a phase structure, which serves as the input to our inference algorithm.
Remark 1. We remark that the choice of automaton aims to reflect the safety property of interest. In our example, one might instead imagine taking the view of a single node as it interacts with multiple keys, which might seem intuitive from the standpoint of implementing the system. However, it is not appropriate for the proof of value uniqueness, since keys pass in and out of the view of a single clientover their lifetime. The phase automaton should not aim to capture the phase structure of the implementation but that arising from the correctness intuition.
We now formally define phase invariants as phase automata that overapproximate the behaviors of the original system.
Definition 2 (Language of Phase Automaton). Let A be a quantified phase automaton over Σ, and σ = σ 0 , . . . , σ n a finite sequence of states (first-order structures)in STRUCT[Σ], all with domain D. Let v : V → D be a valuation of the automaton quantifiers. We say that: Trace inclusion ensures that a phase invariant A can be soundly used to verify safety properties of TS.

Phase Invariants
Example 2 (Phase Invariant). The phase automaton of Fig. 2 is a phase invariant for the protocol: intuitively, whenever an execution of the protocol reaches a phase, its characterizations hold. This fact may not be straightforward to establish. To this end we develop the notion of inductive phase invariants.

Establishing Phase Invariants with Inductive Phase Invariants
To establish phase invariants, we use inductiveness: Definition 4 (Inductive Phase Invariant). A is inductive w.r.t. TS = (Init, TR) ifthe following conditions hold: Edge Covering: for every q ∈ Q, ∀V. ϕ q ∧ TR =⇒ (q,p)∈R δ (q,p) . Fig. 2 is an inductive phase invariant: the characterizations are such that in every possible phase, when the protocol executes a valid action starting from a state satisfying the current phase characterizations, there is an outgoing edge that matches this action (covering), and the resulting program state satisfies the characterization of the target phase (inductiveness). Remark 2. The careful reader may notice that the inductiveness requirement is stronger than needed to ensure that the characterizations form a phase invariant. It could be weakened to require for every q ∈ Q: ∀V. ϕ q ∧ TR =⇒ (q,p)∈R δ (q,p) ∧ ϕ p . However, as we explain in §5, our notion of inductiveness is crucial for inferring inductive phase automata, which is the goal of this paper. Furthermore, for deterministic phase automata, the two requirements coincide.

Example 3 (Inductive Phase Invariant). The phase automaton in
Inductive invariants vs. inductive phase invariants. Inductive invariants and inductive phase invariants are closely related: an inductive phase invariant induces a "standard" inductive invariant, and vice versa: Inv is an inductive invariant for TS, then the phase automaton A Inv = ({q}, {q}, ∅, δ, ϕ), where δ (q,q) = TR and ϕ q = Inv is an inductive phase automaton w.r.t. TS.
In this sense, phase inductive invariants are as expressive as inductive invariants. However, as we show in this paper, their structure can be used by a user as an intuitive way to guide an automatic invariant inference algorithm.
Remark 3. It is straightforward to add more flexibility to a phase automaton by allowing a set of initial states, Q 0 . In this case, for the automaton to over-approximate all the reachable states of TS, it suffices that every initial state corresponds to some initial phase of TS, possibly depending on the valuation of V. Therefore, the initiation constraint in the definition of an inductive phase invariant for TS may be relaxed into:

Safe Inductive Phase Invariants
Next we show that an inductive phase invariant can be used to establish safety.
Definition 5 (Safe Phase Automaton). Let A be a phase automaton over Σ with quantifiers V , and let ∀V. P be a safety property.
holds for every q ∈ Q. Lemma 3. If A is inductive w.r.t. TS and safe w.r.t. ∀V. P then ∀V. P is an invariant of TS.

Inference of Inductive Phase Invariants
In this section we turn to the inference of safe inductive phase invariants over a given phase structure, which guides the searchfor invariant. This section defines the problem, shows that it can be reduced to a set of Constrained Horn Clauses, discusses the aspects by which a phase structure guides inference, and considers witnesses for the case that no solution exists. Formally, the problem we target is: Definition 6 (Inductive Phase Invariant Inference). Given a transition system TS = (Init, TR), a phase structure S = (Q, ι, V, δ) and a safety property ∀V. P, all over Σ, find a safe inductive phase invariant A for TS over the phase structure S, if one exists. Example 4. Inference of an inductive phase invariant is provided with the phase structure in Fig. 2, which embodies an intuitive understanding of the different phases the protocol undergoes (see Example 1). The algorithm automatically finds phase characterizations forming a safe inductive phase invariant over the user-provided structure. We note that inference is valuable even after a phase structure is provided: in the running example, finding an inductive phase invariant is not easy; in particular, the characterizations in Fig. 2 relate different parts of the state and involve multiple quantifiers.

Reduction to Constrained Horn Clauses
We view each unknown phase characterization, ϕ q , which we aim to infer for every q ∈ Q, as a predicate I q . The definition of a safe inductive phase invariant induces a set of second-order Constrained Horn Clauses (CHC) over I q : Initiation.
All the constraints are linear, namely at most one unknown predicate appears at the lefthand side of each implication.
Constraint (4) captures the original safety requirement, whereas (3) can be understood as additional safety properties that are specified by the phase automaton (since no unknown predicates appear in the righthand side of the implications).
A solution I to the CHC system associates each predicate I q with a formula ψ q over Σ (with FV(ψ q ) ⊆ V) such that when ψ q is substituted for I q , all the constraints are satisfied (i.e., the corresponding first-order formulas are valid). A solution to the system induces a safe inductive phase automaton through characterizing each phase q by the interpretation of I q , and vice versa. Formally: Then A is a safe inductive phase invariant wrt. TS and ∀V. P if and only if I is a solution to the CHC system. Therefore, to infer a safe inductive phase invariant over a given phase structure, we need to solve the corresponding CHC system. In §6.1 we explain our approach for doing so for the class of universally quantified phase characterizations. Note that the weaker definition of inductiveness discussed in Remark 2 would prevent the reduction to CHC as it would result in clauses that are not Horn clauses. Completeness of inductive phase invariants. There are cases where a given phase structure induces a safe phase invariant A, but not an inductive one, making the CHC system unsatisfiable. However, a strengthening into an inductive phase invariant can always be used to prove that A is an invariant if (i) the language of invariants is unrestricted, and (ii) the phase structure is deterministic, namely, does not cover the same transition in two outgoing edges. Determinism of the automaton does not lose generality in the context of safety verification since every inductive phase automaton can be converted to a deterministic one; non-determinism is in fact unbeneficial as it mandates the same state to be characterized by multiple phases (see also Remark 2). These topics are discussed in detail in Appendix A.
Remark 4. Each phase is associated with a set of states that can reach it, where a state σ can reach phase q if there is a sequence of program transitions that results in σ and can lead to q according to the automaton's transitions. This makes a phase structure different from a simple syntactical disjunctive template for inference, in which such semantic meaning is unavailable.
Remark 5. When the safety property is of the form ∀V. Grd → ψ where Grd, ψ ∈ F Σ (V), we sometime seek an inductive phase invariant where Grd guards the entire phase structure of the automaton. This may be represented by splitting the initial phase into two: one whose characterization includes Grd, and another "dummy" initial phase whose characterization is ¬Grd. The dummy initial phase has a single self loop labeled TR, whereas the other maintains the actual phase structure. Moreover, Grd is added to the characterization of all other phases.

Phase Structures as a Means to Guide Inference
The search space of invariants over a phase structure is in fact larger than that of standard inductive invariants, because each phase can be associated with different characterizations. Sometimes the disjunctive structure of the phases (Lemma 2) uncovers a significantly simpler invariant than exists in the syntactical class of standard inductive invariants explored by the algorithm, but this is not always the case. 3 Nonetheless, the search for an invariant over the structure is guided, through the following aspects: (1) Phase decomposition. Inference of an inductive phase invariant aims to find characterizations that overapproximate the set of states reachable in each phase (Remark 4). The distinction between phases is most beneficial when there is a considerable difference between the sets associated with different phases and their characterizations. Differences between phases would have two consequences. First, since each phase corresponds to fewer states than all reachable states, generalization-the key ingredient in inference procedures-is more focused. The second consequence stems from the fact that inductive characterizations of different phases are correlated. It is expected that a certain property is more readily learnable in one phase, while related facts in other phases are more complex. For instance, the characterization in line 75 in Fig. 2 is more straightforward than the one in line 82. Simpler facts in one phase can help characterize an adjacent phase when the algorithm analyzes how that property evolves along the edge. Thus utilizing the phase structure can improve the gradual construction of overapproximations of the sets of states reachable in each phase.
(2) Disabled transitions. A phase automaton explicitly states which transitions of the system are enabled in each phase, while the rest are disabled. Such impossible transitions induce additional safety properties to be established by the inferred phase characterizations. For example, the phase invariant in Fig. 2 forbids a , a fact that can trigger the inference of the characterization in line 75. These additional safety properties direct the search for characterizations that are likely to be important for the proof. (3) Phase-awareness. Finally, while a phase structure can be encoded in several ways (such as ghost code), a key aspect of our approach is that the phase decomposition and disabled transitions are explicitly encoded in the CHC system in §5.1, ensuring that they guide the otherwise heuristic search. In §6.2 we demonstrate the effects of aspects (1)-(3) on guidance.

Implementation and Evaluation
In this section we apply invariant inference guided by phase structures to distributed protocols modeled in EPR, motivated by previous deductive approachesto safety of distributed protocols [50,49,59]. The work-flow for our approach is illustrated in Fig. 3.

Phase-PDR ∀ for Inferring Universally Quantified Characterizations
We now describe our procedure for solving the CHCs system of §5.1. It either (i) returns universally quantified phase characterizations that induce a safe inductive phase invariant, (ii) returns an abstract counterexample trace demonstrating that this is not possible, or (iii) diverges. EPR. Our procedure handles transition systems expressed using the extended Effectively PRopositional fragment (EPR) of first order logic [51,50], and infers universally quantified phase characterizations. Satisfiability of (extended) EPR formulas is decidable, enjoys the finite-model property, and supported by existing SMT solvers such as Z3 [45] and first order logic provers such as iProver [40]. Phase-PDR ∀ . Our procedure is based on PDR ∀ [39], a variant of PDR [10,21] that infers universally quantified inductive invariants. PDR computes a sequence of frames F 0 , . . . , F n such that F i overapproximates the set of states reachable in i steps. In our case, each frame F i is a mapping from a phase q to characterizations. The details of the algorithm are standard for PDR; we describe the gist of the procedure in Appendix D. We only stress the following: Counterexamples to safety take into account the safety property as well as disabled transitions. Search for predecessors is performed by going backwards on automaton edges, blocking counterexamples from preceding phases to prove an obligation in the current phase. Generalization is performed w.r.t. all incoming edges. As in PDR ∀ , proof obligations are constructed via diagrams [12]; in our setting these include the interpretation for the view quantifiers (see Appendix D for details). Edge covering check in EPR. When the transition relation formula is in EPR and the phase characterizations are universally quantified, the checks induced by Equations (1), (2) and (4) translate to checking (un)satisfiability of EPR formulas, potentially causing divergence of the solver. Equation (3) is trickier as checking implication between two EPR transition relations falls outside of EPR. To use a decidable edge covering check, we exploit the typical structure of transition relations in our setting, which is a disjunction between the transition relation of exported actions (the different actions in Fig. 1).
In the phase automaton we label an edge (q, p) by a set of exported actions, each action a with a guard g (q,p) a which is an alternation-free formula (a Boolean combination of universal and existential closed formulas). The edge covering check (Equation (3)) can now be written The righthand side of the implication is alternation-free and thus the check falls into the decidable EPR class. Such edge labeling is sufficiently expressive for the phase structures ofall our examples. Alternatively, sound but incomplete bounded quantifier instantiation [23] could be used, potentially allowing more complex decompositions of TR.
Absence of Inductive Phase Characterizations. What happens when the user gets the automaton wrong? One case is when there does not exist an inductive phase invariant with universal phase characterizations over the given structure. When this occurs, our tool can return an abstract counterexample trace-a sequence of program transitions and transitions of the automaton (inspired by [39,48])-which constitutes a proof of that fact (see Appendix B). The counterexample trace can assist the user in debugging the automaton or the program and modifying them. For instance, missing edges occurred frequently when we wrote the automata of §6, and we used the generated counterexample traces to correct them. Another type of failure is when an inductive phase invariant exists but the automaton does not direct the search well towards it. In this case the user may decide to terminate the analysis and articulate a different intuition via a different phase structure. In standard inference procedures, the only way to affect the search is by modifying the transition system; instead, phase structures equip the user with an ability to guide the search.

Evaluation
We evaluate our approach for user-guided invariant inference based on phase structures by comparing Phase-PDR ∀ to standard PDR ∀ , its inductive invariant inference counterpart. We implemented PDR ∀ and Phase-PDR ∀ in MYPYVY  > 1 hour 90.1 (0.82) 10 1 11 2 n/a n/a n/a n/a 13 10-15 12-27 14-39 Table 1: Running times in seconds of PDR ∀ and Phase-PDR ∀ , presented as the mean and standard deviation (in parentheses) over 16 different Z3 random seeds. " * " indicates that some runs did not converge after 1 hour and were not included in the summary statistics. "> 1 hour" means that no runs of the algorithm converged in 1 hour. #p refers to the number of phases and #v to the number of view quantifiers in the phase structure. #r refers to the number of relations and |a| to the maximal arity. The remaining columns describe the inductive invariant/phase invariant obtained in inference. |f| is the maximal frame reached. #c, #q, #l are the mean number of clauses, quantifiers (excluding view quantifiers) and literals per phase, ranging across the different phases.
invariant inference inspired by Ivy [44], over Z3 [45]. We studythe following effects of guidance by phase structures: 1. Can Phase-PDR ∀ converge to a proof when PDR ∀ does not (in reasonable time)? 2. Is Phase-PDR ∀ faster than PDR ∀ ? 3. Which aspects of Phase-PDR ∀ contribute to its performance benefits?
Protocols. We applied PDR ∀ and Phase-PDR ∀ to the most challenging examples admitting universally-quantified invariants, which previous works verified using deductive techniques. The protocols we analyzed are listed below and in Table 1. The full models appear in [1]. The KV-R protocol analyzed is taken from one of the two realistic systems studied by the IronFleet paper [32] using deductive verification. Phase structures. The phase structures we used appear in [1]. In all our examples, it was straightforward to translate the existing high-level intuition of important and relevant distinctions between phases in the protocol into the phase structures we report. For example, it took us less than an hour to finalize an automaton for KV-R. We emphasize that phase structures do not include phase characterizations; the user need not supply them, nor has to understand the inference procedure. Our exposition of the phase structures below refers to an intuitive meaning of each phase, but this is not part of the phase structure provided to the tool.
(1) Achieving Convergence Through Phases In this section we consider the effect of phases on inference for examples on which standard PDR ∀ does not converge in 1 hour.
Examples. Sharded key-value store with retransmissions (KV-R): see §3 and Example 1. This protocol has not been modeled in decidable logic before.
Cache coherence. This example implements the classic MESI protocol for maintaining cache coherence in a shared-memory multiprocessor [35], modeled in decidable logic for the first time. Cores perform reads and writes to memory, and caches snoop on each other's requests using a shared bus and maintain the invariant that there is at most one writer of a particular cache line. For simplicity, we consider only a single cache line, and yet the example is still challenging for PDR ∀ . Standard explanations of this protocol in the literature already use automata to describe this invariant, and we directly exploit this structure in our phase automaton. Phase structure: There are 10 phases in total, grouped into three parts corresponding to the modified, exclusive, and shared states in the classical description. Within each group, there are additional phases for when a request is being processed by the bus. For example, in the shared group, there are phases for handling reads by cores without a copy of the cache line, writes by such cores, and also writes by cores that do have a copy. Overall, the phase structure is directly derived from textbook descriptions, taking into account that use of the shared bus is not atomic. Results and discussion. Measurements for these examples appear in Table 1. Standard PDR ∀ fails to converge in less than an hour on 13 out of 16 seeds for KV-R and all 16 seeds for the cachecoherence protocol. In contrast, Phase-PDR ∀ converges to a proof in a few minutes in all cases. These results demonstrate that phase structures can effectively guide the search and obtain an invariant quickly where standard inductive invariant inference does not.
(2) Enhancing Performance Through Phases In this section we consider the use of phase structures to improve the speed of convergence to a proof. Examples. Distributed lock service, adapted from [60], allows clients to acquire and release locks by sending requests to a central server, which guarantees that only one client holds each lock at a time. Phase structure: for each lock, the phases follow the 4 steps by which a client completes a cycle of acquire and release. We also consider a simpler variant with only a single lock, reducing the arity of all relations and removing the need for an automaton view. Its phase structure is the same, only for a single lock.
Simple quorum-based consensus, based on the example in [59]. In this protocol, nodes propose themselves and then receive votes from other nodes. When a quorum of votes for a node is obtained, it becomes the leader and decides on a value. Safety requires that decided values are unique. The phase structure distinguishes between the phases before any node is elected leader, once a node is elected, and when values are decided. Note that the automaton structure is unquantified.
Leader election in a ring [13,50], in which nodes are organized in a directional ring topology with unique IDs, and the safety property is that an elected leader is a node with the highest ID. Phase structure: for a view of two nodes n 1 , n 2 , in the first phase, messages with the ID of n 1 are yet to advance in the ring past n 2 , while in the second phase, a message advertising n 1 has advanced past n 2 . The inferred characterizations include another quantifier on nodes, constraining interference (see §7).
Sharded key-value store (KV) is a simplified version of KV-R above, without message drops and the retransmission mechanism. The phase structure is exactly as in KV-R, omitting transitions related to sequence numbers and acknowledgment. This protocol has not been modeled in decidable logic before. Results and discussion. We compare the performance of standard PDR ∀ and Phase-PDR ∀ on the above examples, with results shown in Table 1. For each example, we ran the two algorithms on 16 different Z3 random seeds. Measurements were performed on a 3.4GHz AMD Ryzen Threadripper 1950X with 16 physical cores, running Linux 4.15.0, using Z3 version 4.7.1. By disabling hyperthreading and frequency scaling and pinning tasks to dedicated cores, variability across runs of a single seed was negligible.
In all but one example, Phase-PDR ∀ improves performance, sometimes drastically; for example, performance for leader election in a ring is improved by a factor of 60. Phase-PDR ∀ also improves the robustness of inference [26] on this example, as the standard deviation falls from 39 in PDR ∀ to 0.04 in Phase-PDR ∀ .
The only example in which a phase structure actually diminishes inference effectiveness is simple consensus. We attribute this to an automaton structure that does not capture the essence of the correctness argument very well, overlooking votes and quorums. This demonstrates that a phase structure might guide the search towards counterproductive directions if the user guidance is "misleading". This suggests that better resiliency of interactive inference framework could be achieved by combining phasebased inference with standard inductive invariant-based reasoning. We are not aware of a single "good" automaton for this example. The correctness argument of this example is better captured by the conjunction of two automata (one for votes and one for accumulating a quorum) with different views, but the problem of inferring phase invariants for mutually-dependent automata is a subject for future work.
(3) Anatomy of the Benefit of Phases We now demonstrate that each of the beneficial aspects of phases discussed in §5.2 is important for the benefits reported above. Phase decomposition. Is there a benefit from a phase structure even without disabled transitions? An example to a positive answer to this question is leader election in a ring, which demonstrates a huge performance benefit even without disabled transitions. Disabled transitions. Is there a substantial gain from exploiting disabled transitions? We compare Phase-PDR ∀ on the structure with disabled transitions and a structure obtained by (artificially) adding self loops labeled with the originally impossible transitions, on the example of lock service with multiple locks ( §6.2), seeing that it demonstrates a performance benefit using Phase-PDR ∀ and showcases several disabled transitions in each phase. The result is that without disabled transitions, the mean running time of Phase-PDR ∀ on this example jumps from 2.73 seconds to 6.24 seconds. This demonstrates the utility of the additional safety properties encompassed in disabled transitions. Phase-awareness. Is it important to treat phases explicitly in the inference algorithm, as we do in Phase-PDR ∀ ( §6.1)? We compare our result on convergence of KV-R with an alternative in which standard PDR ∀ is applied to an encoding of the phase decomposition and disabled transition by ghost state: each phase is modeled by a relation over possible view assignments, and the model is augmented with update code mimicking phase changes; the additional safety properties derived from disabled transitions are provided; and the view and the appropriate modification of the safety property are introduced. This translation expresses all information present in the phase structure, but does not explicitly guide the inference algorithm to use this information. The result is that with this ghost-based modeling the phase-oblivious PDR ∀ does not converge in 1 hour on KV-R in any of the 16 runs, whereas it converges when Phase-PDR ∀ explicitly directs the search using the phase structure.

Related Work
Phases in distributed protocols. Distributed protocols are frequently described in informal descriptions as transitioning between different phases. Recently, PSync [19] used the Heard-Of model [14], which describes protocols as operating in rounds, as a basis for the implementation and verification of fault-tolerant distributed protocols. Typestates [e.g. 58,24] also bear some similarity to the temporal aspect of phases. State machine refinement [3,27] is used extensively in the design and verification of distributed systems (see e.g. [46,32]). The automaton structure of a phase invariant is also a form of state machine; our focus is on inference of characterizations establishing this. Interaction in verification. Interactive proof assistants such as Coq [8] and Isabelle/HOL [47] interact with users to aid them as they attempt to prove candidate inductive invariants. This differs from interaction through phase structures and counterexample traces. Ivy uses interaction for invariant inference by interactive generalization from counterexamples [50]. This approach is less automatic as it requires interaction for every clause of the inductive invariant. In terminology from synthesis [29], the use of counterexamples is synthesizer-driven interaction with the tool, while interaction via phase structures is mainly user-driven. Abstract counterexample traces returned by the tool augment this kind of interaction. As [37] has shown, interactive invariant inference, when considered as a synthesis problem (see also [26,54]) is related to inductive learning. Template-based invariant inference. Many works employ syntactical templates for invariants, used to constrain the search [e.g. 16,53,56,57,7]. The different phases in a phase structure induce a disjunctive form, but crucially each disjunct also has a distinct semantic meaning, which inference overapproximates, as explained in §5.2. Automata in safety verification. Safety verification through an automaton-like refinement of the program's control has been studied in a number of works. We focus on related techniques for proof automation. The Automizer approach to the verification of sequential programs [33,34] is founded on the notion of a Floyd-Hoare automaton, which is an unquantified inductive phase automaton; an extension to parallel programs [22] uses thread identifiers closed under the symmetry rule, which are related to view quantifiers. Their focus is on the automatic, incremental construction of such automata as a union of simpler automata, where each automaton is obtained from generalizing the proof/infeasibility of a single trace. In our approach the structure of the automaton is provided by the user as a means of conveying their intuition of the proof, while the annotations are computed automatically. A notable difference is that in Automizer, the generation of characterizations in an automaton constructed from a single trace does not utilize the phase structure (beyond that of the trace), whereas in our approach the phase structure is central in generalization from states to characterizations.
In trace partitioning [52,43], abstract domains based on transition systems partitioning the program's control are introduced. The observation is that recording historical information forms a basis for case-splitting, as an alternative to fully-disjunctive abstractions. This differs from our motivation of distinguishing between different protocol phases. The phase structure of the domain is determined by the analyser, and can also be dynamic. In our work the phase structure is provided by the user as guidance. We use a variant of PDR ∀ , rather than abstract interpretation [17], to compute universally quantified phase characterizations.
Techniques such as predicate abstraction [28,25] and existential abstraction [15], as well as the safety part of predicate diagrams [11], use finite languages for the set of possible characterizations and lack the notion of views, both essential for handling unbounded numbers of processes and resources.
Finally, phase splitter predicates [55] share our motivation of simplifying invariant inference by exposing the different phases the loop undergoes. Splitter predicates correspond to inductive phase characterizations [55, Theorem 1], and are automatically constructed according to program conditionals. In our approach, decomposition is performed by the user using potentially non-inductive conditions, and the inductive phase characterizations are computed by invariant inference. Successive loop splitting results in a sequence of phases, whereas our approach utilizes arbitrary automaton structures. Borralleras et al. [9] also refine the control-flow graph throughout the analysis by splitting on conditions, which here are discovered as preconditions for termination (the motivation is to expose termination proof goals to be established): in a sense, the phase structure is grown from candidate characterizations implying termination. This differs from our approach in which the phase structure is used to guide the inference of characterizations.
Quantified invariant inference. We focus here on the works on quantifiers in automatic verification most closely related to our work. In predicate abstraction, quantifiers can be used internally as part of the definitions of predicates, and also externally through predicates with free variables [25,41]. Our work uses quantifiers both internally in phases characterizations and externally in view quantifiers. The view is also related to the bounded number of quantifiers used in view abstraction [6,5]. In this work we observe that it is useful to consider views of entities beyond processes or threads, such as a single key in the store.
Quantifiers are often used to their full extent in verification conditions, namely checking implication between two quantified formulas, but they are sometimes employed in weaker checks as part of thread-modular proofs [4,38]. This amounts to searching for invariants provable using specific instantiations of the quantifiers in the verification conditions [30,36]. In our verification conditions, the view quantifiers are localized, in effect performing a single instantiation. This is essential for exploiting the disjunctive structure under the quantifiers, allowing inference to consider a single automaton edge in each step, and reflecting an intuition of correctness. When necessary to constrain interference, quantifiers in phase characterizations can be used to establish necessary facts about interfering views. Finally, there exist algorithms other than PDR ∀ for solving CHC by predicates with universal invariants [e.g. 31, 20].

Conclusion
Invariant inference techniques aiming to verify intricate distributed protocols must adjust to the diverse correctness arguments on which protocols are based. In this paper we have proposed to use phase structures as means of conveying users' intuition of the proof, to be used by an automatic inference tool as a basis for a full formal proof. We found that inference guided by a phase structure can infer proofs for distributed protocols that are beyond reach for state of the art inductive invariant inference methods, and can also improve the speed of convergence. The phase decomposition induced by the automaton, the use of disabled transitions, and the explicit treatment of phases in inference, all combine to direct the search for the invariant. We are encouraged by our experience of specifying phase structures for different protocols. It would be interesting to integrate the interaction via phase structures with other verification methods and proof logics, as well as interaction schemes based on different, complementary, concepts. Another important direction for future work is inference beyond universal invariants, required for example for the proof of Paxos [49].

A Completeness of Inductive Phase Invariants
There are cases where a phase automaton A is a phase invariant for TS, but this cannot be established via an inductive phase invariant since there is no strengthening of its phase characterizations that leads to an inductive phase invariant for TS. This may happen for two reasons.
First, as with standard inductive invariants, it is possible that the strengthening necessary to ensure inductiveness is not expressible in the logic available to us.
Second, even if we assume an unrestricted language of phase characterizations, it is possible that the edge labeling is too permissive, thus adding transitions that are not necessary for the edge covering requirement. Such "redundant" transitions may sometimes be harmless, but they may also violate preservation along some edge. Namely, if no state that has such an outgoing transition can reach the corresponding phase q from previous phases, such violations can be overcome by strengthening ϕ q to exclude all states that have such an outgoing transition (assuming an unrestricted language of phase characterizations), thus disabling the problematic transition along the edge. In these cases, an inductive automaton can be obtained. However, in other cases, strengthening the phase characterizations in this way would exclude states that can reach phase q and as such would damage the inductiveness property along incoming edges of q. In such cases, the only way to disable problematic transitions along automaton edges is by strengthening the transition relation formulas (i.e., updating the automaton structure). Hence, no inductive automaton exists for the given phase structure. The second reason has no counterpart in standard inductive invariants; it reflects the additional structure expressed by a phase automaton, which is enforced by our stronger definition of inductiveness (as opposed to the weaker definition mentioned in Remark 2). Fortunately, this reason can be avoided by considering deterministic phase automata: Non-determinism is generally unbeneficial as it only mandates some states to be characterized by multiple phases in the inductive phase invariant (see also Remark 2). We point out that restricting our attention to deterministic phase automata does not lose generality in the context of safety verification since every inductive phase invariant, which are the ones we seek, can be translated into a deterministic one, as the following lemma shows. Thus a structure admitting an inductive phase invariant can be converted to a deterministic one with the same property. Lemma 6. Let A = (Q, ι, V, δ, ϕ) be an inductive phase invariant w.r.t. TS. Define an arbitrary total order, <, on Q, and define A = (Q, ι, V, δ , ϕ) where Then A is a deterministic inductive phase invariant w.r.t. TS.
We note that when the language of phase characterizations is restricted and an inductive phase automaton does not exist for this reason, it may be possible to overcome the limitations of the language and obtain one by changing the automaton structure.

B Abstract Counterexample Traces
Phase structures may not admit a safe inductive phase invariant. In this section we discuss causes for this, and notions of concrete and abstract counterexample traces constituting a proof that inductive phase characterizations cannot be found over the given structure in the given language of candidate characterizations.

B.1 Concrete Counterexample Traces
We first consider the case where no safe inductive phase invariant exists, regardless of the language of phase characterizations. Such a case may be witnessed by a counterexample trace that exhibits a violation of one of the safety properties induced by S.
Definition 7 (Counterexample Trace). A trace σ 1 , . . . , σ n of TS with a valuation v for V is a counterexample trace for S, TS and ∀V. P if there exists a trace q 0 , . . . , q n of S such that (σ i , σ i+1 ), v |= δ (qi,qi+1) for every 1 ≤ i < n, but one of the following holds: 1. A state in the trace is not safe: σ i , v |= P for some i. 2. A state in the trace allows a transition that is not covered by any outgoing edge: There exists σ s.t. (σ i , σ ) |= TR for some i, but no edge can cover this transition, i.e. (σ i , σ ), v |= δ (qi,q ) for all q ∈ Q.
Lemma 7. If a counterexample trace exists then there is no safe inductive phase invariant with structure S, even if the language of phase characterizations is unrestricted.
In case 1, the trace violates the original safety property (enforced by Equation (4)). This means that the transition system at hand is not safe and no safe inductive phase invariant can be expected, for any phase structure (by Lemma 3).
In case 2, the trace violates the "additional" safety property specified by the phase structure (Equation (3)). This does not indicate that the transition system is not safe, but rather that the choices of the phase structure are intrinsically incorrect: it includes a trace such that a corresponding trace of the transition system reaches a state that allows a transition that is not covered by any outgoing edge in the automaton. If the phase structure is nondeterministic, this may indicate that some of the edges are redundant and prevent obtaining phase inductive characterizations even if it is a phase invariant. If the structure is deterministic (where every trace of TS corresponds to at most one trace of A), this means that the automaton excludes true traces of the transition system and is therefore not a phase invariant, hence no corresponding inductive phase invariant exists. In both cases, TS may be safe but the user has to modify the phase structure in order to be able to verify it.

B.2 Abstract Counterexample Traces
In practice, phase characterizations are restricted since actual inference algorithms restrict their search space to a certain class of formulas. This may be viewed as a form of abstraction, and may be one of the potential causes of absence of inductive phase invariant. In this case, we may obtain an abstract counterexample trace that may indicate any of the violations discussed in Appendix B.1, but may also reflect the limitations of the language of phase characterizations. Language of Phase Characterization. We denote by L the class of formulas used to represent phase characterizations. We denote by L Σ (V) the set of formulas in L over vocabulary Σ with free variables from V. We assume an implication relation over L: for every ψ 1 , ψ 2 ∈ L Σ (V), a structure σ over Σ and a valuation v we have the relation σ, v |= ψ 1 =⇒ σ, v |= ψ 2 . We use the implication relation to define a preorder on structures (accompanied by valuations) that captures which structure satisfies more formulas from L Σ (V): if v is defined in both σ 1 and σ 2 and for all ψ ∈ L Σ (V), σ 2 , v |= ψ ⇒ σ 1 , v |= ψ.
Intuitively, (σ 1 , v) L Σ (V) (σ 2 , v) means that (σ 2 , v) is more abstract than (σ 1 , v): any formula in L Σ (V) that is satisfied by (σ 2 , v) is also satisfied by (σ 1 , v). In particular, no formula that is satisfied by (σ 2 , v) can distinguish it from (σ 1 , v). Note that L Σ (V) is defined for structures paired with the same valuation, which means that they interpret V in the same way. We often omit Σ and V from the notation and write (σ 1 , v) L (σ 2 , v).
Example 5 (Universal Characterizations). Consider L = ∀ * , i.e., the class of universally quantified formulas. In this case, (σ 1 , v) ∀ * (σ 2 , v) if v is defined in both σ 1 , σ 2 and σ 1 is a substructure of σ 2 (up to isomorphism). (The structure σ 1 = (D 1 , I 1 ) is a substructure of the structure σ 2 = (D 2 , I 2 ) if D 1 ⊆ D 2 and I 2 agrees with I 1 on D 1 .That is, for every constant symbol c ∈ Σ, I 2 (c) = I 1 (c), for every function symbol f ∈ Σ with arity k, I 2 (f )(d 1 , . . . , d k ) = I 1 (f )(d 1 , . . . , d k ) for every d 1 , . . . , d k ∈ D 1 , and for every relation symbol r ∈ Σ with arity k, I 2 (r)∩D k 1 = I 1 (r). ) Abstract Traces. We view the preorder L as an abstraction relation, and use it to define a notion of an abstract trace, where each transition consists of an "abstraction step" followed by a concrete transition of the system. An abstraction step transitions to a "less abstract" state (that cannot be distinguished by any formula satisfied by the source of the transition -the more abstract state).
Definition 9 (Abstract Trace). Given a transition system TS = (Init, TR) over vocabulary Σ, an abstract trace is a finite sequence of states σ 1 , . . . , σ n over Σ with a valuation v over V which is defined in all σ i such that for every 1 ≤ i < n, there exists Note that since L is reflexive,σ i may be equal to σ i , in which case the abstract trace is concrete.
An abstract counterexample trace is similar to a counterexample trace, except that it consists of an abstract trace. While the violation exhibited by an abstract counterexample trace may not be real, it indicates that no safe inductive phase invariant exists in L.

Lemma 8.
If an abstract counterexample trace exists then there is no safe inductive phase invariant with structure S and phase characterizations in L.
Diagnosis of Abstract Counterexample Traces. When a user is presented with an abstract counterexample trace, diagnosing the cause of the trace assists the user in understanding whether (i) the program is faulty, (ii) the phase structure needs to be modified (and how), or (iii) the language of phase characterizations is not expressive enough to capture the required characterizations. These cases can be differentiated by performing bounded model checking along the given abstract counterexample trace. If a concrete trace is found, it demonstrates whether the system is not safe or the automaton needs to be changed. Otherwise, the counterexample is attributed to the limited logical language used for candidate characterizations. In this case, the user can proceed by extending the logical language, or modify the program and/or automaton so that they admit an inductive phase invariant in the given logical language.

D Overview of Phase-PDR ∀
Our procedure is based on PDR ∀ [39], a variant of PDR [10,21] that infers universally quantified inductive invariants. PDR computes a sequence of frames, F 0 , . . . , F n such that F i overapproximates the set of states reachable in i steps. In our case, each frame F i is a mapping from a phase q to a characterization.
We describe the gist of the procedure using the terminology of phase automata. An inductive trace is a sequence of frames such that ∀V. F i (q) =⇒ F i+1 (q) for all i and q ∈ Q, the first frame satisfies F 0 (ι) = Init where ι is the initial phase and F 0 (q) = false for other phases (in accordance with Equation (1)), all frames satisfy Equations (3) and (4), and the constraint of Equation (2) is interpreted between successive frames, namely for all 0 ≤ i < n and for all (q, p) ∈ R, These properties ensure that F i induces phase characterizations such that the language of the induced phase automaton includes all traces of TS of length at most i. The procedure gradually constructs the inductive trace by generating and blocking proof obligations, where a proof obligation (m, q, i) consists of system state(s) m that need to be proved unreachable at phase q of the automaton at frame i (i.e., with traces of length bounded by i). When a proof obligation is shown to hold, it is generalized into a new lemma that excludes the corresponding states and is added as a conjunct to the characterization of phase q at frame i, where the phase characterizations of each frame are initially set to true. 4 Proof obligations are generated by a backward traversal. First, whenever a new frame F n is added, proof obligations are generated from counterexamples to the safety properties of Equations (3) and (4) in some phase q based on F n (q ). Then, to check whether a proof obligation ψ at phase q can be blocked in frame F i , our procedure checks whether the phase characterizations of its pre-phases in the previous frame suffice to show that ψ holds in q in accordance with constraint (2), i.e. ∀V. F i−1 (q) ∧ δ (q,q ) =⇒ ψ for all q such that (q, q ) ∈ R. Otherwise, there is a pre-phase q, a valuation v and a transition (σ, σ ), v |= δ (q,q ) such that σ, v |= F i−1 (q) but σ , v |= ψ. This generates a proof obligation θ for phase q in frame i − 1.
ψ distinct is a conjunction of all inequalities x i = x j for i = j.
ψ identity is a conjunction of the equalities x i = c if c is a constant symbol and I(c) = e i , and of the equalities x i = y if y ∈ V and v(y) = e i .
ψ rels is a conjunction of atomic formulas r(x i1 , . . . , x ia ) for every relation symbol r of arity a and elements e = e i1 , . . . , e ia such that e ∈ I(r), and ¬r(x i1 , . . . , x ia ) if e ∈ I(r). Function symbols are treated similarly.
Note that Diag(σ, v, V) has V as free variables. A proof obligation (Diag(σ, v, V), q, i) is blocked by adding to the characterization of phase q at frame i a universally quantified clause (possibly with free variables in V) that implies ¬Diag(σ, v, V).
The procedure continues to generate (and block) proof obligations across automaton edges, going backwards in the frames, until it reaches the initial frame. If a proof obligation does not hold there, the procedure has found an abstract counter-trace, and returns it as evidence of absence of a safe inductive phase invariant. The reason is that the backward traversal is performed over diagrams, where (σ, v) |= Diag(σ, v, V) if and only ifσ is a substructure of σ (up to isomorphism) [12], and hence if and only if (σ, v) ∀ * (σ, v) (Example 5). Otherwise, the procedure terminates if one of the frames constitutes a solution to the CHC system, namely an inductive phase invariant.

E Proofs
Proof of Lemma 1. Given a domain D and a valuation v, the phase characterizations of A induce the following simulation relation from TS to A: H = {(σ, q) | σ, v |= ϕ q } in the sense that: Proof of Lemma 4. Assume that I is a solution to the CHC system. Then A is a safe inductive phase invariant for TS: A satisfies initiation due to constraint 1, satisfies inductiveness due to constraint 2, and edge covering due to constraint 3, and thus A is inductive w.r.t. TS (Definition 4). Finally, A is safe (Definition 5) due to constraint 4.
Conversely, assume that A is an inductive phase invariant. Then constraint 1 is satisfied because A satisfies initiation, constraint 2 is satisfied because A satisfies inductiveness, and constraint 3 because A satisfies edge covering-all from the definition of an inductive phase invariant (Definition 4). Finally, constraint 4 is satisfied because A is safe (Definition 5).
Proof of Lemma 5. The implication from right to left is a consequence of Lemma 1. Consider the other direction. We construct an inductive phase invariant as follows: For a given valuation v, characterize each phase by the set of states that can reach this phase: reachability of σ to phase q means that there exist trace of program states σ 0 , . . . , σ n = σ and a matching trace of phases q 0 , . . . , q n = q per Definition 2. The result is indeed an inductive phase invariant: initiation follows from A being a phase invariant. Inductiveness follows from taking the characterizations to be sets of reachable states (recall that the language of characterizations is assumed to be unrestricted). It remains to argue that edge covering holds. Let σ be reachable in phase q and (σ, σ ) |= TR. Thus there is a sequence of states program states σ 0 , . . . , σ n = σ and a matching trace of phases q 0 , . . . , q n = q. Now, σ is also reachable, and since A is a phase invariant there exists a trace of phases trace of phases q 0 , . . . , q n , q n+1 = q matching σ 0 , . . . , σ n , σ . But A is deterministic, so necessarily q n = q n (by induction over n) which is q. In particular this gives (σ, σ ) |= δ (q,p) where p = q n+1 , as required.
Proof of Lemma 6. The definition of δ ensures that it is deterministic, and inherits the edge covering property from δ. Initiation is not affected by the edge labeling, and inductiveness cannot be damaged by strengthening δ (q,p) .
Proof of Lemma 7. Follows from Lemma 8 with the identity preorder (or simply by reiterating the proof while ignoring abstraction steps).
Proof of Lemma 8. For the sake of the proof, we explicitly split each transition in an abstract trace to an abstraction step followed by a transition. Let (σ 1 ,σ 1 , . . . , σ n ,σ n ), v) be such a trace and q 1 , . . . , q n be as in the definition. Let A be a strengthening of A with phase characterization ϕ . Assume that A is safe and show that A is not inductive. Assume for the sake of contradiction that it is. We claim by induction on i that σ i , v |= ϕ qi . The base claim follows from initiation. For the induction step, assume that σ i , v |= ϕ qi . Since (σ i , v) L (σ i , v) and ϕ qi ∈ L,σ i , v |= ϕ qi . Now (σ i , σ i+1 ), v |= δ (qi,qi+1) , and from the assumption that A is inductive necessarily σ i+1 , v |= ϕ qi+1 , as required.
Let us consider the cause of the abstract counterexample trace. In case 1, σ i , v |= P is a contradiction to the safety of A (Definition 5). In case 2, if (σ i , σ ) |= TR, since we have shown σ i , v |= ϕ qi it must follow from edge covering that (σ i , σ ), v |= δ (qi,q ) for some q ∈ Q, which is a contradiction. The claim follows.