Causality-Based Game Solving

We present a causality-based algorithm for solving two-player reachability games represented by logical constraints. These games are a useful formalism to model a wide array of problems arising, e.g., in program synthesis. Our technique for solving these games is based on the notion of subgoals, which are slices of the game that the reachability player necessarily needs to pass through in order to reach the goal. We use Craig interpolation to identify these necessary sets of moves and recursively slice the game along these subgoals. Our approach allows us to infer winning strategies that are structured along the subgoals. If the game is won by the reachability player, this is a strategy that progresses through the subgoals towards the final goal; if the game is won by the safety player, it is a permissive strategy that completely avoids a single subgoal. We evaluate our prototype implementation on a range of different games. On multiple benchmark families, our prototype scales dramatically better than previously available tools.


Introduction
Two-player games are a fundamental model in logic and verification due to their connection to a wide range of topics such as decision procedures, synthesis and control [1,2,5,6,10,20]. Algorithmic techniques for finite-state two-player games have been studied extensively for many acceptance conditions [19]. For infinite-state games most problems are directly undecidable. However, infinite state spaces occur naturally in domains like software synthesis [33] and cyberphysical systems [22], and hence handling such games is of great interest. An elegant classification of infinite-state games that can be algorithmically handled, depending on the acceptance condition of the game, was given in [13]. The authors assume a symbolic encoding of the game in a very general form. More recently, incomplete procedures for solving infinite-state two-player games specified using logical constraints were studied [3,17]. While [3] is based on automated theorem-proving for Horn formulas and handles a wide class of acceptance conditions, the work in [17] focusses on reachability games specified in the theory of linear arithmetic, and uses sophisticated decision procedures for that theory.
In this paper, we present a novel technique for solving logically represented reachability games based on the notion of subgoals. A necessary subgoal is a transition predicate that is satisfied at least once on every play that reaches the overall goal. It represents an intermediate target that the reachability player must reach in order to win. Subgoals open up game solving to the study of causeeffect relationships in the form of counterfactual reasoning [27]: If a cause (the subgoal) had not occurred, then the effect (reaching the goal) would not have happened. Thus for the safety player, a necessary subgoal provides a chance to win the game based on local information: If they control all states satisfying the pre-condition of the subgoal, then any strategy that in these states picks a transition outside of the subgoal is winning. Finding such a necessary subgoal may let us conclude that the safety player wins without ever having to unroll the transition relation.
On the other hand, passing through a necessary subgoal is in general not enough for the reachability player to win. We call a subgoal sufficient if indeed the reachability player has a winning strategy from every state satisfying the post-condition of the subgoal. Dual to the description in the preceding paragraph, sufficient subgoals provide a chance for the reachability player to win the global game as they must merely reach this intermediate target. The two properties differ in one key aspect: While necessity of a subgoal only considers the paths of the game arena, for sufficiency the game structure is crucial.
We show how Craig interpolants can be used to compute necessary subgoals, making our methods applicable to games represented by any logic that supports interpolation. In contrast, determining whether a subgoal is sufficient requires a partial solution of the given game. This motivates the following recursive approach. We slice the game along a necessary subgoal into two parts, the pre-game and the post-game. In order to guarantee these games to be smaller, we solve the post-game under the assumption that the considered subgoal was bridged for the last time. We conclude that the safety player wins the overall game if they can avoid all initial states of the post-game that are winning for the reachability player. Otherwise, the pre-game is solved subject to the winning condition given by the sufficient subgoal consisting of these states. This approach does not only determine which player wins from each initial state, but also computes symbolically represented winning strategies with a causal structure. Winning safety player strategies induce necessary subgoals that the reachability player cannot pass, which constitutes a cause for their loss. Winning reachability player strategies represent a sequence of sufficient subgoals that will be passed, providing an explanation for the win.
The Python-based implementation CabPy of our approach was used to compare its performance to SimSynth [17], which is, to the best of our knowledge, the only other available tool for solving linear arithmetic reachability games. Our experiments demonstrate that our algorithm is competitive in many case studies. We can also confirm the expectation that our approach heavily benefits from qualitatively expressive Craig interpolants. It is noteworthy that like SimSynth our approach is fully automated and does not require any input in the form of hints or templates. Our contributions are summarized as follows: -We introduce the concept of necessary and sufficient subgoals and show how Craig interpolation can be used to compute necessary subgoals (Section 4). -We describe an algorithm for solving logically represented two-player reachability games using these concepts. We also discuss how to compute representations of winning strategies in our approach (Section 5). -We evaluate our approach experimentally through our Python-based tool CabPy, demonstrating a competitive performance compared to the previously available tool SimSynth on various case studies (Section 6). Related Work. The problem of solving linear arithmetic games is addressed in [17] using an approach that relies on a dedicated decision procedure for quantified linear arithmetic formulas, together with a method to generalize safety strategies from truncated versions of the game that end after a prescribed number of rounds. Other approaches for solving infinite-state games include deductive methods that compute the winning regions of both players using proof rules [3], predicate abstraction where an abstract controlled predecessor operation is used on the abstract game representation [37], and symbolic BDD-based exploration of the state space [14]. Additional techniques are available for finitestate games, e.g., generalizing winning runs into a winning strategy for one of the players [30].
Our notion of subgoal is related to the concept of landmarks as used in planning [21]. Landmarks are milestones that must be true on every successful plan, and they can be used to decompose a planning task into smaller sub-tasks. Landmarks have also been used in a game setting to prevent the opponent from reaching their goal using counter-planning [31]. Whenever a planning task is unsolvable, one method to find out why is checking hierarchical abstractions for solvability and finding the components causing the problem [35].
Causality-based approaches have also been used for model checking of multithreaded concurrent programs [23,24]. In our approach, we use Craig interpolation to compute the subgoals. Interpolation has already been used in similar contexts before, for example to extract winning strategies from game trees [15] or to compute new predicates to refine the game abstractions [9]. In [17], interpolation is used to synthesize concrete winning strategies from so called winning strategy skeletons, which describe a set of strategies of which at least one is winning.

Motivating Example
Consider the scenario that an expensive painting is displayed in a large exhibition room of a museum. It is secured with an alarm system that is controlled via a control panel on the opposite side of the room. A security guard is sleeping at the control panel and occasionally wakes up to check whether the alarm is still armed. To steal the painting, a thief first needs to disable the alarm and then reach the painting before the alarm has been reactivated. We model this scenario as a two-player game between a safety player (the guard) and a reachability player (the thief) in the theory of linear arithmetic. The moves of both players, their initial positions, and the goal condition are described by the formulas: (sleep or wake up) Goal ≡ ¬r ∧ p = 1.
The thief's position in the room is modeled by two coordinates x, y ∈ R with initial value (0, 0), and with every transition the thief can move some bounded distance. Note that we use primed variables to represent the value of variables after taking a transition. The control panel is located at (0, 10) and the painting at (10,5). The status of the alarm and the painting are described by two boolean variables a, p ∈ {0, 1}. The guard wakes up every two time units, modeled by the variable t ∈ R. The variables x, y are bounded to the interval [0, 10] and t to [0, 2]. The boolean variable r encodes who makes the next move. In the presented configuration, the thief needs more time to move from the control panel to the painting than the guard will sleep. It follows that there is a winning strategy for the guard, namely, to always reactivate the alarm upon waking up.
Although it is intuitively fairly easy to come up with this strategy for the guard, it is surprisingly hard for game solving tools to find it. The main obstacle is the infinite state space of this game. Our approach for solving games represented in this logical way imitates causal reasoning: Humans observe that in order for the thief to steal the painting (i.e., the effect p = 1), a transition must have been taken whose source state does not satisfy the pre-condition of (steal) while the target state does. Part of this cause is the condition a = 0, i.e., the alarm is off. Recursively, in order for the effect a = 0 to happen, a transition setting a from 1 to 0 must have occurred, and so on.
Our approach captures these cause-effect relationships through the notion of necessary subgoals, which are essential milestones that the reachability player has to transition through in order to achieve their goal. The first necessary subgoal corresponding to the intuitive description above is In this case, it easy to see that C 1 is also a sufficient subgoal, meaning that all successor states of C 1 are winning for the thief. Therefore, it is enough to solve the game with the modified objective to reach those predecessor states of C 1 from which the thief can enforce C 1 being the next move (even if it is not their turn). Doing so recursively produces the necessary subgoal meaning that some transition must have caused the effect that the alarm is disabled. However, C 2 is not sufficient which can be seen by recursively solving the game spanning from successor states of C 2 to C 1 . This computation has an important caveat: After passing through C 2 , it may happen that a is reset to 1 at a later point (in this particular case, this constitutes precisely the winning strategy of the safety player), which means that there is no canonical way to slice the game along this subgoal into smaller parts. Hence the recursive call solves the game from C 2 to C 1 subject to the bold assumption that any move from a = 0 to a ′ = 1 is winning for the guard. This generally underapproximates the winning states of the thief. Remarkably, we show that this approximation is enough to build winning strategies for both players from their respective winning regions. In this case, it allows us to infer that moving through C 2 is always a losing move for the thief. However, at the same time, any play reaching Goal has to move through C 2 . It follows that the thief loses the global game.
We evaluated our method on several configurations of this game, which we call Mona Lisa. The results in Section 6 support our conjecture that the room size has little influence on the time our technique needs to solve the game.

Preliminaries
We consider two-player reachability games defined by formulas in a given logic L. We let L(V) be the L-formulas over a finite set of variables V, also called state predicates in the following. We call V ′ = {v ′ | v ∈ V} the set of primed variables, which are used to represent the value of variables after taking a transition. Transitions are expressed by formulas in the set L(V ∪V ′ ), called transition predicates. For some formula ϕ ∈ L(V), we denote the substitution of all variables by their primed variant by For our algorithm we will require the satisfiability problem of L-formulas to be decidable and Craig interpolants [12] to exist for any two mutually unsatisfiable formulas. Formally, we assume there is a function Sat : L(V) → B that checks the satisfiability of some formula ϕ ∈ L(V) and an unsatisfiability check Unsat : L(V) → B. For interpolation, we assume that there is a function Interpolate : L(V) × L(V) → L(V) computing a Craig interpolant for mutually unsatisfiable formulas: If ϕ, ψ ∈ L(V) are such that Unsat(ϕ ∧ ψ) holds, then ψ =⇒ Interpolate(ϕ, ψ) is valid, Interpolate(ϕ, ψ) ∧ ϕ is unsatisfiable, and Interpolate(ϕ, ψ) only contains variables shared by ϕ and ψ.
These functions are provided by many modern Satisfiability Modulo Theories (SMT) solvers, in particular for the theories of linear integer arithmetic and linear real arithmetic, which we will use for all our examples. Note that interpolation is usually only supported for the quantifier-free fragments of these logics, while our algorithm will introduce existential quantifiers. Therefore, we resort to quantifier elimination wherever necessary, for which there are known procedures for both linear integer arithmetic and linear real arithmetic formulas [28,32].
In order to distinguish the two players, we will assume that a Boolean variable called r ∈ V exists, which holds exactly in the states controlled by the reachability player. For all other variables v ∈ V, we let D(v) be the domain of v, and we define D = {D(v) | v ∈ V}. In the remainder of the paper, we consider the variables V and their domains to be fixed.
Definition 1 (Reachability Game). A reachability game is defined by a tuple G = Init , Safe, Reach, Goal , where Init ∈ L(V) is the initial condition, Safe ∈ L(V ∪ V ′ ) defines the transitions of player SAFE, Reach ∈ L(V ∪ V ′ ) defines the transitions of player REACH and Goal ∈ L(V) is the goal condition.
We require the formulas (Safe =⇒ ¬r) and (Reach =⇒ r) to be valid.
A state s of G is a valuation of the variables V, i.e., a function s : V → D that satisfies s(v) ∈ D(v) for all v ∈ V. We denote the set of states by S, and we let S SAFE be the states s such that s(r) = false, and S REACH be the states s such that s(r) = true. The variable r determines whether REACH or SAFE makes the move out of the current state, and in particular Safe ∧ Reach is unsatisfiable.
Given a state predicate ϕ ∈ L(V), we denote by ϕ(s) the closed formula we get by replacing each occurrence of variable v ∈ V in ϕ by s(v). Similarly, given a transition predicate τ ∈ L(V ∪V ′ ) and states s, s ′ , we let τ (s, s ′ ) be the formula we obtain by replacing all occurrences of v ∈ V in τ by s(v), and all occurrences of v ′ ∈ V ′ in τ by s ′ (v). For replacing only v ∈ V by s(v), we define τ (s) ∈ L(V ′ ). A trap state of G is a state s such that (Safe ∨ Reach)(s) ∈ L(V ′ ) is unsatisfiable (i.e., s has no outgoing transitions).
A play of G starting in state s 0 is a finite or infinite sequence of states ρ = s 0 s 1 s 2 . . . ∈ S + ∪ S ω such that for all i < len(ρ) either Safe(s i , s i+1 ) or Reach(s i , s i+1 ) is valid, and if ρ is a finite play, then s len(ρ) is required to be a trap state. Here, len(s 0 . . . s n ) = n for finite plays, and len(ρ) = ∞ if ρ is an infinite play. The set of plays of some game G = Init , Safe, Reach, Goal is defined as Plays(G) = {ρ = s 0 s 1 s 2 . . . | ρ is a play in G s.t. Init(s 0 ) holds}. REACH wins some play ρ = s 0 s 1 . . . if the play reaches a goal state, i.e., if there exists some integer 0 ≤ k ≤ len(ρ) such that Goal (s k ) is valid. Otherwise, SAFE wins play ρ. A reachability strategy σ R is a function σ R : S * S REACH → S such that if σ R (ωs) = s ′ and s is not a trap state, then Reach(s, s ′ ) is valid. We say that a play ρ = s 0 s 1 s 2 . . . is consistent with σ R if for all i such that s i (r) = true we have s i+1 = σ R (s 0 . . . s i ). A reachability strategy σ R is winning from some state s if REACH wins every play consistent with σ R starting in s. We define safety strategies σ S for SAFE analogously. We say that a player wins in or from a state s if they have a winning strategy from s. Lastly, REACH wins the game G if they win from some initial state. Otherwise, SAFE wins.
We often project a transition predicate T onto the source or target states of transitions satisfying T , which is taken care of by the formulas Pre(T ) = ∃V ′ . T and Post(T ) = ∃V. T . The notation ∃V (resp. ∃V ′ ) represents the existential quantification over all variables in the corresponding set. Given ϕ ∈ L(V), we call the set of transitions in G that move from states not satisfying ϕ, to states satisfying ϕ, the instantiation of ϕ, formally:

Subgoals
We formally define the notion of subgoals. Let G = Init , Safe, Reach, Goal be a fixed reachability game throughout this section, where we assume that Init ∧Goal is unsatisfiable. Whenever this assumption is not satisfied in our algorithm, we will instead consider the game G ′ = Init ∧ ¬Goal , Safe, Reach, Goal which does satisfy it. As states in Init ∧ Goal are immediately winning for REACH, this is not a real restriction.
Definition 2 (Enforceable transitions). The set of enforceable transitions relative to a transition predicate T ∈ L(V ∪ V ′ ) is defined by the formula The enforceable transitions operator serves a purpose similar to the controlled predecessors operator commonly known in the literature, which is often used in a backwards fixed point computation, called attractor construction [36]. For both operations, the idea is to determine controllability by REACH. The main difference is that we do not consider the whole transition relation, but only a predetermined set of transitions and check from which predecessor states the post-condition of the set can be enforced by REACH. These include all transitions in T controlled by REACH and additionally transitions in T controlled by SAFE such that all other transitions in the origin state of the transition also satisfy T . The similarity with the controlled predecessor is exemplified by the following lemma: Proof. Clearly, all states in Pre(Enf(T, G)) that are under the control of REACH are winning for REACH, as in any such state they have a transition satisfying T (observe that Enf(T, G) =⇒ T is valid), which leads to a winning state by assumption.
So let s be a state satisfying Pre(Enf(T, G)) that is under the control of SAFE. As Pre(Enf(T, G))(s) is valid, s has a transition that satisfies T (in particular, s is not a trap state). Furthermore, we know that there is no s ′ ∈ S such that Safe(s, s ′ ) ∧ ¬T (s, s ′ ) holds, and hence there is no transition satisfying ¬T from s. Since Post(T )[V ′ /V] is winning for REACH, it follows that from s player SAFE cannot avoid playing into a winning state of REACH.

⊓ ⊔
We now turn to a formal definition of necessary subgoals, which intuitively are sets of transitions that appear on every play that is winning for REACH.

Definition 4 (Necessary subgoal).
A necessary subgoal C ∈ L(V ∪V ′ ) for G is a transition predicate such that for every play ρ = s 0 s 1 . . . of G and n ∈ N such that Goal (s n ) is valid, there exists some k < n such that C(s k , s k+1 ) is valid.
Necessary subgoals provide a means by which winning safety player strategies can be identified, as formalized in the following lemma.
Lemma 5. A safety strategy σ S is winning in G if and only if there exists a necessary subgoal C for G such that for all plays ρ = s 0 s 1 . . . of G consistent with σ S there is no n ∈ N such that C(s n , s n+1 ) holds.
Proof. " =⇒ ". The transition predicate Goal [V/V ′ ] (i.e., transitions with endpoints satisfying Goal ) is clearly a necessary subgoal. If σ S is winning for SAFE, then no play consistent with σ S contains a transition in this necessary subgoal. "⇐=". Let C be a necessary subgoal such that no play consistent with σ S contains a transition of C. Then by Definition 4 no play consistent with σ S contains a state satisfying Goal . Hence σ S is a winning strategy for SAFE.

⊓ ⊔
Of course, the question remains how to compute non-trivial subgoals. Indeed, using Goal as outlined in the proof above provides no further benefit over a simple backwards exploration (see Remark 15 in the following section).
Ideally, a subgoal should represent an interesting key decision to focus the strategy search. As we show next, Craig interpolation allows to extract partial causes for the mutual unsatisfiability of Init and Goal and can in this way provide necessary subgoals. Recall that a Craig interpolant ϕ between Init and Goal is a state predicate that is implied by Goal , and unsatisfiable in conjunction with Init. In this sense, ϕ describes an observable effect that must occur if REACH wins, and the concrete transition that instantiates the interpolant causes this effect.
Proposition 6. Let ϕ be a Craig interpolant for Init and Goal . Then the transition predicate Instantiate(ϕ, G) is a necessary subgoal.
Proof. As ϕ is an interpolant, it holds that Goal =⇒ ϕ is valid and Init ∧ ϕ is unsatisfiable. Consider any play ρ = s 0 s 1 . . . of G such that Goal (s n ) is valid for some n ∈ N. It follows that ¬ϕ(s 0 ) and ϕ(s n ) are both valid. Consequently, there is some 0 ≤ i < n such that ¬ϕ(s i ) and ϕ(s i+1 ) are both valid. As all pairs (s k , s k+1 ) satisfy either Safe or Reach, it follows that Instantiate(ϕ, While avoiding a necessary subgoal is a winning strategy for SAFE, reaching a necessary subgoal is in general not sufficient to guarantee a win for REACH. This is because there might be some transitions in the necessary subgoal that produce the desired effect described by the Craig interpolant, but that trap REACH in a region of the state space where they cannot enforce some other necessary effect to reach goal. For the purpose of describing a set of transitions that is guaranteed to be winning for the reachability player, we introduce sufficient subgoals.
Example 8. Consider the Mona Lisa game G described in Section 2.
qualifies as sufficient subgoal, because REACH wins from every successor state as all those states satisfy Goal . Also, every play reaching Goal eventually passes C 1 , and hence C 1 is also necessary. On the other hand, is only a necessary subgoal in G, because SAFE wins from some (in fact all) states satisfying Post(C 2 ).
If the set of transitions in the necessary subgoal C that lead to winning states of REACH is definable in L then we call the transition predicate F that defines it the largest sufficient subgoal included in C. It is characterized by the properties Since C is a necessary subgoal and F is maximal with the properties above, REACH needs to see a transition in F eventually in order to win. This balance of necessity and sufficiency allows us to partition the game along F into a game that happens after the subgoal and one that happens before.
Proposition 9. Let C be a necessary subgoal, and F be the largest sufficient subgoal included in C. Then REACH wins from an initial state s in G if and only if REACH wins from s in the pre-game Proof. " =⇒ ". Suppose that REACH wins in G from s using strategy σ R . Assume for a contradiction that SAFE wins in G pre from s using strategy σ S . Consider where σ ′′ S is an arbitrary safety player strategy in G. Let ρ = s 0 s 1 . . . be the (unique) play of G consistent with both σ R and σ ′ S , where s 0 = s. Since σ R is winning in G and C is a necessary subgoal in G, there must exist some m ∈ N such that C(s m , s m+1 ) is valid. Let m be the smallest such index. Since F =⇒ C, we know for all 0 ≤ k < m that ¬F (s k , s k+1 ) holds. Hence, there is the play ρ ′ = s 0 s 1 . . . s m . . . in G pre consistent with σ S . The state s m+1 is winning for REACH in G, as it is reached on a play consistent with the winning strategy σ R . Hence, we know that F (s m , s m+1 ) holds, because F is the largest sufficient subgoal included in C. If (Reach ∧ F )(s m , s m+1 ) held, we would have that Pre(Enf(F, G)(s m ) holds: a contradiction with ρ ′ being consistent with σ S , which we assumed to be winning in G pre . It follows that (Safe ∧F )(s m , s m+1 ) holds. We can conclude that (Safe ∧ ¬F )(s m ) is unsatisfiable (i.e., s m is a trap state in G pre ), because in all other cases SAFE plays according to σ S , which cannot choose a transition satisfying F . However, this implies that Pre(Enf(F, G)(s m ) holds, again a contradiction with ρ ′ being consistent with winning strategy σ S . "⇐=". If REACH wins in G pre they have a strategy σ R such that every play consistent with σ R reaches the set Pre(Enf(F, G)). As F is a sufficient subgoal, the states Post(F ) are winning for REACH by definition. It follows by Lemma 3 that all states satisfying Pre(Enf(F, G)) are winning in G. Combining σ R with a strategy that wins in all these states yields a winning strategy for REACH in G.

Causality-Based Game Solving
Proposition 9 in the preceding section foreshadows how subgoals can be employed in building a recursive approach for the solution of reachability games. Before turning to our actual algorithm, we describe a way to symbolically represent nondeterministic memoryless strategies. As discussed in [17], there is no ideal strategy description language for the class of games we consider. Our approach allows us to describe sets of concrete strategies as defined in Section 3 with linear arithmetic formulas. This framework will prove convenient for strategy synthesis, i.e., the computation of winning strategies instead of simply determining the winner of the game.

Symbolically Represented Strategies
We will represent strategies for both players using transition predicates S ∈ L(V ∪ V ′ ), henceforth called symbolic strategies, where we only require that (S =⇒ (Safe ∨ Reach)) is valid. A sequence s 0 . . . s n ∈ S + is called a play prefix if it is a prefix of some play in G, (¬Goal )(s j ) holds for all 0 ≤ j ≤ n, and s n is not a trap state. We say that a play prefix ρ = s 0 . . . s n conforms to a symbolic reachability strategy S if for all j < n we have that S(s j , s j+1 ) holds whenever s j ∈ S REACH (and analogously for safety strategies). A play conforms to S if all its play prefixes conform to S. We say that S is winning for REACH in s if all plays from s that conform to S are winning for REACH and all play prefixes s 0 . . . s n ∈ S * S REACH from s that conform to S are such that (S ∧ Reach)(s n ) is satisfiable (and analogously for SAFE). The second condition ensures that the player cannot be forced to play a transition outside of S by their opponent while the play has not reached a trap state or Goal , and in particular guarantees the existence of a concrete strategy (as defined in Section 3) conforming to S.
Lemma 10. If REACH (SAFE) has a winning symbolic strategy in s, then REACH (SAFE) has a concrete winning strategy in s.
Proof. Let S by a symbolic winning strategy for REACH. Let σ R be any reachability strategy such that for all play prefixes ωs ∈ S * S REACH that conform to S the formula S(s, σ R (ωs)) is valid. Such a function is guaranteed to exist, as (S ∧ Reach)(s) is satisfiable for all such play prefixes by definition. Furthermore, σ R is winning as all play prefixes of plays consistent with σ R conform to S, and hence all these plays are winning by assumption. The proof for SAFE is analogous.

⊓ ⊔
This representation allows us to specify nondeterministic strategies, but classical memoryless strategies on finite arenas (specified as a function σ : S REACH → S or S SAFE → S) can also be represented in this form using a disjunction over for- The following lemma shows that a necessary subgoal directly yields a symbolic strategy for SAFE if the subgoal is, in a certain sense, locally avoidable by SAFE. It will be our main tool for synthesizing safety player strategies.
Lemma 11. Let C be a necessary subgoal for G and suppose that Unsat(Enf(C, G)) holds. Then, Safe ∧ ¬C is a winning symbolic strategy for SAFE in G.

A Recursive Algorithm
We now describe our algorithm which utilizes necessary subgoals to decompose and solve two-player reachability games (Algorithm 1). It is incomplete in the sense that it does not return on every input (Section 5.3 discusses special cases with guaranteed termination). If the algorithm returns on input G, it returns a triple (R, S REACH , S SAFE ), where (1) R is a state predicate characterizing the initial states that are winning for REACH in G, (2) S REACH is a symbolic strategy for REACH that wins in all initial states satisfying R, and (3) S SAFE is a symbolic strategy for SAFE that wins in all initial states satisfying Init ∧ ¬R. The returned safety strategy S SAFE is such that ¬S SAFE is a necessary subgoal that SAFE can avoid locally in the game G restricted to intial states Init ∧ ¬R (see Lemma 11).
Algorithm 1 works as follows. States satisfying Init and Goal are immediately winning for REACH and thus always part of the returned formula R. Following the discussion at the beginning of Section 4, further analysis considers the game starting in the remaining initial states I = Init ∧ ¬Goal . If there is no such state, we may return that all initial states are winning (line 5). Here, REACH wins from R without playing any move, and hence S REACH = false is a valid winning symbolic strategy (winning symbolic strategies are only required to provide moves in prefixes that have not seen Goal so far). We may choose S SAFE arbitrarily as there is no initial state winning for SAFE.
If the algorithm does not return in line 5, a necessary subgoal C between I and Goal is computed by instantiating a Craig interpolant ϕ for the two predicates (lines 6 and 7, see also Proposition 6). We break up the remaining description of the algorithm into three parts, which correspond to the main cases that occur when splitting the game along the subgoal C.
Case 1: SAFE can avoid the subgoal C. If the necessary subgoal C qualifies for Lemma 11, we can immediately conclude that SAFE is winning for all states statisfying I (lines 8 and 9). An instance of this case occurs if the interpolant describes a bottleneck in the game which is fully controlled by SAFE. The winning symbolic reachability strategy is Safe ∧ ¬C in this case (line 9), and we will assume that safety strategies returned by recursive calls of the algorithm are essentially negations of necessary subgoals that can be avoided by SAFE.
-R ∈ L(V) represents the set of initial states winning for REACH; -S REACH is a winning symbolic reachability strategy for states in R; -S SAFE is a winning symbolic safety strategy for states in Init ∧ ¬R.
If Lemma 11 is not applicable, we next find those transitions in C that move into a winning state for the safety player. This is achieved by analyzing the post-game (line 10): Its initial states are exactly the states one sees after bridging the subgoal C. In order to make sure that G post is, in some sense, easier to solve than G, we restrict both Safe and Reach to ϕ, which is the interpolant used to compute the subgoal C. This has the effect of removing all transitions in states not satisfying ϕ, making them trap states. For the safety player this makes G post easier to win than G as all plays ending in such a trap state without seeing Goal before are winning for SAFE in G post . Hence we formally have: Lemma 12. If S is a winning symbolic reachability strategy from s in G post , then S is also winning from s in G.
Due to the restriction to ϕ, intuitively REACH wins from a state s in G post if they can win from s in G while staying inside the interpolant ϕ. In other words, REACH must guarantee that the necessary subgoal C is not visited again in the play. Still, the set R post , as returned in line 11 by the recursive call to Algorithm 1 on G post , is a sufficient subgoal in G, by the above lemma. Furthermore, if SAFE can avoid all states satisfying R post (see line 13), then this also implies a winning strategy from all initial states in I. The reason is that REACH can only win by eventually visiting a state from which they can win without leaving ϕ again, as (Goal =⇒ ϕ) is valid. This is not possible if SAFE can avoid all states in R post .
In this case we construct S SAFE as follows. We assume that ¬S post SAFE is a necessary subgoal that can be locally avoided in G post from all states satisfying Post(C)[V ′ /V] ∧ ¬R post , and furthermore, we know that F := C ∧ R post [V/V ′ ] can be locally avoided in G (line 13). Intuitively, playing according to S post SAFE in G post yields a strategy for SAFE which avoids Goal and may move back into a state satisfying ¬ϕ, which forces REACH to bridge the subgoal C again in order to win. It follows that F ∨ (ϕ ∧ ¬S post SAFE ) is a necessary subgoal from I that can be locally avoided by SAFE in G, and the corresponding symbolic strategy is Safe ∧ ¬F ∧ (ϕ =⇒ S post SAFE ) (we additionally intersect the negated necessary subgoal with Safe to ensure that the symbolic strategy only includes legal transitions).
So far, the subgoal was such that SAFE could avoid it entirely, or at least avoid all states from which REACH would win when forced to remain inside the post-game. If this is not the case, then we also need to consider the pre-game (line 18): which intuitively describes the game before bridging the interpolant C for the last time. The exact definition of F will depend on whether C perfectly partitions the game or not. In both cases F will be the largest sufficient subgoal contained in a necessary subgoal, which lets us apply Proposition 9 to conclude that the initial winning regions of G and G pre coincide.
Case 2: The subgoal perfectly partitions G. We say that ϕ perfectly partitions G if (Reach ∨Safe)∧ϕ∧¬ϕ ′ ∧¬Goal is unsatisfiable (cf. line 15). Intuitively, this means that there is no transition that "undoes" the effect of the subgoal C. If this holds, then the restriction of G post to states satisfying ϕ is de facto no longer a restriction, as no play can reach such a state anyway after passing through the subgoal. This intuition is formalized by the following lemma. It follows that F = C ∧ R post [V/V ′ ] is the largest sufficient subgoal included in C. By Proposition 9, the same initial states are winning for REACH in G pre and in G. In this case, we construct the desired safety strategy (line 22) as where ¬S pre/post SAFE are assumed to be necessary subgoals avoidable by SAFE in the corresponding subgames. Intuitively, the combined strategy consists of following S pre SAFE as long as one remains in the pre-game, which, by induction hypothesis, allows SAFE to avoid all transitions from F if starting in R pre . If the play crosses C ∧ ¬F , the strategy is to play according to the winning strategy of the postgame.
A symbolic strategy for REACH can be given by combining pre-and poststrategies as follows (line 21): This represents a nested conditional strategy that prefers the strategies of the subgames in the priority order S post REACH , F , and finally S pre REACH . The reason for this order is that the winning condition in the post-game coincides with the global winning objective (to reach Goal ), while in the pre-game REACH tries to reach a winning state in the post-game. The set F is exactly the bridge between these two. The last condition makes sure that the strategy only includes transitions of states in which it is winning.
Case 3: The subgoal does not perfectly partition G. If Sat((Reach ∨ Safe) ∧ ϕ ∧ ¬ϕ ′ ∧ ¬Goal ) is true in line 15, we can no longer assume that F is the largest sufficient subgoal in C. The reason is that SAFE may win in G post by moving out of the subgame, but if this move leads to a winning state for REACH in G, then such a strategy is winning in G post , but not in G. So we can only assume that F is sufficient (this follows by Lemma 12). In order to apply Proposition 9 we extend F by all transitions that move directly into Goal (line 16). This immediately yields a necessary and sufficient subgoal, and so again Proposition 9 applies to G pre (line 18). We could have also added Goal -states to F in Case 2, but we have observed that not doing so improves the performance of our procedure considerably.
The reachability strategy is composed of S pre REACH , F , and S post REACH exactly as in Case 2 (line 21). As all transitions in F are losing for SAFE, and these are the only ones that are removed in G pre , essentially SAFE can play using the same strategies in G and G pre . We implement this by setting ϕ to false (line 17), in which case Finally, we formally state the partial correctness of the algorithm, using the ideas outlined above. The proof can be found in the appendix.
Theorem 14 (Partial correctness). If Reach(G) returns (R, S REACH , S SAFE ), then -R characterizes the set of initial states that are winning for REACH in G, -S REACH is a winning symbolic reachability strategy from R, -S SAFE is a winning symbolic safety strategy from Init ∧ ¬R.
Remark 15 (Simulating the attractor). Note that Craig interpolants are by no means unique. If we choose the interpolation function so that Interpolate(I, Goal ) always returns Goal (this is a valid interpolant), Algorithm 1 essentially simulates the attractor. In this case the subgoal C (line 7) contains exactly the transitions that move directly into Goal . All states in Post(C)[V ′ /V] are then winning for REACH and hence R post would be equivalent to Post(C)[V ′ /V], which implies that C ≡ F holds in this case. The new goal states in G pre are set to Pre(Enf(F, G)), which are exactly the states in Pre(C) that either are controlled by REACH, or such that all their transitions are included in F . Hence the set Pre(Enf(F, G)) is exactly the classical controlled predecessor.
One effect of slicing the game along general subgoals is that the initial predicate of the post-game (which describes all states satisfying the post-condition of the subgoal) may be satisfied by many states that do not necessarily need to be considered in order to decide who wins from the initial states of G (for example, because they are not reachable from any initial state, or cannot reach Goal ). This can be a drawback if the (superfluous) size of the subgames makes them hard to solve. Notably, this is in general less of an issue for approaches based on unrolling of the transition relation: The method of solving increasingly large step-bounded games [17] will only consider states that are reachable from Init, while backwards fixpoint computations will not explore states that do not reach Goal . A way of coping with this is to provide additional information on the domains of variables, whenever this is available (we discuss the effect of bounding variable domains in Section 6). Indeed, in the case where all variable domains are finite, Algorithm 1 is guaranteed to terminate, as shown in the next subsection.

Special Cases with Guaranteed Termination
Deciding the winner in the types of games we consider is generally undecidable (see [17] for the case that L is linear real arithmetic). Since Algorithm 1 returns a correct result whenever it terminates, this implies that it cannot always terminate. In this section, we give two important cases in which we can prove termination. The proofs can be found in the appendix. Remark 17 (Time complexity). The termination argument given in the appendix yields a single-exponential upper bound on the runtime of the algorithm, where the input size is measured in the number of concrete transitions of the game. This is because in both recursive calls the subgames may be "almost" as large as the input -they are only guaranteed to have at least one concrete transition less.
We now show that, under certain assumptions, our algorithm also terminates for games that have a finite bisimulation quotient. To this end, we first clarify what bisimilarity means in our setting. A relation R ⊆ S × S over the states of G is called a bisimulation on G, if -for all (s 1 , s 2 ) ∈ R the formulas Goal (s 1 ) ⇐⇒ Goal (s 2 ), Init(s 1 ) ⇐⇒ Init(s 2 ) and r(s 1 ) ⇐⇒ r(s 2 ) are valid (recall that r holds exactly in states controlled by REACH). -for all (s 1 , s 2 ) ∈ R and s ′ 1 ∈ S such that (Safe ∨ Reach)(s 1 , s ′ 1 ) holds, there exists s ′ 2 ∈ S such that (Safe ∨ Reach)(s 2 , s ′ 2 ) holds, and (s ′ 1 , s ′ 2 ) ∈ R. -for all (s 1 , s 2 ) ∈ R and s ′ 2 ∈ S such that (Safe ∨ Reach)(s 2 , s ′ 2 ) holds, there exists s ′ 1 ∈ S such that (Safe ∨ Reach)(s 1 , s ′ 1 ) holds, and (s ′ 1 , s ′ 2 ) ∈ R. We say that s 1 and s 2 are bisimilar (denoted by s 1 ∼ s 2 ) if there exists a bisimulation R such that (s 1 , s 2 ) ∈ R. Bisimilarity is an equivalence relation, and it is the coarsest bisimulation on G. The equivalence classes are called bisimulation classes. As the winning region of any player can be expressed in the µ-calculus [38] and the µ-calculus is invariant under bisimulation [8], it follows that bisimilar states are won by the same player.
Lemma 18. Let R be a bisimulation on G. If (s 1 , s 2 ) ∈ R, then REACH wins from s 1 in G if and only if REACH wins from s 2 in G.
We will assume that for each bisimulation class S i there exists a formula ψ i ∈ L(V) that defines S i , formally: For all s ∈ S, ψ i (s) holds if and only if s ∈ S i . Furthermore, we will assume that the interpolation procedure respects ∼, formally: Interpolate(ϕ, ψ) is equivalent to a disjunction of formulas ψ i . Such an interpolant exists if ψ or ϕ already satisfy this assumption.
Theorem 19. Let G be a reachability game with finite bisimulation quotient under ∼ and assume that all bisimulation classes of G are definable in L. Furthermore, assume that Interpolate respects ∼. Then, Reach(G) terminates.

Case Studies
In this section we evaluate our approach on a number of case studies. Our prototype CabPy † is written in Python and implements the game solving part of the presented algorithm. Extending it to returning a symbolic strategy using the ideas outlined above is straightforward. We compared our prototype with SimSynth [17], the only other readily available tool for solving linear arithmetic games. The evaluation was carried out with Ubuntu 20.04, a 4-core Intel ® Core™ i5 2.30GHz processor, as well as 8GB of memory. CabPy uses the PySMT [18] library as an interface to the MathSAT5 [11] and Z3 [29] SMT solvers. On all benchmarks, the timeout was set to 10 minutes. In addition to the winner, we report the runtime and the number of subgames our algorithm visits. Both may vary with different SMT solvers or in different environments.

Game of Nim
Game of Nim is a classic game from the literature [7] and played on a number of heaps of stones. Both players take turns of choosing a single heap and removing at least one stone from it. We consider the version where the player that removes the last stone wins. Our results are shown in Figure 1. In instances with three heaps or more we bounded the domains of the variables in the instance description, by specifying that no heap exceeds its initial size and does not go below zero.
Following the discussion in Section 5.3, we need to bound the domains to ensure the termination of our tool on these instances. Remarkably, bounding the variables was not necessary for instances with only two heaps, where our tool CabPy scales to considerably larger instances than SimSynth. We did not add the same constraints to the input of SimSynth, as for SimSynth this resulted in longer runtimes rather than shorter. In Game of Nim, there are no natural necessary subgoals that the safety player can locally control.
The results (see Figure 1) demonstrate that our approach is not completely dependent on finding the right interpolants and is in particular also competitive when the reachability player wins the game. We suspect that SimSynth performs worse in these cases because the safety player has a large range of possible moves in most states, and inferring the win of the reachability player requires the tool to backtrack and try our all of them.

Corridor
We now consider an example that demonstrates the potential of our method in case the game structure contains natural bottlenecks. Consider a corridor of 100 rooms arranged in sequence, i.e., each room i with 0 ≤ i < 100 is connected to room i + 1 with a door. The objective of the reachability player is to reach  room 100 and they are free to choose valid values from R 2 for the position in each room at every other turn. The safety player controls some door to a room r ≤ 100. Naturally, a winning strategy is to prevent the reachability player from passing that door, which is a natural bottleneck and necessary subgoal on the way to the last room.
The experimental results are summarized in Figure 2. We evaluated several versions of this game, increasing the length from the start to the controlled door. The results confirm that our causal synthesis algorithm finds the trivial strategy of closing the door quickly. This is because Craig interpolation focuses the subgoals on the room number variable while ignoring the movement in the rooms in between, as can be seen by the number of considered subgames. SimSynth, which tries to generalize a strategy obtained from a step-bounded game, struggles because the tool solves the games that happen between each of the doors before reaching the controlled one.

Mona Lisa
The game described in Section 2 between a thief and a security guard is very well suited to further assess the strength and limitations of both our approach as well as of SimSynth. We ran several experiments with this scenario, scaling the size of the room and the sleep time of the guard, as well as trying a scenario where the guard does not sleep at all. Scaling the size of the room makes it harder for SimSynth to solve this game with a forward unrolling approach, while our approach extracts the necessary subgoals irrespective of the room size. However, scaling the guard's sleep time makes it harder to solve the subgame between the two necessary subgoals, while it only has a minor effect on the length of the unrolling needed to stabilize the play in a safe region, as done by SimSynth.
The results in Figure 3 support this conjecture. The size of the room has almost no effect at all on both the runtime of CabPy and the number of considered subgames. However, as the results for a sleep value of 4 show, the employed combination of quantifier elimination and interpolation introduces some instability in the produced formulas. This means we may get different Craig interpolants and slice the game with more or less subgoals. Therefore, we see a lot of potential in optimizing the interplay between the employed tools for quantifier elimination and interpolation. The phenomenon of the runtime being sensitive to these small changes in values is also seen with SimSynth, where a longer sleep time sometimes means a faster execution.

Program Synthesis
Lastly, we study two benchmarks that are directly related to program synthesis. The first problem is to synthesize a controller for a thermostat by filling out an incomplete program, as described in [3]. A range of possible initial values of the room temperature c is given, e.g., 20.8 ≤ c ≤ 23.5, together with the temperature dynamics which depend on whether the heater is on (variable o ∈ B). The objective for SAFE is to control the value of o in every round such that c stays  between 20 and 25. This is a common benchmark for program synthesis tools and both CabPy and SimSynth solve it quickly. The other problem relates to Lamport's bakery algorithm [25]. We consider two processes using this protocol to ensure mutually exclusive access to a shared resource. The game describes the task of synthesizing a scheduler that violates the mutual exclusion. This essentially is a model checking problem, and we study it to see how well the tools can infer a safety invariant that is out of control of the safety player. For our approach, this makes no difference, as both players may play through a subgoal and the framework is well suited to find a safety invariant. The forward unrolling approach of SimSynth, however, seems to explore the whole state space before inferring safety, and fails to find an invariant before a timeout.

Conclusion
Our work is a step towards the fully automated synthesis of software. It targets symbolically represented reachability games which are expressive enough to model a variety of problems, from common game benchmarks to program synthesis problems. The presented approach exploits causal information in the form of subgoals, which are parts of the game that the reachability player needs to pass through in order to win. Having computed a subgoal, which can be done using Craig interpolation, the game is split along the subgoal and solved recursively. At the same time, the algorithm infers a structured symbolic strategy for the winning player. The evaluation of our prototype implementation CabPy shows that our approach is practically applicable and scales much better than previously available tools on several benchmarks. While termination is only guaranteed for games with finite bisimulation quotient, the experiments demonstrate that several infinite games can be solved as well. This work opens up several interesting questions for further research. One concerns the quality of the returned strategies. Due to its compositional nature, at first sight it seems that our approach is not well-suited to handle global optimization criteria, such as reaching the goal in fewest possible steps. On the other hand, the returned strategies often involve only a few key decisions and we believe that therefore the strategies are often very sparse, although this has to be further investigated. We also plan to automatically extract deterministic strategies from the symbolic ones [4,16] we currently consider.
Another question regards the computation of subgoals. The performance of our algorithm is highly influenced by which interpolant is returned by the solver. In particular this affects the number of subgames that have to be solved, and how complex they are. We believe that template-based interpolation [26] could be a promising candidate to explore with the goal to compute good interpolants. This could be combined with the possibility for the user to provide templates or expressive interpolants directly, thereby benefiting from the user's domain knowledge.
Hence, it is enough to prove the backward direction. Let us assume that there is a reachability player strategy σ R that is winning from s in G, and, for contradiction, a safety player strategy σ S that is winning from s in G post . Consider strategy σ ′ S such that σ ′ S (ωs ′ ) = σ S (ωs ′ ) if (Safe ∧ ϕ)(s ′ ) is satisfiable, and else σ ′ S (ωs ′ ) = σ ′′ S (ωs ′ ), where σ ′′ S is an arbitrary safety player strategy in G. There exists a unique play ρ = s 0 s 1 . . . in G consistent with both σ R and σ ′ S with s 0 = s. Since s satisfies Post(C)[V ′ /V], we know that ϕ(s 0 ) is valid (i.e., ρ starts in G post ), and since ϕ perfectly partitions G, we know that there is no 0 ≤ i such that ((Reach ∨ Safe) ∧ ϕ ∧ ¬ϕ ′ ) (s i , s i+1 ) holds. It follows that for all 0 ≤ j we have that ϕ(s j ) holds (i.e., ρ stays in G post ). Hence, ρ is a play in G post consistent with σ S and σ R (considered as a strategy in G post ). Since σ R is winning in G, there must exist some n ∈ N such that Goal (s n ) is valid. This is a contradiction with σ S being a winning strategy in G post .
⊓ ⊔ Lemma 20. Let T, F be transition predicates such that Unsat(Pre(T ) ∧ Pre(F )) holds. Suppose that G T and G F are games such that Unsat(Enf(T, G T )) and Unsat(Enf(F, G F )) both evaluate to true, and we have Then it follows that Unsat(Enf(T ∨ F, G)) is also true.
Proof. We proceed by contradiction. Suppose there were s, v ∈ S such that Enf(T ∨ F, G)(s, v) holds. It follows that (T ∨ F )(s, v) must hold. Since we have Unsat(Pre(T ) ∧ Pre(F )), two cases may occur: either (1) Pre(T )(s) holds, or (2) Pre(F )(s) holds, but not both. Since they work completely symmetrically, we give the argument only for case (2). So we know that F (s, v) holds and T (s, v) does not. As F =⇒ (Safe F ∨ Reach F ), we know (Safe F ∨ Reach F )(s, v) holds. The only way we can still have Unsat(Enf(F, G F )) is if there is some s ′ such that Safe F (s, s ′ ) ∧ ¬F (s, s ′ ) holds. However, as (Safe F ∨Reach F ) =⇒ (Safe ∨Reach), then also Safe(s, s ′ )∧ ¬F (s, s ′ ) , which is a contradiction to the fact that Enf(F ∨T, G)(s, v) holds. ⊓ ⊔ Theorem 14 (Partial correctness). If Reach(G) returns (R, S REACH , S SAFE ), then -R characterizes the set of initial states that are winning for REACH in G, -S REACH is a winning symbolic reachability strategy from R, -S SAFE is a winning symbolic safety strategy from Init ∧ ¬R.
Proof. We show by induction on the recursion depth that if Reach(G) returns (R, S REACH , S SAFE ), where R ∈ L(V), S REACH and S SAFE are symbolic strategies for REACH/SAFE respectively, and (a) S REACH is a winning strategy for REACH from all states satisfying R∨Pre(S REACH ), (b) ¬S SAFE is a necessary subgoal in G I = Init ∧ ¬R, Safe, Reach, Goal , and (c) if Init ∧ ¬R is satisfiable, then Unsat(Enf(¬S SAFE , G)) holds.
Note that the last condition is equivalent to Unsat(Enf(¬S SAFE , G I )) as Enf does not take initial states into account. Properties (b) and (c) ensure, in view of Lemma 11, that S SAFE is indeed a winning symbolic strategy from all initial states of G I (provided that S SAFE =⇒ (Safe ∨ Reach) holds, which we will show is the case). Together, these properties ensure that R characterizes the set of initial states that are winning for REACH in G. We distinguish the five cases that can occur when the algorithm terminates (there are four return statements, the last of which depends on the if statement in line 15).
Case 1: Reach(G) returns in line 5. Then the formula (Init =⇒ Goal ) is valid, which means that all initial states are goal states. The algorithm returns R = Init ∧ Goal , which in this case is equivalent to Init . Trivially, any strategy is winning for REACH for all initial states. Then, S REACH = false is a winning symbolic strategy for any state satisfying R. This is because clearly any play starting in R is winning for REACH, and we only need to show that REACH has a move satisfying S REACH in a play prefix, which by definition has not seen Goal already. In this case, there is no such play prefix that we need to cover. Observe also that Pre(false) ≡ false, which is required to show (a).
As G I has no initial states, one can choose S SAFE arbitrarily, as anything qualifies as necessary subgoal if I ≡ false, in which case (c) is also directly satisfied. So the properties (a)-(c) above are satisfied.
Case 2: Reach(G) returns in line 9. Then Enf(C, G) is unsatisfiable, where C is the instantiation of the interpolant ϕ (lines 6 and 7). Proposition 6 states that C is a necessary subgoal in G I . Since Enf(C, G) is unsatisfiable, Lemma 11 states that S SAFE = Safe ∧ ¬C is a winning symbolic strategy for SAFE in G I . For REACH, we have the same argument as in Case 1. We conclude that the properties (a)-(c) above are satisfied. As in the previous cases, property (a) is satisfied. In order to prove (b) let ρ = s 0 s 1 . . . be a play such that (Init ∧ ¬R)(s 0 ) and Goal (s n ) hold for some n ∈ N. As C is a necessary subgoal in G, there exists m ∈ N with m < n such that C(s m , s m+1 ) holds. Take m to be the last index with this property. If F (s m , s m+1 ) holds, then also (F ∨ (ϕ ∧ ¬S post SAFE ))(s m , s m+1 ), which implies ¬S SAFE (s m , s m+1 ). If ¬F (s m , s m+1 ) holds, then (Post(C)[V ′ /V]∧¬R post )(s k+1 ) holds. By induction hypothesis, all winning plays for REACH in G post starting in ¬R post have some n > j > m such that ¬S post SAFE (s j , s j+1 ) holds. Furthermore, by construction of G post , all states s j with n > j > m satisfy ϕ. Hence ϕ ∧ ¬S post SAFE (s j , s j+1 ) holds, which implies that ¬S SAFE (s j , s j+1 ) holds. We conclude that ¬S SAFE is a necessary subgoal in G I .
We now show (c). Since we return in line 14, we have Unsat(Enf(F, G)) and, by induction hypothesis, Unsat(Enf(¬S post SAFE , G post )). As the transition relation of G post is restricted to ϕ, this implies Unsat(Enf(ϕ ∧ ¬S post SAFE , G post )). We also have F =⇒ ¬ϕ and (ϕ ∧ ¬S post SAFE ) =⇒ ϕ. As (Safe post ∨ Reach post ) =⇒ (Safe ∨ Reach) holds, we can apply Lemma 20 to conclude Unsat(Enf(F ∨ (ϕ ∧ ¬S post SAFE )), G), which implies Unsat(Enf(¬Safe ∨ F ∨ (ϕ ∧ ¬S post SAFE )), G) = Unsat(Enf(¬S SAFE ), G)). By induction hypothesis we assume that the recursive calls in lines 11 and 19 returned tuples (R post , S post REACH , S post SAFE ) and (R pre , S pre REACH , S pre SAFE ) satisfying properties (a)-(c) above for G post and G pre . We now show these properties in G for R ∨ R pre , and We first show that (a) S REACH is winning for REACH from states satisfying R ∨ R pre ∨ Pre(S REACH ). For states in R this is trivial, so let ρ = s 0 s 1 . . . be a play in G conforming to S REACH such that R pre (s 0 ) holds. Our first claim is that if there exists k ∈ N such that Pre(S post REACH )(s k ) holds, then ρ must be winning for REACH. This is due to the fact that S post REACH is winning in G post from all states satisfying Pre(S post REACH ), which allows us to use Lemma 12. To argue that S post REACH keeps playing according to S post REACH once such a state is reached, we observe that if a symbolic reachability strategy S wins from s, then Pre(S) holds in any state in S REACH reachable from s via a play prefix conforming to S, by definition. Now we show that such a position k must exist. First, for j ∈ N such that (¬ Pre(S post REACH ) ∧ Pre(Enf(F, G)))(s j ) holds, the transition (s j , s j+1 ) must satisfy F . This is because if s j ∈ S SAFE , then all outgoing transitions from s j satisfy F . Otherwise, it follows by the fact that ρ conforms to S REACH . As Post(F ) ≡ R post [V/V ′ ] and S pos REACH wins from all states satisfying R post by assumption, it follows that s j+1 satisfies Pre(S post REACH ). As long as ρ visits only states satisfying (¬ Pre(Enf(F, G)) ∧ ¬ Pre(S post REACH )), the strategy S REACH prescribes to play according to S pre REACH . By assumption, this strategy is winning for REACH in G pre , and hence the play ρ eventually visits a state in Pre(Enf(F, G)). As above, the play is guaranteed to stay in Pre(S pre REACH ) until that position.
The above argument also shows that S REACH is winning for all states satisfying Pre(F ) ∨ Pre(S post REACH ) ∨ Pre(S pre REACH ), which is implied by Pre(S REACH ). Also, S REACH =⇒ (Reach ∨ Safe) is valid, as the corresponding statements hold for the pre-and post-strategies, and F =⇒ (Safe ∨ Reach) is valid.
Next we show that (b) ¬S SAFE is a necessary subgoal in G I . No player can play back from G post to G pre without REACH having already won in G post . We first show that under this condition, qualifies as a necessary subgoal in G I . For this, consider the necessary subgoal C. For any play ρ = s 0 s 1 . . . with n ∈ N such that Goal (s n ) there is some k ∈ N with k < n and C(s k , s k+1 ). As F characterizes a subset of C, we check two cases: Either (1) ¬F (s k , s k+1 ) or (2) F (s k , s k+1 ). In case (1), we have ¬R post (s k+1 ) and because of our assumption that no transition of the game satisfies ϕ ∧ ¬ϕ ′ , for all j ∈ N with k < j < n : ϕ(s j ). It follows by induction hypothesis that there is some l ∈ N such that ¬S post SAFE (s l , s l+1 ). In case (2) we use that S pre SAFE plays only moves available in G pre , and hence S pre SAFE =⇒ ¬F is valid. Furthermore F =⇒ ¬ϕ, as F characterizes a subset of the subgoal C. Hence we can conclude that F =⇒ (¬ϕ ∧ ¬S pre SAFE ) is valid. It follows that ¬S SAFE qualifies as necessary subgoal.
Finally we show (c) that Unsat(Enf(¬S SAFE , G)) holds. We have (Safe pre ∨ Reach pre ) =⇒ (Safe ∨ Reach) and (Safe post ∨ Reach post ) =⇒ (Safe ∨ Reach). As Pre(¬ϕ ∧ ¬S pre SAFE ) ∧ Pre(ϕ ∧ ¬S post SAFE ) is cleary unsatisfiable, we can again apply Lemma 20 to infer Unsat(Enf(¬S SAFE , G)). This uses that any transitions reachable in G post has to satisfy ϕ in this case. For (b) we first observe that by setting ϕ to false (see line 17) in this case we get S SAFE = S pre SAFE . We show that ¬S SAFE = ¬S pre SAFE is a necessary subgoal in G I . The transition predicate F in line 11 is a sufficient subgoal by induction hypothesis, but due to the restriction on the post-game, we cannot conclude that states in Post(C) that are not in Post(F ) are winning for SAFE. By adding all transitions to Goal (line 16) we get that F in line 19 is a necessary and sufficient subgoal (clearly, any winning play must go through Goal [V/V ′ ]). As we have ensured that F is necessary, we know for all plays ρ = s 0 s 1 . . . with some n ∈ N such that Goal (s n ) there is some k ∈ N with k < n and F (s k , s k+1 ). As in Case 4 we may conclude that F =⇒ ¬S pre SAFE . It follows that ¬S SAFE is a necessary subgoal in G I .
For (c) we observe that Unsat(Pre(Enf(¬S pre SAFE , G pre ))) holds by induction hypothesis, which directly implies Unsat(Pre(Enf(¬S pre SAFE , G))). This concludes the argument for the final case, and the proof is complete.
⊓ ⊔ Theorem 16. If the domains of all variables in G are finite, then Reach(G) terminates.
Proof. We denote by size(G) the number of concrete transitions of G, formally: If the domains of all variables are finite, then so is size(G). We assume that this is the case and show that the subgames on which Reach(G) recurses are strictly smaller in this measure. This is enough to guarantee termination.
The first subgame is constructed in line 10 and takes the form: The important restriction of this game is that both safety and reachability player transitions have the additional precondition ϕ. We may assume that Enf(C, G) is satisfiable, as otherwise the algorithm does not reach line 10. Then, in particular, C is satisfiable, by the definition of Enf(C, G). But C = Instantiate(ϕ, G) = (Safe ∨ Reach) ∧ ¬ϕ ∧ ϕ ′ , which means that there exist states s, s ′ such that (Safe ∨ Reach)(s, s ′ ), ¬ϕ(s), and ϕ(s ′ ) are all valid. This transition from s to s ′ in G is excluded in G post , and as no new transitions are included, it follows that size(G post ) < size(G). The second subgame is constructed in line 18 and takes the form: G pre = I, Safe ∧ ¬F, Reach ∧ ¬F, Pre(Enf(F, G)) .
We may assume that F ∧ (Safe ∨ Reach) is satisfiable, as otherwise the algorithm would not have moved past line 13. Observe that if F is changed in line 16 then it is only extended and hence satisfiability is preserved. As no transition satisfying F exists in G pre it follows that size(G pre ) < size(G). This concludes the proof. ⊓ ⊔ Theorem 19. Let G be a reachability game with finite bisimulation quotient under ∼ and assume that all bisimulation classes of G are definable in L. Furthermore, assume that Interpolate respects ∼. Then, Reach(G) terminates.
Proof. Let S 1 , . . . , S n be the bisimulation classes of G, and ψ 1 , . . . , ψ n ∈ L(V) be the formulas that define them. We define size(G) = |{(S i , S j ) | (Safe ∨ Reach) ∧ ψ i ∧ ψ ′ j is satisfiable}|, which equals the number of transitions in the bisimulation quotient of G under ∼.
Our aim is to show that Reach(·) terminates for all subgames that are considered in any recursive call of Reach(G).
To this end, we show that Reach(G) terminates for all reachability games G = Init , Safe, Reach, Goal such that -size(G) is finite, -the relation ∼ is a bisimulation on G, and -Goal is equivalent to a disjunction of formulas ψ i .
We show this by induction on size(G).
Let G = Init , Safe, Reach, Goal satisfy these conditions, and assume that size(G) = 0. Then it follows that Safe ∨ Reach is unsatisfiable. This is because if any (s 1 , s 2 ) would satisfy Safe ∨Reach, then in particular (Safe ∨Reach)∧ψ i ∧ψ ′ j would be satisfied by (s 1 , s 2 ), where we assume s 1 ∈ S i and s 2 ∈ S j . It follows that Unsat(Enf(C, G)) in line 8 is true, as Enf(C, G) =⇒ (Safe ∨ Reach) is valid for any C. But then Algorithm 1 terminates on input G. Now suppose that we have G with size(G) > 0. If the algorithm does not return in lines 5 or 9, we have to consider the first subgame which is constructed in line 10. We may assume that for some I ⊆ {1, . . . , n} we have ϕ ≡ i∈I ψ i , due to our assumption on the function Interpolate. Hence the effect of restricting all transitions to ϕ is to remove all transitions in states not in {S i | i ∈ I}, which are exactly the states in {S i | i ∈ {1, . . . , n} \ I}. It is clear that ∼ is still a bisimulation in the resulting game, and that the goal states are preserved. To see that size(G post ) < size(G) we may assume that Unsat(Enf(C, G)) is false, otherwise we would have returned in line 9. Then, in particular, there is a transition in G satisfying ¬ϕ, which means that there is a pair S i , S j such that (Safe ∨ Reach) ∧ ¬ϕ ∧ ψ i ∧ ψ ′ j is satisfiable. This is cleary unsatisfiable when replacing (Safe ∨ Reach) by (Safe ∨ Reach) ∧ ϕ. Hence, size(G post ) < size(G). As a result, we can apply the induction hypothesis to conclude that the recursive call Reach(G post ) in line 11 terminates. Now let us consider the second subgame G pre , as constructed in line 18. First, we observe that F ≡ (Safe ∨Reach)∧¬ϕ∧ϕ ′ ∧R post [V/V ′ ], where R post is a state predicate characterizing the initial winning states of G post (this uses Theorem 14). As ∼ is a bisimulation on G post , it follows by Lemma 18 that R post is equivalent to a disjunction of formulas ψ i . As a consequence, we can equivalently write F as φ 1 ∧ (φ 2 [V/V ′ ]) for two formulas φ 1 , φ 2 ∈ L(V) that are both equivalent to disjunctions of ψ i . By Lemma 21 it follows that Pre(Enf(E, G)) is also equivalent to a disjunction of ψ i .
Restricting transitions to ¬F in G pre has the effect of removing all transitions from states in {S i | ψ i =⇒ φ 1 is valid} to states in {S i | ψ i =⇒ φ 2 is valid}. It is clear that ∼ is still a bisimulation in the resulting game. Furthermore, as Enf(F, G) is satisfiable, there is at least one such transition in G. It follows that size(G pre ) < size(G) and hence the algorithm terminates by induction hypothesis.