Environmentally-friendly GR(1) Synthesis

Many problems in reactive synthesis are stated using two formulas ---an environment assumption and a system guarantee--- and ask for an implementation that satisfies the guarantee in environments that satisfy their assumption. Reactive synthesis tools often produce strategies that formally satisfy such specifications by actively preventing an environment assumption from holding. While formally correct, such strategies do not capture the intention of the designer. We introduce an additional requirement in reactive synthesis, non-conflictingness, which asks that a system strategy should always allow the environment to fulfill its liveness requirements. We give an algorithm for solving GR(1) synthesis that produces non-conflicting strategies. Our algorithm is given by a 4-nested fixed point in the $\mu$-calculus, in contrast to the usual 3-nested fixed point for GR(1). Our algorithm ensures that, in every environment that satisfies its assumptions on its own, traces of the resulting implementation satisfy both the assumptions and the guarantees. In addition, the asymptotic complexity of our algorithm is the same as that of the usual GR(1) solution. We have implemented our algorithm and show how its performance compares to the usual GR(1) synthesis algorithm.


Introduction
Reactive synthesis from temporal logic specifications provides a methodology to automatically construct a system implementation from a declarative specification of correctness. Typically, reactive synthesis starts with a set of requirements on the system and a set of assumptions about the environment. The objective of the synthesis tool is to construct an implementation that ensures all guarantees are met in every environment that satisfies all the assumptions; formally, the synthesis objective is an implication A ⇒ G. In many synthesis problems, the system can actively influence whether an environment satisfies its assumptions. In such cases, an implementation that prevents the environment from satisfying its assumptions is considered correct for the specification: since the antecedent of the implication A ⇒ G does not hold, the property is satisfied. q0 q2 q4 q5 q7 q9 q1 q3 q6 q8 Figure 1. Pictorial representation of a desired strategy for a robot (square) moving in a maze in presence of a moving obstacle (circle). Obstacle and robot start in the lower left and right corner, can move at most one step at a time (to non-occupied cells) and cells that they should visit infinitely often are indicated in light and dark gray (see q0), respectively. Nodes with self-loops (q {1,3,6,8} ) can be repeated finitely often with the obstacle located at one of the dotted positions.
Such implementations satisfy the letter of the specification but not its intent. Moreover, assumption-violating implementations are not a theoretical curiosity but are regularly produced by synthesis tools such as slugs [13]. In recent years, a lot of research has thus focused on how to model environment assumptions [18,11,2,5,4], so that assumption-violating implementations are ruled out. Existing research either removes the "zero sum" assumption on the game by introducing different levels of co-operation [5], by introducing equilibrium notions inspired by non-zero sum games [7,20,15], or by introducing richer quantitative objectives on top of the temporal specifications [3,1].
Contribution In this paper, we take an alternative approach. We consider the setting of GR(1) specifications, where assumptions and guarantees are both conjunctions of safety and Büchi properties [6]. GR(1) has emerged as an expressive specification formalism [23,27,17] and, unlike full linear temporal logic, synthesis for GR (1) can be implemented in time quadratic in the state/transition space. In our approach, the environment is assumed to satisfy its assumptions provided the system does not prevent this. Conversely, the system is required to pick a strategy that ensures the guarantees whenever the assumptions are satisfied, but additionally ensures non-conflictingness: along each finite prefix of a play according to the strategy, there exists the persistent possibility for the environment to play such that its liveness assumptions will be met. Note that non-conflictingness is not a trace property; we cannot "compile away" this additional requirement into a different GR (1) or even ω-regular objective.
Our main contribution is to show a µ-calculus characterization of winning states (and winning strategies) that rules out system strategies that are winning by preventing the environment from fulfilling its assumptions. Specifically, we provide a 4-nested fixed point that characterizes winning states and strategies that are non-conflicting and ensure all guarantees are met if all the assumptions are satisfied. Thus, if the environment promises to satisfy its assumption if allowed, the resulting strategy ensures both the assumption and the guarantee.
Our algorithm does not introduce new notions of winning, or new logics or winning conditions. Moreover, since µ-calculus formulas with d alternations can be computed in O(n ⌈d/2⌉ ) time [25,8], the O(n 2 ) asymptotic complexity for the new symbolic algorithm is the same as the standard GR(1) algorithm.
Motivating Example Consider a small two-dimensional maze with 3x2 cells as depicted in Figure 1, state q 0 . A robot (square) and an obstacle (circle) are q4 q0 q1 q2 q3 Figure 2. Pictorial representation of the GR(1) winning strategy synthesized by slugs for the robot (square) in the game described in Figure 1. located in this maze and can move at most one step at a time to non-occupied cells. There is a wall between the lower and upper left cell and the lower and upper right cell. The interaction between the robot and the object is as follows: first the environment chooses where to move the obstacle to, and, after observing the new location of the obstacle, the robot chooses where to move.
Our objective is to synthesize a strategy for the robot s.t. it visits both the upper left and the lower right corner of the maze (indicated in dark gray in Figure 1, state q 0 ) infinitely often. Due to the walls in the maze the robot needs to cross the two white middle cells infinitely often to fulfill this task. If we assume an arbitrary, adversarial behavior of the environment (e.g., placing the obstacle in one white cell and never moving it again) this desired robot behavior cannot be enforced. We therefore assume that the obstacle is actually another robot that is required to visit the lower left and the upper right corner of the maze (indicated in light gray in Figure 1, state q 0 ) infinitely often. While we do not know the precise strategy of the other robot (i.e., the obstacle), its liveness assumption is enough to infer that the obstacle will always eventually free the white cells. Under this assumption the considered synthesis problem has a solution.
Let us first discuss one intuitive strategy for the robot in this scenario, as depicted in Figure 1. We start in q 0 with the obstacle (circle) located in the lower left corner and the robot (square) located in the lower right corner. Recall that the obstacle will eventually move towards the upper right corner. The robot can therefore wait until it does so, indicated by q 1 . Here, the dotted circles denote possible locations of the obstacle during the (finitely many) repetitions of q 1 by following its self loop. Whenever the obstacle moves to the upper part of the maze, the robot moves into the middle part (q 2 ). Now it waits until the obstacle reaches its goal in the upper right, which is ensured to happen after a finite number of visits to q 3 . When the obstacle reaches the upper right, the robot moves up as well (q 4 ). Now the robot can freely move to its goal in the upper left (q 5 ). This process symmetrically repeats for moving back to the respective goals in the lower part of the maze (q 6 to q 9 and then back to q 0 ). With this strategy, the interaction between environment and system goes on for infinitely many cycles and the robot fulfills its specification.
The outlined synthesis problem can be formalized as a two player game with GR(1) winning condition. When solving this synthesis problem using the tool slugs [13], we obtain the strategy depicted in Figure 2 (not the desired one in Figure 1). The initial state, denoted by q 0 is the same as in Figure 1 and if the environment moves the obstacle into the middle passage (q 1 ) the robot reacts as before; it waits until the object eventually proceeds to the upper part of the maze (q 2 ). However, after this happens the robot takes the chance to simply move to the lower left cell of the maze and stays there forever (q 3 ). By this, the robot prevents the environment from fulfilling its objective. Similarly, if the obstacle does not immediately start moving in q 0 , the robot takes the chance to place itself in the middle passage and stays there forever (q 4 ). This obviously prevents the environment from fulfilling its liveness properties.
In contrast, when using our new algorithm to solve the given synthesis problem, we obtain the strategy given in Figure 1, which satisfies the guarantees while allowing the environment assumptions to be satisfied.
Related Work Our algorithm is inspired by supervisory controller synthesis for non-terminating processes [22,26], resulting in a fixed-point algorithm over a Rabin-Büchi automaton. This algorithm has been simplified for two interacting Büchi automata in [21] without proof. We adapt this algorithm to GR(1) games and provide a new, self-contained proof in the framework of two-player games, which is distinct from the supervisory controller synthesis setting (see [12,24] for a recent comparison of both frameworks).
The problem of correctly handling assumptions in synthesis has recently gained attention in the reactive synthesis community [4]. As our work does not assume precise knowledge about the environment strategy (or the ability to impose the latter), it is distinct from cooperative approaches such as assumeguarantee [9] or rational synthesis [16]. It is closest related to obliging games [10], cooperative reactive synthesis [5], and assume-admissible synthesis [7]. Obliging games [10] incorporate a similar notion of non-conflictingness as our work, but do not condition winning of the system on the environment fulfilling the assumptions. This makes obliging games harder to win. Cooperative reactive synthesis [5] tries to find a winning strategy enforcing A ∩ G. If this specification is not realizable, it is relaxed and the obtained system strategy enforces the guarantees if the environment cooperates "in the right way". Instead, our work always assumes the same form of cooperation; coinciding with just one cooperation lever in [5]. Assume-admissible synthesis [7] for two players results in two individual synthesis problems. Given that both have a solution, only implementing the system strategy ensures that the game will be won if the environment plays admissible. This is comparable to the view taken in this paper, however, assuming that the environment plays admissible is stronger then our assumption on an environment attaining its liveness properties if not prevented from doing so. Moreover, we only need so solve one synthesis problem, instead of two. However, it should be noted that [10,5,7] handle ω-regular assumptions and guarantees. We focus on the practically important GR(1) fragment and our method better leverages the computational benefits for this fragment.

Winning Conditions
We consider winning conditions defined over sets of states of a given game graph H. Given F ⊆ Q, we say a play π satisfies the Büchi condition F if Inf(π) ∩F = ∅, where Inf(π) = {q ∈ Q | π(k) = q for infinitely many k ∈ N}. Given a set F = {F 1 , . . ., F m }, where each F i ⊆ Q, we say a play π satisfies the generalized Büchi condition F if Inf(π) ∩F i = ∅ for each i ∈ [1; m]. We additionally consider generalized reactivity winning conditions with rank 1 (GR(1) winning conditions in short). Given two generalized Büchi conditions That is, whenever the play satisfies F 0 , it also satisfies F 1 . We use the tuples (H, F ), (H, F ) and (H, F 0 , F 1 ) to denote a Büchi, generalized Büchi and GR(1) game over H, respectively, and collect all winning plays in these games in the sets L(H, F ), L(H, F ) and L(H, F 0 , F 1 ). A strategy f l is winning for player l in a Büchi, generalized Büchi, or GR(1) game, if L(H, f l ) is contained in the respective set of winning plays. Set Transformers on Games Given a game graph H, we define the existential, universal, and player 0-, and player 1-controllable pre-operators. Let P ⊆ Q.
Intuitively, CondPre computes the set of states from which P is reachable in one step and player 1 can force a visit to P ∪ P ′ in one step. Likewise, CondPre computes the set of states from which either player 0 can force a visit to P ∩ P ′ in one step or neither player can force the game to leave P in one step. We see that Q \ CondPre(P, P ′ ) = CondPre(Q \ P, Q \ P ′ ).

µ-Calculus
We use the µ-calculus as a convenient logical notation used to define a symbolic algorithm (i.e., an algorithm that manipulates sets of states rather then individual states) for computing a set of states with a particular property over a given game graph H. The formulas of the µ-calculus, interpreted over a two-player game graph H, are given by the grammar where p ranges over subsets of Q, X ranges over a set of formal variables, pre ∈ {Pre ∃ , Pre ∀ , Pre 0 , Pre 1 , CondPre, CondPre} ranges over set transformers, and µ and ν denote, respectively, the least and greatest fixpoint of the functional defined as X → ϕ(X). Since the operations ∪, ∩, and the set transformers pre are all monotonic, the fixpoints are guaranteed to exist. A µ-calculus formula evaluates to a set of states over H, and the set can be computed by induction over the structure of the formula, where the fixpoints are evaluated by iteration. We omit the (standard) semantics of formulas [19].

The Considered Synthesis Problem
The GR(1) synthesis problem asks to synthesize a winning strategy for the system player (player 1) for a given GR(1) game (H, F A , F G ) or determine that no such strategy exists. This can be equivalently represented in terms of ωlanguages, by asking for a system strategy f 1 over H s.t.
That is, the system wins on plays π ∈ L(H, f 1 ) if either π / ∈ L(H, F A ) or π ∈ L(H, F A ) ∩ L(H, F G ). The only mechanism to ensure that sufficiently many computations will result from f 1 is the usage of the environment input, which enforces a minimal branching structure. However, the system could still win this game by falsifying the assumptions; i.e., by generating plays π / ∈ L(H, F A ) that prevent the environment from fulfilling its liveness properties.
We suggest an alternative view to the usage of the assumptions on the environment F A in a GR(1) game. The condition F A can be interpreted abstractly as modeling an underlying mechanism that ensures that the environment player (player 0) generates only inputs (possibly in response to observed outputs) that conform with the given assumption. In this context, we would like to ensure that the system (player 1) allows the environment, as much as possible, to fulfill its liveness and only restricts the environment behavior if needed to enforce the guarantees. We achieve this by forcing the system player to ensure that the environment is always able to play such that it fulfills its liveness, i.e. pfx(L(H, f 1 )) = pfx(L(H, f 1 ) ∩ L(H, F A )) .
As the ⊇-inclusion trivially holds, the constraint is given by the ⊆-inclusion. Intuitively, the latter holds if every finite play α compliant with f 1 over H can be extended (by a suitable environment strategy) to an infinite play π compliant with f 1 that fulfills the environment liveness assumptions. It is easy to see that not every solution to the GR(1) game (H, F A , F G ) (in the classical sense) supplies this additional requirement. We therefore propose to synthesize a system strategy f 1 with the above properties, as summarized in the following problem statement.
and pfx(L(H, both hold, or verify that no such system strategy exists. Problem 1 asks for a strategy f 1 s.t. every play π compliant with f 1 over H fulfills the system guarantees, i.e., π ∈ L(H, F G ), if the environment fulfills its liveness properties, i.e., if π ∈ L(H, F A ) (from (7a)), while the latter always remains possible (by a suitably playing environment) due to (7b). Inspired by algorithms solving the supervisory controller synthesis problem for non-terminating processes [22,26], we propose a solution to Problem 1 in terms of a vectorized 4-nested fixed-point in the remaining part of this paper. We show that Problem 1 can be solved by a finite-memory strategy, if a solution exists.
We note that (7b) is not a linear time but a branching time property and can therefore not be "compiled away" into a different GR(1) or even ω-regular objective. Satisfaction of (7b) requires checking whether the set F A remains reachable from any reachable state in the game graph realizing L(H, f 1 ) 3 . This is made clear by the example in Figure 3. The game graph H ′ (Figure 3, left) realizes a language L(H, f 1 ) which is non-conflicting for F A = {q 5 } as q 5 is reachable from all states in H ′ . However, reducing this language to the single trace q 0 q 1 (q 2 q 3 q 4 ) ω realized by the game graph H ′′ (Figure 3, right) shows that the property does not hold anymore. Hence, non-conflictingness is not a trace property.

Algorithmic Solution for Singleton Winning Conditions
We first consider the GR(1) game (H, F A , F G ) with singleton winning conditions F A = {F A } and F G = {F G }, i.e., n = m = 1. It is well known that a system winning strategy f 1 for this game can be synthesized by solving a three color parity game over H. This can be expressed by the µ-calculus formula (see [14]) if and only if the synthesis problem has a solution and the winning strategy f 1 is obtained from a ranking argument over the sets computed during the evaluation of (8). To obtain a system strategy f 1 solving Problem 1 instead, we propose to extend (8) to a 4-nested fixed-point expressed by the µ-calculus formula Compared to (8) this adds an inner-most largest fixed-point and substitutes the last controllable pre-operator by the conditional one. Intuitively, this distinguishes between states from which player 1 can force visiting F G and states from which player 1 can force avoiding F A . This is in contrast to (8) and allows to exclude strategies that allow player 1 to win by falsifying the assumptions. This is further explained when discussing the example in Figure 5.
The remainder of this section shows that q 0 ∈ [[ϕ 4 ]] if and only if Problem 1 has a solution and the winning strategy f 1 fulfilling (7) can be obtained from a ranking argument over the sets computed during the evaluation of (9).

Soundness
We prove soundness of (9) by showing that every state q ∈ [[ϕ 4 ]] is winning for the system player. In view of Problem 1 this requires to show that there exists a system strategy f 1 s.t. all plays starting in a state q ∈ [[ϕ 4 ]] and evolving in accordance to f 1 result in an infinite play that fulfills (7a) and (7b).
We start by defining f 1 from a ranking argument over the iterations of (9). Consider the last iteration of the fixed-point in (9) over Z. As (9) terminates after this iteration we have Using these sets, we define a ranking for every state q ∈ Z ∞ s.t.  (16) (right). Diamond, ellipses and rectangles represent the sets D, E i and R i j , while blue, green and red indicate the sets Y 1 , Y 2 \ Y 1 and Y 3 \ Y 2 (annotated by a / ab for the right figure). Labels (i, j) and (a, i, b, j) indicate that all states q associated with this set fulfill rank(q) = (i, j) and ab rank(q) = (i, j), respectively. Solid, colored arcs indicate system-enforceable moves, dotted arcs indicate existence of environment or system transitions and dashed arcs indicate possible existence of environment transitions.
We order ranks lexicographically. It further holds that (see Appendix A.1) where D, E i and R i j denote the sets added to the winning state set by the first, second and third term of (9), respectively, in the corresponding iteration. Figure 4 (left) shows a schematic representation of this construction for an example with k = 3, l 1 = 4, l 2 = 2 and l 3 = 3. The set D = F G ∩ Z ∞ is represented by the diamond at the top where the label (1, 1) denotes the associated rank (see (11a)). The ellipses represent the sets where the corresponding i > 1 is indicated by the associated rank (i, 1). Due to the use of the controllable pre-operator in the first and second term of (9), it is ensured that progress out of D and E i can be enforced by the system, indicated by the solid arrows. This is in contrast to all states in represented by the rectangular shapes in Figure 4 (left). These states allow the environment to increase the ranking (dashed lines) as long as Z ∞ \ F A \ F G is not left and there exists a possible move to decrease the j-rank (dotted lines). While this does not strictly enforce progress, we see that whenever the environment plays such that states in F A (i.e., the ellipses) are visited infinitely often (i.e., the environment fulfills its assumptions), the system can enforce progress w.r.t. the defined ranking and states in F G (i.e., the diamond shape) is eventually visited.
The system is restricted to take the existing solid or dotted transitions in Figure 4 (left). With this, it is easy to see that the constructed strategy is winning if the environment fulfills its assumptions, i.e., (7a) holds. However, to ensure that (7b) also holds, we need an additional requirement. This is necessary as the used construction also allows plays to cycle through the blue region of Figure 4 (left) only, and by this not surely visiting states in F A infinitely often. However, if L(H, F G ) ⊆ L(H, F A ) we see that (7b) holds as well. It should be noted that the latter is a sufficient condition which can be easily checked symbolically on the problem instance but not a necessary one. Based on the ranking in (10) we define a memory-less system strategy f 1 : The next theorem shows that this strategy indeed solves Problem 1.

Completeness
We show completeness of (9)

Example
Consider the game graph H 1 in Figure 5. Running the fixed-point in (9) for H 1 induces the ranking defined in (10) as indicated in the top right of every winning state. Here, the evaluation of the fixed-point is particularly simple as the smallest fixed-points over X and Z never remove states; we therefore concentrate on the maximal fixed-points over W and Y . In the first iteration over W , we start with F G (i.e., q 4 ) and successively enlarge this set by states that can reach W (i.e., have a path to F G ) and can be forced by player 1 to stay within Q \ F A . This is true for all states with rank (1, ·) , i.e., q 1 to q 4 , giving It is easy to see, that the environment can increase the rank during a play by going from q 2 to q 3 (i.e., moving from R 1 2 to R 1 It is easy to see that q 8 and q 9 are never added to the winning region, as they do not have a path to any W constructed during the iteration over (9), i.e., do not allow to reach F G . By this, the strategy induced by this ranking via (12) always transitions from q 1 to q 2 and from q 6 to q 0 , thereby avoiding to win by falsifying the assumptions. Now consider the fixed-point in (8), which induces a ranking over Y as indicated in the bottom right of every winning state in Figure 5 (see [6] for a definition of the used ranking). Due to the missing inner fixed-point over W , the first iteration over X is initialized directly with While the remaining iterations over Y result in an equivalent i ranking as in the new 4-nested fixed-point (9), we see that q 8 and q 9 are now part of the winning region. Even worse, due to the structure of (8), q 8 and q 2 have the same rank. I.e., the rank does not allow to distinguish between states from which player 1 can force a visit to F G and states from which player 1 can force the play to stay inside Q \ F A . Therefore, is not possibly to construct a strategy via this ranking that avoids winning by falsifying the assumptions.

A Solution for Problem 1
We note that the additional assumption in Theorem 1 is required only to ensure that the resulting strategy fulfills (7b). Suppose that this assumption holds for the initial state q 0 of H. That is, consider a GR (1) Then it follows from Theorem 2 that Problem 1 has a solution iff  (12), we can construct f 1 that wins the GR(1) condition in (7a) and is non-conflicting, as in (7b).
We can check symbolically whether L(H, F G ) ⊆ L(H, F A ). For this we construct a game graph H ′ from H by removing all states in F A , and then check whether L(H ′ , F G ) is empty. The latter is decidable in logarithmic space and polynomial time. If this check fails, then L(H, F G ) ⊆ L(H, F A ). Furthermore, we can replace L(H, F G ) in (7a) by L(H, F G ) ∩ L(H, F A ) without affecting the restriction (7a) imposes on the choice of f 1 . Given singleton winning conditions . That is, we fulfill the conditional by replacing the system guarantee L(H, F G ) by L(H, {F G , F A }). However, this results in a GR(1) synthesis problem with m = 1 and n = 2, which we discuss next.
the same idea to the 4-nested fixed-point algorithm (9) results in where ) and a + denotes (a mod n) + 1.
The remainder of this section shows how soundness and completeness carries over from the 4-nested fixed-point algorithm (9) to its vectorized version in (15).

Soundness and Completeness
We refer to intermediate sets obtained during the computation of the fixpoints by similar notations as in Section 3. For example, the set a Y i is the i-th approximation of the fixpoint computing a Y and ab W i j is the j-th approximation of ab W while computing the i-th approximation of a Y , i.e., computing a Y i and using a Y i−1 . Similar to the above, we define a mode-based rank for every state q ∈ a Z ∞ ; we track the currently chased guarantee a ∈ [1; n] (similar to [6]) and the currently avoided assumption set b ∈ [1, m] as an additional internal mode. In analogy to (10) Again, we order ranks lexicographically, and, in analogy to (11), we have The sets a Y i , ab W i j , a D, a E i and ab R i j are interpreted in direct analogy to Section 3, where a and b annotate the used line and conjunct in (15). Figure 4 (right) shows a schematic representation of the ranking for an example with a k = 3, a1 l 1 = 0, a2 l 1 = 4, a3 l 1 = 2, a· l 2 = 2, a1 l 3 = 3, a2 l 3 = 0, and a3 l 3 = 2. Again, the set a D ⊆ a F G is represented by the diamond at the top of the figure. Similarly, all ellipses represent sets a E i added in the i-th iteration over line a of (15). Again, progress out of ellipses can be enforced by the system, indicated by the solid arrows leaving those shapes. However, this might not preserve the current b mode. It might be the environment choosing which assumption to avoid next. Further, the environment might choose to change the b mode along with decreasing the i-rank, as indicated by the colored dashed lines 5 . This is possible as for and is further explained when discussing the example in Figure 6. Finally, the interpretation of the sets represented by rectangular shapes in Figure 4 (right), corresponding to (17c), is in direct analogy to the case with singleton winning conditions. It should be noticed that this is the only place where we preserve the current b-mode when constructing a strategy.
Using this intuition we define a system strategy that uses enforceable and existing transitions to decrease the rank if possible and preserves the current a mode until the diamond shape is reached. The b mode is only preserved within rectangular sets. This is formalized by a strategy We say that a play π over H is compliant with With this it is easy to see that the intuition behind Theorem 1 directly carries over to every line of (15). Additionally, using Pre 1 ( a + Z) in a D allows to cycle through all the lines of (15), which ensures that every set a F G ∈ F G is tried to be attained by the constructed system strategy in a pre-defined order. This is formalized in Appendix B and summarized in Theorem 3 below.
To prove completeness, it is shown in Appendix B.2 that the negation of (15) can be over-approximated by negating every line separately. Therefore, the reasoning for every line of the negated fixed-point carries over from Section 3, resulting in the analogous completeness result. With this we obtain soundness and completeness in direct analogy to Theorem 1-2, formalized in Theorem 3.

Example
We will explain the evaluation of the vectorized fixed-point in (15) using the game graph H 2 in Figure 6. In this example, the fixed-point terminates after one iteration over every line of (15) with 1 Z = 2 Z = Q. Therefore, the ranking induced by the first iteration over Z is also the final one. We discuss its construction for both lines separately. a = 1: First consider a = 1 and b = 1. In this case, the first iteration over except q 4 , as q 4 ∈ 1 F A . Applying the smallest fixed-point over 11 X to this set has no effect and we have 11 , q 9 } and 11 R 1 6 = {q 6 }, as indicated by the upper four-digit number on the top-right of each state. Now we consider a = 1 and b = 2. Again, the first iteration over 12 W starts with 1 F G = {q 3 } and successively adds all states except q 7 and q 6 (as q 7 ∈ 2 F A and q 6 is its predecessor), beginning with q 2 . This results in 12 X = Q\{q 6 , q 7 } for re-iterating 12 W , which does not allow to add q 2 to 12 W as not all successors of q 2 (in particular q 6 ) are contained in 12 X \ 2 F A . The re-calculation of the fixed-point therefore terminates with 12 X 1 ∞ = 12 W 1 ∞ = {q 3 }, giving 12 R 1 j = ∅ for all j > 1. Now taking the union over the resulting fixed-points 11 X 1 ∞ = Q \ {q 4 } and 12 X 1 ∞ = {q 3 } gives 1 Y 1 = Q \ {q 4 } and it is easy to see that q 4 ∈ Pre 1 ( 1 Y 1 ), giving 1 E 2 = {q 4 }. As all other states are already contained in 1 Y 1 we have 12 R 2 j = ∅ for all j > 1. a = 2: We first consider a = 2 and b = 1. In this case, the first iteration over 21 W starts with 2 F G = {q 8 , q 10 } (giving 2 D = {q 8 , q 10 }) and successively adds all states except q 3 and q 4 (as q 4 ∈ 1 F A and q 3 is its predecessor). Similarly to the case where a = 1 and b = 2 this results in the removal of q 2 from 21 W 1 ∞ when re-iterating the fixed-point with 21 X = Q \ {q 3 , q 4 }, as its successor q 3 is not contained in 21 X. However, this does not effect the remaining iterations and we get 21 R 1 2 = {q 7 , q 9 }, 21 R 1 3 = {q 6 , q 1 }, 21 R 1 4 = {q 0 } and 21 R 1 5 = {q 5 }, as indicated by the lower four-digit number on the top-right of each state. Now we consider a = 2 and b = 2. Again, the first iteration over 22 W starts with 2 F G = {q 8 , q 10 } but no further states are added as their only predecessors q 7 and q 9 are both in 2 F A . Hence, 22 R 1 j = ∅ for all j > 1. Now taking the union over the resulting fixed-points 21 , q 4 } and it is easy to see that q 4 ∈ Pre 1 ( 2 Y 1 ), giving 2 E 2 = {q 4 }. Now re-computing the fixed-points over 21 W and 22 W adds q 2 and q 3 in the first iteration in both cases. Hence 2· R 2 2 = {q 2 , q 3 }, as indicated by the lower four-digit number 22 · 2 on the top-right of both states.
Given this example we want to highlight that in q 2 the environment can decide to switch the b-mode from 2 to 1 by transitioning to q 6 , which decreases the i-rank from 2 to 1. This is due to the fact that the re-evaluation of 22 W "copies" Pre 12 Y 1 to 22 W 2 1 , which contains q 6 . Further, we see that for a = 2 the system strategy will always decide to move from q 1 to q 9 , as this preserves the current b-mode. In this example, this also allows to reach the target state q 10 ∈ 2 F G faster, which might not necessarily be the case. On the other hand, for a = 1 the strategy will always transition from q 1 to q 2 , as otherwise the rank increases. By this, the system must rely on the environment to eventually choose to transition from q 2 to q 3 . While this might not always be the case (the environment is allowed to increase the j-rank by transitioning from q 2 to q 6 ), we see that whenever the environment plays such that the assumption is satisfied, i.e., 1 F A = {q 4 } is visited infinitely often, also 1 F G = {q 3 } is visited infinitely often, resulting in a winning play.
A Solution for Problem 1 Given that L(H, F G ) ⊆ L(H, F A ) it follows from Theorem 3 that Problem 1 has a solution iff we can construct f 1 that wins the GR(1) condition in (7a) and is non-conflicting, as in (7b).
Using a similar construction as in Section 3, we can symbolically check

Complexity Analysis
We show that the search for a more elaborate strategy does not affect the worst case complexity. In Section 6 we show that this is also the case in practice. We state this complexity formally below. Proof. Each line of the fixed-point is iterated O(|Q| 2 ) times [8]. As there are |F G ||F A | lines the upper bound follows. As we have to compute |F G ||F A | different ranks for each state, it follows that the complexity is O(m|Q| 2 |F G ||F A |).
We note that enumeratively our approach is theoretically worse than the classical approach to GR(1). This follows from the straight forward reduction to the rank computation in the rank lifting algorithm and the relative complexity of the new rank when compared to the general GR(1) rank. We conjecture that more complex approaches, e.g., through a reduction to a parity game and the usage of other enumerative algorithms, could eliminate this gap.

Experiments
We have implemented the 4-nested fixed-point algorithm in (15) and the corresponding strategy extraction in (18). It is available as an extension to the GR(1) synthesis tool slugs [13]. In this section we show how this algorithm (called 4FP) performs in comparison to the usual 3-nested fixed-point algorithm for GR(1) synthesis (called 3FP) available in slugs. All experiments were run on a computer with an Intel i5 processor running an x86 Linux at 2 GHz with 8 GB of memory.
We first run both algorithms on a benchmark set obtained from the maze example in the introduction by changing the number of rows and columns of the maze. We first increased the number of lines in the maze and added a goal state for both the obstacle and the robot per line. This results in a maze where in the first and last column, system and environment goals alternate and all adjacent cells are separated by a horizontal wall. Hence, both players need to cross the one-cell wide white space in the middle infinitely often to visit all their goal states infinitely often. The computation times and the number of states in the resulting strategy are shown in Table 1, upper part, column 3-6. Interestingly, we see that the 3FP always returns a strategy that blocks the environment. In contrast, the non-conflicting strategies computed by the 4FP are relatively larger (in state size) and computed about 10 times slower compared to the 3FP (compare column 3-4 and 5-6). When increasing the number of columns instead (lower part of Table 1), the number of goals is unaffected. We made the maze wider and left only a one-cell wide passage in the middle of the maze to allow crossings between its upper and lower row. Still, the 3FP only returns strategies that falsify the assumption, which have fewer states and are computed much faster than the environment respecting strategy returned by the 4FP. Unfortunately, the speed of computing a strategy or its size is immaterial if the winning strategy so computed wins only by falsifying assumptions.
To rule out the discrepancy between the two algorithms w.r.t. the size of strategies, we slightly modified the above maze benchmark s.t. the environment assumptions are not falsifiable anymore. We increased the capabilities of the obstacle by allowing it to move at most 2 steps in each round and to "jump over" the robot. Under these assumptions we repeated the above experiments. The computation times and the number of states in the resulting strategy are shown in Table 1, column 9-12. We see, that in this case the size of the strategies computed by the two algorithms are more similar. The larger number for the 4FP is due to the fact that we have to track both the a and the b mode, possibly resulting in multiple copies of the same a-mode state. We see that the state difference decreases with the number of goals (upper part of Table 1, column 9-12) and increases with the number of (non-goal) states (lower part of Table 1, column 9-12). In both cases, the 3FP still computes faster, but the difference decreases with the number of goals.
In addition to the 3FP and the 4FP we have also tested a sound but incomplete heuristic, which avoids the disjunction over all b's in every line of (15) by only investigating a = b. The state count and computation times for this heuristic are shown in Table 1, column 7-8 for the original maze benchmark, and in column 13-14 for the modified one. We see that in both cases the heuristic only returns a winning strategy if the maze is not wider then 3 cells. This is due to the fact that in all other cases the robot cannot prevent the obstacle from attaining a particular assumption state until the robot has moved from one goal to the next. The 4FP handles this problem by changing between avoided assumptions in between visits to different goals. Intuitively, the computation times and state counts for the heuristic should be smaller then for the 4FP, as the exploration of the disjunction over b's is avoided, which is true for many scenarios of the considered benchmark. It should however be noted that this is not always the case (compare e.g. line 3, column 6 and 8). This stems from the fact that restricting the synthesis to avoiding one particular assumption might require more iterations over W and Y within the fixed-point computation.
In addition to the maze benchmark, we have also run our algorithm on the 3 safety-benchmarks that are included in the slugs distribution. All three benchmarks do not have liveness assumptions for either the system or the environment player. For all realizable specifications, both the 3FP and the 4FP return the same strategy (as there is only one maximal permissive strategy in a safety game) and need almost the same time to compute this strategy.

Discussion
We believe the requirement that a winning strategy be non-conflicting is a simple way to disallow strategies that win by actively preventing the environment from satisfying its assumptions, without significantly changing the theoretical formulation of reactive synthesis (e.g., by adding different winning conditions or new notions of equilibria). It is not a trace property, but our main results show that adding this requirement retains the algorithmic niceties of GR(1) synthesis: in particular, symbolic algorithms have the same asymptotic complexity.
However, non-conflictingness makes the implicit assumption of a "maximally flexible" environment: it is possible that because of unmodeled aspects of the environment strategy, it is not possible for the environment to satisfy its specifications in the precise way allowed by a non-conflicting strategy. In the maze example discussed in Section 1, the environment needs to move the obstacle to precisely the goal cell which is currently rendered reachable by the system. If the underlying dynamics of the obstacle require it to go back to the lower left from state q 3 before proceeding to the upper right (e.g., due to a required battery recharge), the synthesized robot strategy prevents the obstacle from doing so.
Finally, if there is no non-conflicting winning strategy, one could look for a "minimally violating" strategy. We leave this for future work. Additionally, we leave for future work the consideration of non-conflictingness for general LTL specifications or (efficient) fragments thereof.

A.1 Soundness
As mentioned, we compute W i j as part of Y i and based on Y i−1 and W i j−1 : Suppose that f 1 is the system strategy in (12) based on the ranking in (10). We first show, that the property in (11) holds.
Lemma 1. Given the premises of Theorem 1, it holds that Proof. We show all claims separately. Show (20): To see that (20a) ⇔ (20c) holds, recall that Z ∞ denotes the fixed-point set. We can show that Z ∞ is closed under Pre 1 (·), which immediately implies that ( a F G ∩ Pre 1 (Z ∞ )) = a F G ∩ Z ∞ . Using (19) it can be easily observed that for i = j = 1 we have W 1 1 = (F G ∩ Pre 1 (Z ∞ )) = D. As Y 0 = W 0 0 = ∅ this implies that every state q ∈ D has rank(q) = (1, 1) and vice versa. By the definition of the rank in (10), this in turn means that rank(q ′ ) > (1, 1) implies Show (21): To see that (21b)⇒(21a) holds, we pick q s.t. rank(q) = (i, 1) and i > 1. With j = 1 we know that W i 0 = ∅ and hence Θ i 0 = ∅. It furthermore follows from (20) and i > 1 that q / ∈ D. As (10) further implies q ∈ W i 1 we conclude from (19) that q ∈ Pre 1 (Y i−1 ). It follows again from (10) that q / ∈ Y i−1 . To see, that the other direction also holds, pick q ∈ Pre 1 (Y i−1 ) \ Y i−1 = E i and observe that E i = ∅ iff i > 1 as Y 0 = ∅. This implies q ∈ W i j (from (19)) and hence q ∈ a Y i by construction. Now observe that (10) determines the j-rank based on W i j \ W i j−1 . As we know that W i 1 contains Pre 1 (Y i−1 ) from before, we conclude j = 1.
We now show (21a)⇒(21c). By the nature of the fixed-point we have (20), what proves the statement. To see that (21a)⇐(21c) also holds, fix q ∈ (F A \F G )∩Z ∞ s.t. rank(q) = (i, j). As q / ∈ F G , it follows from (20) that i > 1 and q / ∈ D. With q ∈ F A ∩ Z ∞ we see that q / ∈ Θ i j either. With this, it follows from (19) Show (22): First observe that for any q s.t. rank(q) = (i, j) and j > 1 we As (20) and (21) holds, we furthermore know that q / ∈ D and q / ∈ E i . With this it follows from (19) This immediately proves (22b)⇒(22a). For the other direction, we see that q ∈ Θ i j implies q ∈ W i j from (19). As q / ∈ W i j−1 and q / ∈ Y i−1 , we know that rank(q) = (i, j). As q / ∈ D and q / ∈ E i , it immediately follows from (20) and (21) that j > 1.
Based on this insight, we first show that any play over H started in a state q ∈ Z ∞ that complies with the system strategy f 1 and the environment transition rules stays in Z ∞ . Lemma 2. Given the premises of Theorem 1, it holds for all q ∈ Z ∞ that δ(q) ∈ Z ∞ .
Proof. Suppose q ∈ Q 0 ∩ Z ∞ . Then rank(q) is defined and one of the cases (a')-(c') holds. As for all cases holds δ 0 (q) ⊆ Z ∞ , the claim follows. Suppose q ∈ Q 1 ∩ Z ∞ . Then rank(q) is defined and one of the cases (a)-(c) holds. If (a) holds, q ′ = f 1 (q) implies q ′ ∈ Z ∞ from the second line of (12). If (b)-(c) holds q ′ = f 1 (q) implies q ′ ∈ Z ∞ from the first line of (12).
Next we show that every play π on H consistent with f 1 and starting in q ∈ Z ∞ satisfies the GR(1) winning condition.

Lemma 3. Given the premises of Theorem 1, it holds for all
Proof. Let π ∈ L q (H, f 1 ), i.e., π(0) = q ∈ Z ∞ . Then it follows from Lemma 2 that π(k) ∈ Z ∞ for all k ∈ N, i.e., one of the cases (a)-(c') holds for every k. If π visits every F A infinitely often, then case (b) or (b') occurs infinitely often. It follows that the first component decreases infinitely often. The only option that allows the first component to increase is by going through case (a) or (a'). Hence, π visits F G infinitely often.
Next we show that there always exists a play π on H that complies with f 1 , starts in a state q ∈ Z ∞ and visits every F G infinitely often.

Lemma 4. Given the premises of Theorem 1, it holds for all
Proof. We will construct an infinite computation π in L q (f 1 ) ∩ L q (H, F G ), hence π(0) = q ∈ Z ∞ . We construct π by induction such that for every k we have π(k) ∈ Z ∞ . As π will be consistent with f 1 this follows from Lemma 2.
2. rank(q ′ ) = (i, 1) for i > 1 -then q ′ ∈ Pre 1 (Y i−1 ). We extend π by choosing a successor q ′′ of q ′ compatible with f 1 such that q ′′ ∈ Y i−1 . That is, the first component in the rank of q ′′ is smaller than i.
. By definition of CondPre we have q ′ ∈ Pre ∃ (W i j−1 ). We extend π by choosing a successor q ′′ of q ′ compatible with f 1 such that q ′′ ∈ W i j−1 . That is, rank(q ′′ ) < rank(q ′ ). We note that if q ′ ∈ Q 1 then the only option compatible with f 1 is q ′′ . However, if q ′ ∈ Q 0 then q ′′ is compatible with f 1 but q ′′ is not enforceable by player 1.
We show that π ∈ L q (H, F G ). In option 1 above, F G is visited and the rank is possibly increased. In options 2 and 3 above, the rank of π decreases. As π is infinite, it follows that infinitely many times option 1 is taken, implying that every F G is visited infinitely often, hence π ∈ L q (H, F G ).
As an immediate consequence of Lemma 2 and Lemma 4 we can now show that pfx(L q (H, f 1 )) is contained in pfx(L q (H, f 1 ) ∩ L q (H, F A )). Interestingly, this is only true if L(H, F G ) ⊆ L(H, F A ).
Lemma 5. Given the premises of Theorem 1, let q ∈ Z ∞ and L q (H, Proof. Observe that "⊇" above always holds. We therefore only prove the other direction. Pick π ∈ pfx(L q (H, f 1 (q))). Let q ′ be the last state in π. As q ∈ Z ∞ it follows from Lemma 2 that q ′ ∈ Z ∞ . Then we can use Lemma 4 to pick β s.t. πβ ∈ L q (H, f 1 ) ∩ L q (H, F G ). As L q (H, F G ) ⊆ L q (H, F A ) we therefore have πβ ∈ L q (H, F A ) and hence πβ ∈ L q (H, f 1 ) ∩ L q (H, F A ). With this we immediately have that π ∈ pfx(L q (H, f 1 ) ∩ L q (H, F A )).
Proof of Theorem 1 Combing the above properties of f 1 we see that (13a) follows from Lemma 3, (13b) follows from Lemma 4 and (13c) follow from Lemma 5.

A.2 Completeness
We start by negating (9). We then use the induced ranking of this negated fixed-point to show that the environment can (i) render the negated winning set invariant, and (ii) can force the play to violate the guarantees. Based on this, we show that whenever (7a) holds for an arbitrary system strategy f 1 starting in [[ϕ 4 ]], then (7b) cannot hold.
Negating the fixed-point in (9) We use the negation rule of the µ-calculus, i.e., ¬(µX . F (X)) = νX . F (X), to negate (9). This results in the fixed-point µZ.νY .µX.νW . (F G ∪Pre 0 (Z)) ∩ Pre 0 (Y ) ∩ (F A ∪CondPre(W , X∪F A )). (23) By using de-Morgan laws on the right-hand side of (23) we obtain four disjuncts: From the structure of the fixed-points, we know that Z ⊆ X ⊆ W ⊆ Y . As Pre 0 is monotonic, we have Pre 0 (Z) ⊆ Pre 0 (Y ). It follows that L 2 above simplifies to Pre 0 (Z) ∩ F A and L 4 simplifies to so (23) simplifies to The induced ranking of Z ∞ Let Z 0 = ∅ and Z i for i ≥ 1 denote the set obtained in the ith iteration over Z. For i ≥ 1 we denote Y i = Z i as the value of the fixpoint on Y that computes the i-th iteration of Z . Furthermore, let X i 0 = ∅ and denote by X i j for j ≥ 1 the set obtained in the j-th iteration over X performed while computing Y i (i.e., using Y i for Y and Z i−1 for Z). Then it follows from the properties of the fixed-point that after the ith iteration over Z has terminated, we have We define the ranking for every state q ∈ Z ∞ s.t.
After termination of the inner fixed-point over W , giving W i j = X i j , we have ). (27) Before interpreting this set, we look at the last term of (27) separately. Using the definition of Pre ∀ , Pre ∃ and CondPre from Section 2 we have Using (27) and (28) we see that for every system state q ∈ Q 1 ∩ Z ∞ with Similarly, for every environment state q ∈ Z ∞ ∩ Q 0 with rank(q) = (i, j) holds Consequences for a game over H Consider a system strategy f 1 over H starting in some state q ∈ Z ∞ and an environment playing in accordance with the properties (a) − (c) and (a ′ ) − (c ′ ). We denote by R f 1 the subset of Z ∞ that is reachable under f 1 within such a game and construct this region by induction on the distance from q as follows.
By assumption q ∈ Z ∞ . Initially, we set q ∈ R f 1 . Consider, by induction, a state q ′ ∈ R f 1 with rank(q ′ ) = (i, j). Then we have two cases.
(1) If q ′ ∈ Q 1 , then based on the (a), (b), and (c) above it follows that either (a) all successors of q ′ have rank at most (i − 1, ·) and δ 1 (q ′ ) ∈ Z ∞ , (b) all successors of q ′ have rank at most (i, ·) and δ 1 (q ′ ) ∈ Z ∞ , or (c) all successors of q ′ have rank at most (i, j) and δ 1 (q ′ ) ∈ Z ∞ . In particular, one of these cases holds for the successor q ′′ of q ′ that is compatible with f 1 . We add q ′′ to R f 1 .
(2) If q ′ ∈ Q 0 , then based on (a ′ ), (b ′ ), and (c ′ ) above it follows that either (a ′ ) there is a successor q ′′ of q ′ that has rank at most (i − 1, ·) and q ′′ ∈ Z ∞ , (b ′ ) there is a successor q ′′ of q ′ that has rank at most (i, ·) and q ′′ ∈ Z ∞ , (c ′ 2) there is a successor q ′′ of q ′ such that rank(q ′′ ) ≤ (i, j − 1), or (c ′ 3) there is a successor q ′′ of q ′ such that rank(q ′′ ) ≤ (i, j) and q ′′ ∈ F A . In all these cases, we add this identified successor q ′′ to R f 1 . As f 1 is a strategy for the system, the state q ′′ is compatible with f 1 . The remaining case is (c ′ 1) when all successors q ′′ of q ′ satisfy that q ′′ ∈ Z ∞ and that rank(q ′′ ) ≤ (i, j). In that case we add all successors q ′′ of q ′ to R f 1 . As f 1 is a strategy for the system all these successors q ′′ are compatible with f 1 . We denote by L q (H, f 1 , R f 1 ) the restriction of L q (H, f 1 ) to computations that remain within R f 1 . It is easy to see that the following lemma follows by construction and is therefore stated without proof. Lemma 6. Given the premises of Theorem 2, it holds that L q (H, Hence, the environment can render Z ∞ invariant. Additionally, it can ensure that F G is only visited finitely often, as formalized in the following lemma.
Lemma 7. Given the premises of Theorem 2, it holds for all q ∈ Z ∞ and for every system strategy for all k ∈ N one of the cases (a)-(c') holds. As F G can only be visited by going through cases (a) and (a'), every visit of π to F G causes the first component of the rank to decrease. As no case causes an increase of the first component of the rank, π ultimately gets trapped in states with some i-rank and cannot visit F G any more. Hence, F G is not visited infinitely often and therefore π / ∈ L(H, F G ).
Using Lemma 6 and Lemma 7 we can now show the essence of Theorem 2, i.e., that whenever (14a) holds for an arbitrary system strategy f 1 starting in [[ϕ 4 ]], then (14b) cannot hold. This is formalized in the following lemma. Proof. First observe that the left part of (14a) implies pfx(L q (H, f 1 )) = ∅. The claim is therefore proven by showing that pfx(L q (H, f 1 ) ∩ L q (H, F A )) = ∅.
Consider the unwinding of the region R f 1 to an infinite tree T . Label every node in the tree according to the case (a) − (c) or (a ′ ) − (c ′ 3) that applies to it according to the construction of R f 1 . By Lemma 7 there are finitely many occurrences of cases (a) and (a ′ ). Assume by contradiction that cases (b), (b ′ ) or (c ′ 3) appear infinitely often in T . From König's lemma it follows that there is a path π in T along which these cases occur infinitely often. However, whenever (b), (b ′ ), or (c ′ 3) occur, π visits F A . It follows that π visits infinitely many states in F A and only finitely many states in F G (from Lemma 7). This contradicts the assumption that f 1 satisfies (14a). It follows that cases (b), (b ′ ) and (c ′ 3) occur finitely often in T . Now consider a location in T under which there are no appearances of cases (b), (b ′ ) or (c ′ 3) and restrict attention to the sub-tree T ′ of T under this location. Suppose that case (c ′ 2) occurs infinitely often in T ′ . As (c ′ 2) leads to a decrease in the second component of the rank, and cases (c) and (c ′ 1) do not allow the rank to increase it follows that there are finitely many occurrences of (c ′ 2) in T ′ .
This reasoning implies that along every branch of T (enumerated by k ∈ N) there exists a finite prefix s k ∈ pfx(L q (H, f 1 )) leading to a state q k at which a sub-tree T ′′ k is rooted in which only cases (c) and (c ′ 1) occur. By construction of R f 1 all sub-trees T ′′ k are closed under environment moves. This implies that pfx(L q (H, f 1 )) = k pfx(s k · L q k (H, f 1 , T ′′ k )). Further, using the same reasoning as before we know that (14a) implies that T ′′ k only contains finitely many states in F A . This implies that L q k (H, f 1 , T ′′ k ) ∩ L q (H, F A ) = ∅ for all k. As s k also only contains finitely many states in F A (from above), combining the last two observations results in pfx(L q (H, f 1 ) ∩ L q (H, F A )) = ∅.
Proof of Theorem 2 It is easy to see that Theorem 2 directly follows from Lemma 8. If we pick some system strategy f 1 over H we have that either (14a) does not hold, or, if (14a) holds we know from Lemma 8 that (14b) does not hold.

B.1 Soundness
We start by recalling that the last iteration of the fixed-point in (15) a Z ∞ . Now let a Y i be the set obtained after the i-th iteration of a Y in line a of (15), let ab X i denote the fixed-point of the iteration over ab X resulting in ab Y i and denote by ab W i j the set obtained in the jth iteration over ab W performed while computing ab X i in line a of (15). With this notation, we see that the computation of ab W i j as part of a Y i and based on a Y i−1 , ab X i , and ab W i j−1 results in the set (29) Using (29), we first show that (17) holds.
Lemma 9. Given the premises of Theorem 3 it holds that Proof. We show each claim separately. Show (30): Using (29) it can be easily observed that for i = j = 1 we have As a Y 0 = ∅ and ab W 1 0 = ∅ this implies that every state q ∈ a D has ab rank(q) = (1, 1) for every b (from (16)). By the definition of the ab rank in (16), this in turn means that ab rank(q ′ ) > (1, 1) implies q ′ / ∈ a D. Hence, (30a) ⇔ (30b) holds.
Show (31): First observe that for every q s.t. ab rank(q) = (i, 1) and i > 1 we know that ab W i j−1 = ab W i 0 = ∅ and with this ab Θ i j = ∅. As q ∈ ab W i 1 and ab W i 0 = ∅ we conclude q ∈ Pre 1 ( a Y i−1 ). Now observe from the definition of the ranking that we have q / ∈ a Y i−1 . This immediately proves (31b)⇒(31a). To see, that the other direction also holds, fix (29)) and hence q ∈ a Y i by construction. Now observe that (16) determines the j-rank based on ab W i j \ ab W i j−1 . As we know that ab W i 1 contains Pre 1 (Y i−1 ) (from (29)), we conclude j = 1. Show (32): First observe that for every q s.t. ab rank(q) = (i, j) and j > 1 we know that q ∈ ab W i j \ ab W i j−1 where ab W i j−1 = ∅ and q ∈ a Y i \ a Y i−1 . As (30) and (31) hold, we furthermore know that q / ∈ a D and q / ∈ a E i . With this it follows from (29) . This immediately proves (32b)⇒(32a). For the other direction, we see that q ∈ ab Θ i j implies q ∈ ab W i j from (29). As q / ∈ ab W i j−1 and q / ∈ a Y i−1 , we know that ab rank(q) = (i, j). As q / ∈ a D and q / ∈ a E i , it immediately follows from (30) and (31) that j > 1. To see that (32a)⇒(32c), observe that (32a) and (29) imply that q is contained in the last term of (29), from which it is easy to see that q / ∈ b F A .
Even though the proven statements are a bit weaker compared to Lemma 1 they are still sufficient to derive the same cases for states within Z ∞ as in case of singleton winning conditions. In particular, observe that (32c) implies that any state in b F A ∩ a Z ∞ needs to have a ab rank(q) with j = 1. Therefore, the remaining proof for soundness follows the same lines as the one discussed in Section A.1 by annotating the used sets with a and b modes. The resulting lemmas and proofs are given in the remainder of this section for the sake of completeness.
Based on this insight, we first show that every play over H started in a state q ∈ Z ∞ that complies with the system strategy f 1 and the environment transition rules stays in Z ∞ .

Lemma 10. Given the premises of Theorem 3, it holds for all
Proof. Suppose q ∈ Q 0 ∩ Z ∞ . Then ab rank(q) is defined and one of the cases (a')-(c') holds. As for all cases holds δ 0 (q) ⊆ a Z ∞ ⊆ Z ∞ , the claim follows. Suppose q ∈ Q 1 ∩ Z ∞ . Then ab rank(q) is defined and one of the cases (a)-(c) holds. If (a) holds, q, a, b) implies q ′ ∈ a Z ∞ ⊆ Z ∞ from the second and third line of (18).
Next we show that every play π on H consistent with f 1 and starting in q ∈ Z ∞ satisfies the GR(1) winning condition.
Lemma 11. Given the premises of Theorem 3, it holds for all q ∈ Z ∞ that L q (H, f 1 ) ⊆ L q (H, F A ) ∪ L q (H, F G ).
Proof. Let π ∈ L q (H, f 1 ), i.e., π(0) = q ∈ Z ∞ . Then it follows from Lemma 10 that π(k) ∈ Z ∞ for all k ∈ N, i.e., one of the cases (a)-(c') holds for every k. Now assume that π ∈ L q (H, F A ), i.e., π visits every b F A with b ∈ [1, m] infinitely often. It remains to show that in this case π needs to also pass a F G with a ∈ [1, n] infinitely often.
Consider some state q = π(k) s.t. (c) or (c') holds, i.e., there exists a, b s.t. ab rank(q) = (i, j) with j > 1 and q / ∈ b F A . In order to visit b F A again, the second component of the ab rank has to decrease to j = 1, entering case (a') or (b') (as q / ∈ b F A whenever case (c') holds for q). If we enter case (a'), a F G is visited and the rank gets reset. Then we can re-apply the same reasoning for a + and some b ′ . On the other hand, if we enter case (b'), the first component of the a-rank gets reduced and b possibly changes to some b ′′ . Re-applying the same reasoning as before shows that case (b') always eventually needs to occur in π, always reducing the first component of the rank for every b. The only option that allows the first component of the rank to increase is by going through case (a) or (a'). As π is infinite, while the ranking is finite, this implies that we eventually need to go through case (a) or (a') for a, passing a F G . With this, we reach a state π(k ′ ) s.t. a + b rank(π(k ′ )) is defined for some b. Then we can apply the same reasoning to show that we will eventually pass a + F G . Hence, π visits a F G for every a ∈ [1, n] infinitely often.
Next we show that there always exists a play π on H that complies with f 1 , starts in a state q ∈ Z ∞ and visits every a F G infinitely often.
Lemma 12. Given the premises of Theorem 3, it holds for all q ∈ Z ∞ that L q (H, f 1 ) ∩ L q (H, F G ) = ∅.
Proof. We will construct an infinite computation π in L q (f 1 ) ∩ L q (H, F G ). We construct π by induction such that for every k we have π(k) ∈ Z ∞ . As π will be consistent with f 1 this follows from Lemma 10. Let π(0) = q ∈ Z ∞ .
That is, the first component in the rank of q ′′ is smaller than i. 3. There exists some b s.t. ab rank(q ′ ) = (i, j) with j > 1 -then . By definition of CondPre we have q ′ ∈ Pre ∃ ( ab W i j−1 ). We extend π by choosing a successor q ′′ of q ′ compatible with f 1 such that q ′′ ∈ ab W i j−1 . That is, ab rank(q ′′ ) ≤ (i, j − 1) < ab rank(q ′ ). We note that if q ′ ∈ Q 1 then the only option compatible with f 1 is q ′′ . However, if q ′ ∈ Q 0 then q ′′ is compatible with f 1 but q ′′ is not enforceable by player 1.
We show that π ∈ L q (H, F G ). In option 1 above, a F G is visited, the mode is changed to a + b ′ and both components of the rank are possibly increased. In options 2 and 3 above, the rank of π decreases. As π is infinite, it follows that infinitely many times option 1 needs to be taken, implying that every mode a ∈ [1; n] and every a F G is visited infinitely often, hence π ∈ L q (H, F G ).
As an immediate consequence of Lemma 10 and Lemma 12 we can now show that pfx(L q (H, f 1 )) is contained in pfx(L q (H, f 1 ) ∩ L q (H, F A )). Interestingly, this is only true if L(H, F G ) ⊆ L(H, F A ).
Proof. Observe that "⊇" above always holds. We therefore only prove the other direction. Pick π ∈ pfx(L q (H, f 1 (q))). Let q ′ be the last state in π. As q ∈ Z ∞ it follows from Lemma 10 that q ′ ∈ Z ∞ . Then we can use Lemma 12 to pick β s.t. πβ ∈ L q (H, f 1 ) ∩ L q (H, F G ). As L q (H, F G ) ⊆ L q (H, F A ) we therefore have πβ ∈ L q (H, F A ) and hence πβ ∈ L q (H, f 1 ) ∩ L q (H, F A ). With this we immediately have that π ∈ pfx(L q (H, f 1 ) ∩ L q (H, F A )).
Proof of Theorem 3, part 1 Combing the above properties of f 1 we see that (13a) follows from Lemma 11, (13b) follows from Lemma 12 and (13c) follow from Lemma 13.

B.2 Completeness
We first show that the negation of (15) can be over-approximated by negating every line separately. This implies that the reasoning for every line of the negated fixed-point carries over from Section A.2 by annotating the used sets with a and b modes. The resulting lemmas and proofs are re-stated in this section for the sake of completeness.
Negating the vectorized fixed-point in (15) First observe that negating line a of (15) results in the formula One assumption that was made in the simplification of (25) was that Z ⊆ X ⊆ W ⊆ Y . When we consider the vectorized version, the right hand side of a Z depends on a + Z. Although, ultimately, all the Z variables have the same value (as arises from our proofs) we cannot rely on this in the simplification of the fixpoint. Instead, we use an over-approximation of the fixpoint. Consider the reorganization of (23) appearing in (24). The reasoning that simplifies L 3 relies on X ⊆ W ⊆ Y . It is easy to see, that we still have ab X ⊆ ab W ⊆ a Y , but the simplification of L 2 and L 4 to Pre 0 (Z) relies on Z ⊆ Y . However, we note that in both cases, Pre 0 (Z) over-approximates L 2 and L 4 . It follows that if we replace L 2 and L 4 in (24) by Pre 0 (Z) we get a formula that characterizes more states. Applying this reasoning to (33) results in ν a Y . m b=1 µ ab X . ν ab W . (Pre 0 ( a + Z)) (34) which is the mode-annotated version of (25). We denote the vectorized versions of (33) and (34) Before interpreting this set, we look at the last term of (36) separately. Using the definition of Pre ∀ , Pre ∃ and CondPre from Section 2 we have Proof of Theorem 3, part 2 It is easy to see that the claim directly follows from Lemma 16. If we pick some system strategy f 1 over H we have that either (14a) does not hold, or, if (14a) holds we know from Lemma 16 that (14b) does not hold.