Solving Mean-Payoff Games via Quasi Dominions

We propose a novel algorithm for the solution of mean-payoff games that merges together two seemingly unrelated concepts introduced in the context of parity games, small progress measures and quasi dominions. We show that the integration of the two notions can be highly beneficial and significantly speeds up convergence to the problem solution. Experiments show that the resulting algorithm performs orders of magnitude better than the asymptotically-best solution algorithm currently known, without sacrificing on the worst-case complexity.


Introduction
In this article we consider the problem of solving mean-payoff games, namely infinite-duration perfectinformation two-player games played on weighted directed graphs, each of whose vertexes is controlled by one of the two players.The game starts at an arbitrary vertex and, during its evolution, each player can take moves at the vertexes it controls, by choosing one of the outgoing edges.The moves selected by the two players induce an infinite sequence of vertices, called play.The payoff of any prefix of a play is the sum of the weights of its edges.A play is winning if it satisfies the game objective, called mean-payoff objective, which requires that the limit of the mean payoff, taken over the prefixes lengths, never falls below a given threshold ν.
Mean-payoff games have been first introduced and studied by Ehrenfeucht and Mycielski in [20], who showed that positional strategies suffice to obtain the optimal value.A slightly generalized version was also considered by Gurvich et al. in [24].Positional determinacy entails that the decision problem for these games lies in NPTIME ∩ CONPTIME [34], and it was later shown to belong to UPTIME ∩COUPTIME [25], being UPTIME the class of unambiguous non-deterministic polynomial time.This result gives the problem a rather peculiar complexity status, shared by very few other problems, such as integer factorization [22], [1] and parity games [25].Despite various attempts [7], [19], [24], [30], [34], no polynomial-time algorithm for the mean-payoff game problems is known so far.
A different formulation of the game objective allows to define another class of quantitative games, known as energy games.The energy objective requires that, given an initial value c, called credit, the sum of c and the payoff of every prefix of the play never falls below 0. These games, however, are tightly connected to mean-payoff games, as the two type of games have been proved to be log-space equivalent [11].They are also related to other more complex forms of quantitative games.In particular, unambiguous polynomial-time reductions [25] exist from these games to discounted payoff [34] and simple stochastic games [18].
Recently, a fair amount of work in formal verification has been directed to consider, besides correctness properties of computational systems, also quantitative specifications, in order to express performance measures and resource requirements, such as quality of service, bandwidth and power consumption and, more generally, bounded resources.Mean-payoff and energy games also have important practical applications in system verification and synthesis.In [14] the authors show how quantitative aspects, interpreted as penalties and rewards associated to the system choices, allow for expressing optimality requirements encoded as mean-payoff objectives for the automatic synthesis of systems that also satisfy parity objectives.With similar application contexts in mind, [9] and [8] further contribute to that effort, by providing complexity results and practical solutions for the verification and automatic synthesis of reactive systems from quantitative specifications expressed in linear time temporal logic extended with mean-payoff and energy objectives.Further applications to temporal networks have been studied in [16] and [15].Consequently, efficient algorithms to solve mean-payoff games become essential ingredients to tackle these problems in practice.
Several algorithms have been devised in the past for the solution of the decision problem for meanpayoff games, which asks whether there exists a strategy for one of the players that grants the meanpayoff objective.The very first deterministic algorithm was proposed in [34], where it is shown that the problem can be solved with O n 3 • m • W arithmetic operations, with n and m the number of positions and moves, respectively, and W the maximal absolute weight in the game.A strategy improvement approach, based on iteratively adjusting a randomly chosen initial strategy for one player until a winning strategy is obtained, is presented in [31], which has an exponential upper bound.The algorithm by Lifshits and Pavlov [29], which runs in time O(n • m • 2 n • log 2 W ), computes the "potential" of each game position, which corresponds to the initial credit that the player needs in order to win the game from that position.Algorithms based on the solution of linear feasibility problems over the tropical semiring have been also provided in [2]- [4].The best known deterministic algorithm to date, which requires O(n • m • W ) arithmetic operations, was proposed by Brim et al. [13].They adapt to energy and mean-payoff games the notion of progress measures [28], as applied to parity games in [26].The approach was further developed in [17] to obtain the same complexity bound for the optimal strategy synthesis problem.A strategy-improvement refinement of this technique has been introduced in [12].Finally, Bjork et al. [6] proposed a randomized strategy-improvement based algorithm running in time Our contribution is a novel mean-payoff progress measure approach that enriches such measures with the notion of quasi dominions, originally introduced in [5] for parity games.These are sets of positions with the property that as long as the opponent chooses to play to remain in the set, it loses the game for sure, hence its best choice is always to try to escape.A quasi dominion from where is not possible escaping is a winning set for the other player.Progress measure approaches, such as the one of [13], typically focus on finding the best choices of the opponent and little information is gathered on the other player.In this sense, they are intrinsically asymmetric.Enriching the approach with quasi dominions can be viewed as a way to also encode the best choices of the player, information that can be exploited to speed up convergence significantly.The main difficulty here is that suitable lift operators in the new setting do not enjoy monotonicity.Such a property makes proving completeness of classic progress measure approaches almost straightforward, as monotonic operators do admit a least fixpoint.Instead, the lift operator we propose is only inflationary (specifically, non-decreasing) and, while still admitting fixpoints [10], [33], need not have a least one.Hence, providing a complete solution algorithm proves more challenging.The advantages, however, are significant.On the one hand, the new algorithm still enjoys the same worst-case complexity of the best known algorithm for the problem proposed in [13].On the other hand, we show that there exist families of games on which the classic approach requires a number of operations that can be made arbitrarily larger than the one required by the new approach.Experimental results also witness the fact that this phenomenon is by no means isolated, as the new algorithm performs orders of magnitude better than the algorithm developed in [13].

Mean-Payoff Games
A two-player turn-based arena is a tuple A = Ps ⊕ , Ps ⊟ , Mv , with Ps ⊕ ∩ Ps ⊟ = ∅ and Ps Ps ⊕ ∪ Ps ⊟ , such that Ps, Mv is a finite directed graph without sinks.Ps ⊕ (resp., Ps ⊟ ) is the set of positions of player ⊕ (resp., ⊟) and Mv ⊆ Ps × Ps is a left-total relation describing all possible moves.A path in V ⊆ Ps is a finite or infinite sequence π ∈ Pth(V) of positions in V compatible with the move relation, i.e., (π i , π i+1 ) ∈ Mv , for all i ∈ [0, |π| − 1).A positional strategy for player α ∈ {⊕, ⊟} on V ⊆ Ps is a function σ α ∈ Str α (V) ⊆ (V ∩ Ps α ) → Ps, mapping each α-position v in the domain of σ α to position σ α (v) compatible with the move relation, i.e., (v, σ α (v)) ∈ Mv .With Str α (V) we denote the set of all α-strategies on V, while Str α denotes V⊆Ps Str α (V).A play in V ⊆ Ps from a position v ∈ V w.r.t. a pair of strategies A mean-payoff game (MPG for short) is a tuple = A, Wg, wg , where A is an arena, Wg ⊂ Z is a finite set of integer weights, and wg : Ps → Wg is a weight function assigning a weight to each position.Ps + (resp., Ps − ) denotes the set of positive-weight positions (resp., non-positive-weight positions).For convenience, we shall refer to non-positive weights as negative weights.Notice that this definition of MPG is equivalent to the classic formulation in which the weights label the moves, instead.The weight function naturally extends to paths, by setting wg(π) The pair of winning regions (Wn ⊕ , Wn ⊟ ) forms a ν-mean partition.Assuming ν integer, the ν-mean partition problem is equivalent to the 0-mean partition one, as we can subtract ν to the weights of all the positions.As a consequence, the MPG decision problem can be equivalently restated as deciding whether player ⊕ (resp., ⊟) has a strategy to enforce lim inf i→∞ , for all the resulting plays π.

Solving Mean-Payoff Games via Progress Measures
The abstract notion of progress measure [28] has been introduced as a way to encode global properties on paths of a graph by means of simpler local properties of adjacent vertexes.In the context of MPGs, the graph property of interest, called mean-payoff property, requires that the mean payoff of every infinite path in the graph be non-positive.More precisely, in game theoretic terms, a mean-payoff progress measure witnesses the existence of strategy σ ⊟ for player ⊟ such that each path in the graph induced by fixing that strategy on the arena satisfies the desired property.A mean-payoff progress measure associates with each vertex of the underlying graph a value, called measures, taken from the set of extended natural numbers N ∞ N ∪ {∞}, endowed with an ordering relation ≤ and an addition operation +, which extend the standard ordering and addition over the naturals in the usual way.Measures are associated with positions in the game and the measure of a position v can intuitively be interpreted as an estimate of the payoff that player ⊕ can enforce on the plays starting in v.In this sense, they measure "how far" v is from satisfying the mean-payoff property, with the maximal measure ∞ denoting failure of the property for v.More precisely, the ⊟-strategy induced by a progress measure ensures that measures do not increase along the paths of the induced graph.This, in turn, ensures that every path eventually gets trapped in a non-positive-weight cycle, thereby witnessing a win for player ⊟.
To obtain a progress measure, one starts from some suitable association of position of the game with measures.The local information encoded by these measures is then propagated back along the edges of the underlying graph so as to associate with each position the information gathered along plays of some finite length starting from that position.The propagation process is performed according to the following intuition.The measures of positions adjacent to v are propagated back to v only if those measures push v further away from the property.This propagation is achieved by means of a measure stretch operation +, which adds, when appropriate, the weight of an adjacent position to the measure of a given position.This is established by comparing the measure of v with those of its adjacent positions, since, for each position v, the mean-payoff property is defined in terms of the sum of the weights encountered along the plays from that position.The process ends when no position can be pushed further away from the property and each position is not dominated by any, respectively one, of its adjacents, depending on whether that position belongs to player ⊕ or to player ⊟, respectively.The positions that did not reach measure ∞ are those from which player ⊟ can win game and the set of measures currently associated with such positions forms a mean-payoff progress measure for the game.
To make the above intuitions precise, we introduce the notion of measure function, progress measure, and an algorithm for computing progress measures correctly.It is worth noticing that the progress-measure based approach as described in [13], called SEPM from now on, can be easily recast equivalently in the form below.A measure function µ : Ps → N ∞ maps each position v in the game to a suitable measure µ(v).The order ≤ of the measures naturally induces a pointwise partial order ⊑ on the measure functions defined in the usual way, namely, for any two measure functions µ  and µ  , we write , for all positions v.The set of measure functions over a measure space, together with the induced ordering ⊑, forms a measure-function space.
Definition 1 (Measure-Function Space).The measure-function space is the partial order F MF, ⊑ whose components are defined as reported in the following: ) of all positions having maximal (resp., non-maximal) measure associated within µ.
Assuming that a given position v has an adjacent with measure η, a measure update of η w.r.t.v is obtained by the stretch operator + : N ∞ × Ps → N ∞ , defined as η + v max{0, η + wg(v)}, which corresponds to the payoff estimate that the given position will obtain by choosing to follow the move leading to the prescribed adjacent.
A mean-payoff progress measure is such that the measure associated with each game position v needs not be increased further in order to beat the actual payoff of the plays starting from v. In particular, it can be defined by taking into account the opposite attitude of the two players in the game.While the player ⊕ tries to push toward higher measures, the player ⊟ will try to keep the measures as low as possible.A measure function in which the payoff of each ⊕-position (resp., ⊟-position) v is not dominated by the payoff of all (resp., some of) its adjacents augmented with the weight of v itself meets the requirements.
Definition 2 (Progress Measure).A measure function µ ∈ MF is a progress measure if the following two conditions hold true, for all positions v ∈ Ps: The following theorem states the fundamental property of progress measures, namely, that every position associated with a non-maximal value is won by player ⊟.
In order to obtain a progress measure from a given measure function, one can iteratively adjust the current measure values in such a way to force the progress condition above among adjacent positions.To this end, we define the lift operator lift : MF → MF as follows: Note that the lift operator is clearly monotone and, therefore, admits a least fixpoint.A mean-payoff progress measure can, then, be obtained by repeatedly applying this operator until a fixpoint is reached, starting from the minimal measure function µ  {v ∈ Ps → 0} that assigns measure 0 to all the positions in the game.The following solver operator applied to µ  computes the desired solution: Observe that the measures generated by the procedure outlined above have a fairly natural interpretation.Each positive measure, indeed, under-approximates the weight that player ⊕ can enforce along finite prefixes of the plays from the corresponding positions.This follows from the fact that, while player ⊕ maximizes its measures along the outgoing moves, player ⊟ minimizes them.In this sense, each positive measure witnesses the existence of a positively-weighted finite prefix of a play that player ⊕ can enforce.Let S {wg(v) ∈ N : v ∈ Ps ∧ wg(v) > 0} be the sum of all the positive weights in the game.Clearly, the maximal payoff of a simple play in the underlying graph cannot exceed S. Therefore, a measure greater than S witnesses the existence of a cycle whose payoff diverges to infinity and is won, thus, by player ⊕.Hence, any measure strictly greater than S can be substituted with the value ∞.This observation established the termination of the algorithm and is instrumental to its completeness proof.Indeed, at the fixpoint, the measures actually coincide with the highest payoff player ⊕ is able to guarantee.Soundness and completeness of the above procedure have been established in [13], where the authors also show that, despite the algorithm requiring O(n • S) = O n 2 • W lift operations in the worstcase, with n the number of positions and W the maximal positive weight in the game, the overall cost of these lift operations is  Let us consider the simple example game depicted in Figure 1, where the shape of each position indicates the owner, circles for player ⊕ and square for its opponent ⊟, and, in each label of the form ℓ/w, the letter w corresponds to the associated weight, where we assume k > 1. Starting from the smallest measure function µ  = {a, b, c, d → 0}, the first application of the lift operator returns µ  = {a → k; b, c → 0; d → 1} = lift(µ  ).After that step, the following iterations of the fixpoint alternatively updates positions c and d, since the other ones already satisfy the progress condition.Being c ∈ Ps ⊟ , the lift operator chooses for it the measure computed along the move (c, d), thus obtaining A progress measure is obtained after exactly 2k + 1 iterations, when the measure of c reaches value k and d value k + 1. Note, however, that the choice of the move (c, d) is clearly a losing strategy for player ⊟, as remaining in the highlighted region would make the payoff from position c diverge.Therefore, the only reasonable choice for player ⊟ is to exit from that region by taking the move leading to position a.An operator able to diagnose this phenomenon early on could immediately discard the move (c, d) and jump directly to the correct payoff obtained by choosing the move to position a.As we shall see, such an operator might lose the monotonicity property and recovering the completeness of the resulting approach will prove more involved.
In the rest of this article we shall devise a progress operator that does precisely that.To this end, we start by providing a notion of quasi dominion, originally introduced for parity games in [5], which can be exploited in the context of MPGs.

Definition 3 (Quasi Dominion). An arbitrary set of positions
If the condition wg(π) > 0 holds only for infinite plays π, then Q is called weak quasi ⊕-dominion.
Essentially, a quasi ⊕-dominion consists in a set Q of positions starting from which player ⊕ can force plays in Q of positive weight.Analogously, any infinite play that player ⊕ can force in a weak quasi ⊕-dominion has positive weight.Clearly, any quasi ⊕-dominion is also a weak quasi ⊕-dominion.Moreover, the latter are closed under subsets, while the former are not.It is an immediate consequence of the definition above that all infinite plays induced by the ⊕-witness, if any, necessarily have infinite weight and, thus, are winning for player ⊕.Indeed, every such a play π is regular, i.e. it can be decomposed into a prefix π ′ and a simple cycle (π ′′ ) ω , i.e. π = π ′ (π ′′ ) ω , since the strategies we are considering are memoryless.Now, wg((π ′′ ) ω ) > 0, so, wg(π ′′ ) > 0, which implies wg((π From Proposition 1, it directly follows that, if a weak quasi ⊕-dominion Q is closed w.r.t.its ⊕-witness, namely all the induced plays are infinite, then it is a ⊕-dominion, hence is contained in Wn ⊕ . Consider again the example of Figure 1.The set of position Q {a, c, d} forms a quasi ⊕-dominion whose ⊕-witness is the only possible ⊕-strategy mapping position d to c. Indeed, any infinite play remaining in Q forever and compatible with that strategy (e.g., the play from position c when player ⊟ chooses the move from c leading to d or the one from a to itself or the one from a to d) grants an infinite payoff.Any finite compatible play, instead, ends in position a (e.g., the play from c when player ⊟ chooses the move from c to a and then one from a to b) giving a payoff of at least k > 0. On the other hand, Q ⋆ {c, d} is only a weak quasi ⊕-dominion, as player ⊟ can force a play of weight 0 from position c, by choosing the exiting move (c, a).However, the internal move (c, d) would lead to an infinite play in Q ⋆ of infinite weight.
The crucial observation here is that the best choice for player ⊟ in any position of a (weak) quasi ⊕-dominion is to exit from it as soon as it can, while the best choice for player ⊕ is to remain inside it as long as possible.The idea of the algorithm we propose in this section is to precisely exploit the information provided by the quasi dominions in the following way.Consider the example above.In position a player ⊟ must choose to exit from Q = {a, c, d}, by taking the move (a, b), without changing its measure, which would corresponds to its weight k.On the other hand, the best choice for player ⊟ in position c is to exit from the weak quasi-dominion Q ⋆ = {c, d}, by choosing the move (c, a) and lifting its measure from 0 to k.Note that this contrasts with the minimal measure-increase policy for player ⊟ employed in [13], which would keep choosing to leave c in the quasi-dominion by following the move to d, which gives the minimal increase in measure of value 1.Once c is out of the quasi-dominion, though, the only possible move for player ⊕ is to follow c, taking measure k + 1.The resulting measure function is a progress measure and the solution has, thus, been reached.
In order to make this intuitive idea precise, we need to be able to identify quasi dominions first.Interestingly enough, the measure functions µ defined in the previous section do allow to identify a quasi dominion, namely the set of positions µ − (0) having positive measure.Indeed, as observed at the end of that section, a positive measure witnesses the existence of a positively-weighted finite play that player ⊕ can enforce from that position onward, which is precisely the requirement of Definition 3. In the example of Figure 1, µ −  (0) = ∅ and µ −  (0) = {a, c, d} are both quasi dominions, the first one w.r.t. the empty ⊕-witness and the second one w.r.t. the ⊕-witness σ ⊕ (d) = c.
We shall keep the quasi-dominion information in pairs (µ, σ), called quasi-dominion representations (QDR, for short), composed of a measure function µ and a ⊕-strategy σ, which corresponds to one of the ⊕-witnesses of the set of positions with positive measure in µ.The connection between these two components is formalized in the definition below that also provides the partial order over which the new algorithm operates.
Definition 4 (QDR Space).The quasi-dominion-representation space is the partial order Q QDR, ⊑ , whose components are defined as prescribed in the following: Condition 1a is obvious.Condition 1b, instead, requires that every position with infinite measure is indeed won by player ⊕ and is crucial to guarantee the completeness of the algorithm.Finally, Conditions 1c and 1d ensure that every positive measure under approximates the actual weight of some finite play within the induced quasi dominion.This is formally captured by the following proposition, which can be easily proved by induction on the length of the play.Proposition 2. Let ̺ be a QDR and vπu a finite path starting at position v ∈ Ps and terminating in position u ∈ Ps compatible with the ⊕-strategy σ ̺ .Then, µ ̺ (v) ≤ wg(vπ) + µ ̺ (u).
It is immediate to see that every MPG admits a non-trivial QDR space, since the pair (µ  , σ  ), with µ  the smallest measure function and σ  the empty strategy, trivially satisfies all the required conditions.Proposition 3. Every MPG has a non-empty QDR space associated with it.
The solution procedure we propose, called QDPM from now on, can intuitively be broken down as an alternation of two phases.The first one tries to lift the measures of positions outside the quasi dominion Q(̺) in order to extend it, while the second one lifts the positions inside Q(̺) that can be forced to exit from it by player ⊟.The algorithm terminates when no new position can be absorbed within the quasi dominion and no measure needs to be lifted to allow the ⊟-winning positions to exit from it, when possible.To this end, we define a controlled lift operator lift : QDR×2 Ps ×2 Ps ⇀ QDR that works on QDRs and takes two additional parameters, a source and a target set of positions.The intended meaning is that we want to restrict the application of the lift operation to the positions in the source set S, while using only the moves leading to the target set T. The different nature of the two types of lifting operations is reflected in the actual values of the source and target parameters.lift(̺, S, T) ̺ ⋆ , where and, for all Except for the restriction on the outgoing moves considered, which are those leading to the targets in T, the lift operator acts on the measure component of a QDR very much like the original lift operator does.In order to ensure that the result is still a QDR, however, the lift operator must also update the ⊕-witness of the quasi dominion.This is required to guarantee that Conditions 1a and 1c of Definition 4 are preserved.If the measure of a ⊕-position v is not affected by the lift, the ⊕-witness must not change for that position.On the other hand, if the application of the lift operation increases the measure, then the ⊕-witness on v needs to be updated to any move (v, u) that grants measure µ ̺ ⋆ (v) to v.In principle, more than one such move may exist and any one of them can serve the purpose as witness.
The solution algorithm can then be expressed as the inflationary fixpoint [10], [33] of the composition of the two phases mentioned above, defined by the progress operators prg  and prg + .
The first phase is computed by the operator prg  : QDR ⇀ QDR, defined as follows: This operator is responsible of enforcing the progress condition on the positions outside the quasi dominion Q(̺) that do not satisfy the inequalities between the measures along a move leading to Q(̺) itself.It does that by applying the lift operator with Q(̺) as source and no restrictions on the moves.Those position that acquire a positive measure in this phase contribute to enlarging the current quasi dominion.Observe that the strategy component of the QDR is updated so that it is a ⊕-witness of the new quasi dominion.To guarantee that measures never decrease, the supremum w.r.t. the QDR-space ordering is taken as result.
The second phase, instead, implements the mechanism intuitively described above, while analyzing the simple example of Figure 1.This is achieved by the operator prg + reported in Algorithm 1.The procedure iteratively examines the current quasi dominion by lifting the measures of the positions that must exit from it.Specifically, it processes Q(̺) layer by layer, starting from the outer layer of positions that must escape from.The process ends when a, possibly empty, closed weak quasi dominion is obtained.Recall that all the positions in a closed weak quasi dominion are necessarily winning for player ⊕, due to Proposition 1.We distinguish two sets of positions in Q(̺).Those that already satisfy the progress condition and those that do not.The measures of first ones already witness an escape route from Q(̺).The other ones, instead, are those whose current choice is to remain inside it.For instance, when considering the measure function µ  in the example of Figure 1, position a belongs to the first set, while positions c and d to the second one, since the choice of c is to follow the internal move (c, d).
Since the only positions that change measure are those in the second set, only such positions need to be examined.To identify them, which form a weak quasi dominion ∆(̺) strictly contained in Q(̺), we proceed as follows.First, we collect the set npp(̺) of positions in Q(̺) that do not satisfy the progress condition, called the non-progress positions.Then, we compute the set of positions that will have no choice other than reaching npp(̺).The non-progress positions are computed as follows.
The remaining positions in ∆(̺) are collected as the inflationary fixpoint of the following operator.
The final result is Intuitively, ∆(̺) contains all the ⊕-positions that are forced to reach npp(̺) via the quasi-dominion ⊕witness and all the ⊟-positions that can only avoid reaching npp(̺) by strictly increasing their measure, which player ⊟ wants obviously to prevent.
It is important to observe that, from a functional view-point, the progress operator prg + would work just as well if applied to the entire quasi dominion Q(̺), since it would simply leave unchanged the measure of those positions that already satisfy the progress condition.However, it is crucial that only the positions in ∆(̺) are processed in order to achieve the best asymptotic complexity bound known to date.We shall reiterate on this point later on.
Algorithm 1: Progress Operator At each iteration of the while-loop of Algorithm 1, let Q denote the current (weak) quasi dominion, initially set to ∆(̺) (Line 1).It first identifies the positions in Q that can immediately escape from it (Line 2).Those are (i) all the ⊟-position with a move leading outside of Q and (ii) the ⊕positions v whose ⊕-witness σ ̺ forces v to exit from Q, namely σ ̺ (v) ∈ Q, and that cannot strictly increase their measure by choosing to remain in Q.While the condition for ⊟-position is obvious, the one for ⊕-positions require some explanation.The crucial observation here is that, while player ⊕ does indeed prefer to remain in the quasi dominion, it can only do so while ensuring that by changing strategy it does not enable infinite plays within Q that are winning for the adversary.In other words, the new ⊕-strategy must still be a ⊕-witness for Q and this can only be ensured if the new choice strictly increases its measure.The operator esc : QDR×2 Ps → 2 Ps formalizes the idea: Consider, for instance, the example in Figure 2 and a QDR ̺ such that µ ̺ = {a → 3; b → 2; c, d, f → 1; e → 0} and σ ̺ = {b → a; f → d}.In this case, we have Q ̺ = {a, b, c, d, f} and ∆(̺) = {c, d, f}, since c is the only non-progress positions, d is forced to follow c in order to avoid the measure increase required to reach b, and f is forced by the ⊕-witness to reach d.Now, consider the situation where the current weak quasi dominion is Q = {c, f}, i.e. after d has escaped from ∆(̺).The escape set of Q is {c, f}.To see why the ⊕-position f is escaping, observe that µ ̺ (f) + f = 1 = µ ̺ (f) and that, indeed, should player ⊕ choose to change its strategy and take the move (f, f) to remain in Q, it would obtain an infinite play with payoff 0, thus violating the definition of weak quasi dominion.
Before proceeding, we want to stress an easy consequence of the definition of the notion of escape set and Conditions 1c and 1d of Definition 4, i.e., that every escape position of the quasi dominion Q(̺) can only assume its weight as possible measure inside a QDR ̺, as reported is the following proposition.This observation, together with Proposition 2, precisely ensures that the measure of a position v ∈ Q(̺) is an under approximation of the weight of all finite plays leaving Q(̺).Proposition 4. Let ̺ be a QDR.Then, µ ̺ (v) = wg(v) > 0, for all v ∈ esc(̺, Q(̺)).Now, going back to the analysis of the algorithm, if the escape set is non-empty, we need to select the escape positions that need to be lifted in order to satisfy the progress condition.The main difficulty is to do so in such a way that the resulting measure function still satisfies Condition 1d of Definition 4, for all the ⊟-positions with positive measure.The problem occurs when a ⊟-position can exit either immediately or passing through a path leading to another position in the escape set.The solution to this problem is simply to lift in the current iteration only those positions that obtain the lowest possible measure increase, hence position d in the example, leaving the lift of c to some subsequent iteration of the algorithm that would choose the correct escape route via d.To do so, we first compute the minimal measure increase, called the best-escape forfeit, that each position in the escape set would obtain by exiting the quasi dominion immediately.The positions with the lowest possible forfeit, called best-escape positions, can all be lifted at the same time.The intuition is that the measure of all the positions that escape from a (weak) quasi dominion will necessarily be increased of at least the minimal best-escape forfeit.This observation is at the core of the proof of Theorem 2 (see the appendix) ensuring that the desired properties of QDRs are preserved by the operator prg + .The set of best-escape positions is computed by the operator bep : QDR×2 Ps → 2 Ps as follows: where the operator bef : MF×2 Ps ×Ps → N ∞ computes, for each position v in a quasi dominion Q, its best-escape forfeit: In our example, bef(µ, Once the set E of best-escape positions is identified (Line 3 of the algorithm), the procedure simply lifts them restricting the possible moves to those leading outside the current quasi dominion (Line 4).Those positions are, then, removed from the set (Line 5), thus obtaining a smaller weak quasi dominion ready for the next iteration.
The algorithm terminates when the (possibly empty) current quasi dominion Q is closed.By virtue of Proposition 1, all those positions belong to Wn ⊕ and their measure is set to ∞ by means of the operator win : QDR×2 Ps ⇀ QDR (Line 6), which also computes the winning ⊕-strategy on those positions.
and, for all Observe that, since we know that every ⊕-position v ∈ Q ∩ Ps ⊕ , whose current ⊕-witness leads outside Q, is not an escape position, any move (v, u) within Q that grants the maximal stretch µ ̺ (u) + v strictly increases its measure and, therefore, is a possible choice for a ⊕-witness of the ⊕-dominion Q.
At this point, it should be quite evident that the progress operator prg + is responsible of enforcing the progress condition on the positions inside the quasi dominion Q(̺), thus, the following necessarily holds.
 is already a progress measure, while ̺ ⋆  requires another application of prg + in order to solve the game, since ̺ ⋆  = prg + (̺ ⋆  ).In order to prove the correctness of the proposed algorithm, we first need to ensure that any quasidominion space Q is indeed closed under the operators prg  and prg + .This is established by the following theorem, which states that the operators are total functions on that space.
Since both operators are inflationary, so is their composition, which admits fixpoint.Therefore, the operator sol is well defined.Moreover, following the same considerations discussed at the end of Section 3, it can be proved the fixpoint is obtained after at most n • (S + 1) iterations.Let ifp k X .F(X) denote the k-th iteration of an inflationary operator F.Then, we have the following theorem.
Consider, as a final example, the game depicted in Figure 4, with k > 2, where the numbers denote the weights of the positions of the game, in the picture labeled (0), and the measures assigned by the procedure, in the remaining ones.Each picture also features both the ⊕-witness strategy in dashed blue and the best counter ⊟-strategy in dashed red for the current quasi dominion.Moreover, solid colored moves are moves along which the measure strictly increases.Below each picture, we also indicate the phase, prg  or prg + , that produces the displayed result.The computation starts from the initial QDR ̺  = (µ  , σ  ), assigning measure 0 to all the positions of the game with the associated empty strategy.The first iteration applies prg  to ̺  , which lifts positions a, f, and g to their respective weights, leading to ̺  as shown in Picture (1).At this point, Q(̺  ) = {a, f, g} but ∆(̺  ) is empty, as all those positions already satisfy the progress condition, thus, prg + does nothing.In the next iteration, prg  applied to ̺  results in the lifting of positions c and d, as reported in Picture (2).Position c is a ⊕-position and the lift operator chooses (c, f) as its strategy.The resulting quasi-dominion is Q(̺  ) = {a, c, d, f, g} and ∆(̺  ) = {d, g}, with g the only escape position that is also non-progress.The measure of g is lifted to µ  (c) + g = 4. Finally, it is the turn of position d to be lifted to µ  (g) + d = 3. Picture (3) shows the resulting QDR ̺  .The final iteration first applies prg  to ̺  (Picture (4)), lifting position b to measure 1 via the move (b, c).This change of measure triggers another application of prg + , as position f is now non-progress.The resulting QDR ̺  is such that Q(̺  ) = {a, b, c, d, f, g} and ∆(̺  ) = {b, c, d, f, g}.The only escape position is b, which is lifted directly to measure k − 1.In the remaining set {c, d, f, g}, (1) : (2) : e/0 f/2 g/4 (3) : (4) : (5) : the only escape position is f, which is lifted to measure k + 1.The resulting weak quasi dominion {c, d, g}, however, is closed, since µ ̺ (c) = 2 < µ ̺ (d) + c = 3.Therefore, player ⊕ changes strategy and chooses the move (c, d).Since no escape positions remain, the set {c, d, g} is winning for player ⊕ and the win operator lifts all their measures to ∞, leading to ̺  in Picture (5).The measure function µ  is now a progress measure and the algorithm terminates.The total number of single measure updates for QDPM to reach the fixpoint on the example of Figure 4 is 13, regardless of the value of the maximal weight k in the game assigned to position a.
On the other hand, it can easily be proved that SEPM [13] requires 3k+8 applications of its lift operator to compute a progress measure, for a total of 5k + 9 measure updates.Indeed, the first two evaluations of lift, starting from µ  , lead to µ  = {a → k; b, e → 0; c, f, g → 2; d → 1}, as in Picture (2), and require 5 measure lifts.Then, the algorithm iteratively increases the measures of b, g, d, f, and c by applying 3(k − 1) times the lift operator, for a total of 5(k − 1) measure lifts: which contribute with the remaining 3 lifts.From this observation, the next result immediately follows.
Theorem 4 (Efficiency).An infinite family of MPGs { k } k exists on which QDPM requires a constant number of measure updates, while SEPM requires O(k) such updates.
From Theorem 1, together with Lemmas 1 and 2, it follows that the solution provided by the algorithm is indeed a progress measure, hence establishing soundness.
On the other hand, Theorem 3, together with Condition 1b of Definition 4, ensures that all the positions with infinite measure are winning for player ⊕, hence the algorithm is also complete.
The following lemma ensures that each execution of the operator prg + strictly increases the measure of all the positions in ∆(̺).
Recall that each position can at most be lifted S +1 = O(n • W ) times and, by the previous lemma, the complexity of sol only depends on the cumulative cost of such lift operations.We can express, then, the total cost as the sum, over the set of positions in the game, of the cost of all the lift operations performed on that positions.Each such operation can be computed in time linear in the number of incoming and outgoing moves of the corresponding lifted position v, namely O (|Mv (v)| + |Mv − (v)|) • log S , with O(log S) the cost of each arithmetic operation involved.Summing all up, the actual asymptotic complexity of the procedure can, therefore, be expressed as to solve an MPG with n positions, m moves, and maximal positive weight W .In order to assess the effectiveness of the proposed approach, we implemented both QDPM and SEPM [13], the most efficient known solution to the problem and the more closely related one to QDPM, in C++ within OINK [32].OINK has been developed as a framework to compare parity game solvers.However, extending the framework to deal with MPGs is not difficult.The form of the arenas of the two types of games essentially coincide, the only relevant difference being that MPGs allow negative numbers to label game positions.We ran the two solvers against randomly generated MPGs of various sizes. 1  Figure 5 compares the solution time, expressed in seconds, of the two algorithms on 4000 games, each with 5000 positions and randomly assigned weights in the range [−15000, 15000].The scale of both axes is logarithmic.The experiments are divided in 4 clusters, each containing 1000 games.The benchmarks in different clusters differ in the maximal number m of outgoing moves per position, with m ∈ {10, 20, 40, 80}.These experiments clearly show that QDPM substantially outperforms SEPM.Most often, the gap between the two algorithms is between two and three orders of magnitude, as indicated by the dashed diagonal lines.It also shows that SEPM is particularly sensitive to the density of the underlying graph, as its performance degrades significantly as the number of moves increases.The maximal solution time was 8940 sec.for SEPM and 0.5 sec.for QDPM.

Experimental Evaluation
Figure 6, instead, compares the two algorithms fixing the maximal out-degree of the underlying graphs to 2, in the left-hand picture, and to 40, in the right-hand one, while increasing the number of positions from 10 3 to 10 5 along the x-axis.Each picture displays the performance results on 2800 games.Each point shows the total time to solve 100 randomly generated games with that given number of positions, which increases by 1000 up to size 2•10 3 and by 10000, thereafter.In both pictures the scale is logarithmic.For the experiments in the right-hand picture we had to set a timeout for SEPM to 45 minutes per game, which was hit most of the times on the bigger ones.
Once again, the QDPM significantly outperforms SEPM on both kinds of benchmarks, with a gap of more than an order of magnitude on the first ones, and a gap of more than three orders of magnitude on the second ones.The results also confirm that the performance gap grows considerably as the number of moves per position increases.
We are not aware of actual concrete benchmarks for MPGs.However, exploiting the standard encoding of parity games into mean-payoff games [25], we can compare the behavior of SEPM and QDPM on concrete verification problems encoded as parity games.For completeness, Table 1  The table reports the execution times, expressed in seconds, required by the two algorithms to solve instances of two classic verification problems: the Elevator Verification and the Language Inclusion problems.These two benchmarks are included in the PGSolver [23] toolkit and are often used as benchmarks for parity games solvers.The first benchmark is a verification under fairness constraints of a simple model of an elevator, while the second one encodes the language inclusion problem between a non-deterministic Büchi automaton and a deterministic one.The results on various instances of those problems confirm that QDPM significantly outperforms the classic progress measure approach.Note also that the translation into MPGs, which encodes priorities as weights whose absolute value is exponential in the values of the priorities, leads to games with weights of high magnitude.Hence, the results in Table 1 provide further evidence that QDPM is far less dependent on the absolute value of the weights.They also show that QDPM can be very effective for the solution of real-world qualitative verification problems.It is worth noting, though, that the translation from parity to MPGs gives rise to weights that are exponentially distant from each other [25].As a consequence, the resulting benchmarks are not necessarily representative of MPGs, being a very restricted subclass.Nonetheless, they provide evidence of the applicability of the approach in practical scenarios.

Concluding Remarks
We proposed a novel solution algorithm for the decision problem of MPGs that integrates progress measures and quasi dominions.We argue that the integration of these two concepts may offer significant speed up in convergence to the solution, at no additional computational cost.This is evidenced by the existence of a family of games on which the combined approach can perform arbitrarily better than a classic progress measure based solution.Experimental results also show that the introduction of quasi dominions can often reduce solution times up to three order of magnitude, suggesting that the approach may be very effective in practical applications as well.We believe that the integration approach we devised is general enough to be applied to other types of games.In particular, the application of quasi dominions in conjunction with progress measure based approaches, such as those of [27] and [21], may lead to practically efficient quasi polynomial algorithms for parity games and their quantitative extensions.
Proof.The proof proceeds by showing that, for each ̺ ∈ QDR, the elements prg  (̺) and prg + (̺) are QDR too.We also prove that ̺ ⊑ prg  (̺) and ̺ ⊑ prg + (̺).The two operators are analyzed separately. • , thus the appropriate condition between Conditions 1c and 1d of Definition 4 is verified, since for all adjacents u ∈ Mv (v), as required by Condition 1d.To complete the proof that prg  is a total function from QDR to itself, we need to show that ̺ ⋆ satisfies Conditions 1b and 1a too.It is immediate to see that and Mv (v) ⊆ µ ̺ ⊕ , otherwise.Therefore, µ ̺ ⋆ ⊕ is necessarily a ⊕-dominion, so Condition 1b is verified.Finally, let us focus on Condition 1a and consider a (σ ̺ ⋆ , v)-play vπ.If, on the one hand, π is infinite and does not meet v, thanks to Proposition 1, we have wg(π) = ∞, thus wg(vπ) = ∞ and, so, wg(vπ) > 0.
If π is finite, instead, it holds that lst(π) ∈ esc(̺, Q(̺)) and, so, µ ̺ ⋆ (lst(π)) = wg(lst(π)), due to Proposition 4. Now, by Proposition 2, we have that µ ̺ (fst(π)) ≤ µ ̺ (lst(π)) + wg(π <ℓ−1 ) = wg(lst(π)) + wg(π <ℓ−1 ) = wg(π), where ℓ ∈ N is the length of π Moreover, 0 < µ ̺ (v) ≤ µ ̺ (fst(π)) + v = µ ̺ (fst(π)) + wg(v), thanks to the previously proved Conditions 1c and 1d.Hence, 0 < µ ̺ (v) ≤ µ ̺ (fst(π)) + wg(v) ≤ wg(v) + wg(π) = wg(vπ), as required by the definition of quasi ⊕-dominion.Finally, if π is infinite and does meet v, it can be decomposed as (vπ ′ ) ω , where π is a non-empty finite path that does not meet v.Then, by exploiting the same reasoning done above for the case where π is finite, we have that wg(vπ ′ ) > 0, which implies wg(π) = wg((vπ prg + (̺) and consider the two infinite monotone sequences Q  ⊇ Q  ⊇ . . .and ̺  ⊑ ̺  ⊑ . . .defined as follows: We first prove, by induction on the index i ∈ N of the sequences, that every ̺ i satisfies Conditions 1a and 1c of Definition 4. Finally, we show that ̺ ⋆ is a QDR.The base case i = 0 is trivial, since ̺ i = ̺ is a QDR.Now, let us consider the inductive case i > 0. Since the lift operator only modifies the measure of positions belonging to where the latter equality is due to the fact that σ ̺i (v) ∈ E i− .Thus, by Lemma 4, it holds that σ ̺i is a ⊕-witness for Q(̺ i ), i.e., Condition 1a is verified.Also, Condition 1c directly follows from the definition of the ⊕-strategy inside the lift operator.At this point, we can conclude the proof by showing that ̺ ⋆ is a QDR.Indeed, by Lemma 4, σ ̺ ⋆ is a ⊕witness for Q(̺ k ) = Q(̺), so, Condition 1a is satisfied.Similarly to the inductive analysis developed above, Condition 1c directly follows from the definition of the ⊕-strategy inside the win function.Moreover, the set Q k is a closed subset of Q(̺ k ), since E k = ∅ and, so, esc(̺ k , Q k ) = ∅.Therefore, Q k ⊆ Wn ⊕ , by Proposition 1.In addition, all positions in As a consequence, Condition 1b is verified as well.It remains to prove Condition 1d.To do so, let f i min v∈esc(̺i,Qi) bef(µ ̺i , Q i , v).We now first show that the sequence of natural numbers f  , f  , . . . is monotone, i.e., f i ≤ f i+ .Suppose by contradiction that f i > f i+ , for some index i ∈ N.Then, there necessarily exists a position We proceeds by a case analysis on the owner of the position v.
-[v ∈ Ps ⊕ ].By definition of the best-escape forfeit function, we have that Therefore, the following equalities and inequalities hold, which lead to the contradiction f i ≤ f i+ < f i : Notice that the first and last equality are due to the definition of the measure stretch operator.
The second one is derived from the fact that σ ̺i (v) ∈ E i , while the third one from v ∈ E i+ , which implies µ ̺i+ (v) = µ ̺i (v).Finally, the last inequality follows from Condition 1c applied to ̺ i , i.e., Again by definition of the best-escape forfeit function, we have that the following equalities hold: Notice that the second and last equality are due to the definition of the measure stretch operator.The third one is derived from the fact that u ∈ Mv (v) \ Q i+ ⊆ E i , while the fourth one from v ∈ E i+ , which implies µ ̺i+ (v) = µ ̺i (v).Finally, the last inequality follows from Condition 1d applied to ̺, i.e., µ ̺i (v) = µ ̺ (v) ≤ µ ̺ (u) + v ≤ µ ̺i (u) + v, for all adjacents u ∈ Mv (v).Now suppose by contradiction that Condition 1d does not hold for ̺ ⋆ .Then, there exist a ⊟-position v ∈ Q(̺ ⋆ ) ∩ Ps ⊟ and one of its adjacents u ∈ Mv (v) such that µ ̺ ⋆ (u) + v < µ ̺ ⋆ (v).Due to the process used to compute ̺ ⋆ , there are indexes i, j ∈ [0, k] such that µ ̺ ⋆ (u) = µ ̺i+ (u) = µ ̺ (u) + f i and µ ̺ ⋆ (v) = µ ̺j+ (v) = µ ̺ (v)+f j .Now, by Condition 1d applied to ̺, we have µ ̺ (v) ≤ µ ̺ (u)+v, which implies that 0 ≤ µ ̺ (u) + v − µ ̺ (v) < f j − f i and, consequently, both i < j and u ∈ Q j .However, Notice that the first equality is due to the definition of the best-escape forfeit function.The second and third ones, instead, follows from the fact that v and u changed their values at iterations j + 1 and i + 1, respectively.Finally, the fourth equality derives from the operation of lift and best-escape forfeit computed on u.
Lemma 5. Let ̺ ⋆ prg + (̺), for some ̺ ∈ QDR, and Proof.Suppose by contradiction that there exists a position Moreover, there exists a finite path π compatible with the ⊕-strategy σ ̺ and entirely contained in Q(̺) \ ∆(̺), which starts in v and ends in esc(̺, Q(̺)), i.e., fst(π) = v and lst(π) ∈ esc(̺, Q(̺)).By Propositions 2 and 4, we have that S < µ where the last inequality is obviously due to the fact that there are no repeated positions in π, being it finite.However, S ⋆ ≤ S, which means that a contradiction has been reached with v ∈ Q(̺)\∆(̺).Thus, assume v ∈ ∆(̺) and consider the two infinite monotone sequences Q  ⊇ Q  ⊇ . . .and ̺  ⊑ ̺  ⊑ . . .defined as in the proof of Theorem 2: , for all i ∈ N. Also, let S  ≤ S  ≤ . . .< ⊤ be the sequence of natural numbers defined as Therefore, to prove the thesis, it suffices to show that µ ̺i (z) ≤ S i , for all positions z ∈ Q i and index i ∈ [0, k].The base case i = 0 follows by applying the same reasoning previously done for the case v ∈ Q(̺) \ ∆(̺) and by noticing that S  = S ⋆ .Now, let i > 0. By definition of the lift operator, there exists at least one adjacent x of z such that µ ̺i+ (z) = µ ̺i (x) + z.By the inductive hypothesis, µ ̺i (x) ≤ S i .Thus, µ ̺i+ (z) ≤ S i + wg(z) ≤ S i+ , since wg(z) ∈ S i .
Proof.By definition of inflationary fixpoint, ̺ ⋆ is a fixpoint of the composition of the two progress operators, i.e., ̺ ⋆ = prg + (prg  (̺ ⋆ )), which are inflationary functions, due to Theorem 2. As a consequence, we have that Proof.By definition of the progress operator prg  , we have that ̺ = prg  (̺) = sup{̺, lift(̺, Q(̺), Ps)}, from which we derive ̺ ⋆ lift(̺, Q(̺), Ps) ⊑ ̺.Now, consider an arbitrary position v ∈ Q(̺) and observe that µ ̺ ⋆ (v) ≤ µ ̺ (v), due to Item 2 of Definition 4. At this point, the proof proceeds by a case analysis on the owner of the position v itself.
• [v ∈ Ps ⊕ ].By definition of the lift operator, we have that µ Again by definition of the lift operator, we have that µ Hence, Condition 2 of Definition 2 is satisfied on Q(̺) as well.
Proof.Let us consider the infinite monotone sequence of position sets Q  ⊇ Q  ⊇ . . .defined as follows: By definition of the progress operator prg + and the equality ̺ = prg + (̺), we have that ̺ = lift(̺, E i , Q i ), for all i ∈ [0, k), and ̺ = win(̺, Q k ).Now, consider an arbitrary position v ∈ Q(̺).If v ∈ ∆(̺), due to the definition of the set ∆(̺), the position v satisfies by definition of the appropriate condition of Definition 2 on Q(̺).Therefore, let us assume v ∈ ∆(̺).Then, it is obvious that either In the first case, we have µ ̺ (v) = ∞, due to the definition of the function win.Therefore, v is a progress position.In the other case, the proof proceeds by a case analysis on the owner of the position v itself.
Thus, due to the definition of the function esc, we have that µ ̺ (u) + v ≤ µ ̺ (v), for all positions u ∈ Mv (v) ∩ Q i .Now, by the definition of the lift operator, we have that µ Again by definition of the lift operator, we have that µ Hence, Condition 2 of Definition 2 is satisfied on Q(̺) as well.
Thus, µ ̺ (v) = min{µ ̺ (u) + v : u ∈ Mv (v) \ ∆(̺)} > µ ̺ (v).Hence, µ ̺ ⋆ (v) > µ ̺ (v) in this case as well.Now, consider a position v ∈ ∆(̺) \ E. Obviously, µ ̺ (v) < ∞.If µ ̺ ⋆ (v) = ∞, the thesis immediately follows.Otherwise, it will be considered as an escape of some weak quasi dominion Q ⊂ ∆(̺), after the removal of the first escape positions in E. Due to the non-decreasing property of the sequence of best-escape forfeit shown in the proof of Theorem 2, v exits from Q with a forfeit f ⋆ at least as great as the one f of E that we just proved to be strictly positive.Indeed, f = µ ̺ ⋆ (z) − µ ̺ (z) > 0, for all z ∈ E. Therefore, µ ̺ ⋆ (v) − µ ̺ (v) = f ⋆ ≥ f > 0, which implies µ ̺ ⋆ (v) > µ ̺ (v).Proof.To compute sol efficiently, we now provide an imperative reformulation of the functional fixpoint algorithm sol ifp ̺ .prg + (prg  (̺)) with the desired complexity.Recall that, by Lemma 5, each position can only be lifted S + 1 times, where S {wg(v) ∈ N : v ∈ Ps ∧ wg(v) > 0} = O(n • W ). Therefore, to obtain the claimed complexity, we have to guarantee that the cost of all the computational steps be linear in the number of measure increases.To do so, it suffices to ensure that the algorithm explores the incoming and outgoing moves only of those positions whose measures are actually lifted.This is clearly the case for the lift operator itself, since it only explores the outgoing moves of each position in its source set.The only remaining problem is to be able to identify the positions that need to be lifted in the next iteration, by only exploring the incoming moves of the positions just lifted.Solving this problem requires some technical tricks.Specifically, inspired by [13], will employ vectors of counters, namely c, d and g, that associates with ⊕-positions the number of moves that do not satisfy the progress condition, and with ⊟-positions the number of moves that satisfy it.In addition, we will also use a priority queue T to allow an efficient identification of the best-escape positions during the computation of the operator prg + .Algorithm 2 reports the procedural implementation of sol(̺  ), where ̺  is the smallest possible QDR, as defined at Line 1.At the beginning of each iteration i ∈ N of the while-loop at Line 4, the variable ̺ maintains the QDR ̺ i computed by applying to ̺  the composition prg + • prg  i times.Moreover, the sets N  and N + contain, respectively, the positions that need to be lifted by prg  and the non-progress positions in ̺ i .The formal invariants at Line 4 are: N  = {v ∈ Ps : µ ̺i (v) = 0 = µ ̺i+ (v)} and N + = npp(̺ i ).Observe that these invariants are trivially satisfied for i = 0, thanks to Line 3. After the execution of the progress procedure prg  at Line 5, whose code is reported in Algorithm 3, we have that N  ⊆ {v ∈ Ps : µ ̺i+ (v) = 0 = µ ̺i+ (v)} and N + ∪ A = npp(̺ ⋆ i ), where ̺ ⋆ i prg  (̺ i ).Thus, Line 6 ensures that N + = npp(̺ ⋆ i ).Line 7 calls the progress procedure prg + , which is reported in Algorithm 5, and forces the lift of the measures of all the positions in ∆(̺ ⋆ i ), as stated by Lemma 3. In addition, the verified invariants are N  ∪ A = {v ∈ Ps : µ ̺i+ (v) = 0 = µ ̺i+ (v)} and N + = npp(̺ i+ ).Finally, after Line 8, it holds that N  = {v ∈ Ps : µ ̺i+ (v) = 0 = µ ̺i+ (v)}, as required by the previously discussed invariants for the next iteration i + 1. Observe that Line 2 is used to initialize, for each ⊟-position v ∈ Ps ⊟ , the counter c(v) to the number of adjacents u ∈ Mv (v) of v that satisfy the progress inequality µ ̺ (v) ≥ µ ̺ (u) + v.
The subsequent analysis of Algorithms 3 and 5 shows that the procedures prg  (̺, c, N  ) and prg + (̺, c, N + ) require time respectively, where npp(̺) = N + .In particular, the factor log S is due to all the arithmetic operations required to compute the stretch of the measures.Since during the entire execution of the algorithm each position v ∈ Ps can appear at most once in some N  and at most S times in some ∆(̺), it follows that the total cost of Algorithm 2 is where the term n is due to the initialization operations at Lines 1-3.
Observe that, the two procedures prg  and prg + , together with the auxiliary one reported in Algorithm 4, share with Algorithm 2 both the current QDR ̺ and the counter c as global variables.
with m the number of moves and O(log S) the cost of each arithmetic operation necessary to compute the stretch of the measures.
Consider again the example above, where Q = ∆(̺) = {c, d, f}.If position d immediately escapes from Q using the move (d, b), it would change its measure to µ ′ (d) = µ(b) + d = 2 > µ(d) = 1.Now, position c has two ways to escape, either directly with move (c, a) or by reaching the other escape position d passing through f.The first choice would set its measure to µ(a) + c = 4.The resulting measure function, however, would not satisfy Condition 1d of Definition 4, as the new measure of c would be greater than µ ′ (d) + c = 2, preventing to obtain a QDR.Similarly, if position d escapes from Q passing through c via the move (c, a), we would have µ ′′ (d) = µ ′′ (c) + d = (µ(a) + c) + d = 4 > 2 = µ(b) + d, still violating Condition 1d.Therefore, in this specific case, the only possible way to escape is to reach b.
, and µ i+ = µ i+ [c → i+2], for all i ∈ [1, k − 1].At this point, b and f have obtained measures k − 1 and k + 1, respectively, which suffice to satisfy the progress relation along the moves (f, b) and (b, a).However, the ⊟-position g does not satisfy such a relation along its unique move (g, c), since µ k− (g) = k + 2 < µ k− (c) + g = (k + 1) + 2 = k + 3. Therefore, other six applications of lift are needed before g can exceed the bound S = wg(a) + wg(f) + wg(g) = k + 4. Each one of them modifies the measure of one position only, for a total of 6 lifts: µ

Figure 6 :
Figure 6: Total solution times in seconds of SEPM and QDPM on 5600 random games.

Theorem 7 (
Complexity).QDPM requires time O(n • m • W • log(n • W )) to solve an MPG with n positions, m moves, and maximal positive weight W .

Table 1 :
reports some experiments on such problems.Experiments on concrete verification problems.