On the Comparison of Discounted-Sum Automata with Multiple Discount Factors

We look into the problems of comparing nondeterministic discounted-sum automata on finite and infinite words. That is, the problems of checking for automata $A$ and $B$ whether or not it holds that for all words $w$, $A(w)=B(w), A(w) \leq B(w)$, or $A(w)<B(w)$. These problems are known to be decidable when both automata have the same single integral discount factor, while decidability is open in all other settings: when the single discount factor is a non-integral rational; when each automaton can have multiple discount factors; and even when each has a single integral discount factor, but the two are different. We show that it is undecidable to compare discounted-sum automata with multiple discount factors, even if all are integrals, while it is decidable to compare them if each has a single, possibly different, integral discount factor. To this end, we also provide algorithms to check for given nondeterministic automaton $N$ and deterministic automaton $D$, each with a single, possibly different, rational discount factor, whether or not $N(w) = D(w)$, $N(w) \geq D(w)$, or $N(w)>D(w)$ for all words $w$.


Introduction
Equivalence and containment checks of Boolean automata, namely the checks of whether L(A) = L(B), L(A) ⊆ L(B), or L(A) ⊂ L(B), where L(A) and L(B) are the languages that A and B recognize, are central in the usage of automata theory in diverse areas, and in particular in formal verification (e.g, [34,26,17,33,35,28]). Likewise, comparison of quantitative automata, which extends the equivalence and containment checks by asking whether A(w) = B(w), whether A(w) ≤ B(w), or whether A(w) < B(w) for all words w, are essential for harnessing quantitative-automata theory to the service of diverse fields and in particular to the service of quantitative formal verification (e.g, [15,14,21,11,27,3,5,22]).
Discounted summation is a common valuation function in quantitative automata theory (e.g, [19,12,14,15]), as well as in various other computational models, such as games (e.g., [37,4,1]), Markov decision processes (e.g, [23,29,16]), and reinforcement learning (e.g, [32,36]), as it formalizes the concept that an immediate reward is better than a potential one in the far future, as well as that a potential problem (such as a bug in a reactive system) in the far future is less troubling than a current one.
A nondeterministic discounted-sum automaton (NDA) has rational weights on the transitions, and a fixed rational discount factor λ > 1. The value of a (finite or infinite) run is the discounted summation of the weights on the transitions, such that the weight in the ith transition of the run is divided by λ i . The value of a (finite or infinite) word is the infimum value of the automaton runs on it. An NDA thus realizes a function from words to real numbers.
NDAs cannot always be determinized [15], they are not closed under basic algebraic operations [8], and their comparison is not known to be decidable, relating to various longstanding open problems [9]. However, restricting NDAs to have an integral discount factor λ ∈ N \ {0, 1} provides a robust class of automata that is closed under determinization and under algebraic operations, and for which comparison is decidable [8].
Various variants of NDAs are studied in the literature, among which are functional, k-valued, probabilistic, and more [21,20,13]. Yet, until recently, all of these models were restricted to have a single discount factor. This is a significant restriction of the general discounted-summation paradigm, in which multiple discount factors are considered. For example, Markov decision processes and discounted-sum games allow multiple discount factors within the same entity [23,4]. In [6], NDAs were extended to NMDAs, allowing for multiple discount factors, where each transition can have a different one. Special attention was given to integral NMDAs, namely to those with only integral discount factors, analyzing whether they preserve the good properties of integral NDAs. It was shown that they are generally not closed under determinization and under algebraic operations, while a restricted class of them, named tidy-NMDAs, in which the choice of discount factors depends on the prefix of the word read so far, does preserve the good properties of integral NDAs.
While comparison of tidy-NMDAs with the same choice function is decidable in PSPACE [6], it was left open whether comparison of general integral NMDAs A and B is decidable. It is even open whether comparison of two integral NDAs with different (single) discount factors is decidable.
We show that it is undecidable to resolve for given NMDA N and deterministic NMDA (DMDA) D, even if both have only integral discount factors, on both finite and infinite words, whether N ≡ D and whether N ≤ D, and on finite words also whether N < D. We prove the undecidability result by reduction from the halting problem of two-counter machines. The general scheme follows similar reductions, such as in [18,2], yet the crux is in simulating a counter by integral NMDAs. Upfront, discounted summation is not suitable for simulating counters, since a current increment has, in the discounted setting, a much higher influence than of a far-away decrement. However, we show that multiple discount factors allow in a sense to eliminate the influence of time, having automata in which no matter where a letter appears in the word, it will have the same influence on the automaton value. (See Lemma 1 and Fig. 3). Another main part of the proof is in showing how to nondeterministically adjust the automaton weights and discount factors in order to "detect" whether a counter is at a current value 0. (See Figs. 5, 6, 8 and 9.) On the positive side, we provide algorithms to decide for given NDA N and deterministic NDA (DDA) D, with arbitrary, possibly different, rational discount factors, whether N ≡ D, N ≥ D, or N > D (Theorem 4). Our algorithms work on both finite and infinite words, and run in PSPACE when the automata weights are represented in binary and their discount factors in unary. Since integral NDAs can always be determinized [8], our method also provides an algorithm to compare two integral NDAs, though not necessarily in PSPACE, since determinization might exponentially increase the number of states. (Even though determinization of NDAs is in PSPACE [8,6], the exponential number of states might require an exponential space in our algorithms of comparing NDAs with different discount factors.) The challenge with comparing automata with different discount factors comes from the combination of their different accumulations, which tends to be intractable, resulting in the undecidability of comparing integral NMDAs, and in the open problems of comparing rational NDAs and of analyzing the representation of numbers in a non-integral basis [30,24,25,9]. Yet, the main observation underlying our algorithm is that when each automaton has a single discount factor, we may unfold the combination of their computation trees only up to some level k, after which we can analyze their continuation separately, first handling the automaton with the lower (slower decreasing) discount factor and then the other one. The idea is that after level k, since the accumulated discounting of the second automaton is already much more significant, even a single non-optimal transition of the first automaton cannot be compensated by a continuation that is better with respect to the second automaton. We thus compute the optimal suffix words and runs of the first automaton from level k, on top which we compute the optimal runs of the second automaton.

Preliminaries
Words. An alphabet Σ is an arbitrary finite set, and a word over Σ is a finite or infinite sequence of letters in Σ, with ε for the empty word. We denote the concatenation of a finite word u and a finite or infinite word w by u ·w, or simply by uw. We define Σ + to be the set of all finite words except the empty word, i.e., Σ + = Σ * \{ε}. For a word w = σ 0 σ 1 σ 2 · · · and indexes i ≤ j, we denote the letter at index i as w[i] = σ i , and the sub-word from i to j as w[i.
For a finite word w and letter σ ∈ Σ, we denote the number of occurrences of σ in w by #(σ, w), and for a set S ⊆ Σ, we denote σ∈S #(σ, w) by #(S, w).
For a finite or infinite word w and a letter σ ∈ Σ, we define the prefix of w up to σ, pref σ (w), as the minimal prefix of w that contains a σ letter if there is a σ letter in w or w itself if it does not contain any σ letters. Formally, Automata. A nondeterministic discounted-sum automaton (NDA) [15] is an automaton with rational weights on the transitions, and a fixed rational discount factor λ > 1. A nondeterministic discounted-sum automaton with multiple discount factors (NMDA) [6] is similar to an NDA, but with possibly a different discount factor on each of its transitions. They are formally defined as follows: . A nondeterministic discounted-sum automaton with multiple discount factors (NMDA), on finite or infinite words, is a tuple A = Σ, Q, ι, δ, γ, ρ over an alphabet Σ, with a finite set of states Q, an initial set of states ι ⊆ Q, a transition function δ ⊆ Q × Σ × Q, a weight function γ : δ → Q, and a discount-factor function ρ : δ → Q ∩ (1, ∞), assigning to each transition its discount factor, which is a rational greater than one. 1 -A run of A is a sequence of states and alphabet letters, p 0 , σ 0 , p 1 , σ 1 , p 2 , · · · , such that p 0 ∈ ι is an initial state, and for every i, -The length of a run r, denoted by |r|, is n for a finite run r = p 0 , σ 0 , p 1 , · · · , σ n−1 , p n , and ∞ for an infinite run. -For an index i < |r|, we define the i-th transition of r as r and the prefix run with i transitions as r[0.
-The value of a finite/infinite run r is .
For example, the value of the run r 1 = q 0 , a, q 0 , a, q 1 , b, q 2 of A from Fig. 1 is A(r 1 ) = 1 + 1 2 · 1 3 + 2 · 1 2·3 = 3 2 . -The value of A on a finite or infinite word w is A(w) = inf{A(r) | r is a run of A on w}. -For every finite run r = p 0 , σ 0 , p 1 , · · · , σ n−1 , p n , we define the target state as δ(r) = p n and the accumulated discount factor as ρ(r) = n−1 i=0 ρ r[i]) . -When all discount factors are integers, we say that A is an integral NMDA.
-In the case where |ι| = 1 and for every q ∈ Q and σ ∈ Σ, we have |{q ′ (q, σ, q ′ ) ∈ δ}| ≤ 1, we say that A is deterministic, denoted by DMDA, and view δ as a function from words to states. -When the discount factor function ρ is constant, ρ ≡ λ ∈ Q ∩ (1, ∞), we say that A is a nondeterministic discounted-sum automaton (NDA) [15] with discount factor λ (a λ-NDA). If A is deterministic, it is a λ-DDA. -For a state q ∈ Q, we write A q for the NMDA A q = Σ, Q, { q } , δ, γ, ρ . Counter machines. A two-counter machine [31] M is a sequence (l 1 , . . . , l n ) of commands, for some n ∈ N, involving two counters x and y. We refer to { 1, . . . , n } as the locations of the machine. For every i ∈ { 1, . . . , n } we refer to l i as the command in location i. There are five possible forms of commands: where c ∈ { x, y } is a counter and 1 ≤ k, k ′ ≤ n are locations. For not decreasing a zero-valued counter c ∈ { x, y }, every dec(c) command is preceded by the command if c=0 goto <current_line> else goto <next_line>, and there are no other direct goto-commands to it. The counters are initially set to 0. An example of a two-counter machine is given in Fig. 2. Let L be the set of possible commands in M, then a run of M is a sequence ψ = ψ 1 , . . . , ψ m ∈ (L × N × N) * such that the following hold: 1. ψ 1 = l 1 , 0, 0 . 2. For all 1 < i ≤ m, let ψ i−1 = (l j , α x , α y ) and ψ i = (l ′ , α ′ x , α ′ y ). Then, the following hold.
-If l j is an inc(x) command (resp. inc(y)), then α ′ -If l j is if y=0 goto l k else goto l k ′ then α ′ x = α x , α ′ y = α y , and l ′ = l k if α y = 0, and l ′ = l k ′ otherwise.
-If l ′ is halt then i = m, namely a run does not continue after halt.
If, in addition, we have that ψ m = l j , α x , α y such that l j is a halt command, we say that ψ is a halting run. We say that a machine M 0-halts if its run is halting and ends in l, 0, 0 . We say that a sequence of commands τ ∈ L * fits a run ψ, if τ is the projection of ψ on its first component.
The command trace π = σ 1 , . . . , σ m of a halting run ψ = ψ 1 , . . . , ψ m describes the flow of the run, including a description of whether a counter c was equal to 0 or larger than 0 in each occurrence of an if c=0 goto l k else goto l k ′ command. It is formally defined as follows. σ m = halt and for every 1 < i ≤ m, we define σ i−1 according to ψ i−1 = (l j , α x , α y ) in the following manner: For example, the command trace of the halting run of the machine in Fig. 2 is Deciding whether a given counter machine M halts is known to be undecidable [31]. Deciding whether M halts with both counters having value 0, termed the 0-halting problem, is also undecidable. Indeed, the halting problem can be reduced to the latter by adding some commands that clear the counters, before every halt command.

Comparison of NMDAs
We show that comparison of (integral) NMDAs is undecidable by reduction from the halting problem of two-counter machines. Notice that our NMDAs only use integral discount factors, while they do have non-integral weights. Yet, weights can be easily changed to integers as well, by multiplying them all by a common denominator and making the corresponding adjustments in the calculations.
We start with a lemma on the accumulated value of certain series of discount factors and weights. Observe that by the lemma, no matter where the pair of discount-factor λ ∈ N \ {0, 1} and weight w = λ−1 λ appear along the run, they will have the same effect on the accumulated value. This property will play a key role in simulating counting by NMDAs. Lemma 1. For every sequence λ 1 , · · · , λ m of integers larger than 1 and weights Proof. We show the claim by induction on m.
The base case, i.e. m = 1, is trivial. For the induction step we have

The Reduction
We turn to our reduction from the halting problem of two-counter machines to the problem of NMDA containment. We provide the construction and the correctness lemma with respect to automata on finite words, and then show in Section 3.2 how to use the same construction also for automata on infinite words.
Given a two-counter machine M with the commands (l 1 , . . . , l n ), we construct an integral DMDA A and an integral NMDA B on finite words, such that M 0-halts iff there exists a word w ∈ Σ + such that B(w) ≥ A(w) iff there exists a word w ∈ Σ + such that B(w) > A(w).
The automata A and B operate over the following alphabet Σ, which consists of 5n + 5 letters, standing for the possible elements in a command trace of M: When A and B read a word w ∈ Σ + , they intuitively simulate a sequence of commands τ u that induces the command trace u = pref halt (w). If τ u fits the actual run of M, and this run 0-halts, then the minimal run of B on w has a value strictly larger than A(w). If, however, τ u does not fit the actual run of M, or it does fit the actual run but it does not 0-halt, then the violation is detected by B, which has a run on w with value strictly smaller than A(w).
In the construction, we use the following partial discount-factor functions ρ p , ρ d : Σ nohalt → N and partial weight functions γ p , γ d : Σ nohalt → Q.
. We say that ρ p and γ p are the primal discount-factor and weight functions, while ρ d and γ d are the dual functions. Observe that for every c ∈ {x, y} we have that Intuitively, we will use the primal functions for A's discount factors and weights, and the dual functions for identifying violations. Notice that if changing the primal functions to the dual ones in more occurrences of inc(c) letters than of dec(c) letters along some run, then by Lemma 1 the run will get a value lower than the original one.
We continue with their formal definitions. Fig. 3. Observe that the initial state q A has self loops for every alphabet letter in Σ nohalt with weights and discount factors according to the primal functions, and a transition (q A , halt, q h A ) with weight of 14 15 and a discount factor of 15.
dec(x), 3 4 , 4 inc(y), 6 7 , 7 Σ goto , 14 15 , 15 dec(y), 5 6 , 6 halt, 14 15 , 15 The integral NMDA B = Σ, Q B , ι B , δ B , γ B , ρ B is the union of the following eight gadgets (checkers), each responsible for checking a certain type of violation in the description of a 0-halting run of M. It also has the states q freeze , q halt ∈ Q B such that for all σ ∈ Σ, there are 0-weighted transitions (q freeze , σ, q freeze ) ∈ δ B and (q halt , σ, q halt ) ∈ δ B with an arbitrary discount factor. Observer that in all of B's gadgets, the transition over the letter halt to q halt has a weight higher than the weight of the corresponding transition in A, so that when no violation is detected, the value of B on a word is higher than the value of A on it.
1. Halt Checker. This gadget, depicted in Fig. 4, checks for violations of nonhalting runs. Observe that its initial state q HC has self loops identical to those of A's initial state, a transition to q halt over halt with a weight higher than the corresponding weight in A, and a transition to the state q last over every letter that is not halt, "guessing" that the run ends without a halt command.
q Nx q halt inc(x), 9 10 , 10 14 15 , 15 dec(y), 5   3. Positive-Counters Checker. The third gadget, depicted in Fig. 6, checks that for every c ∈ {x, y}, the input prefix u has no more inc(c) than dec(c) commands. It is similar to A, while having self loops in its initial state according to the dual functions rather than the primal ones. 14 15 , 15 dec(y), 6 7 , 7 halt, 15 16 , 16 4. Command Checker. The next gadget checks for local violations of successive commands. That is, it makes sure that the letter w i represents a command that can follow the command represented by w i−1 in M, ignoring the counter values. For example, if the command in location l 2 is inc(x), then from state q 2 , which is associated with l 2 , we move with the letter inc(x) to q 3 , which is associated with l 3 . The test is local, as this gadget does not check for violations involving illegal jumps due to the values of the counters. An example of the command checker for the counter machine in Fig. 2 is given in Fig. 7. 14 15 , 15 goto l4 x > 0, 14 15 , 15 dec(x), 3 4 , 4 (goto l6, x = 0), 14 15 , 15 (goto l3, x > 0), 14 15 , 15 halt, The command checker, which is a DMDA, consists of states q 1 , . . . , q n that correspond to the commands l 1 , . . . , l n , and the states q halt and q freeze . For two locations j and k, there is a transition from q j to q k on the letter σ iff l k can locally follow l j in a run of M that has σ in the corresponding location of the command trace. That is, either l j is a goto l k command (meaning l j = σ = goto l k ), k is the next location after j and l j is an inc or a dec command (meaning k = j + 1 and l j = σ ∈ Σ incdec ), l j is an if c=0 goto l k else goto l k ′ command with σ = (goto l k , c = 0), or l j is an if c=0 goto l s else goto l k command with σ = (goto l k , c > 0). The weights and discount factors of the Σ nohalt transitions mentioned above are according to the primal functions γ p and ρ p respectively. For every location j such that l j = halt, there is a transition from q j to q halt labeled by the letter halt with a weight of 15 16 and a discount factor of 16. Every other transition that was not specified above leads to q freeze with weight 0 and some discount factor. 5,6. Zero-Jump Checkers. The next gadgets, depicted in Fig. 8, check for violations in conditional jumps. In this case, we use a different checker instance for each counter c ∈ {x, y}, ensuring that for every if c=0 goto l k else goto l k ′ command, if the jump goto l k is taken, then the value of c is indeed 0.
Intuitively, q c ZC profits from words that have more inc(c) than dec(c) letters, while q c continues like A. If the move to q c occurred after a balanced number of inc(c) and dec(c), as it should be in a real command trace, neither the prefix word before the move to q c , nor the suffix word after it result in a profit. Otherwise, provided that the counter is 0 at the end of the run (as guaranteed by the negative-and positive-counters checkers), both prefix and suffix words get profits, resulting in a smaller value for the run.  Fig. 9, are dual to the zero-jump checkers, checking for the dual violations in conditional jumps. 14 15 , 15 14 15 , 15 halt, 15 16 , 16 halt, 15 16 , 16 Similarly to the zero-jump checkers, we have a different instance for each counter c ∈ {x, y}, ensuring that for every if c=0 goto l k else goto l k ′ command, if the jump goto l k ′ is taken, then the value of c is indeed greater than 0. 14 15 , 15 (goto l k ′ , c > 0), 14 15 , 15 Intuitively, if the counter is 0 on a (goto l k ′ , c > 0) command when there was no inc(c) command yet, the gadget benefits by moving from q c PC0 to q freeze . If there was an inc(c) command, it benefits by having the dual functions on the move from q c PC0 to q c PC1 over inc(c) and the primal functions on one additional self loop of q c PC1 over dec(c).
Lemma 2. Given a two-counter machine M, we can compute an integral DMDA A and an integral NMDA B on finite words, such that M 0-halts iff there exists a word w ∈ Σ + such that B(w) ≥ A(w) iff there exists a word w ∈ Σ + such that B(w) > A(w).
Proof. Given a two-counter machine M, consider the DMDA A and the NMDA B constructed in Section 3.1, and an input word w. Let u = pref halt (w). I. We start with the case that u correctly describes a 0-halting run of M, and show that B(w) > A(w).
Observe that in all of B's checkers, the transition over the halt command to the q halt state has a weight higher than the weight of the corresponding transition in A. Thus, if a checker behaves like A over u, namely uses the primal functions, it generates a value higher than that of A.
We show below that each of the checkers generates a value higher than the value of A on u (which is also the value of A on w), also if it nondeterministically "guesses a violation", behaving differently than A.
1. Halt Checker. Since u does have the halt command, the run of the halt checker on u, if guessing a violation, will end in the pair of transitions from q HC to q last to q freeze with discount factor 2 and weights 0 and 2, respectively.
Let D be the accumulated discount factor in the gadget up to these pair of transitions. According to Lemma 1, the accumulated weight at this point is 1 − 1 D , hence the value of the run will be 1 − 1 D + 1 D · 0 + 1 2D · 2 = 1, which is, according to Lemma 1, larger than the value of A on any word.

2,3. Negative-and Positive-Counters Checkers.
Since u has the same number of inc(c) and dec(c) letters, by Eq. (1) and Lemma 1, these gadgets and A will have the same value on the prefix of u until the last transition, on which the gadgets will have a higher weight.

Command Checker.
As this gadget is deterministic, it cannot "guess a violation", and its value on u is larger than A(u) due to the weight on the halt command. 5,6. Zero-Jump Checkers. Consider a counter c ∈ { x, y } and a run r of the gadget on u. If r did not move to q c , we have B(r) > A(w), similarly to the analysis in the negative-and positive-counters checkers. Otherwise, denote the transition that r used to move to q c as t. Observe that since u correlates to the actual run of M, we have that t was indeed taken when c = 0. In this case the value of the run will not be affected, since before t we have the same number of inc(c) and dec(c) letters, and after t we also have the same number of inc(c) and dec(c) letters. Hence, due to the last transition over the halt command, we have B(r) > A(u).

7,8. Positive-Jump Checkers.
Consider a counter c ∈ { x, y } and a run r of the gadget on u. If r never reaches q c PC1 , it has the same sequence of weights and discount factors as A, except for the higher-valued halt transition. If r reaches q c PC1 but never reaches q c PC2 , since u ends with a halt letter, we have that r ends with a transition to q freeze that has a weight of 1, hence B(r) = 1 > A(w).
If r reaches q c PC2 , let u = y · inc(c) · z · v where y has no inc(c) letters, t = r[|y|+1+|z|] is the first transition in r targeted at q c PC2 , and α c ≥ 1 is the value of the counter c when t is taken. We have that 1+#(inc(c), z) = #(dec(c), z)+α c . Since u is balanced, we also have that #(dec(c), v) = #(inc(c), v) + α c . For the first inc(c) letter, r gets a discount factor of ρ d (inc(c)) = ρ p (dec(c)). All the following inc(c) and dec(c) letters contribute discount factors according to ρ p in z and according to ρ d in v. Hence, r gets the discount factor ρ p (dec(c)) a total of times, and the discount factor ρ p (inc(c)) a total of times. Therefore, the value of r is at least as big as the value of A on the prefix of u until the halt transition, and due to the higher weight of r on the latter, we have B(r) > A(u).

II.
We continue with the case that u does not correctly describe a 0-halting run of M, and show that B(w) < A(w). Observe that the incorrectness must fall into one of the following cases, each of which results in a lower value of one of B's gadgets on u, compared to the value of A on u: -The word u has no halt command. In this case the minimal-valued run of the halt checker on u will be the same as of A until the last transition, on which the halt checker will have a 0 weight, compared to a strictly positive weight in A. -The word u does not describe a run that ends up with value 0 in both counters.
Then there are the following sub-cases: • The word u has more dec(c) than inc(c) letters for some counter c ∈ {x, y}. For c = x, in the negative-counters checker, more discount factors were changed from 4 to 2 than those changed from 5 to 10, compared to their values in A, implying that the total value of the gadget until the last letter will be lower than of A on it. For c = y, we have a similar analysis with respect to the discount factors 6; 3, and 7; 14. • The word u has more inc(c) than dec(c) letters for some counter c ∈ {x, y}. By Eq. (1) and Lemma 1, the value of the positive-counters checker until the last transition will be lower than of A until the last transition. Observe, though, that the weight of the gadgets on the halt transition (16) is still higher than that of A on it (15). Nevertheless, since a "violation detection" results in replacing at least one discount factor from 4 to 2, from 6 to 3, from 5 to 4, or from 7 to 6 (and replacing the corresponding weights, for preserving the ρ−1 ρ ratio), and the ratio difference between 16 and 15 is less significant than between the other pairs of weights, we have that the gadget's value and therefore B's value on u is smaller than A(u). Indeed, by Therefore, compared to A, more weights changed from γ p (inc(c)) to γ d (inc(c)) = γ p (dec(c)) than weights changed from γ p (dec(c)) to γ d (dec(c)) = γ p (inc(c)), resulting in a lower total value of r than of A on u. (As shown for the negative-and positive-counters checkers, the higher weight of the halt transition is less significant than the lower values above.) * A counter c = 0 at a position i of M's run, while u[i] = goto l k , c > 0. Let r be a minimal-valued run of the positive-jump checker on u.
If there are no inc(c) letters in u before position i, r will have the same weights and discount factors as A until the i's letter, on which it will move from q c PC1 to q freeze , continuing with 0-weight transitions, compared to strictly positive ones in A. Otherwise, we have that the first inc(c) letter of u takes r from q c PC0 to q c PC1 with a discount factor of ρ d (inc(c)). Then in q c PC1 we have more dec(c) transitions than inc(c) transitions, and in q c PC2 we have the same number of dec(c) and inc(c) transitions. (We may assume that u passed the previous checkers, and thus has the same total number of inc(c) and dec(c) letters.) Hence, we get two more discount factors of ρ d (inc(c)) than ρ p (inc(c)), resulting in a value smaller than A(u). (As in the previous cases, the higher value of the halt transition is less significant.)

Undecidability of Comparison
For finite words, the undecidability result directly follows from Lemma 2 and the undecidability of the 0-halting problem of counter machines [31]. Theorem 1. Strict and non-strict containment of (integral) NMDAs on finite words are undecidable. More precisely, the problems of deciding for given integral NMDA N and integral DMDA D whether N (w) ≤ D(w) for all finite words w and whether N (w) < D(w) for all finite words w.
For infinite words, undecidability of non-strict containment also follows from the reduction given in Section 3.1, as the reduction considers prefixes of the word until the first halt command. We leave open the question of whether strict containment is also undecidable for infinite words. The problem with the latter is that a halt command might never appear in an infinite word w that incorrectly describes a halting run of the two-counter machine, in which case both automata A and B of the reduction will have the same value on w. On words w that have a halt command but do not correctly describe a halting run of the two-counter machine we have B(w) < A(w), and on a word w that does correctly describe a halting run we have B(w) > A(w). Hence, the reduction only relates to whether B(w) ≤ A(w) for all words w, but not to whether B(w) < A(w) for all words w.
Theorem 2. Non-strict containment of (integral) NMDAs on infinite words is undecidable. More precisely, the problem of deciding for given integral NMDA N and integral DMDA D whether N (w) ≤ D(w) for all infinite words w.
Proof. The automata A and B in the reduction given in Section 3.1 can operate as is on infinite words, ignoring the Halt-Checker gadget of B which is only relevant to finite words.
Since the values of both A and B on an input word w only relate to the prefix u = pref halt(w) of w until the first halt command, we still have that B(w) > A(w) if u correctly describes a halting run of the two-counter machine M and that B(w) < A(w) if u is finite and does not correctly describe a halting run of M.
Yet, for infinite words there is also the possibility that the word w does not contain the halt command. In this case, the value of both A and the command checker of B will converge to 1, getting A(w) = B(w).
Hence, if M 0-halts, there is a word w, such that B(w) > A(w) and otherwise, for all words w, we have B(w) ≤ A(w). ⊓ ⊔ Observe that for NMDAs, equivalence and non-strict containment are interreducible.
Theorem 3. Equivalence of (integral) NMDAs on finite as well as infinite words is undecidable. That is, the problem of deciding for given integral NMDAs A and B on finite or infinite words whether A(w) = B(w) for all words w.
Proof. Assume toward contradiction the existence of a procedure for equivalence check of A and B. We can use the nondeterminism to obtain an automaton C = A ∪ B, having C(w) ≤ A(w) for all words w. We can then check whether C is equivalent to A, which holds if and only if A(w) ≤ B(w) for all words w. Indeed, if A(w) ≤ B(w) then A(w) ≤ min(A(w), B(w)) = C(w), while if there exists a word w, such that B(w) < A(w), we have C(w) = min(A(w), B(w)) < A(w), implying that C and A are not equivalent. Thus, such a procedure contradicts the undecidability of non-strict containment, shown in Theorems 1 and 2.

Comparison of NDAs with Different Discount Factors
We present below our algorithm for the comparison of NDAs with different discount factors. We start with automata on infinite words, and then show how to solve the case of finite words by reduction to the case of infinite words. The algorithm is based on our main observation that, due to the difference between the discount factors, we only need to consider the combination of the automata computation trees up to some level k, after which we can consider first the best/worst continuation of the automaton with the smaller discount factor, and on top of it the worst/best continuation of the second automaton.
For an NDA A, we define its lowest (resp. highest ) infinite run value by lowrun(A) (resp. highrun(A)) = min (resp. max) {A(r) r is an infinite run of A (on some word w ∈ Σ ω )}.
Observe that we can use min and max (rather than inf and sup) since the infimum and supremum values are indeed attainable by specific infinite runs of the NDA (cf. [10, Proof of Theorem 9]). Notice that lowrun(A) and highrun(A) can be calculated in PTIME by a simple reduction to one-player discountedpayoff games [4].
Considering word values, we also refer to the lowest (resp. highest ) word value of A, defined by lowword(A) (resp. highword(A))= min (resp. max) For an NMDA A with states Q, we define the maximal difference between suffix runs of A as maxdiff( Notice that maxdiff(A) ≥ 0 and that A q (w) is bounded as follows.
Lemma 3. There is an algorithm that computes for every input discount factors λ A , λ D ∈ Q ∩ (1, ∞), λ A -NDA A and λ D -DDA D on infinite words the value of min{A(w) − D(w) w ∈ Σ ω }.
Proof. Consider an alphabet Σ, discount factors λ A , λ D ∈ Q ∩ (1, ∞), a λ A -NDA A = Σ, Q A , ι A , δ A , γ A and a λ D -DDA D = Σ, Q D , ι D , δ D , γ D . When λ A = λ D , we can generate a λ A -NDA C ≡ A − D over the product of A and D and compute lowword(C).
When λ A = λ D , we consider first the case that λ A < λ D . Our algorithm unfolds the computation trees of A and D, up to a level in which only the minimal-valued suffix words of A remain relevant -Due to the massive difference between the accumulated discount factor in A compared to the one in D, any "penalty" of not continuing with a minimal-valued suffix word in A, defined below as m A , cannot be compensated even by the maximal-valued word of D, which "profit" is at most as high as maxdiff(D). Hence, at that level, it is enough to look among the minimal-valued suffixes of A for the one that implies the highest value in D.
For every transition t = (q, σ, q ′ ) ∈ δ A , let minval(q, σ, q ′ ) = γ A (q, σ, q ′ ) + 1 λA · lowword(A q ′ ) be the best (minimal) value that A q can get by taking t as the first transition. We say that t is preferred if it starts a minimal-valued infinite run of A q , namely δ pr = { t = (q, σ, q ′ ) ∈ δ A minval(t) = lowword(A q ) } is the set of preferred transitions of A. Observe that an infinite run of A q that takes only transitions from δ pr , has a value equal to lowrun(A q ) (cf. [10, Proof of Theorem 9]). If all the transitions of A are preferred, A has the same value on all words, and then min{A(w) − D(w) w ∈ Σ ω } = lowrun(A) − highword(D). (Recall that since D is deterministic, we can easily compute highword(D).) Otherwise, let m A be the minimal penalty for not taking a preferred transition in A, meaning Considering the connection between m A and maxdiff(D), notice first that if maxdiff(D) = 0, D has the same value on all words, and then we have min{A(w) − D(w) w ∈ Σ ω } = lowrun(A) − lowrun(D). Otherwise, meaning maxdiff(D) > 0, we unfold the computation trees of A and D for the first k levels, until the maximal difference between suffix runs in D, divided by the accumulated discount factor of D, is smaller than the minimal penalty for not taking a preferred transition in A, divided by the accumulated discount factor of A. Meaning, k is the minimal integer such that Starting at level k, the penalty gained by taking a non-preferred transition of A cannot be compensated by a higher-valued word of D.
At level k, we consider separately every run ψ of A on some prefix word u. We should look for a suffix word w, that minimizes A central point of the algorithm is that every word that minimizes A − D must take only preferred transitions of A starting at level k (see Lemma 4). As all possible remaining continuations after level k yield the same value in A, we can choose among them the continuation that yields the highest value in D.
Let B be the partial automaton with the states of A, but only its preferred transitions δ pr . (We ignore words on which B has no runs.) We shall use the automata product B δA(ψ) × D δD (u) to force suffix words that only take preferred transitions of A, while calculating among them the highest value in D.
A word w has a run in A δA(ψ) that uses only preferred transitions iff w has a run in C (δA(ψ),δD (u)) . Also, observe that the nondeterminism in C is only related to the nondeterminism in A, and the weight function of C only depends on the weights of D, hence all the runs of C (δA(ψ),δD(u)) on the same word result in the same value, which is the value of that word in D. Combining both observations, we get that a word w has a run in A δA(ψ) that uses only preferred transitions iff w has a run r in C (δA(ψ),δD (u)) such that C (δA(ψ),δD (u)) (r) = D δD (u) (w). Hence, after taking the k-sized run ψ of A, and under the notations defined in Eq. (4), a suffix word w that can take only preferred transitions of A, and maximizes D δD (u) (w), has a value of D δD (u) (w) = highrun(C (δA(ψ),δD(u)) ). This leads to and it is only left to calculate this value for every k-sized run of A, meaning for every leaf in the computation tree of A.
-The preferred transitions of D are the ones that start a maximal-valued infinite run, that is δ pr = { t = (p, σ ′ , p ′ ) ∈ δ D maxval(t) = highrun(D p ) }, and the minimal penalty m D is m D = min maxval(t ′′ ) − maxval(t ′ ) t ′′ = (p, σ ′′ , p ′′ ) ∈ δ pr , t ′ = (p, σ ′ , p ′ ) ∈ δ D \ δ pr k should be the minimal integer such that maxdiff(A) λA k < mD λD k . -We define B to be the restriction of D to its preferred transitions, and C (δA(ψ),δD (u)) as a partial λ A -NDA on the product of A δA(ψ) and B δD(u) while considering the weights of A. We then calculate lowrun(C (δA(ψ),δD(u)) ) for every k-sized run of A, ψ, and conclude that min { A − D } is equal to Observe that in this case, it might not hold that all runs of C (δA(ψ),δD (u)) on the same word have the same value, but such property is not required, since we look for the minimal run value (which is the minimal word value).

⊓ ⊔
Notice that the algorithm of Lemma 3 does not work if switching the direction of containment, namely if considering a deterministic A and a nondeterministic D. The determinism of D is required for finding the maximal value of a valid word in B δA(ψ) × D δD (u) . If D is not deterministic, the maximal-valued run of B δA(ψ) × D δD (u) on some word w equals the value of some run of D on w, but not necessarily the value of D on w. We also need D to be deterministic for computing highword(D p ) in the case that λ A > λ D .
To show the correctness of Lemma 3, we present the following claim.
Lemma 4. For every input discount factors λ A , λ D ∈ Q ∩ (1, ∞) such that λ A < λ D , λ A -NDA A and λ D -DDA D, every infinite word w that minimizes A(w) − D(w) must take a preferred transition of A at every level k for which maxdiff(D) and k the minimal integer such that Assume toward contradiction the existence of a word v that minimizes A−D, while a minimal-valued run ψ A of A on v does not take a preferred transition at some level n ≥ k. Let u be the n-sized prefix of v, w the corresponding suffix (meaning v = u · w), ψ the prefix run of ψ A on u, and w ′ some minimal-valued word of A δA(ψ) . The first transition taken by ψ A when continuing with w is not preferred, meaning Hence, leading to a contradiction.

⊓ ⊔
Moving to automata on finite words, we reduce the problem to the corresponding problem handled in Lemma 3, by adding to the alphabet a new letter that represents the end of the word, and making some required adjustments.

Lemma 5.
There is an algorithm that computes for every input discount factors Proof. Without loss of generality, we assume that initial states of automata have no incoming transitions. (Every automaton can be changed in linear time to an equivalent automaton with this property.) We convert, as described below, an NDA N on finite words to an NDÂ N on infinite words, such thatN intuitively simulates the finite runs of N . For an alphabet Σ, a discount factor λ ∈ Q ∩ (1, ∞), and a λ-NDA (DDA) N = Σ, Q N , ι N , δ N , γ N on finite words, we define the λ-NDA (DDA)N = Σ , Q N ∪ { q τ } , ι N , δN , γN on infinite words. The new alphabetΣ = Σ ∪ { τ } contains a new letter τ / ∈ Σ that indicates the end of a finite word. The new state q τ has 0-valued self loops on every letter in the alphabet, and there are 0valued transitions from every non-initial state to q τ on the new letter τ . Formally, otherwise Observe that for every state q ∈ Q N , the following hold.
1. For every finite run r N of N q , there is an infinite run rN ofN q , such that N q (rN ) = N q (r N ), and rN takes some τ transitions. (rN can start as r N and then continue with only τ transitions.) 2. For every infinite run rN ofN q that has a τ transition, there is a finite run r N of N q , such thatN q (rN ) = N q (r N ). (r N can be the longest prefix of rN up to the first τ transition). 3. For every infinite run rN ofN q that has no τ transition, there is a series of finite runs of N q , such that the values of the runs in N q converge toN q (rN ).
(For example, the series of all prefixes of rN ).
Hence, for every q ∈ Q N we have inf { N q (r) r is a run of N q } = lowrun(N q ) and sup { N q (r) r is a run of N q } = highrun(N q ). (For a non-initial state q, we also consider the "run" of N q on the empty word, and define its value to be 0.) Notice that the infimum (supremum) run value of N q is attained by an actual run of N q iff there is an infinite run ofN q that gets this value and takes a τ transition. For every state q ∈ QN , we can determine, as follows, whether lowrun(N q ) is attained by an infinite run taking a τ transition. We calculate lowrun(N q ) for all states, and then start a process that iteratively marks the states ofN , such that at the end, q ∈ QN is marked iff lowrun(N q ) can be achieved by a run with a τ transition. We start with q τ as the only marked state. In each iteration we further mark every state q from which there exists a preferred transition t = (q, σ, q ′ ) ∈ δ pr to some marked state q ′ . The process terminates when an iteration has no new states to mark. Analogously, we can determine whether highrun(N q ) is attained by a run that goes to q τ .
Consider discount factors λ A , λ D ∈ Q ∩ (1, ∞), a λ A -NDA A and a λ D -DDA D on finite words. When λ A = λ D , similarly to Lemma 3, the algorithm finds the infimum value of C ≡ A − D usingĈ, and determines if an actual finite word attains this value using the process described above.
Otherwise, the algorithm converts A and D toÂ andD, and proceeds as in Lemma 3 overÂ andD. According to the above observations, we have that inf { A(u) − D(u) u ∈ Σ + } = min{Â(w) −D(w) w ∈ Σ ω }, and that inf { A(u) − D(u) } is attainable iff min{Â(w)−D(w)} is attainable by some word that has a τ transition. Hence, whenever computing lowrun or highrun, we also perform the process described above, to determine whether this value is attainable by a run that has a τ transition. We determine that inf { A(u) − D(u) } is attainable iff exists a leaf of the computation tree that leads to it, for which the relevant values lowrun and highrun are attainable.
⊓ ⊔ Complexity analysis We show below that the algorithm of Lemmas 3 and 5 only needs a polynomial space, with respect to the size of the input automata, implying a PSPACE algorithm for the corresponding decision problems. We define the size of an NDA N , denoted by |N |, as the maximum between the number of its transitions, the maximal binary representation of any weight in it, and the maximal unary representation of the discount factor. (Binary representation of the discount factors might cause our algorithm to use an exponential space, in case that the two factors are very close to each other.) The input NDAs may have rational weights, yet it will be more convenient to consider equivalent NDAs with integral weights that are obtained by multiplying all the weights by their common denominator [6]. (Observe that it causes the values of all words to be multiplied by this same ratio, and it keeps the same input size, up to a polynomial change.) Before proceeding to the complexity analysis, we provide an auxiliary lemma.
Proof. Let λ = p q be A's discount factor, and γ its weight function. Consider a lasso run r = t 0 , t 1 , . . . , t x−1 , (t x , t x+1 , . . . , t x+y−1 ) ω of A. Let v f = γ(t 0 ) + 1 λ γ(t 1 ) + . . . + 1 λ x−1 γ(t x−1 ) be its prefix value, and v ℓ = γ(t x ) + 1 λ γ(t x+1 ) + . . . + 1 λ y−1 γ(t x+y−1 ) its loop value. Since all the weights are integers, we have that v f = a f p x and v ℓ = a ℓ p y for some integers a f and a ℓ . Recall that for a loop ℓ of length y and accumulated value v ℓ in a λ-NDA, the accumulated value of its infinite repetition is Hence the value of r is a ℓ · q x−y p y+x−y ( p y −q y q y ) = a f (p y − q y ) + a ℓ · q x p x (p y − q y )

⊓ ⊔
Proceeding to the complexity analysis, let the input size be S = |A| + |D|, the reduced forms of λ A and λ D be p q and pD qD respectively, the number of states in A be n, and the maximal difference between transition weights in D be M . Observe that n ≤ S, p ≤ S, M ≤ 2 · 2 S , λD λD −1 ≤ pD pD −qD ≤ p D ≤ S, and for λ D > λ A > 1, we also have λD λA = p·qD q·pD ≥ 1 + 1 S 2 . Observe that A has a best infinite run (and D has a worst infinite run), in a lasso form as in Lemma 6, with x, y ∈ [1..n]. Indeed, following preferred transitions, a run must complete a lasso, and then may forever repeat its choices of preferred transitions. Hence, m A , being the difference between two lasso runs, is in the form of for some x 1 , x 2 , y 1 , y 2 ≤ n and some integers b 1 , b 2 , b 3 . (Similarly, we can show that m D > 1 2 3S 2 .) We have maxdiff(D) ≤ M · λD λD −1 , hence Recall that we unfold the computation tree until level k, which is the minimal integer such that ( λD λA ) k > maxdiff(D) mA . Observe that for S ≥ 1 we have λD λA meaning that k is polynomial in S. Similar analysis shows that k is polynomial in S also for λ D < λ A . Considering decision problems that use our algorithm, due to the equivalence of NPSPACE and PSPACE, the algorithm can nondeterministically guess an optimal prefix word u of size k, letter by letter, as well as a run ψ of A on u, transition by transition, and then compute the value of A(ψ)+ lowrun(A δ A (ψ) ) λA k − D(u) − highrun(C (δ A (ψ),δ D (u)) ) λD k .
Observe that along the run of the algorithm, we need to save the following information, which can be done in polynomial space: -The automaton C ≡ B × D (or A × B), which requires polynomial space.
λ A k (for A(ψ)) and λ D k (for D(u)). Since we save them in binary representation, we have log 2 (λ k ) ≤ k log 2 (S), requiring polynomial space.
We thus get the following complexity result.
Theorem 4. For input discount factors λ A , λ D ∈ Q ∩ (1, ∞), λ A -NDA A and λ D -DDA D on finite or infinite words, it is decidable in PSPACE whether A(w) ≥ D(w) and whether A(w) > D(w) for all words w.
Proof. We use Lemma 3 in the case of infinite words and Lemma 5 in the case of finite words, checking whether min { A(w) − D(w) } < 0 and whether min { A(w) − D(w) } ≤ 0. In the case of finite words, we also use the information of whether there is an actual word that gets the desired value.
⊓ ⊔ Since integral NDAs can always be determinized [8], we get as a corollary that there is an algorithm to decide equivalence and strict and non-strict containment of integral NDAs with different (or the same) discount factors. Note, however, that it might not be in PSPACE, since determinization exponentially increases the number of states, resulting in k that is exponential in S, and storing in binary representation values in the order of λ k might require exponential space.

Corollary 1.
There are algorithms to decide for input integral discount factors λ A , λ B ∈ N, λ A -NDA A and λ B -NDA B on finite or infinite words whether or not A(w) > B(w), A(w) ≥ B(w), or A(w) = B(w) for all words w.

Conclusions
The new decidability result, providing an algorithm for comparing discountedsum automata with different integral discount factors, may allow to extend the usage of discounted-sum automata in formal verification, while the undecidability result strengthen the justification of restricting discounted-sum automata with multiple integral discount factors to tidy NMDAs. The new algorithm also extends the possible, more limited, usage of discounted-sum automata with rational discount factors, while further research should be put into this direction.