Weighted Tree Automata with Constraints

The HOM problem, which asks whether the image of a regular tree language under a given tree homomorphism is again regular, is known to be decidable [Godoy&Gim\'enez: The HOM problem is decidable. JACM 60(4), 2013]. However, the problem remains open for regular weighted tree languages. It is demonstrated that the main notion used in the unweighted setting, the tree automaton with equality and inequality constraints, can straightforwardly be generalized to the weighted setting and can represent the image of any regular weighted tree language under any nondeleting and nonerasing tree homomorphism. Several closure properties as well as decision problems are also investigated for the weighted tree languages generated by weighted tree automata with constraints.


Introduction
Numerous extensions of nondeterministic finite-state string automata have been proposed in the past few decades.On the one hand, the qualitative evaluation of inputs was extended to a quantitative evaluation in the weighted automata of [23].This development led to the fruitful study of recognizable formal power series [22], which are well-suited for representing factors such as costs, consumption of resources, or time and probabilities related to the processed input.The main algebraic structure for the weight calculations are semirings [16,17], which offer a nice compromise between generality and efficiency of computation (due to their distributivity).On the other hand, finite-state automata have been generalized to other input structures such as infinite words [21] and trees [4].Finite-state tree automata were introduced independently in [7,24,25], and they and the tree languages they generate, called regular tree languages, have been intensively studied since their inception [4].They are successfully utilized in various applications in many diverse areas like natural language processing [18], picture generation [8], and compiler construction [28].Indeed several applications require the combination of the two mentioned generalizations, and a broad range of weighted tree automaton (WTA) models has been studied (see [13,Chapter 9] for an overview).
It is well-known that finite-state tree automata cannot ensure that two subtrees (of potentially arbitrary size) are always equal in an accepted tree [14].An extension proposed in [20] aims to remedy this problem and introduces a tree automaton model that explicitly can require certain subtrees to be equal or different.Such models are very useful when investigating (tree) transformation models (see [13] for an overview) that can copy subtrees (thus resulting in equal subtrees in the output), and they are the main tool used in the seminal paper [15] that proved that the HOM problem is decidable.The HOM problem was a long-standing open problem in the theory of tree languages and recently solved in [15].It asks whether the image of an (effectively presented) regular tree language under a given tree homomorphism is again regular.This is not necessarily the case as tree homomorphisms can create copies of subtrees.Indeed removing this ability from the tree homomorphism, obtaining a linear tree homomorphism, yields that the mentioned image is always regular [14].In the solution to the HOM problem provided in [15] the image is first represented by a tree automaton with constraints, and then it is investigated whether this tree automaton actually generates a regular tree language.
The HOM problem is also interesting in the weighted setting as it once again provides an answer whether a given homomorphic image of a regular weighted tree language can be represented efficiently.While preservation of regularity has been investigated [3,10,11,12] also in the weighted setting, the decidability of the HOM problem remains wide open.With the goal of investigating this problem, we introduce weighted tree grammars with constraints (WTGc for short) in this contribution.We demonstrate that those WTGc can again represent all (nondeleting and nonerasing) homomorphic images of the regular weighted tree languages.Thus, in principle, it only remains to provide a decision procedure for determining whether a given WTGc generates a regular weighted tree language.We approach this task by providing some common closure properties following essentially the steps also taken in [15].For zero-sum free semirings we can also show that decidability of support emptiness and finiteness are directly inherited from the unweighted case [15].
The present work is a revised and extended version of [29] presented at the 26th Int.Conf.Developments in Language Theory (DLT 2022).We provide additional proof details and examples, as well as a new pumping lemma for the class of (nondeleting and nonerasing) homomorphic images of regular weighted tree languages.We utilize this pumping lemma to show that for any zero-sum free semiring, the class of homomorphic images of regular weighted tree languages is properly contained in the class of weighted tree languages generated by all positive WTGc, which are WTGc that utilize only equality constraints.

Preliminaries
We denote the set of nonnegative integers by N, and we let [k] = {i ∈ N | 1 ≤ i ≤ k} for every k ∈ N.For all sets T and Z let T Z be the set of all mappings ϕ : Z → T , and correspondingly we sometimes write ϕ z instead of ϕ(z) for every ϕ ∈ T Z .The inverse image ϕ −1 (S) of ϕ for a subset S ⊆ T is ϕ −1 (S) = {z ∈ Z | ϕ(z) ∈ S}, and we write ϕ −1 (t) instead of ϕ −1 ({t}) for every t ∈ T .The range of ϕ is Finally, the cardinality of Z is denoted by |Z|.
A ranked alphabet (Σ , rk) is a pair consisting of a finite set Σ and a map rk ∈ N Σ that assigns a rank to each symbol of Σ .If there is no risk of confusion, we denote a ranked alphabet (Σ , rk) by Σ .We write σ (k) to indicate that rk(σ . Given a ranked alphabet Σ and a set Z, the set T Σ (Z) of Σ -trees indexed by Z is the smallest set such that Z ⊆ T Σ (Z) and σ (t 1 , . . .,t k ) ∈ T Σ (Z) for every k ∈ N, σ ∈ Σ k , and t 1 , . . . ,t k ∈ T Σ (Z).We abbreviate T Σ ( / 0) simply to T Σ , and any subset L ⊆ T Σ is called a tree language.Let Σ be a ranked alphabet, Z a set, and t ∈ T Σ (Z).The set pos(t) of positions of t is inductively defined by pos(z) = {ε} for all z ∈ Z and by for all k ∈ N, σ ∈ Σ k , and t 1 , . . . ,t k ∈ T Σ (Z).The size |t| of t is defined as |t| = |pos(t)|, and its height ht(t) is ht(t) = max w∈pos(t) |w|.For w ∈ pos(t) and t ′ ∈ T Σ (Z), the label t(w) of t at w, the subtree t| w of t at w, and the substitution t[t ′ ] w of t ′ into t at w are defined by z(ε) = z| ε = z and z[t ′ ] ε = t ′ for all z ∈ Z and for t = σ (t 1 , . . .,t k ) by t and w ′ ∈ pos(t i ).For all S ⊆ Σ ∪ Z, we let pos S (t) = w ∈ pos(t) | t(w) ∈ S and var(t) = x ∈ X | pos x (t) = / 0 .For a single σ ∈ Σ ∪ Z we abbreviate pos {σ } (t) simply by pos σ (t).

Weighted Tree Grammars with Constraints
Let us start with the formal definition of our weighted tree grammars.They are a weighted variant of the tree automata with equality and inequality constraints originally introduced in [1,5].Compared to [1,5] our model is slightly more expressive as we allow arbitrary constraints, whereas constraints were restricted to subtrees occurring in the productions in [1,5].This more restricted version will be called classic in the following.An overview of further developments for these automata can be found in [26].We essentially use the version recently utilized to solve the HOM problem [15,Definition 4.1].For the rest of this section, let (S, +, •, 0, 1) be a commutative semiring.
Definition 1 (see [15,Definition 4.1]) A weighted tree grammar with constraints (WTGc) is a tuple G = (Q, Σ , F, P, wt) such that -Q is a finite set of nonterminals and F ∈ S Q assigns final weights, -Σ is a ranked alphabet of input symbols, -P is a finite set of productions of the form (ℓ, q, E, I), where ℓ ∈ T Σ (Q) \ Q, q ∈ Q, and E, I ⊆ N * × N * are finite sets, and wt ∈ S P assigns a weight to each production.
⊓ ⊔ In the following, let G = (Q, Σ , F, P, wt) be a WTGc.The components of a production p = (ℓ, q, E, I) ∈ P are the left-hand side ℓ, the target nonterminal q, the set E of equality constraints, and the set I of inequality constraints.Correspondingly, the production p is also written ℓ E,I −→ q or even ℓ E,I −→ wt p q if we want to indicate its weight.Additionally, we simply list an equality constraint , it has no inequality constraints, and it is unconstrained if E = / 0 = I; i.e., the production has no constraints at all.Instead of ℓ / 0, / 0 −→ q we also write just ℓ → q.The production is classic if {v, v ′ } ⊆ pos Q (ℓ) for all constraints (v, v ′ ) ∈ E ∪ I.In other words, in a classic production the constraints can only refer to nonterminallabeled subtrees of the left-hand side.The WTGc G is a weighted tree automaton with constraints (WTAc) if all productions p ∈ P are normalized, and it is a weighted tree grammar (WTG) [14] if all productions p ∈ P are unconstrained.If G is both a WTAc as well as a WTG, then it is a weighted tree automaton (WTA) [14].All these devices have Boolean final weights if F ∈ {0, 1} Q , they are positive if every p ∈ P is positive, and they are classic if every production p ∈ P is classic.Finally, if we utilize the Boolean semiring B, then we reobtain the unweighted versions and omit the 'W' in the abbreviations and the mapping 'wt' from the tuple.
The semantics for our WTGc G is a slightly non-standard derivation semantics when compared to [15,Definitions 4.3 & 4.4].Let (v, v ′ ) ∈ N * × N * and t ∈ T Σ .If v, v ′ ∈ pos(t) and t| v = t| v ′ , we say that t satisfies (v, v ′ ), otherwise t dissatisfies (v, v ′ ).Let now C ⊆ N * × N * be a finite set of constraints.We write t |= C if t satisfies all (v, v ′ ) ∈ C, and t | ∀ = C if t dissatisfies all (v, v ′ ) ∈ C. Universally dissatisfying C is generally stronger than simply not satisfying C. Definition 2 A sentential form (for G) is simply a tree of ξ ∈ T Σ (Q).Given an input tree t ∈ T Σ , sentential forms ξ , ζ ∈ T Σ (Q), a production p = ℓ E,I −→ q ∈ P, and a position w ∈ pos(ξ ), we write ξ ⇒ where is the lexicographic order on N * in which prefixes are larger, so ε is the largest element.
⊓ ⊔ Note that the sentential forms ξ 1 , . . ., ξ n are uniquely determined if they exist, and for any derivation d for t there exists a unique permutation of d that is a leftmost derivation for t.The derivation d is complete if ξ n ∈ Q, and in that case it is also called a derivation to ξ n .The set of all complete left-most derivations for t to q ∈ Q is denoted by −→ q ∈ P be a production.Since there exist unique k = |pos Q (ℓ)|, c ∈ C Σ (X k ), and q 1 , . . ., q k ∈ Q such that ℓ = c[q 1 , . . ., q k ], we also simply write c[q 1 , . . ., q k ] E,I −→ q instead of p.Using this notation, we can present a recursion for the set ) be a complete derivation for some tree t ∈ T Σ .For a given position w ∈ {w 1 , . . ., w n }, we let k ∈ N and 1 , the indices of the derivation steps applied to positions below w with w ′ i being the suffix of w i following the prefix w for all i ∈ {i 1 , . . ., i k }.The derivation for t| w incorporated in d is the derivation (p i 1 , w ′ i 1 ), . . ., (p i k , w ′ i k ).Conversely, for every w ∈ N * we abbreviate the derivation (p 1 , ww 1 ) • • • (p n , ww n ) by simply wd.
The weighted tree language generated by G, written simply G ∈ S T Σ , is defined for every t ∈ T Σ by Two WTGc are equivalent if they generate the same weighted tree language.Finally, a weighted tree language is regular if it is generated by some WTG, positive constraint-regular if it is generated by some positive WTGc, classic constraint-regular if it is generated by some classic WTGc, and constraint-regular if it is generated by some WTGc.
Since the weights of productions are multiplied, we can assume without loss of generality that wt p = 0 for all p ∈ P. Example 1 Consider the WTGc G = (Q, Σ , F, P, wt) over the arctic semiring A with nonterminals Q = {q, q ′ }, Σ = {α (0) , γ (1) , σ (2) }, F q = −∞, F q ′ = 0, and P and 'wt' given by the productions p 1 = α → 0 q, p 2 = γ(q) → 1 q, and p 3 = σ γ(q), q 11=2 −→ 1 q ′ .Clearly, G is positive and classic, but not a WTAc.The tree t = σ γ(γ(α)), γ(α) has the unique left-most derivation to the nonterminal q ′ , which is illustrated in Figure 1.Overall, we have Next, we introduce another semantics, called initial algebra semantics, which is based on the presented recursive presentation of derivations and often more convenient in proofs.Definition 4 For every nonterminal q ∈ Q we recursively define the map wt q G ∈ S T Σ such that for every t ∈ T Σ by

⊓ ⊔
It is a routine matter to verify that wt q G (t) = ∑ d∈D q G (t) wt G (d) for every q ∈ Q and t ∈ T Σ .This utilizes the presented recursive decomposition of complete derivations as well as distributivity of the semiring S.
As for WTG and WTA [13], also every (positive) WTGc can be turned into an equivalent (positive) WTAc at the expense of additional nonterminals by decomposing the left-hand sides.
Proof Let G = (Q, Σ , F, P, wt) be a WTGc with a non-normalized production Q) be an injective map such that ϕ q = q for all q ∈ Q.We de- fine the WTGc q for all q ∈ Q and F ′ q ′ = 0 for all q ′ ∈ Q ′ \ Q, and and for every p To prove that G ′ is equivalent to G we observe that for every left-most derivation of G, there exists a corresponding derivation d ′ of G ′ , which is obtained by replacing each derivation step (p a , w a ) with p a = p by the sequence of derivation steps of G ′ (yielding also a unique corresponding left-most derivation).This replacement preserves the weight of the derivation.Vice versa any left-most derivation of G ′ that utilizes the production σ (ϕ ℓ 1 , . . . ,ϕ ℓ k ) E,I −→ q ∈ P ′ at w needs to previously utilize the productions ℓ i → ϕ ℓ i ∈ P ′ at wi for all i ∈ [k] with ℓ i / ∈ Q since these are the only productions that generate the nonterminal ϕ ℓ i .Thus, we established a weight-preserving bijection between the left-most derivations of G and G ′ , so it is obvious that G ′ = G.Repeated application of the normalization eventually (after finitely many steps) yields an equivalent WTAc.Finally, we note that the constructed WTAc is positive if the original WTGc is positive.
⊓ ⊔ As we will see in the next example, the construction used in the proof of Lemma 1 does not preserve the classic property.
Example 2 Consider the classic and positive WTGc G of Example 1 and its nonnormalized production p = σ γ(q), q 11=2 −→ 1 q ′ .Applying the construction in the proof of Lemma 1 we replace p by the productions σ (q ′′ , q) 11=2 −→ 1 q, which is not classic, and γ(q) → 0 q ′′ , where q ′′ is some new nonterminal.The WTGc obtained this way is already a positive WTAc.

⊓ ⊔
Another routine normalization turns the final weights into Boolean final weights following the approach of [2, Lemma 6.1.1].This is achieved by adding special copies of all nonterminals that terminate the derivation and pre-apply the final weight.
Lemma 2 WTGc and WTGc with Boolean final weights are equally expressive.This also applies to positive WTGc, classic WTGc, and classic positive WTGc as well as the same WTAc.
We construct the WTGc belongs to P ′ and wt ′ p ′ = wt p •F q for every p = ℓ E,I −→ q ∈ P. No other productions belong to P ′ .Finally, F ′ q = 0 for all q ∈ Q and F c = 1 for all c ∈ C. The proof of equivalence is straightforward showing for every t ∈ T Σ and q ∈ Q that wt q G ′ (t) = wt q G (t) and wt The construction trivially preserves the properties normalized, positive, and classic.⊓ ⊔ t) be a derivation for some q ∈ Q and t ∈ T Σ .Since we often argue with the help of such derivations d, it is a nuisance that we might have wt G (d) = 0.This anomaly can occur even if wt p = 0 for all p ∈ P due to the presence of zerodivisors, which are elements s, s ′ ∈ S \ {0} such that s • s ′ = 0.However, we can fortunately avoid such anomalies altogether utilizing a construction of [19], which has been lifted to tree automata in [9].
).This also applies to positive WTGc, classic WTGc, and classic positive WTGc as well as the same WTAc.The construction also preserves Boolean final weights.
Proof Let G = (Q, Σ , F, P, wt).Obviously, (S, •, 1, 0) is a commutative monoid with zero.Let (s 1 , . . ., s n ) be an enumeration of the finite set wt(P) \ {1} ⊆ S. We consider the monoid homomorphism h : N n → S, which is given by for every m 1 , . . ., m n ∈ N. According to DICKSON's lemma [6] the set min h −1 (0) is finite, where the partial order is the standard pointwise order on N n .Hence there is u ∈ N such that min h −1 (0) ⊆ {0, . . ., u} n = U.We define the operation ⊕ : q,v = F q for all q, v ∈ Q ′ , and P ′ and wt ′ are given as follows.For every production belongs to P ′ and its weight is wt ′ p ′ = wt p .No further productions are in P ′ .The construction trivially preserves the properties positive, classic, and normalized.For correctness, let q ′ = q, v ∈ Q ′ , t ′ ∈ T Σ , and d ′ ∈ D q ′ G ′ (t ′ ).We suitably (for the purpose of zero-divisors) track the weight of the derivation in v and h v = 0 by definition.Consequently, wt ′ G ′ (d ′ ) = 0 as required.We note that possibly wt For zero-sum free semirings [16,17] we obtain that the support supp(G) of an WTGc can be generated by a TGc.A semiring is zero-sum free if s = 0 = s ′ for every s, s ′ ∈ S such that s + s ′ = 0. Clearly, rings are never zero-sum free, but the mentioned semirings B, N, T, and A are all zero-sum free.
Proof We apply Lemma 2 to obtain an equivalent WTGc with Boolean final weights and then Lemma 3 to obtain the WTGc G ′ = (Q ′ , Σ , F ′ , P ′ , wt ′ ) with Boolean final weights.As mentioned we can assume that wt ′ p ′ = 0 for all p ′ ∈ P ′ .Let G ′ (t ′ ) and s + s ′ = 0 for all s, s ′ ∈ S \ {0} due to zero-sum freeness, we obtain t ′ ∈ supp(G ′ ).Thus, the existence of a complete derivation for t ′ to an accepting nonterminal (i.e., one with final weight 1) characterizes whether we have t ′ ∈ supp(G ′ ).Consequently, the TGc Q ′ , Σ , supp(F ′ ), P ′ generates the tree language supp(G ′ ), which is thus constraint-regular.The properties positive and classic are preserved in all the constructions.

Closure Properties
Next we investigate several closure properties of the constraint-regular weighted tree languages.We start with the (point-wise) sum, which is given by (A + A ′ ) t = A t + A ′ t for every t ∈ T Σ and A, A ′ ∈ S T Σ .Given WTGc G and G ′ generating A and A ′ we can trivially use a disjoint union construction to obtain a WTGc generating A + A ′ .We omit the details.
Proposition 1 The (positive, classical) constraint-regular weighted tree languages (over a fixed ranked alphabet) are closed under sums.

⊓ ⊔
The corresponding (point-wise) product is the HADAMARD product, which is given by t for every t ∈ T Σ and A, A ′ ∈ S T Σ .With the help of a standard product construction we show that the (positive) constraint-regular weighted tree languages are also closed under HADAMARD product.As preparation we introduce a special normal form.
In other words, two productions cannot differ only in the sets of constraints.It is straightforward to turn any (positive) WTAc into an equivalent constraint-determined (positive) WTAc by introducing additional nonterminals (e.g.annotate the constraints to the nonterminal on the right-hand side).
Theorem 1 The (positive) constraint-regular weighted tree languages (over a fixed ranked alphabet) are closed under HADAMARD products.
Proof Let A, A ′ ∈ S T Σ be constraint-regular.Without loss of generality (see Lemma 1) we can assume constraint-determined WTAc G = (Q, Σ , F, P, wt) and that generate A and A ′ , respectively.We construct the direct product WTAc such that F ′′ q,q ′ = F q • F ′ q ′ for every q ∈ Q and q ′ ∈ Q ′ and for every production p = σ (q 1 , . . ., q k ) E,I −→ q ∈ P and production p ′ = σ (q ′ 1 , . . ., q ′ k ) E ′ ,I ′ −→ q ′ ∈ P ′ the production belongs to P ′′ and its weight is wt ′′ p ′′ = wt p • wt ′ p ′ .No other productions belong to P ′′ .It is straightforward to see that the property positive is preserved.The correctness proof that for all t ∈ T Σ using the initial algebra semantics.The WTAc G and G ′ are required to be constraint-determined, so that we can uniquely identify the basic productions p ∈ P and p ′ ∈ P ′ that construct a newly formed production p ′′ ∈ P ′′ .
We can obtain a constraint-determined WTAc at the expense of a polynomial increase in the number of productions (assuming that the ranked alphabet of input symbols is fixed).Let r = max σ ∈Σ rk(σ ) be the maximal rank of an input symbol and c = |P| be the number of productions of the given WTAc G = (Q, Σ , F, P, wt).First, we modify the target nonterminal q of each production ρ = (ℓ, q, E, I) ∈ P to additionally include the identifier ρ, which yields the production (ℓ, q, ρ , E, I).This effectively yields the new nonterminal set Q × P, which has size |Q| • c.Then we create copies of the production (σ (q 1 , . . . ,q k ), q, ρ , E, I) by the set of productions σ ( q 1 , ρ 1 , . . ., q k , ρ k ), q, ρ , E, I ρ 1 , . . ., ρ k ∈ P .
Clearly, this turns each production into at most c r productions since k ≤ r, so the overall number of productions after all replacements is at most c r+1 .The product construction itself is then quadratic.

⊓ ⊔
We note that the previous construction also works for classic WTAc.
Hence we obtain the equality ⊓ ⊔ Next, we use an extended version of the classical power set construction to obtain an unambiguous WTAc that keeps track of the reachable nonterminals, but preserves only the homomorphic image of its weight.The unweighted part of the construction mimics a power-set construction and the handling of constraints roughly follows [15, Definition 3.1].
Theorem 2 Let h ∈ T S be a semiring homomorphism into a finite semiring T. For every (classic) WTAc G = (Q, Σ , F, P, wt) over S there exists an unambiguous (classic) WTAc G ′ = (T Q , Σ , F ′ , P ′ , wt ′ ) such that for every tree t ∈ T Σ and ϕ Proof For every σ ∈ Σ , let −→ q ∈ P be the constraints that occur in productions of G whose left-hand side contains σ .We let No additional productions belong to P ′ .Finally, we set wt ′ p ′ = 1 for all p ′ ∈ P ′ .In general, the WTAc G ′ is certainly not deterministic due to the choice of constraints, but G ′ is unambiguous since the resulting 2 |C σ | rules for each left-hand side have mutually exclusive constraint sets.In fact, for each t ∈ T Σ there is exactly one left-most complete derivation of G ′ for t, and it derives to ϕ ∈ T Q such that ϕ q = h wt q G (t) for every q ∈ Q.The weight of that derivation is 1.These statements are proven inductively.The final statement G ′ t = h(G t ) for every t ∈ T Σ is an easy consequence of the previous statements.If G is classic, then also the constructed WTAc G ′ is classic.⊓ ⊔ Example 4 Recall the WTAc G and G ′ from Example 3. Consider the WTAc generating their disjoint union, as well as the semiring homomorphism h ∈ B A given by h a = 1 for all a ∈ A \ {−∞} and h −∞ = 0.The sets C γ and C σ of utilized constraints are C γ = (11,12) and C σ = (1, 2) , and we write ϕ ∈ B Q simply as subsets of Q.We obtain the unambiguous WTAc G ′′ with the following sensible (i.e., having satisfiable constraints) productions for all Q ′ , Q ′′ ⊆ {q, z}, which all have weight 1.
Corollary 2 (of Theorem 2) Let S be finite.For every (classic) WTAc over S there exists an equivalent unambiguous (classic) WTAc.

⊓ ⊔
Corollary 3 (of Theorem 2) Let S be zero-sum free.For every (classic) WTAc G over S there exists an unambiguous (classic) TAc generating supp(G).
Proof Utilizing Lemma 2 we can first construct an equivalent WTAc with Boolean final weights.If S is zero-sum free, then there exists a semiring homomorphism h ∈ B S by [27].By Lemma 3 we can assume that each derivation of G has non-zero weight and sums of non-zero elements remain non-zero by zero-sum freeness.Thus we can simply replace the factor h(wt p ) by 1 in (2).The such obtained TAc generates supp(G).

⊓ ⊔
Corollary 4 (of Theorem 2) Let S be zero-sum free.For every (classic) WTAc G over S there exists an unambiguous (classic) TAc generating T Σ \ supp(G).
Proof Let G ′ = (Z, Σ , Z 0 , P ′ ) be the unambiguous TAc given by Corollary 3. Since G ′ is also complete in the sense that every input tree has a derivation, the desired unambiguous TAc G ′′ is simply [15,Definition 4.11]) to restrict A to the support of A ′ but without changing the weights of those trees inside the support.Formally, we define and A| supp(A ′ ) (t) = 0 otherwise.Utilizing unambiguous WTAc and the HADAMARD product, we can show that A| supp(A ′ ) is constraint-regular if A and A ′ are constraintregular and the semiring S is zero-sum free.
Theorem 3 Let S be zero-sum free.For all (classic) WTAc G and G ′ there exists a (classic) WTAc H such that H = G| supp(G ′ ) .
Proof By Corollary 1 the support supp(G ′ ) is constraint-regular.Hence we can obtain an unambiguous WTAc G ′′ for supp(G ′ ) using Theorem 2. Without loss of generality we assume that both G and G ′′ are constraint-determined; we note that the normalization preserves unambiguous WTAc.Finally we construct G × G ′′ , which by Theorem 1 generates exactly G| supp(G ′ ) as required.
⊓ ⊔ In the following, we establish a special property for classic WTGc.To this end, we first need another notion.Let G = (Q, Σ , F, P, wt) be a WTGc.
In other words, for every sink nonterminal ⊥ the production σ (⊥, . . ., ⊥) → ⊥ belongs to P with weight 1 for every symbol σ ∈ Σ .Additionally, no other productions have the sink nonterminal ⊥ as target nonterminal.Given a set E ⊆ N * × N * of equality constraints, we let ≡ E = (E ∪ E −1 ) * be the smallest equivalence relation containing E and [w] ≡ E be the equivalence class of w ∈ N * .Additionally, for every production c[q 1 , . . ., q k ] E,I −→ q ∈ P we let

be a representation of the equality constraints on the indices [k].
Definition 5 A classic WTGc G = (Q, Σ , F, P, wt) is eq-restricted if there exists a sink nonterminal ⊥ ∈ Q such that for every production p = c[q 1 , . . ., q k ] E,I −→ q ∈ P and index i ∈ [k] there exists a nonterminal q ′ ∈ Q such that 1. {q j | j ∈ [i] ≡ c(E) } ⊆ {q ′ , ⊥} and 2. there exists exactly one index j ∈ [i] ≡ c(E) , also called governing index for i in p, such that q j = q ′ .The mapping g p : In other words, in an eq-restricted classic WTGc one subtree is generated normally by the WTGc and all the subtrees that are required to be equal by means of the equality constraints are generated by the sink nonterminal ⊥, which can generate any tree with weight 1.In this manner, the restrictions on subtree and weight generation induced by the WTGc are exhibited completely on a single subtree and the "copies" are only provided by the equality constraint, but not further restricted by the WTGc.We will continue to use ⊥ for the suitable sink nonterminal of an eq-restricted classic WTGc.
Finally, we show that the weighted tree languages generated by eq-restricted positive classic WTGc are closed under relabelings.A relabeling is a tree homomorphism π ∈ T ∆ (X) Σ such that for every k ∈ N and σ ∈ Σ k there exists δ ∈ ∆ k with π σ = δ (x 1 , . . ., x k ).In other words, a relabeling deterministically replaces symbols respecting their rank.We often specify a relabeling just as a mapping π ∈ ∆ Σ such that π σ ∈ ∆ k for every k ∈ N and σ ∈ Σ k .
Theorem 4 The weighted tree languages generated by eq-restricted positive classic WTGc are closed under relabelings.
Proof Let WTGc G = (Q, Σ , F, P, wt) be an eq-restricted positive classic WTGc with sink nonterminal ⊥.Without loss of generality, suppose that Σ ∩ X = / 0.Moreover, let π ∈ ∆ Σ be a relabeling.We first extend π to a mapping ′ ∈ (∆ ∪ X) Σ ∪X , in which we treat the elements of X as nullary symbols, for every σ ∈ Σ and x ∈ X by π ′ σ = π σ and π ′ x = x.Let G ′ = (Q, ∆ , F, P ′ , wt ′ ) be the eq-restricted positive classic WTGc such that and for every production Finally, wt ′ δ (⊥, . . ., ⊥) → ⊥ = 1 for all δ ∈ ∆ .For correctness we prove the following equality for every u ∈ T ∆ and q ∈ Q by induction on u The second case is immediate since there is a single derivation, namely the one utilizing only nonterminal ⊥, for u to ⊥ and its weight is 1.In the remaining case we have q = ⊥.Then Recall that g p : [k] → [k] assigns to each index its governing index.For better readability, we write just g ′ .Note that due to the special form of substitution we automatically fulfill u |= E and can thus drop it. ( We note that g p ′ = g p for all used productions p, so we just write g.Additionally, for every q i with i ∈ [k] \ ran(g) we have q i = ⊥ and thus wt q i G (t g(i) ) = 1 because there is exactly one such derivation with weight 1.
We complete the proof for every u ∈ T ∆ as follows.

Towards the HOM Problem
The strategy of [15] for deciding the HOM problem first represents the homomorphic image L ′ = h(L) of the regular tree language L with the help of an WTGc G ′ .For deciding whether L ′ is regular, a tree automaton G ′′ simulating the behavior of G ′ up to a certain bounded height is constructed.If the automata G ′ and G ′′ are equivalent, i.e., G ′′ = G ′ , then L ′ is regular.In the remaining case, pumping arguments are used to prove that it is impossible to find any TA for L ′ .Overall, this reduces the HOM problem to an equivalence problem.Towards solving the HOM problem in the weighted case we now proceed similarly.First, we show that WTGc can encode each (well-defined) homomorphic image of a regular weighted tree language.This ability motivated their definition in the unweighted case [15,Proposition 4.6], and it also applies in the weighted case with minor restrictions that just enforce that all obtained sums are finite.
Theorem 5 Let G = (Q, Σ , F, P, wt) be a WTA and h ∈ T T Σ ∆ be a nondeleting and nonerasing tree homomorphism.There exists an eq-restricted positive classic WTGc G ′ with G ′ = h(G).
Proof We construct a WTGc G ′ for h(G) in two stages.First, let such that for every p = σ (q 1 , . . ., q k ) → q ∈ P and h σ = u = δ (u 1 , . . ., u n ), 2 , in which the substitution δ , p (u 1 , . . ., u n ) q 1 , . . ., q k replaces for every i ∈ [k] only the left-most occurrence of x i in δ , p (u 1 , . . ., u n ) by q i and all other occurrences by ⊥.Moreover wt ′′ p ′′ = wt p .Additionally, we let No other productions are in P ′′ .Finally, we let F ′′ q = F q for all q ∈ Q and F ′′ ⊥ = 0. Obviously, G ′′ is eq-restricted, positive, and classic.
In order to better describe the behaviour of G ′′ , let us introduce the following notation.Given a tree t = σ (t 1 , . . .,t k ) ∈ T Σ and a complete left-most deriva- . ., d k be the derivations for t 1 , . . .,t k , respectively that are incorporated in d and h σ = δ (u 1 , . . ., u n ).Then we define the tree h(t, d) ∈ T ∆ ∪∆ ×P inductively by Using this notation, let us now prove that for each q ∈ Q we have and, in turn, every such D q G ′′ (s) is a singleton set with wt G ′′ (d ′′ ) = wt G (d) for the unique d ′′ ∈ D q G ′′ h(t, d) .We start with the inclusion from right to left.To this end, let t ∈ T Σ be a tree and d = (p 1 , w 1 ) • • • (p m , w m ) be a complete left-most derivation of G for t to some nonterminal q ∈ Q.Let t = σ (t 1 , . . .,t k ) be the input tree with h σ = δ (u 1 , . . ., u n ), let p m = σ (q 1 , . . ., q k ) → q be the production utilized last in d, and let d i be the complete left-most derivation for t i to q i incorporated in d for every i ∈ [k].For every i ∈ [k], we utilize the induction hypothesis to conclude that be the unique element, for which we additionally have wt with weight 1 that exclusively utilizes the nonterminal ⊥.We define For every i ∈ [k], let v i be the left-most occurrence of x i in h σ .We consider the derivations v 1 h(t 1 , d 1 ), . . ., v k h(t k , d k ), and for every other occurrence v of x i in h σ we consider the derivation vd ⊥ i .Let d ′′ be the derivation assembled from the considered subderivations followed by (p ′′ m , ε), where the production p ′′ m at the root is p ′′ m = δ , p m (u 1 , . . ., u n ) q 1 , . . ., q k E, / 0 −→ q with the constraints E = k i=1 pos x i (h σ ) 2 .Clearly, the production p ′′ m is the only applicable one since the only other production whose left-hand side is labeled by δ , p m at the root reaches ⊥ = q.Reordering the derivation d ′′ to be left-most, we obtain the desired complete left-most derivation d ′′ for s, for which we also have wt G ′′ (d ′′ ) = wt G (d).This proves that d ′′ is the required single element of D q G ′′ (s) = D q G ′′ h(t, d) = / 0. On the other hand, consider s ∈ T ∆ ∪∆ ×P such that there exists a complete leftmost derivation m that is applied must be of the form with δ (u 1 , . . ., u n ) q 1 , . . ., q k = h σ q 1 , . . ., q k for some symbol σ ∈ Σ k and produc- tion p = σ (q 1 , . . ., q k ) → q.For every i ∈ [k], we denote by w i the unique position in h σ q 1 , . . ., q k labeled by q i .By the induction hypothesis applied to s| w i , for which the complete left-most derivation d ′′ i for s| w i to q i incorporated in d ′′ exists, there exists a tree t i ∈ T Σ and a complete left-most derivation d i of G for t i to q i such that For the tree t = σ (t 1 , . . .,t k ) we obtain that s = h(t, d) for the complete left-most derivation d ∈ D q G (t) given by d for which we also have wt G (d) = wt G ′′ (d ′′ ), which completes this proof.
So far, Q ′′ and P ′′ are larger than Q and P only by a constant (assuming a fixed alphabet Σ ) caused by the additional sink nonterminal ⊥ and its productions, but the alphabet size increases by the summand |∆ | • |P|.
We now delete the annotation with the help of the relabeling π ∈ ∆ ∆ ∪∆ ×P given for every δ ∈ ∆ and p ∈ P by π δ = π δ ,p = δ following the construction in Theorem 4.

π(G
for every u ∈ T ∆ .The construction of Theorem 4 is applicable because ⊥ is clearly a sink nonterminal in G ′′ and G ′′ is an eq-restricted positive classic WTGc.

⊓ ⊔
Let us illustrate the construction on a simple example.
Definition 6 Let G = (Q, Σ , F, P, wt) be an eq-restricted, positive, and classic WTGc with sink nonterminal ⊥.Moreover, let q, q ′ ∈ Q, t,t ′ ∈ T Σ , and d ∈ D q G (t) as well as d ′ ∈ D q ′ G (t ′ ) such that q = ⊥ = q ′ and d = d(p, ε) with the final utilized production p = c[q 1 , . . ., q k ] E, / 0 −→ q ∈ P. For every i ∈ [k] let w i = pos x i (c) and d i be the unique derivation for t i = t| pos x i (c) incorporated in d.Finally, for every tree u ∈ T Σ let d ⊥ u be the unique derivation for u to ⊥.For every w ∈ pos(t), for which the derivation for t| w incorporated in d yields q ′ we recursively define the derivation substitution d d ′ w of d ′ into d at w and the resulting tree t t ′ d w as follows.If w = ε, then d d ′ ε = d ′ and t t ′ d ε = t ′ .Otherwise w = w j w for some j ∈ [k] and we have where for each i ∈ [k] we have Fig. 2 Input trees t and t ′ from Example 6.
if q i = ⊥ and w i ∈ [w j ] ≡ E (i.e., it is a position that is equality restricted to w j ), We consider the WTGc G = {q, ⊥}, Σ , F, P, wt with input ranked alphabet Σ = {a (0) , g (2) , f (2) }, final weights F q = 1 and F ⊥ = 0 as well as productions p a = a → 1 q p g = g(q, ⊥) 1=2 −→ 1 q and p f = f q, f (q, ⊥) besides the sink nonterminal productions p ⊥ σ = σ (⊥, . . ., ⊥) → 1 ⊥ for all σ ∈ Σ .As before, for every u ∈ T Σ we let d ⊥ u ∈ D ⊥ G (u) be the unique derivation of G for u to ⊥, which utilizes only the nonterminal ⊥.According to Definition 6 we choose the states q = q ′ and the trees t and t ′ and derivations d and d ′ as given in Figure 2 and below.
We select that position w = 11 and observe that that the derivation for t| 11 is (p a , ε), which yields q = q ′ .We compute d d ′ w as follows ) (p g , ε) and u = g g(a, a), g(a, a) .We note that w = 11 is explicitly equality constrained to position 12 in d via the constraint 1 = 2 at position 1 and implicitly equality constrained to positions 221 and 222 via the constraint 1 = 22 at the root ε.As our example illustrates, the tree t t ′ d w is obtained from t by (i) identifying the set of all positions of t that are explicitly or implicitly equality constrained to w by the productions in the derivation d and (ii) substituting t ′ into t at every such position.If w ′ ∈ pos(t) is parallel to all positions constrained to w, like position 21 in Example 6, then t t ′ w | w ′ = t| w ′ .Note that t| 21 is equal to the replaced subtree t| 11 , but we only replace constrained subtrees and not all equal subtrees.This substitution allows us to prove a pumping lemma for eq-restricted, positive, and classic WTGc, which can generate all (nondeleting and nonerasing) homomorphic images of regular weighted tree languages by Theorem 5. To this end, we need some final notions.Let G = (Q, Σ , F, P, wt) be a WTGc.Moreover, let p = ℓ E,D −→ q ∈ P be a production.We define the height ht(p) of p by ht(p) = ht(ℓ) (i.e., the height of its left-hand side).Moreover, we let ht(P) = max ht(p) | p ∈ P and ht(G) = (|Q| + 1) • ht(P) .
Lemma 4 Let G = (Q, Σ , F, P, wt) be an eq-restricted, positive, and classic WTGc with sink nonterminal ⊥.There exists n ∈ N such that for every tree t 0 ∈ T Σ , nonterminal q ∈ Q \ {⊥}, and derivation d ∈ D q G (t 0 ) such that ht(t 0 ) > n and wt G (d) = 0 there are infinitely many trees t 1 ,t 2 , . . .and derivations d 1 , d 2 , . . .such that d i ∈ D q G (t i ) and wt G (d i ) = 0 for all i ∈ N.
Proof Without loss of generality, suppose that for every c[q 1 , . . ., q k ] E, / 0 −→ q ′ ∈ P with q ′ = ⊥ and k = 0 there exists i ∈ [k] such that q i = ⊥.This can easily be achieved by introducing a copy ⊤ of nonterminal ⊥ and replacing one instance of ⊥ by ⊤ in offending productions.Similarly, we can assume without loss of generality that the construction in the proof of Lemma 3 has been applied to G. If this is the case, then we can select n = ht(G).Let t 0 ∈ T Σ be such that ht(t 0 ) > n.Let Q ′ = Q\ {⊥}, d ∈ D q G (t 0 ) be a derivation with wt G (d) = 0, and select a position w ∈ pos(t 0 ) of maximal length such that d incorporates a derivation for t 0 | w to some q which yields that at least |Q| proper prefixes w ′ of w exist such that d incorporates a derivation for t 0 | w ′ to some q ′ ∈ Q ′ .Hence there exist prefixes w ′ , w ′′ of w such that d incorporates a derivation d ′ for t ′ = t 0 | w ′ to q ′ ∈ Q ′ as well as a derivation for t 0 | w ′′ to the same nonterminal q ′ .Then d d ′ w ′′ is a derivation of G for t 1 = t t ′ d w ′′ to q with ht(t 1 ) > ht(t 0 ).Since we achieve the same state q, the annotation of the proof of Lemma 3 guarantees that wt G (d 1 ) = 0. Iterating this substitution yields the desired trees t 1 ,t 2 , . . .and derivations d 1 , d 2 , . . . .

⊓ ⊔
A WTGc generating a (nondeleting and nonerasing) homomorphic image of a regular weighted tree language, if constructed as described in Theorem 5, will never have overlapping constraints since constraints always point to leaves of the left-hand sides of productions as required by classic WTGc.It is intuitive that this limitation to the operating range of constraints leads to an actual restriction in the expressive power of WTGc, but we will only prove it for eq-restricted, positive, and classic WTGc.
Proposition 2 Let S be a zero-sum free semiring.The class of positive constraintregular weighted tree languages is strictly more expressive than the class of weighted tree languages generated by eq-restricted, positive, and classic WTGc.
a → 1 q ′ g(q ′ , q ′ ) → 1 q f (q, q) 12=21 −→ 1 q f (q, q) The first two productions are only used on leaves and on subtrees of the form g(a, a).Every other position w (i.e., neither leaf nor position with two leaves as children) is labeled either f or f and additionally every derivation enforces the constraint 12 = 21, so the subtrees t| w12 and t| w21 of the input tree t need to be equal for a complete derivation of G to exist.
For the sake of a contradiction, suppose that an eq-restricted, positive, and classic WTGc G ′ = (Q ′ , Σ , F ′ , P ′ , wt ′ ) exists that is equivalent to G. We recursively define the trees t n ∈ T Σ and t ′ n ∈ T Σ for every n ∈ N with n ≥ 1 by t 0 = a t 1 = g(t 0 ,t 0 ) t n+1 = f (t n ,t n ) Clearly, t n and t ′ n are both complete binary trees of height n.Naturally, the leaves are labeled a, and the penultimate level in both trees is always labeled g.In t n the remaining levels are universally labeled f , whereas in t ′ n the left-most spine on those levels is labeled f .We illustrate an example tree t ′ n in Figure 4. Obviously G(t n ) = 1 as well as G(t ′ n ) = 1 for every n ∈ N with n ≥ 1. Furthermore we note that the derivations of G only enforce equality constraints on positions of the form w12 or w21, but since pos f (t ′ n ) ⊆ {1} * , the positions, in which the labels in t n and t ′ n differ, are not affected by any equality constraint.This can be used to verify that G(t ′ n ) = 1 for each n ≥ 1.
In the following, let n = 3 ht(G ′ ) + 2. Since G ′ is equivalent to G, we need to have G ′ (t ′ n ) = 1 as well, which requires a complete derivation of G ′ for t ′ n to some final nonterminal q 0 ∈ Q ′ .Let d ∈ D q 0 G ′ (t ′ n ) be such a derivation.for some production p = c[q 1 , . . ., q k ] E, / 0 −→ q 0 ∈ P ′ .Since the input tree t ′ n contains positions there must exist j ∈ N such that c(1 j ) = x 1 ; i.e., position 1 j is labeled x 1 in c.Obviously, j ≤ ht(G ′ ), so the height of the subtree t ′′ = t ′ n | 1 j , which is still a complete binary tree, is at least 2 ht(G ′ ) + 2. We can thus apply Lemma 4 to the tree t ′′ in such a way that it modifies its second direct subtree (starting from 1 j ∈ pos(t ′ n ), we descend to 1 j 2; from there, we either find a subderivation to some nonterminal different from ⊥, or all subtrees below 1 j 2 are copies of subtrees below 1 j 1, and in that case, we apply the pumping to an equality constrained subtree below 1 j 1, which then also modifies the corresponding subtree below 1 j 2).Let u be the such obtained pumped tree, which according to zero-sum freeness and Lemma 4 is also in the support of G ′ ; i.e., u ∈ supp(G ′ ).Let d ′ be the derivation constructed in Lemma 4 corresponding to u.We have u(1 j−1 ) = f , so the position 1 j−1 is labeled f .Since G and G ′ are equivalent, there must be a derivation of G for u as well, which enforces the equality constraint u| 1 j−1 12 = u| 1 j−1 21 .By construction we have t ′ n | 1 j−1 12 = u| 1 j−1 12 .Since the positions 1 j−1 12 and 1 j−1 21 have no common suffix, this equality can only be guaranteed by G ′ if 1 j−1 12 and 1 j−1 21 are themselves (explicitly or implicitly) equality constrained in d ′ .The potentially several constraints that achieve this must of course be located at prefixes of 1 j−1 12 and 1 j−1 21, and since the production used in d ′ at the root is still p and stretches all the way to 1 j , this can only be achieved if d ′ enforces 1 j−1 1 = 1 j−1 2 via p at the root as well as 1 = 2 at 1 j−1 1 or at 1 j−1 2. However, this is a contradiction as u(1 j−1 1) = f = f = u(1 j−1 2), so we cannot have an explicit or implicit equality constraint between 1 j−1 12 and 1 j−1 21, so u| 1 j−1 21 = t ′ n | 1 j−1 21 , but contradicts that G has a complete derivation for u.

⊓ ⊔
Although for zero-sum free semirings, the support of a regular weighted tree language is again regular, in general, the converse is not true, so we cannot apply the decision procedure of [15] to the support of a homomorphic image in order to decide its regularity.Instead, we hope to extend the unweighted argument in a way that tracks the weights sufficiently close.For this, we prepare two decidability results, which rely mostly on the corresponding results in the unweighted case.To this end, we need to relate our WTGc constructed in Theorem 5 to the classic TGc used in [15].

Fig. 1
Fig. 1 Illustration of the derivation mentioned in Example 1.

Fig. 4 A
Fig.4A snippet of the tree t ′ n and the productions used by G ′ .

and the constraints E and I are fulfilled on t| w ; i.e., t| w |= E and t| w | ∀
and otherwise d ′ i = w i d i and t ′ i = t i (i.e., derivation and tree remain unchanged).
Thus, we obtain d d ′ 11 by substituting d ′ into d at position 11 as well as substituting d ⊥ t ′ into d at positions 12, 221, and 222.The obtained tree t t ′ d