Kleene Algebra with Hypotheses

. We study the Horn theories of Kleene algebras and star continuous Kleene algebras, from the complexity point of view. While their equational theories coincide and are PSpace -complete, their Horn theories diﬀer and are undecidable. We characterise the Horn theory of star continuous Kleene algebras in terms of downward closed languages and we show that when restricting the shape of allowed hypotheses, the problems lie in various levels of the arithmetical or analytical hierarchy. We also answer a question posed by Cohen about hypotheses of the form 1 = S where S is a sum of letters: we show that it is decidable.


Introduction
Kleene algebras [10,6] are idempotent semirings equipped with a unary operation star such that x * intuitively corresponds to the sum of all powers of x.They admit several models which are important in practice: formal languages, where L * is the Kleene star of a language L; binary relations, where R * is the reflexive transitive closure of a relation R; matrices over various semirings, where M * can be used to perform flow analysis.
A fundamental result is that their equational theory is decidable, and actually PSpace-complete.This follows from a completeness result which was proved independently by Kozen [11] and Krob and Boffa [17,3], and the fact that checking language equivalence of two regular expressions is PSpace-complete: given two regular expressions, we have (where KA e ≤ f denotes provability from Kleene algebra axioms, and [e] is the language of a regular expression e).
Because of their interpretation in the algebra of binary relations, Kleene algebras and their extensions have been used to reason abstractly about program correctness [12,15,2,9,1].For instance, if two programs can be abstracted into two relational expressions (R * ; S) * and ((R ∪ S) * ; S) = , then we can deduce that these programs are equivalent by checking that the regular expression (a * b) * and (a + b) * b + 1 denote the same language.This technique made it possible to automate reasoning steps in proof assistants [4,16,19].
In such a scenario, one often has to reason under assumptions.For instance, if we can abstract our programs into relational expressions (R + S) * and S * ; R * , then we can deduce algebraically that the starting programs are equal if we know that R; S = R (i.e., that S is a no-op when executed after R).When doing so, we move from the equational theory of Kleene algebras to their Horn theory: we want to know whether a given set of equations, the hypotheses, entails another equation in all Kleene algebras.Unfortunately, this theory is undecidable in general [13].In this paper, we continue the work initiated by Cohen [5] and pursued by Kozen [13], by characterising the precise complexity of new subclasses of this general problem.
A few cases have been shown to be decidable in the literature, when we restrict the form of the hypotheses : when they are of the form e = 0 [5], when they are of the form a ≤ 1 for a a letter [5], when they are of the form 1 = w or a = w for a a letter and w a word, provided that those equations seen as a word rewriting system satisfy certain properties [18,14]; this includes equations like idempotency (x = xx) or selfinvertibility (1 = xx).
(In the first two cases, the complexity can be shown to remain in PSpace.) We add one positive case, which was listed as open by Cohen [5], and which is typically useful to express that a certain number of predicates cover all cases: when hypotheses are of the form S = 1 for S a sum of letters.
Conversely, Kozen also studied the precise complexity of various undecidable sub-classes of the problem [13].For those, one has to be careful about the precise definition of Kleene algebras.Indeed, these only form a quasi-variety (their definition involves two implications), and one often consider * -continuous Kleene algebras [6], which additionally satisfy an infinitary implication (We define these formally in Sect.2).While the equational theory of Kleene algebras coincides with that of * -continuous Kleene algebras, this is not the case for their Horn theories: there exist Horn sentences which are valid in all * -continuous Kleene algebras but not in all Kleene algebras.
Kozen [13] showed for instance that when hypotheses are of the form pq = qp for pairs of letters (p, q), then validity of an implication in all * -continuous Kleene algebras is Π 0 1 -complete, while it is only known to be ExpSpace-hard for plain Kleene algebras.In fact, for plain Kleene algebras, the only known negative result is that the problem is undecidable for hypotheses of the form u = v for pairs (u, v) of words (Kleene star plays no role in this undecidability result: this is just the word problem).We show that it is already undecidable, and in fact Σ 0 1 -complete when hypotheses are of the form a ≤ S where a is a letter and S is a sum of letters.We use a similar encoding as in [13] to relate the Horn theories of KA and KA * to runs of Turing Machines and alternating linearly bounded automata.This allows us to show that deciding whether an inequality w ≤ f holds where w is a word, in presence of sum-of-letters hypotheses, is EXPTIMEcomplete.We also refine the Π 1  1 -completeness result obtained in [13] for general hypotheses, by showing that hypotheses of the form a ≤ g where a is a letter already make the problem Π 1 1 -complete.The key notion we define and exploit in this paper is the following: given a set H of equations, and given a language L, write cl H (L) for the smallest language containing L such that for all hypotheses (e ≤ f ) ∈ H and all words u, v, This notion makes it possible to characterise the Horn theory of * -continuous Kleene algebras, and to approximate that of Kleene algebras: we have where KA H e ≤ f (resp.KA * H e ≤ f ) denotes provability in Kleene algebra (resp.* -continuous Kleene algebra).We study downward closed languages and prove the above characterisation in Sect.3.
The first implication can be strengthened into an equivalence in a few cases, for instance when the regular expression e and the right-hand sides of all hypotheses denote finite languages, or when hypotheses have the form 1 = S for S a sum of letters.We obtain decidability in those cases (Sect.4).
Then we focus on cases where hypotheses are of the form a ≤ e for a a letter, and we show that most problems are already undecidable there.We do so by exploiting the characterisation in terms of downward closed languages to provide encodings of various undecidable problems on Turing machines, total Turing machines, and linearly bounded automata (Sect.5).
We summarise our results in Fig. 1.The top of each column restricts the type of allowed hypotheses.Variables e, f stand for general expressions, u, w for words, and a, b for letters.Grayed statements are implied by non-grayed ones.
Notations We let a, b range over the letters of a finite alphabet Σ.We let u, v, w range over the words over Σ, whose set is written Σ * .We write for the empty word; uv for the concatenation of two words u, v; |w| for the length of a word w.We write Σ + for the set of non-empty words.We let e, f, g range over the regular expressions over Σ, whose set is written Exp Σ .We write [e] for the language of such a an expression e: [e] ⊆ Σ * .We sometimes implicitly regard a word as a regular expression.If X is a set, P(X) (resp.P fin (X)) is the set of its subsets (resp.finite subsets) and |X| for its cardinality.
A Kleene algebra is * -continuous if it satisfies the following implication: A hypothesis is an inequation of the form e ≤ f , where e and f are regular expressions.If H is a set of hypotheses, and e, f are regular expressions, we write KA H e ≤ f (resp.KA * H e ≤ f ) if e ≤ f is derivable from the axioms and implications of KA (resp.KA * ) as well as the hypotheses from H. We omit the subscript when H is empty.
Note that the letters appearing in the hypotheses are constants: they are not universally quantified.In particular if H = {aa ≤ a}, we may deduce Languages over the alphabet Σ form a * -continuous Kleene algebra, as well as binary relations over an arbitrary set.
In absence of hypotheses, provability in KA is coincides with provability in KA * and with language inclusion: Theorem 1 (Kozen [11]).
We will classify the theories based on the shape of hypotheses we allow; we list them below (I is a finite non-empty set): Name of the hypothesis Its shape

Closure of regular languages
It is known that provability in KA and KA * can be characterised by language inclusions (Thm.1).In the presence of hypotheses, this is not the case anymore: we need to take the hypotheses into account in the semantics.We do so by using the following notion of downward closure of a language.

Definition of the closure
Definition 2 (H-closure).Let H be a set of hypotheses and L ⊆ Σ * be a language.The H-closure of L, denoted cl H (L), is the smallest language K such that L ⊆ K and for all hypotheses e ≤ f ∈ H and all words u, v ∈ Σ * , we have Alternatively, cl H (L) can be defined as the least fixed point of the function φ L : P(Σ * ) → P(Σ * ) defined by φ L (X) = L ∪ ψ H (X), where In order to manipulate closures more conveniently, we introduce a syntactic object witnessing membership in a closure: derivation trees.Definition 3. Let H be a set of hypotheses and L a regular language.We define an infinitely branching proof system related to cl H (L), where statements are regular expressions, and rules are the following, called respectively axiom, extension, and hypothesis: We write H,L e if e is derivable in this proof system, i.e. if there is a wellfounded tree using these rules, with root e and all leaves labelled by words in L. Such a tree will be called a derivation tree for [e] ⊆ cl H (L) (or e ∈ cl H (L) if e is a word).

Properties of the closure operator
We summarise in this section some useful properties of the closure.Lem. 1 shows in particular that the closure is idempotent, monotonic (both for the set of hypotheses and its language argument) and invariant by context application.Lem. 2 shows that internal closure operators can be removed in the evaluation of regular expressions.
Lemma 1.Let A, B, U, V ⊆ Σ * .We have 3.3 Relating closure and provability in KA H and KA *

H
We show that provability in KA * can be characterized by closure inclusions.In KA, provability implies closure inclusions but the converse is not true in general.
Theorem 2. Let H be a set of hypotheses and e, f be two regular expressions.

Let us prove that for any regular expressions
. Let e, f be two such expressions and let T be a derivation tree for [e] ⊆ cl H ([f ]), i.e. witnessing H,L e ≤ f .We show that we can transform this tree T into a proof tree in KA * H .The extension rule is an occurrence of Lem.12. Finally, the hypothesis rule is also provable in KA * H , using the hypothesis e ≤ f together with compatibiliy of ≤ with concatenation, and completeness of KA * for membership of u ∈ [e].We can therefore build from the tree T a proof in When we restrict the shape of the expression e to words, and hypotheses to (w ≤ w)-hypotheses, we get the implication missing from Thm. 2.
. We proceed by induction on the height of a derivation tree for w ∈ cl H ([f ]).If this tree is just a leaf, then w ∈ [f ] and by Thm. 1 KA w ≤ f .Otherwise, this derivation starts with the following steps: . . .
4 Decidability of KA and KA * with (1 = x)-hypotheses In this section, we answer positively the decidability problem of KA H , where H is a set of (1 = x)-hypotheses, posed by Cohen [5]: To prove this theorem we show that in the case of (1 = x)-hypotheses: ) is regular and we can compute effectively an expression for it.
Decidability of KA H follows immediately from (P1) and (P2), since it amounts to checking language inclusion for two regular expressions.
To show (P 1) and (P 2), it is enough to prove the following result: Theorem 4. Let H be a set of (1 = x)-hypotheses and let f be a regular expression.The language cl H ([f ]) is regular and we can compute effectively an expression c such that To prove Thm. 4, we first show that the closure of (1 = x)-hypotheses can be decomposed into the closure of (x ≤ 1)-hypotheses followed by the closure of (1 ≤ x)-hypotheses: We set Sketch.We show that rules from H id can be locally permuted with rules of H sum in a derivation tree.This allows to compute a derivation tree where all rules from H id occur after (i.e.closer to leaves than) rules from H sum .Now, we will show results similar to Thm. 4, but which apply to (x ≤ 1)hypotheses and (1 ≤ x)-hypotheses (Prop.5 and 6 below).To prove Thm. 4, the idea is to decompose H into H id and H sum using the decomposition property Prop.3, then applying Prop. 5 and Prop.6 to H id and H sum respectively.
To show these two propositions, we make use of a result from [7]: ) be an NFA , H be a set of hypotheses and ϕ : Q → Exp Σ a function from states to expressions.We say that ϕ is Hcompatible with A if: -KA H 1 ≤ ϕ(q) whenever q ∈ F , -KA H aϕ(r) ≤ ϕ(q) for all transitions (q, a, r) ∈ ∆.

Proposition 4 ([7]
).Let A be a NFA, H be a set of hypothesis and ϕ be a function H-compatible with A. We can construct a regular expression f A such that: Let H be a set of (x ≤ 1)-hypotheses and let f be a regular expression.The language cl H ([f ]) is regular and we can compute effectively an expression c such that If A is a NFA for f , a NFA A id recognizing K can be built from A by adding a Γ -labelled loop on every state.It is straightforward to verify that the resulting NFA recognizes K, by allowing to ignore any letter from Γ .For every q ∈ Q, let f q be a regular expression such that [f q ] = [q] A , where [q] A denotes the language accepted from q in A.
so by completeness of KA, we have KA 1 ≤ f q .Let (p, a, q) be a transition of A id .Either (p, a, q) ∈ ∆, in which case we have a[f q ] ⊆ [f p ], and so by Thm. 1 KA af q ≤ f p .Or p = q (this transition is a loop that we added).Then KA H a ≤ 1, so KA H af p ≤ f p , and this concludes the proof.
By Prop.4, we can now construct a regular expression c which satisfies the desired properties.Definition 5. Let Γ be a set of letters.A language L is said to be Γ -closed if: Remark 1.If H is a set of (x ≤ 1)-hypothesis, and Γ = {a | (a ≤ 1) ∈ H}, then cl H (L) is Γ -closed for every language L. Proposition 6.Let H be a set of (1 ≤ x)-hypotheses and let f be a regular expression whose language is H-closed.The language cl H ([f ]) is regular and we can compute effectively an expression c such that Let us show that cl H (L) is regular.The idea is to construct a set of words L , where each word u is obtained from a word u of cl H (L), by adding at the position where a rule (1 ≤ S j ) is applied in the derivation tree for cl H (L) u, a new symbol j .We will show that this set satisfies the two following properties: cl H (L) is obtained from L by erasing the symbols j .
-L is regular.
Since the operation that erases letters preserves regularity, we obtain as a corolary that cl H (L) is regular.
Let us now introduce more precisely the language L and show the properties that it satisfies.Let Θ = { j | j ∈ J} be a set of new letters and Σ = Σ ∪ Θ be the alphabet Σ enriched with these new letters.
We define the function exp : Σ → P(Σ) that expands every letter j into the sum of the letters corresponding to its rule in H as follows: This function can naturally be extended to exp : (Σ ) * → P(Σ * ).
If L ⊆ Σ * , we define L ⊆ (Σ ) * as follows: We define the morphism π : (Σ ) * → Σ * that erases the letters from Θ as follows: π(a) = a if a ∈ Σ and π( j ) = for all j ∈ J. Our goal is to prove that cl H (L) = π(L ) and that L is regular.To prove the first part, we need an alternative presentation of L as the closure of a new set of hypotheses H which we define as follows: See [8, App.B] for a detailed proof of Lem. 4.
. By Lem. 4, there is a derivation tree T v for v ∈ cl H (L). Erasing all occurences of j in T v yields a derivation tree for u ∈ cl H (L). Conversely, if u ∈ cl H (L) is witnessed by some derivation tree T u , we show by induction on T u that there exists v ∈ L ∩ π −1 (u).If T u is a single leaf, we have u ∈ L, and therefore it suffices to take v = u.
Otherwise, the rule applied at the root of T u partitions u into u = wz, and has premises {wbz | b ∈ [S j ]} for some j ∈ J and w, z ∈ Σ * .By induction hypothesis, for all b ∈ [S j ], there is Lemma 6. L is a regular language, computable effectively.Sketch.From a DFA A = (Σ, Q, q 0 , F, δ) for for L, we first build a DFA A ∧ = (Σ, P(Q), q 0 , P(F ), δ ∧ ),which corresponds to a powerset construction, except that accepting states are P(F ).This means that the semantic of a state P is the conjunction of its members.We then build A = (Σ, P(Q), q 0 , P(F ), δ ) based on A ∧ , which can additionally read letters of the form j , by expanding them using the powerset structure of A ∧ .Lemma 7. We can construct a regular expression c such that [c] = cl H (L) and KA H c ≤ f .Proof.Let A be the DFA constructed for L in the proof of Lem. 6.We will use the notations of this proof in the following.
Let π(A ) = (Σ, P(Q), q 0 , P(F ), π(δ )) be the NFA obtained from A by replacing every transition δ (P, j ) = R, where j ∈ J, by a transition π(δ )(P, ) = R.By Lem. 5, the automaton π(A ) recognizes the language cl H (L). Let us construct a regular expression c for this automaton such that KA H c ≤ f .
For every P ∈ P(Q), let f P be a regular expression such that [f P ] = [P ] A∧ .Let ϕ : P(Q) → Exp Σ be the function which maps each state P of π(A ) to ϕ(P ) = f P .Let us show that ϕ is H-compatible.
If P ∈ P(F ), then P is a final state of A ∧ , so 1 ∈ [f P ], and by completeness of KA, KA 1 We have also that KA H j ≤ S j , so KA H j f R ≤ f P .By Prop.4, we can construct the desired regular expression c.

Complexity results for letter hypotheses
In this section, we give a recursion-theoretic characterization of KA H and KA * H where H is a set of letter hypotheses or (w ≤ w)-hypotheses.In all the section, by "deciding KA ( * ) H " we mean deciding whether KA ( * ) H e ≤ f , given e, f, H as input.
Theses various complexity classes will be obtained by reduction from some known problems concerning Turing Machines (TM) and alternating linearly bounded automata (LBA), such as halting problem and universality.
To obtain these reductions, we build on a result which bridges TMs and LBAs on one hand and closures on the other: the set of co-reachable configurations of a TM (resp.LBA) can be seen as the closure of a well-chosen set of hypotheses.
We present this result in Section 5.1, and show in Section 5.2 how to instantiate it to get our complexity classes.
∈ Γ be fresh symbols to mark the ends of the tape, and A configuration is a word uqav = # L Γ * QΓ + # R , where # L and # R are special symbols not in Γ , meaning that the head of the TM points to the letter a.We denote by C the set of configurations of The execution of the TM M over input w ∈ Σ may be seen as a game-like scenario between two players ∃loise and ∀belard over a graph C (C ×P({L, R}× Γ × Q)), with initial position ιw which proceeds as follows.
over a configuration uqav with a ∈ Γ , u, v ∈ Γ * # , ∃loise picks a transition X ∈ ∆(q, a) to move to position (uqav, X) over a position (uqav, X) with a ∈ Γ , u, v ∈ Γ * , ∀belard picks a triple Given a subset of configurations D ⊆ C, we define Attr ∃loise (D) the ∃loise attractor for D as the set of configurations from which ∃loise may force the execution to go through D.
A deterministic TM M is one where every ∆(q, a) ⊆ {{(d, c, r)}} for some (d, c, r) ∈ {L, R} × Γ × Q In such a case, we may identify M with the underlying partial function [M] : Σ * Q F .An alternating linearly bounded automaton over the alphabet Σ is a tuple ) is a TM that does not insert B symbols.This means that the head can point to d , and for every X ∈ ∆(q, # d ) and (d , a, r) ∈ X, we have d = d and a = # d .
An LBA is deterministic if its underlying TM is.
The following lemma generalizes a similar construction from [13].
Lemma 8.For every TM M of working alphabet Γ , there exists a set of (w ≤ w)-hypotheses H M over the alphabet Θ = Q ∪ Γ such that, for any set of configurations D ⊆ C we have that: cl H A (D) = Attr ∃loise (D).Furthermore, this reduction is polytime computable, and H A is length-preserving if M is an LBA.
A configuration c is co-reachable if ∃loise has a strategy to reach a final configuration from c. Lem. 8 shows that the set of co-reachable configurations can be seen as the closure by (w ≤ w)-hypotheses.Since we are also interested in (x ≤ x)-hypotheses, we will show that (w ≤ w) hypotheses can be transformed into letter hypotheses.Moreover, this transformation preserves the length-preserving property.
Theorem 5. Let Σ be an alphabet, H be a set of (w ≤ w)-hypotheses over Σ.There exists an extended alphabet Σ ⊇ Σ, a set of (x ≤ w)-hypotheses H over Σ and a regular expression h ∈ Exp Σ such that the following holds for every f ∈ Exp Σ and w ∈ Σ * .
Furthermore, we guarantee the following: -(Σ , H , h) can be computed in polynomial time from (Σ, H).
-H is length-preserving whenever H is.

Complexity results
Lemma 9.If H is a set of length-preserving (w ≤ w)-hypotheses (resp.a set of ( Proof.We actually show that our problem is complete in alternating-PSPACE (APSPACE), which enables us to conclude as EXPTIME and APSPACE coincide.First, notice that by completeness of KA H over this fragment (Prop.2), we have KA H w ≤ f ⇔ w ∈ cl H ([f ]).Hence, we work directly with the latter notion.It suffices to show hardness for the (x ≤ x) case and membership for the (w ≤ w) case.
Given an arbitrary alternating Turing Machine M in APSPACE there exists a polynomial p ∈ N[X] such that executions of M over words w are bisimilar to executions of the LBA(M) over wB p (|w|) .Hence, by Lem. 8 and Thm. 5, the problem with (x ≤ x)-hypotheses is APSPACE-hard.Conversely, we may show that our problem with (w ≤ w)-hypotheses falls into APSPACE.On input w, the alternating algorithm first checks whether w ∈ [f ] in linear time.If it is the case, it returns "yes".Otherwise, it non-deterministically picks a factorization w = uxv with x ∈ Σ * and a hypothesis x ≤ i y i .It then universally picks y i ∈ Σ |x| , and replaces x by y i on the tape, so that the new tape content is w = uy i v. Then the algorithm loops back to its first step.In parallel, we keep track of the number of steps and halt by returning "no" as soon as we reach |Σ| |w| steps.This is correct because, if there is a derivation tree witnessing w ∈ cl H ([f ]), there is one where on every path, all nodes have distinct labels, so the nondeterministic player can play according to this tree, while the universal player selects a branch.
Proof.By Lem. 9 and the fact that regular expressions are in recursive bijection with natural numbers, our set is clearly Π 0 1 .To show completeness, we effectively reduce the set of universal LBAs, which is known to be Π 0 1 −complete, to our set of triples.Indeed, by Lem. 8, an LBA A is universal if and only if # L {ι}Σ * # R ⊆ cl H (C F ) where C F is the set of final configurations.
Proof.As KA H is a recursively enumerable theory, our set is Σ 0 1 .By the completeness theorem (Prop.2), we have ), so we may work directly with closure.In order to show completeness, we reduce the halting problem for Turing machines (on empty input) to this problem.Let M be a Turing machine with alphabet Σ and final state q f , and H M be the set of (w ≤ w)-hypotheses given effectively by Lem. 8. Let f = Σ * q f Σ * , by Lem. 8 we have M halts on empty input if and only if q 0 ∈ cl H M (f ).Notice that hypotheses of H are of the form u ≤ V where u ∈ Θ 3 and V ⊆ Θ 3 .By Thm. 5, we can compute a set H of (x ≤ x)-hypotheses, and an expression h on an extended alphabet such that Proof.This set is Π 0 2 by Thm. 7. It is complete by reduction from the set of Turing Machines accepting all inputs, which is known to be Π 0 2 .Indeed, let M be a Turing Machine on alphabet Σ with final state q f , by Lem. 8, we can compute a set of (w ≤ w)-hypotheses H M with finite language in second components such that c ∈ cl H M (c ) if and only if configuration c is reachable from c.As before, by Thm. 5, we can compute a set of letter hypotheses H with finite languages in second components, and a regular expression h on an extended alphabet, such that for any cl Sketch.It is shown in [13] that the problem is complete with hypotheses of the form H = H w ∪ {x ≤ g}, where H w is a set of length-preserving (w ≤ w) hypotheses.A slight refinement of Thm. 5 allows us to reduce this problem to hypotheses of the form x ≤ g.A universal total F : Σ * → {0, 1} is a function such that, for every total Turing machine M and input w ∈ Σ * we have F ( M , w ) = [M ](w).In particular, F should be total and is not uniquely determined over codes of partial Turing machines.The next folklore lemma follows from an easy diagonal argument.Lemma 10.There is no universal total Turing machine.

Undecidability of KA H for sums of letters
Our strategy is to show that decidability of KA H with (x ≤ x) hypotheses would imply the existence of a universal total TM.To do so, we need one additional lemma.
Lemma 11.Suppose that M = (Q, Q F , Γ, ι, B, ∆) is a total Turing machine with final states {0, 1} and initial state ι.Let w ∈ Σ * be an input word for M.
Then there is effectively a set of length-preserving (w ≤ w)-hypotheses H and expressions e w , h such that [M](w) = 1 if and only if KA H e w ≤ h Theorem 10.KA H is undecidable for (x ≤ x)-hypotheses.
Proof.Assume that KA H is decidable.This means that we have an algorithm A taking tuples (Σ, w, f, H), with H consisting only of sum-of-letters hypotheses and returning true when KA H w ≤ f and false otherwise.Without loss of generality, we can assume that A is total.By Thm. 5, we may even provide an algorithm A taking as input tuples (w, f, H) where H is a set of lengthpreserving (w ≤ w)-hypotheses with a similar behaviour: A returns true when KA H w ≤ f and false otherwise.
Given A , consider M defined so that [M]( N , w) = [A ](e w , h, H).where the last tuple is given by Lem.11.We show that M is a total universal Turing machine.Since such a machine cannot exist by Lem. 10, this is enough to conclude.Since A is total, so is M.For total Turing Machines N , Lem.If e = e 1 e 2 , noting IH ei for the induction hypothesis on e i , we have We prove Prop.1:

Proof. Assume [e]
⊆ cl H (L).This means there is an ordinal α such that [e] ⊆ φ α L (∅), by definition of cl H (L) as the least fixed point of φ L and Knaster-Tarski theorem.We prove by transfinite induction on α that cl H (L) H,L e, i.e. there is a derivation tree for [e] ⊆ cl H (L). The case α = 0 is trivial, as φ 0 L (∅) = ∅.If α > 0, we get that [e] ⊆ φ L ( β<α φ β L (∅)).We build the tree T in the following way: let w ∈ [e], we want to build a tree T w for w ∈ cl H (L). If w ∈ L, then the tree T w is just the single leaf w, an axiom of the system.If w ∈ φ β L (∅) for some β < α, we can conclude by induction hypothesis.The last possibility is w ∈ ψ H (φ β L (∅) for some β < α.This means that there is an hypothesis e We can therefore build a tree T w for w ∈ cl H (L), by appending a hypothesis rule at the root of T .Using an extension rule combining all these tres T w , we finally build the derivation tree T for [e] ⊆ cl H (L).
Conversely, assume there is a well-founded derivation tree T for [e] ⊆ cl H (L), we want to show that it is indeed true that [e] ⊆ cl H (L). Again, this can be shown by induction on the transfinite height α of the tree.If the tree is an axiom, then e is a word of L so it is true that [e] ∈ cl H (L). Otherwise, consider the rule applied to the root of the tree.If it is an extension rule, then by induction hypothesis, for all u ∈ [e] we have u ∈ cl H (L), so we have [e] ⊆ cl H (L). If it is a hypothesis rule, then there is a hypothesis e H ≤ f H ∈ H and words u, v ∈ Σ * such that e is a word w ∈ u[e H ]v, and there is a tree T for u[f H ]v ⊆ cl H (L). Then by induction hypothesis we have indeed u[f H ]v ⊆ cl H (L), and by definition of φ L we have w ∈ cl H (L).

A.1 Proof of closure properties
We prove Lem. 1:

Proof. The first four items follow from the definition of cl H as the smallest fixed point of φ
Finally, assume A ⊆ cl H (B) and let (u, w, v) ∈ U × A × V .We need to show that uwv ∈ cl H (U BV ).Consider a derivation tree T for w ∈ cl H (B). Applying the mapping x → uxv to all nodes of T yields a derivation tree for uwv ∈ cl H (uBv) ⊆ cl H (U BV ).By Prop. 1, we obtain uwv ∈ cl H (uBv) ⊆ cl H (U BV ).
We prove Lem. 2: Proof.Using Lem. 1, to prove the first item it suffices to prove that cl To show the second item, again it suffices to show cl H (A)cl H (B) ⊆ cl H (AB).By stability under concatenation (last item of Lem.1), for any X ⊆ Σ * , we have Xcl H (B) ⊆ cl H (XB), so cl H (A)cl H (B) ⊆ cl H (cl H (A)B). Using this stability again, we can now show cl H (A)B ⊆ cl H (AB), and thus by Lem. 1, cl H (cl H (A)B) ⊆ cl H (AB), thereby concluding the second item.
We finally show the last item, by proving cl

B Proofs of section 4
We prove Prop.3: We set For every language L ⊆ Σ * , we have that: Proof.We have that cl Hsum (cl H id (L)) ⊆ cl H (L) using the monotonicity of the closure (items 3 and 4, lem.1).Let us show the other inclusion.Let u ∈ cl H (L), and T be a derivation tree witnessing this membership.Note that T is finite since it is well-founded and finitely branching.We show first that T can be transformed into a derivation tree for u ∈ cl H (L), where the application of the rules from H id are delayed after the rules H sum .In other words, no rule from H sum appears after a rule H id .For that, we define a rewriting system where a redex is a pattern of an application of a rule from H id followed immediately by a rule from H sum , followed by an expansion rule.Thus a redex is a derivation of one of the following forms: We define the following rewriting rules, which delays the application of the hypothesis rule from H id .
uavS j w uavw Using these rewriting rules, we can transform T into a redex-free derivation T .
Let T be the subtree of T (with the same root) such that: -No hypothesis rule from H id is applied in T .
-For every leaf l of T , the subtree of T rooted in l does not contain a hypothesis rule from H sum .We denote this subtree T l .
This decomposition of T is possible because all the rule applications of H id are delayed after those of H sum .Note that T l is a derivation tree for l ∈ cl H id (L).Thus T is a derivation tree for u ∈ cl Hsum (cl H id (L)).
Proof of Lem. 4 Proof.If v ∈ (Σ ) * , let |v| be the number of letters in v from Θ .We show by induction on |v| that for all v ∈ (Σ ) completing the base case of the induction.Consider a derivation tree T v for v ∈ cl H (L). Since v does not contain any occurence of j for any j ∈ J, no hypothesis from H can be applied at the root of T v , so T v is necessarily a single leaf, and v ∈ L.
We now proceed to the induction case, and Conversely, assume v ∈ cl H (L), witnessed by a derivation tree T v .
-If the rule applied at the root of T v is of the form j ≤ 1, then it partitions v into v 1 j v 2 , and by induction hypothesis we have exp(v 1 v 2 ) ⊆ L. Since L is Γ -closed, and letters from Γ ⊆ Σ are preserved by exp, for each i ∈ I j , we have exp(v -Or the rule applied at the root of T v partitions v into v 1 j v 2 for some j ∈ J and has premises {v 1 bv 2 | b ∈ [S j ]}.By induction hypothesis, for all b ∈ [S j ], v 1 bv 2 ∈ L , i.e. exp(v 1 bv 2 ) ⊆ L. As before, this yields v ∈ L .
Proof of Lem.6: Lemma 6. L is a regular language, computable effectively.
Proof.Let A = (Σ, Q, q 0 , F, δ) be a DFA for L. Let A ∧ = (Σ, P(Q), q 0 , P(F ), δ ∧ ) be the DFA where δ ∧ is defined as follows: If q is the state of an automaton A, we denote by [q] A the language of this automaton with initial state q.Note that if P is a state of A ∧ , then [P ] A∧ = ∩ q∈P [q] A .Let us construct A , the automaton for L .We set A = (Σ , P(Q), q 0 , P(F ), δ ) be the DFA where δ is defined as follows: We will show that for all u ∈ Σ * , for all P ∈ P(Q), u ∈ [P ] A ⇔ exp(u) ⊆ [P ] A∧ , by induction on |u|.This implies in particular [A ] = L .Furthermore, we guarantee the following: -(Σ , H , h) can be computed in polynomial time from (Σ, H).
-H is length-preserving whenever H is.
Proof.Let Σ be an alphabet and H ∈ P fin (Σ + × P fin (Σ * )) be a set of word hypotheses.
We will show that it is possible to design h and a set of letter hypotheses H such that performing derivations for cl H ([f + h]) simulates derivations for cl H ([f ]).This is done by simulating hypotheses (w, X) from H by many hypotheses in H , one for each letter of w, and the expression h controlling that we process the letters in the right order.This also needs extra alphabets to store information about which hypothesis (w, X) and which position in w we are currently processing.
Notice that if H is length-preserving, then t is actually a function Θ → Σ.
The set H is defined as follows.It is straightforward to verify that (Σ , H , h) is computable in polynomial time from (Σ, H).We need to show the announced equivalence, for arbitrary u ∈ Σ * and f ∈ Exp Σ : We start with the left to right implication.To do so, we first show an auxiliary lemma.
Proof.We first show that, for every transition word δ w,X,w and factorization δ w,X,w = uv, we have t(u)v ∈ cl H ([X +h]) by induction over the length of v.If v is the empty word, we have t(u)v = t(δ w,X,w ) = w ∈ X ⊆ cl H ([X + h]).Otherwise, v = av with a = (w, X, w , i) for some i ∈ N. By the inductive hypothesis, By definition of H , we have (a, {t(a), ⊥ (w,X) }) ∈ H . Clearly, t(u)⊥ (w,X) v ∈ [h] and t(u)t(a)v = t(ua)v so we may conclude.Now, we show that for every factorization δ w,X,w = uv we have us(v) ∈ cl H ([X + h]) by another induction over |v|.If v is empty, then this is given by the previous induction by taking a trivial factorization.Otherwise, v = av with a = (w, X, w , i) for some i ∈ N. Let Hyp = (w, X).We have two subcases.
-If i > 0 and u is non-empty, then we show uw i s(v ) ∈ cl H ([uas(v ) -Otherwise v = δ w,X,w and u is empty.In such a case, note that for every w ∈ X, we have s(δ w,X,w ) = w = s(δ w,X,w ).For any w ∈ X, call b 0 w = (w, X, w , 0) the first letter of δ w,X,w and factorize w as av with a ∈ Σ and v , then we also have u ∈ [f + h] and we are done.Otherwise, we have some factorization u = u wu for u , u ∈ Σ * , w ∈ Σ + , some X such that (w, X) ∈ H and u Xu ⊆ cl H ([f +h]) by the induction hypothesis.By Lem. 13, we know that w ∈ cl H ([X + h]), and by stability under concatenation (last item of Lem.1), we have u wu To prove the converse, i.e. the right to left implication of Thm. 5, we first need auxiliary lemmas concerning |= letters and consistency of transitions.
Proof.The right-to-left direction is easy.For the left-to-right, first notice that H does not allow to remove letters from ⊥, hence (  Proof.Straightforward induction. The following lemma states that if we commpletely process a word hypothesis from H letter by letter according to cl H ([f + h]), we indeed performed a step according to cl H ([f ]).
Lemma 16.Let δ w,X,w be a transition word, and u, v ∈ Θ * such that uv = δ w,X,w .Then we have, for every u Proof.We will show the result for all (w, X, w ), and proceed by induction.Towards notational convenience, let us define the following sets and function, for any (w, X) ∈ H.
Notice that F is well-defined because u is non-empty in the first case, and therefore contains the information w = uv.In the second case, it suffices to project letters from v according to t.The function F describes the result after we finish processing the current word hypothesis from H letter-by-letter.
We show that for every α ).We proceed by induction on the derivation tree -If xay ∈ Y w,X,1 for some (w, X) ∈ H, we have xay = u us(v)v with u, v = .
According to hypotheses H , Lem. 14 and the shape of h 1 , we necessarily have: a ∈ Σ, x = u u, s(v) = as(v ), and y = s(v )v for some v ∈ Θ * .We also have x⊥ s y ∈ h 1 .By Lem. 15 and since u = , we must have a ∈ X such that δ w,X,w = ua v .Hence we have xa y = u ua s(v )v ∈ Y and thus, by the induction hypothesis, F (xa y) ∈ cl H ([f ]).We can then conclude since F (xay) = F (xa y).
-If xay ∈ Y w,X,2 for some (w, X) ∈ H, we have xay = u t(u)vv .We make a case distinction on whether v = .
• It is then easy to check that cl H A , when restricted to C and Attr ∃ (D) are fixed points of the same operator.Notice that if A is an LBA, H A is length-preserving.
For Turing machines, the hypotheses are no more left preserving, and as shown in [13] rules H right , H left are replaced with rules C.3 Proof of Lem.11 Lemma 11.Suppose that M = (Q, Q F , Γ, ι, B, ∆) is a total Turing machine with final states {0, 1} and initial state ι.Let w ∈ Σ * be an input word for M.
Then there is effectively a set of length-preserving word hypotheses H and expressions e w , h such that [M](w) = 1 if and only if KA H e w ≤ h Proof.Consider the linearly bounded automaton LBA(M) associated with M and take H = H LBA (M) to be the set of length-preserving hypotheses given by Lem. 8. Notice that this LBA is stuck on configurations where the head reaches an extremity of the configurations.Take accordingly h to be the sum h 1 +h 2 +h 3 where By the semantics, the right-to-left implication is trivial in light of Lem. 8. Indeed, as soon as the number of B symbols in the lefthand side is sufficient, the expression h forces the result of M to be 1.

Full
version of the extended abstract to appear in Proc.FoSSaCS 2019.This work has been supported by the European Research Council (ERC) under the European Union's Horizon 2020 programme (CoVeCe, grant agreement No 678157) and by the LABEX MILYON (ANR-10-LABX-0070) of Université de Lyon, within the program "Investissements d'Avenir" (ANR-11-IDEX-0007) operated by the French National Research Agency (ANR).

Example 2 .
The following derivation is a derivation tree for bababa ∈ cl H ([b * a * ]), where H = {ab ≤ ba}.bbbaaa bbabaa bbaaba bababa Derivation trees witness membership to the closure as shown by the following proposition.Proposition 1. [e] ⊆ cl H (L) iff H,L e.
since the other implication is always true (Thm.2).Let e, f such that [e] ⊆ cl H ([f ]).If c is the expression given by Thm 4, we have KA H c ≤ f and [e] ⊆ [c] so by Thm. 1 KA e ≤ c, and this concludes the proof.

5. 1
Closure and co-reachable states of TMs and LBAs Definition 6.An alternating Turing Machine over Σ is a tuple M = (Q, Q F , Γ, ι, B, ∆) consisting of a finite set of states Q and final states Q F ⊆ Q, a finite set of states Q, a finite working alphabet Γ ⊇ Σ, an initial state ι ∈ Q, B ∈ Γ the blank symbol and a transition function ∆ :
which takes care of the base step where T α is a single leaf.For the inductive step, suppose that the root of T α uses the rule (a, X ) ∈ H in the following way, with α = xay:∀a ∈ X , xa y xaywhere by induction hypothesis, for each a ∈ X , if xa y ∈ Y , then F (xa y) ∈ cl H ([f ]).We perform a case analysis according to which component of Y the word xay belongs to.

C. 4 10 Lemma 17 .
For the left-to-right, suppose that [M](w) = 1.Then, from ιw, M may execute to a final configuration u1v.Considering the execution of LBA(M) over # L B k ιwB k # R , we may show that we have n, n ∈ N such that a stuck configuration c occurs in one of the following patterns:if k ≥ n and k ≥ n , then the computation faithfully simulates the execution of M and c ∈ [h 3 ] if k < n and k ≥ n , then the computation cannot faithfully the execution of M because of lack of space on the left of the tape and c ∈[h 1 ] if k ≥ n and k < n , c ∈ [h 2 ] for similar reasons if k < n and k < n , we have c ∈ [h 2 + h 1 ] Let e w = # L B * ιwB * # R .We partition e w into the following (n + 1)(n + 1) regular expressions e for which we can prove KA H e ≤ h.We detail below the different cases.-Theexpressions # L B k ιwB k # R with k < n and k < n .The wanted inequality can be shown in KA H by Cor.9.-The expressions # L B k ιwv B n B * # R with k < n.Using the proof of Lem. 8, we can show that # L B k ιwB n ∈ cl H (Q# L Γ * ), which by Prop. 2 establishes that KA H # L B k ιwv B n ≤ Q# L Γ * .Then, we have KA B * # R ≤ Γ * # R , thus by concatenation and KA Γ * Γ * ≤ Γ * , we have KA H # L B k ιwB n B * # R ≤ Q# L Γ * # R = h 1 .-The expressions # L B * B n ιwB k # R with k < nare treated in the same way.-The expression # L B * B n ιwB n B * # R also gets a fairly similar treatment: KA H B n ιwB n ≤ Γ * 1, from which we conclude by cutting with a proof in KA.Proof of Lem.There is no universal total Turing machine.Proof.Suppose that M is a universal total Turing machine.Consider the diagonal function D(w) = 1 − M( w, w ).Notice that D is total.So, by universality, we have a contradiction.[D]( D ) = 1 − [M]( D , D ) = 1 − [D]( D ) , ) is a * -continuous Kleene algebra.The inequality ≤ of F H,Σ coincides with inclusion of languages.Proof.By Lem. 2, the function cl H : (P(Σ * ), +, •, * ) → (CReg H,Σ , ⊕, , ) is a homomorphism.We show that F H,Σ is a * -continuous Kleene algebra.First, identities of Lang Σ = (P(Σ * ), +, •, * ) are propagated through the morphism cl H , so only Horn formulas defining * -continuous Kleene algebras remain to be verified.It suffices to prove that F H,Σ satisfies the * -continuity implication, because the implication xy ≤ y → x * y ≤ y and its dual can be deduced from it.Let A, B, C ∈ F H,Σ such that for all i ∈ N, A B = cl H (AB i C), so we have cl H (AB i C) ≤ D, and in particular AB i C ≤ D for all i.By * -continuity of Lang Σ , we obtain AB * C ≤ D. By Lem. 1 and using D = cl H (D), we obtain cl H (AB * C) ≤ D and finally by Lem. 2, A B C ≤ D. This achieves the proof that F H,Σ is a * -continuous Kleene algebra.Let A, B ∈ CReg H,Σ .We have A ≤ B ⇔ A ⊕ B = B ⇔ cl H (A + B) = B ⇔ A ⊆ B. Finally, if e ≤ f is a hypothesis from H, then we have cl H [e] ⊆ cl H ([f ]), so the hypothesis is verified in F H,Σ .
i C A∧ by induction hypothesis.Since exp(u) = b∈[Sj ] bexp(v), we obtain u ∈ [P ] A ⇔ exp(u) ⊆ [P ] A∧ .Theorem 5. Let Σ be an alphabet, H a set of word hypotheses over Σ.There exists an extended alphabet Σ ⊇ Σ, a set of letter hypotheses H over Σ and a regular expression h ∈ Exp Σ such that the following holds for every f ∈ Exp Σ and w ∈ Σ * .
+h]), from which me may conclude using the inductive hypothesis and the properties of the closure.By definition of H and settingY = {b ∈ Θ Hyp | s(b) = w i } ∪ {⊥ s }, we have (w i , Y ) ∈ H .It thus suffices to check that uys(v ) ∈ [h]for all y ∈ Y \ {a} to conclude.
empty.Moreover, any rule of H either leaves the word unchanged, or introduces a new letter from ⊥.This means if a word w ∈ (Σ∪Θ) * |= (Σ∪Θ) * is in w ∈ cl H ([f +h]), by the previous remark it must be in [f +h].Since [f ] ⊆ Σ * , we get w ∈ [h].