Alternating complexity of counting first-order logic for the subword order

This paper considers the structure consisting of the set of all words over a given alphabet together with the subword relation, regular predicates, and constants for every word. We are interested in the counting extension of first-order logic by threshold counting quantifiers. The main result shows that the two-variable fragment of this logic can be decided in twofold exponential alternating time with linearly many alternations (and therefore in particular in twofold exponential space as announced in the conference version (Kuske and Schwarz, in: MFCS’20, Leibniz International Proceedings in Informatics (LIPIcs) vol. 170, pp 56:1–56:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020) of this paper) provided the regular predicates are restricted to piecewise testable ones. This result improves prior insights by Karandikar and Schnoebelen by extending the logic and saving one exponent in the space bound. Its proof consists of two main parts: First, we provide a quantifier elimination procedure that results in a formula with constants of bounded length (this generalises the procedure by Karandikar and Schnoebelen for first-order logic). From this, it follows that quantification in formulas can be restricted to words of bounded length, i.e., the second part of the proof is an adaptation of the method by Ferrante and Rackoff to counting logic and deviates significantly from the path of reasoning by Karandikar and Schnoebelen.


Introduction
The subword relation is one of the simplest nontrivial examples of a well-quasi-ordering [5] and can be used in the verification of infinite state systems [4]. It can be understood as embeddability of one word into another. This embeddability relation has been considered for other classes of structures like trees, posets, semilattices, lattices, graphs, Mazurkiewicz traces, etc. [7,8,12,13,15,23,24].
Many of these papers study logical aspects of the embeddability relation. Regarding the subword relation, the literature provides a rather sharp description of the border between decidable and undecidable fragments of first-order logic: For the subword order alone, the ∃ * -theory is decidable [14] and the ∃ * ∀ * -theory is undecidable [9]. For the subword order together with regular predicates, the two-variable theory is decidable [9] (this holds even for B Dietrich Kuske dietrich.kuske@tu-ilmenau.de 1 Technische Universität Ilmenau, Ilmenau, Germany the two-variable fragment of the logic C+MOD, i.e., the extension of first-order logic by threshold-and modulo-counting quantifiers [16]) and the three-variable theory [9] as well as the ∃ * -theory are undecidable [6] (these two undecidabilities already hold if we only consider singleton predicates, i.e., constants). Recently, Baumann et al. [1] strengthened the last undecidability result by showing that all semi-decidable languages can be defined by an existential formula using constants (even more, a language belongs to the n th existential level of the arithmetical hierarchy if, and only if, it can be defined by a Σ n -formula).
We next sketch the decision procedure for the 2-variable fragment of the first-order theory of the subword relation together with regular predicates from [9]. Let ϕ(x) be a formula with a single free variable. It may contain regular predicates that are given in any familiar formalism. Then, the crucial insight from [9] is that the set of words satisfying ϕ(x) can be obtained from the regular predicates by a fixed set of rational transductions and Boolean operations. Hence, one can inductively build the minimal deterministic finite automaton (henceforth dfa) accepting this set. The only known upper bound for the size of this minimal dfa is nonelementary since any quantification requires to apply one of the rational transductions to the language of a minimal dfa (which leads to a nondeterministic finite automaton, i.e., nfa) and then to determinise and minimise this nfa. The crucial insight from the follow-up paper [10] by the same authors is that the size of these minimal dfas is at most triply exponential if, instead of regular predicates, one allows constants, only (alternatively: singleton predicates). Since determinisation and minimisation of an nfa can be done in space polynomial in the resulting minimal dfa (and logarithmic in the nfa), the above construction can be carried out in threefold exponential space 1 which is also an upper bound for the said theory (the best lower bound we know so far is PSPACE [9]). This bound on the size of the minimal dfas is possible since all defined languages are piecewise testable [20]. A useful complexity measure for piecewise testable languages is their height. The new and innovative contribution of the proof from [10] are bounds for the height of the upwards closure L↑, the downwards closure L↓, and the incomparability set L of a piecewise testable language L; these new bounds are polynomial in the height of L (assuming a fixed alphabet). 2 We improve this 3EXPSPACE upper bound for the theory in three aspects: 1. We prove an upper bound of twofold exponential alternating time with linearly many alternations (which implies an upper bound of twofold exponential space, i.e., the result we announced in the conference version of this paper [11]). 2. We allow piecewise testable predicates given by so-called pt-nfas [17,18] (which are more succinct than minimal dfas). Further, the upper bound is measured in the depth of these pt-nfas as opposed to their size.
Remark Any piecewise testable predicate can be defined in the one-variable fragment of first-order logic. Consequently, these predicates do not increase the expressive power. Since a pt-nfa of depth k accepts a piecewise testable language of height k, the naive translation of a pt-nfa into a formula yields a formula of size exponential in the depth of the pt-nfa. As to whether this size increase is necessary seems not to be known. 3. We extend the two-variable fragment of first-order logic by threshold counting quantifiers ∃ t (from [16], we know that this theory is decidable, even with regular predicates).
Following and extending the ideas from [10], we first prove new results on the height of piecewise testable languages. Namely, we extend the above mentioned results about L↑, L↓, and L to, e.g., L↑ t , the set of words that have at least t subwords in L (and similarly for L↓ t and L t ). These considerations can be found in Sect. 3.
From these results, it follows that a language L defined by a formula (that uses threshold counting quantifiers and piecewise testable predicates given by pt-nfas) is piecewise testable of height at most doubly exponential in the size of the formula (Theorem 4.3).
Remark Consequently, L can be defined by a quantifier-free first-order formula. It follows that also the addition of counting quantifiers ∃ t does not increase the expressive power of the logic. But the use of counting quantifiers allows to write exponentially more succinct formulas (Theorem 4.5).
So far, this parallels the development in [10] where the corresponding result was shown for first-order logic. But at this point, instead of building automata (as done in [10]), we follow another path of argument, that is an adaptation of Ferrante and Rackoff's method [3].
The language-theoretic considerations imply that any formula is equivalent to a quantifierfree formula that uses constants of doubly exponential length and no piecewise testable predicates (Corollary 4.4). From this, we derive that quantification in formulas can be restricted to words of doubly exponential length. This implies that the two-variable fragment of the threshold counting extension of first-order logic becomes decidable in twofold exponential alternating time with linearly many alternations (allowing piecewise testable predicates in the formula given by pt-nfas).

Definitions and main results
Throughout this paper, we fix an alphabet Σ. We denote by Σ * the set of (finite) words over Σ. A word u ∈ Σ * is a subword of v ∈ Σ * if u = u 1 u 2 . . . u n and v = v 0 u 1 v 1 u 2 v 2 · · · u n v n for some n ∈ N and u i , v i ∈ Σ * . We write u v for this fact and alternatively say that v is a superword of u. Finally, we write u v if neither u is a subword of v nor vice versa; we say that u and v are incomparable. Note that for any two distinct words u and v, we have precisely one of the three relations u v, u v, or u v.
Let L ⊆ Σ * be a language. Its upwards closure is the language L↑ = {v ∈ Σ * | ∃u ∈ L : u v} of all words v that have some subword u in L. Dually, the downwards closure of L is the language L↓ = {u ∈ Σ * | ∃v ∈ L : u v} of all words u that have some superword v in L. Finally, the incomparability set of L is the language L = {u ∈ Σ * | ∃v ∈ L : u v} of all words u that have some incomparable word v in L.

Piecewise testable languages and the main result for language theorists
The length of a word u ∈ Σ * is denoted |u|, Σ n denotes the set of words of length n. We next define Simon's congruences ∼ n that play an important role in our considerations.
Definition Let u, v ∈ Σ * and n ∈ N. Then, u and v are n-equivalent (denoted u ∼ n v) if they have the same subwords of length n. We denote by [u] n the equivalence class containing the word u wrt. the equivalence relation ∼ n .
A language L ⊆ Σ * is piecewise testable if there exists n ∈ N such that L is a union of languages [u] n for some words u ∈ Σ * (which is equivalent to saying that L is closed under ∼ n ). The minimal such n is called the height of L. We write PT(n) for the class of piecewise testable languages of height n. Note that PT(n) ⊆ PT(n + 1), and that both ∅ and Σ * are of height 0. Since the set of equivalence classes [u] n forms a partition of Σ * , the class PT(n) is closed under Boolean operations. Since Σ n is finite, there are only finitely many equivalence classes of ∼ n . Hence, for any n ∈ N, there are only finitely many languages L ⊆ Σ * in PT(n).
Let L ⊆ Σ * be piecewise testable. Then, the upwards closure L↑, the downwards closure L↓ and the incomparability set L are all piecewise testable of height polynomial in that of L (the degree of the polynomial is the size of the alphabet Σ) [10]. We will extend these results to the following more general operations.
Let L ⊆ Σ * be some language and t ∈ N some threshold. Then denotes the set of words v that have t subwords in L. In particular, L↑ 0 = Σ * and L↑ 1 is the usual upwards closure L↑ of L. Note that any language L↑ t is upwards closed (i.e., satisfies L↑ t ↑ = L↑ t ) and therefore piecewise testable. Dually, the set consists of all words u that have t superwords in L; the above remarks on L↑ t apply mutatis mutandis. Let L t = {u ∈ Σ * | ∃v 1 , . . . , v t ∈ L pairwise distinct : u v i for all 1 i t} contain all words u that are incomparable with t words from L. We will also write, e.g., L <t for the complement of L t , i.e., for the set of words that are incomparable with at most t − 1 words from L.
Before we turn to a consequence in logic, we shortly recall some results on the relation of nondeterministic finite automata (abbreviated nfa) and piecewise testable languages.
There are different characterisations of piecewise testable languages using nfas; we only rely on one by Masopust and Thomazo [17,18] (see following remark for missing definition). They define a class of nondeterministic finite automata, called pt-nfa and prove the following: -A language is piecewise testable iff it is accepted by some pt-nfa [18,Thm. 25].
-Further, the depth ||A|| of a pt-nfa (i.e., the maximal length of a simple path) bounds the height of the accepted language [17,Thm. 8].

Remark
The concrete definition of a pt-nfa is of no importance for this paper; we only recall it for the convenience of the interested reader. An nfa is a tuple A = (Q, I , T , F) such that Q is a finite set, I , F ⊆ Q, and T ⊆ Q × Σ × Q. For p, q ∈ Q and Γ ⊆ Σ, we write p Γ * −→ q whenever there exists a word over Γ that labels some path from p to q. The depth of the nfa A is the maximal length of a simple path. The language L(A) of the nfa A is the set of words over Σ that label some path from some element of I to some element of F.
Let A = (Q, I , T , F) be an nfa. For r ∈ Q, we write Σ r for the set of letters a ∈ Σ with (r , a, r ) ∈ T . The nfa A is a pt-nfa [17,Def. 3] if the following hold: -The reachability relation is a partial order (i.e., p (An nfa satisfying this property is called acyclic.)

The logic C 2 and the main result for logicians
Let NFA be the set of all nfas over the alphabet Σ (to make this a set as opposed to a class, we require that states of these nfas belong to N). Consider the structure S = Σ * , , L(A) A∈NFA , (w) w∈Σ * whose universe is the set of words, whose only binary relation is the subword relation, that has a unary relation L(A) for each nfa A ∈ NFA and a constant for every word over Σ.
We can make statements about this structure using some variant of classical first-order logic. To control the use of nfas in these formulas, let A ⊆ NFA be a set of nfas (e.g., A = NFA, A = ∅, or A = ptNFA ⊆ NFA which is the set of pt-nfas). Then, formulas from C 2 A are defined by the following syntax: where c, d ∈ {x, y} ∪ Σ * are variables from {x, y} or words over Σ, A ∈ A is some nfa over Σ, t ∈ N, and z ∈ {x, y} is a variable. Note that we allow only the variables x and y. The semantics of these formulas is defined in the obvious way with the understanding that ∃ t x ϕ holds if there are t mutually distinct words that all make the formula ϕ true. Consequently, ∃ 1 is the usual existential quantifier and ∃ 0 x ϕ is always true. Let FO 2 A denote the subset of C 2 A that only uses the quantifier ∃ 1 , i.e., the classical first-order quantifier. For arbitrary structures, the introduction of threshold counting quantifiers ∃ t in conjunction with the restriction to two variables extends the expressive power. Later, we will see that in our context, the logics C 2 ptNFA and FO 2 ∅ are equally expressive by Corollary 4.4, but C 2 ptNFA is exponentially more succinct than FO 2 ∅ by Theorem 4.5. As a side remark, we prove that constants of length 2 suffice for the whole expressive power.

Theorem 2.2 Let
A , there exists an equivalent formula ψ ∈ C 2 A that uses constants of length 2, only. The same applies to the logic FO 2 A .
Proof We show that, for every word w ∈ Σ * , there exists a formula λ w (x) ∈ FO 2 ∅ using at most constants of length 2 such that w is the only word satisfying λ w (x).
Before we start the construction of λ w (x), consider the following inductively defined formula α n (z) (where z is any variable from {x, y} and z is the other variable): Then, S | α n (u) iff |u| n.
We now come to the construction of λ w (x) by induction on the length of w. If |w| 2, we simply set λ w (x) = w. Now let n = |w| > 2 and define m = n /2 The first two conjuncts express |x| = n, i.e., the length of x equals that of w. By the induction hypothesis, λ u (y) expresses y = u. Consequently, the latter two conjuncts are equivalent to x ∼ m w.
In other words, The size of a formula is defined with the understanding that the size |A| of an nfa A is its number of states, the size of a variable is 1, the size of a word is its length, and the size of the quantifier ∃ t is the length |bin(t)| of the binary encoding of t.
Besides the size, we also define the norm ||ϕ|| of a formula ϕ from C 2 ptNFA (recall that ||A|| denotes the depth of the pt-nfa A): Note that this norm ||ϕ|| forms a mixture between the size of a formula and its quantifier depth: It depends on the maximal size of constants and simple paths in automata appearing in ϕ as well as on the quantifier depth (where the quantifier ∃ t , that intuitively corresponds to a sequence of t quantifiers, contributes only log(t) to the norm). In particular, ||ϕ|| bounds the length of constants and the depth of pt-nfas occurring in ϕ. Note further that the norm ||ϕ|| of any formula ϕ is at most its size |ϕ|, i.e., ||ϕ|| |ϕ|. From Theorem 2.1, we infer in Sect. 4 that all definable languages are piecewise testable of bounded height (Theorem 4.3). This allows to derive a quantifier elimination result that reads as follows: ptNFA -formula ϕ is equivalent to some quantifier-and automata-free formula ψ ∈ FO 2 ∅ with ||ψ|| < 2 c 2||ϕ|| .
Karandikar and Schnoebelen [10] showed that any non-empty piecewise testable language of height n has elements of length polynomial in n. Based on Corollary 4.4, we can therefore restrict quantification in a formula ϕ to words of bounded length, implying our main result for logicians.
ptNFA -theory of S belongs to STA * , 2 2 poly(n) , O(n) , i.e., can be decided in doubly exponential alternating time with linearly many alternations.
Recall that, by [2], STA(s, t, a) is the class of all languages, for which membership can be decided by an alternating Turing machine whose space, time, and alternations are bounded by the functions s, t, and a, respectively. Typically, * is used to denote that no restriction is placed on a specific resource. Thus, STA is a combined complexity measure that is particularly useful when describing the complexity of logical theories (see, e.g., [2,3]).

Closure of the class of piecewise testable languages
The purpose of this section is to prove Theorem 2.1, i.e., our main result for language theorists.

Notions and results used in the proof
It is a chain if it is linearly ordered by the subword order and if it is infinite. Since the subword order is well-founded, any chain is isomorphic to (N, ). An example of a singleton equivalence class is [u] |u|+1 for any u ∈ Σ * ; if u contains two distinct letters, then even For a set L ⊆ Σ * of words, let min(L) denote the set of words v ∈ L that have no proper subword in L. Since the subword relation is well-founded, any word from L is a superword of some word from min(L), i.e., L ⊆ min(L)↑.
Imre Simon found a description of the set of minimal elements of an equivalence class [u] n that uses the following concept.  Deleting all empty sets from the tuple (B 1 , B 2 , . . . , B k ) makes the above presentation of min [u] n unique. The theorem implies in particular that all words from min [u] n have the same Parikh image. Further, they all have the same length 1 i k |B i | which is g |Σ| (n) (by the very definition of that function) and therefore (n + 2) |Σ| (by [10, Thm. 3.7 and Eq. (3.12)]).

Theorem 3.3
Let Σ be an alphabet, w ∈ Σ * , and n ∈ N. Then, there exists a word v ∼ n w with |v| g |Σ| (n) and v w.
Proof The definition of the function g |Σ| implies the existence of some word u ∼ n w with |u | g |Σ| (n). Since the subword order is well-founded, there exist words u, v ∈ min([w] n ) with u u and v w. Now Theorem 3.2 implies |v| = |u| |u | g |Σ| (n).

Upward closures
The following result verifies the first claim of Theorem 2.1. 3 Proposition 3.4 Let L ∈ PT(n) be a piecewise testable language of height n and t ∈ N. Then, the language L↑ t is piecewise testable of height g |Σ| (n) + t − 1.
Choosing the elements of Y as short as possible, we can assume Since x is a subword of y, there are more than |y| − |x| words x with x x y. Since the equivalence class [y] n is convex, any such word x satisfies x ∼ n y and therefore x ∈ L. Consequently, But this implies |y| g |Σ| (n) + t − 1.
So far, we proved that all words from Y have length at most g |Σ| (n) + t − 1. Since they all are subwords of z ∼ g |Σ| (n)+t−1 z , we obtain y z for all y ∈ Y . From |Y | = t and Y ⊆ L, we derive z ∈ L↑ t , i.e., L↑ t is closed under ∼ g |Σ| (n)+t−1 .

Downward closures
To verify the second claim of Theorem 2.1, we first prove that only singleton equivalence classes [x] n have maximal elements. We will use this lemma in the following proof when Lemma 3.5 Let n ∈ N and x, y ∈ Σ * be distinct with x ∼ n y. Then, there exists z ∈ Σ * with y ∼ n z, y z, and y = z.
Proof Since [x] n = [y] n is not a singleton, it is infinite by Lemma 3.1(4), thus contains in particular a word w of length |w| > |y|. By Lemma 3.1(2), there exists a z ∈ [y] n with y, w z, implying |z| |w|, and therefore z = y. Proposition 3.6 Let L ∈ PT(n) be a language over Σ and t ∈ N. Then, the language L↓ t belongs to PT (|Σ| + 1) · (g |Σ| (n) + 1) .
Proof Since L ∈ PT(n) and since ∼ n has finite index, there are finitely many words x 1 , . . . , x m with L = 1 i m [x i ] n and x i n x j for all 1 i < j m. By the definition of the function g |Σ| , we can assume |x i | g |Σ| (n) for all 1 i m. Set We first show For the inclusion "⊆", let x ∈ L↓ t \ F↓ t . Then, x has t superwords in L = F ∪ I , but at most t − 1 many in F. Hence, it has at least one superword in I , i.e., x ∈ I ↓. For the converse inclusion, note that F↓ t ⊆ L↓ t is trivial since F ⊆ L. So let x ∈ I ↓. Then, there exists y ∈ I with x y. Since y ∈ I , the equivalence class [y] n ⊆ I is infinite and therefore contains no maximal element by Lemma 3.5. Hence, there are infinitely many (and therefore in particular t) superwords of y x in I ⊆ L. Consequently, x ∈ L↓ t .
Note that the height of I is n since it is a union of equivalence classes of ∼ n . Consequently, the height of I ↓ is (|Σ| + 1) · g |Σ| (n) + 1 by [10, Thm. 5.5].

Incomparability set
There are three types of equivalence classes [x] n : the singletons, the chains (i.e., infinite languages ordered linearly by the subword order), and the infinite ones which are no chains. Note that by Lemma 3.1(4) this is a complete characterization of the equivalence classes. Propositions 3.7, 3.9, and 3.14, respectively, bound the heights of [x] n t for these three types of equivalence classes and collectively verify Theorem 2.1(3). Proposition 3.7 Let n, t ∈ N and x ∈ Σ * such that L = [x] n is a singleton. Then, L t ∈ PT g |Σ| (n) .
Proof If t 2, then L t = ∅ since L is a singleton. If t = 0, then L t = Σ * . Note that both these languages belong to PT(0) ⊆ PT g |Σ| (n) .
Finally, consider the case t = 1. Then, L t = Σ * \ (L↑ ∪ L↓) since L is a singleton. Note that L↑ ∪ L↓ = L↑ ∪ L↓ \ {x} since x ∈ L↑. The height of the former language is |x|. The latter is finite, and all its elements have length < |x|; hence, the height of that language is |x| as well. Thus, the height of L↑ ∪ L↓ is |x| and the same applies to its complement L t . Since L = [x] n is a singleton, the definition of the function g |Σ| implies |x| g |Σ| (n).
Next, we consider the case that [x] n is a chain and bound the height of [x] n t . The following lemma provides the central argument that will also be used later.
Note that, provided x 0 = ε, the chain C is not maximal since it can be extended to the left.
Proof We first demonstrate the inclusion "⊇". Since any two elements of C are comparable, we clearly have C <t ⊇ C. Further, any subword of x t−1 is a subword of all words x t−1+i for i ∈ N and therefore at most incomparable with the t − 1 words x 0 , x 1 , …, x t−2 from C.
For the converse inclusion, let y ∈ C <t . Then, y is comparable with infinitely many words from C. Since C has only finitely many words that are shorter than y, there is ∈ N with y x . Let ∈ N be minimal with this property. We distinguish three cases of the relation between x 0 and y: . , x −1 y since was chosen minimal with y x and x 0 x 1 · · · x l−1 is a chain. From y ∈ C <t , we obtain < t and therefore y x x t−1 . -If y x 0 , we have x 0 y x and therefore y ∈ C since C is convex.
Thus, in any case, y ∈ C ∪ {x t−1 }↓. Proposition 3.9 Let n, t ∈ N and x ∈ Σ * such that C = [x] n is a chain. Then, C t ∈ PT g |Σ| (n) + t .
We list the elements of the chain C in increasing order: n is a chain, it is a convex chain by Lemma 3.1(1) such that |x i | = |x 0 | + i holds for all i 0. From Lemma 3.8, we obtain The height of C is n g |Σ| (n) by assumption; thus, the height of C <t is g |Σ| (n) + t. But then the same bound applies to the height of C t = Σ * \ C <t .
It remains to prove a similar statement for infinite equivalence classes [x] n that are not a chain. The proof of the case t = 1 from [10] first shows that [x] n contains at least two elements of every length > |x|. Consequently, every word of length > |x| is incomparable with some word from [x] n , i.e., [x] n 1 is cofinite and therefore piecewise testable. Our proof for t > 1 shows that the set of pairs of words of equal length can be grouped into two convex chains, i.e., the equivalence class [x] n contains two convex chains that intersect, at most, in min [x] n (Lemma 3.13). Then, we apply Lemma 3.8. But first, we need some insight into convex chains which is the topic of the following considerations. Lemma 3.10 Let x, y ∈ Σ * and a ∈ Σ. Then, xa * y is a convex chain.
Proof Let x be the longest prefix of x not ending with a (i.e., x ∈ Σ * \ Σ * a and x ∈ x a * ) and y the longest suffix of y not beginning with a. Then, xa * y ⊆ x a * y is convex in (x a * y , ) and we prove the stronger claim that the latter set is a convex chain.
Let w ∈ Σ * and i, ∈ N with xa i y w xa y. We have to show that w belongs to xa * y. Note that x y w since x y xa i y w. Let w 1 be the prefix of length |x| of w, w 3 be the suffix of length |y| of w, and w 2 be the unique word with w = w 1 w 2 w 3 . Since w is a subword of xa y, dropping the first |x| letters in both w and xa y preserves the subword relation. The same holds when dropping the last |y| letters, hence w 2 a , i.e. w 2 ∈ a * . By similar reasoning for x y w 1 w 2 w 3 , we can conclude that x w 1 w 2 . Since w 2 ∈ a * , but x does not end on a, x has to be a subword of w 1 and thus x = w 1 , as both words are of the same length. Symmetrically, we can show y = w 3 . Consequently, we have w ∈ xa * y.
The third item of the following lemma implies, together with Theorem 3.2, that the maximal a-prefixes of two words from min [x] n differ in length by at most one. Towards a contradiction, assume u, v / ∈ aΣ * , |m − n| > 1, and a m u, a n v ∈  Perm(B 1 , . . . , B k ). Without loss of generality, we may assume m n + 2. By the second claim, we get B 1 = {a} from m n + 2 2. Hence, By induction, we obtain a m−n u, v ∈ Perm(B n+1 , . . . , B k ). Since m − n 2, the second claim implies B n+1 = {a} and therefore v ∈ aΣ * . But this contradicts our choice of v. Lemma 3.12 Let B be a tuple of finite nonempty sets of letters, x 1 , x 2 , y 1 , y 2 ∈ Σ * be words with x 1 x 2 , y 1 y 2 ∈ Perm(B), and a, b ∈ Σ letters with x 1 ax 2 = y 1 by 2 .
Then, x 1 a * x 2 and y 1 b * y 2 are convex chains that intersect, at most, in x 1 x 2 .
It remains to be shown that their intersection is contained in {x 1 x 2 } = {y 1 y 2 }. So let v ∈ C 1 ∩ C 2 . Then, there exist non-negative integers and m with v = x 1 a x 2 = y 1 b m y 2 .
Since the words x 1 x 2 and y 1 y 2 are of equal length, we have = m. If = 0, then v = x 1 a 0 x 2 = y 1 b 0 y 2 is in {x 1 x 2 } and we are done. Thus, we may assume > 0.
Since x 1 x 2 and y 1 y 2 both belong to Perm(B), we get |x 1 x 2 | a = |y 1 y 1 | a , implying 0 < = |a | a = |x 1 a x 2 | a − |x 1 x 2 | a = |y 1 b y 2 | a − |y 1 y 2 | a = |b | a and therefore a = b. Since x 1 and y 1 both are prefixes of x 1 a x 2 with |x 1 | |y 1 |, the word x 1 is a prefix of y 1 , i.e., there is a word x 1 with y 1 = x 1 x 1 . Symmetrically, we get a word y 2 with x 2 = y 2 y 2 . From x 1 x 1 a y 2 = y 1 a y 2 = x 1 a x 2 = x 1 a y 2 y 2 , we conclude x 1 a = a y 2 (and therefore in particular |x 1 | = |y 2 |). Aiming at a contradiction, assume |x 1 | = |y 2 | . Then, x 1 is a prefix of a and similarly y 2 a suffix of a , hence x 1 = y 2 = a k for some nonnegative integer k ∈ N. But then y 1 by 2 = y 1 ay 2 = x 1 a k ay 2 = x 1 aa k y 2 = x 1 ax 2 , as opposed to our assumption. Consequently |x 1 | = |y 2 | > , implying that there exists k ∈ N and a word w ∈ Σ * \ aΣ * such that x 1 = a a k w and y 2 = a k wa . If w = ε, then x 1 = y 2 = a k+ and therefore (as above) y 1 by 2 = y 1 ay 2 = x 1 a k+ ay 2 = x 1 aa k+ y 2 = x 1 ax 2 , as opposed to our assumption. Hence, w = cw for some letter c = a and some word w ∈ Σ * .
Note that x 1 a k cw a y 2 = x 1 x 2 ∈ Perm(B) and x 1 a +k cw y 2 = y 1 y 2 ∈ Perm(B) .
Recall that we considered an arbitrary word v ∈ C 1 ∩ C 2 and derived v ∈ {x 1 x 2 }. Hence, indeed, Note that, in the lemma above, the two words x 1 x 2 and y 1 y 2 have the same Parikh image. However, replacing the requirement x 1 x 2 , y 1 y 2 ∈ Perm(B) by this weaker property does not suffice for the claim of the lemma: consider x 1 = aac, y 2 = caa, x 2 = y 1 = ε, and a = b. Then, x 1 x 2 = aac and y 1 y 2 = caa satisfy the modified prerequisites, but x 1 a * x 2 = aaca * and y 1 b * y 2 = b * caa = a * caa are two convex chains that intersect in aacaa.

Lemma 3.13
Let u ∈ Σ * and n ∈ N such that [u] n is infinite but not a single chain. Then, [u] n contains two convex chains C 1 and C 2 with C 1 ∩ C 2 ⊆ min [u] n and C i ∩ min [u] n = ∅ for i ∈ {1, 2}.
By Theorem 3.2, there exists a tuple B of nonempty subsets of Σ such that x 1 x 2 , y 1 y 2 ∈ min [u] n ⊆ Perm(B). By Lemma 3.12, x 1 a * x 2 and y 1 b * y 2 are convex chains whose intersection is contained in {x 1 x 2 }. By Proposition 3.14 Let n, t ∈ N and x ∈ Σ * such that L = [x] n is infinite but not a chain. Then, L t ∈ PT g |Σ| (n) + t .
By Lemma 3.13, there exist two convex chains C 1 , C 2 ⊆ L such that C 1 ∩ C 2 ⊆ min(L) and C i ∩ min(L) = ∅ for i ∈ {1, 2}. We prove that Let v ∈ Σ * with |v| g |Σ| (n) + t > g |Σ| (n). Then, by Theorem 3.2 and the definition of the function g |Σ| , v / ∈ min(L) implying v / ∈ C 1 ∩ C 2 , without loss of generality, we assume v / ∈ C 1 . Since C 1 ∩ min(L) = ∅, the chain C 1 contains some word of length g |Σ| (n). Consequently, its word x t−1 number t − 1 satisfies |x t−1 | < g |Σ| (n) + t |v|, i.e., v cannot be a subword of Consequently, v / ∈ L <t which proves the above claim. Since all words in L <t are "short", we obtain L <t ∈ PT g |Σ| (n) + t and the same holds for the complement L t of this set.
We can now put the above three propositions together to verify the last claim of Theorem 2.1.

Proposition 3.15
Let L ∈ PT(n) be a language over Σ and t ∈ N. Then, L t ∈ PT g |Σ| (n) + t .
Proof Since L is of height n, there is a finite set of words {x 1 , . . . , x m } with x i n x j for all 1 i < j m such that L is the union of the equivalence classes [x i ] n . Since equivalence classes are disjoint, we obtain where the union is taken over all functions g : {1, 2, . . . , m} → {0, 1, . . . , t} with 1 i m g(i) = t. The previous propositions show that any of the languages [x i ] n s is piecewise testable of height g |Σ| (n) + t. Since the class PT g |Σ| (n) + t is closed under Boolean operations, the claim follows.

Expressive power and quantifier elimination
Having completed the language-theoretic part of this paper, we now come to its consequences in logic, i.e., we consider the threshold counting logic C 2 ptNFA that has two variables x and y, unary predicates for each piecewise testable language (represented by some pt-nfa), the subword order, a constant for every word, and threshold quantifiers of the form ∃ t for t ∈ N. The central result, Theorem 4.3, states that every language definable in this logic is piecewise testable of height bounded in terms of the norm of the defining formula. But first a simple result on the expressive power of quantifier-free formulas. (1) Any language L ∈ PT(n) is defined by some quantifier-and automata-free formula ϕ(x) ∈ FO 2 ∅ with ||ϕ|| n.
ptNFA is a quantifier-free formula with ||ϕ|| n, then it defines a language from PT(n + 1).
Proof (1) Since L ∈ PT(n), it is a finite union of equivalence classes [v] n for v ∈ Σ * . Such an equivalence class [v] n can be defined by the formula Since ϕ uses constants of length n, only, we have ||ϕ|| n. (2) Now let ϕ(x) ∈ FO 2 ptNFA be a quantifier-free formula with ||ϕ(x)|| n. First, suppose x ∈ L (A) is a subformula of ϕ(x). Then, the depth of the pt-nfa A is n. Hence, by [17,Thm. 8], L(A) ∈ PT(n). By the first statement, any subformula x ∈ L(A) can be replaced by a quantifier-and automata-free formula λ(x) ∈ FO 2 ∅ with ||λ(x)|| n. Consequently, we can assume that ϕ(x) is automata-free, i.e., belongs to FO 2 ∅ . Now replace subformulas of the form x v (with v a word) by such that the formula ϕ(x) becomes a Boolean combination of formulas u x and u = x with constants u of length n. Note that {u}↑ is of height |u| and {u} is of height |u| + 1. Hence, ϕ(x) defines a Boolean combination of languages from PT(n + 1), i.e., a language from PT(n + 1).
Proof We prove the claim by induction on the construction of the formula ϕ.
There exists a finite set A of formulas of the following form such that ϕ (x, y) is a Boolean combination of formulas from A: -formulas where at most x or y, but not both, are free -atomic formulas x y, x = y, and y x Note that all formulas α from A satisfy ||α|| ||ϕ || since they are subformulas of ϕ (x, y).
Then, there is a set B of subsets of A such that ϕ (x, y) is equivalent to Since any pair of words can satisfy at most one formula δ B (x, y), the formula ϕ(x) = ∃ t y : ϕ (x, y) is equivalent to where the disjunction extends over all tuples (t B ) B∈B of natural numbers from {0, 1, . . . , t} that sum up to t. So far, we expressed the formula ϕ(x) as a Boolean combination of formulas ∃ s y : δ(x, y) with s t and δ(x, y) a conjunction of possibly negated formulas from A. Note that any such formula is equivalent to the disjunction over all formulas where the disjunction extends over all tuples (s 1 , s 2 , s 3 , s 4 ) of natural numbers from {0, 1, . . . , s} that sum up to s.
So far, we expressed the formula ϕ(x) as a Boolean combination of formulas ∃ s y : xθ y∧ δ(x, y) with s t, δ(x, y) a conjunction of possibly negated formulas from A, and θ ∈ {Ĺ , Ľ, =, }.
We now consider one such formula. Since δ(x, y) is a conjunction of possibly negated formulas from A, we can write it as α(x) ∧ β(x, y) ∧ γ (y) with ||α||, ||γ || ||ϕ || and β(x, y) a conjunction of formulas of the form x y, x y, and their negations. Depending on whether xθ y is consistent with β(x, y) or not, the formula ∃ s y : xθ y ∧ δ(x, y) is equivalent to ⊥ or to α(x) ∧ ∃ s y : xθ y ∧ γ (y) .
Since the class PT(n) is closed under Boolean operations, it suffices to show that any such formula defines a piecewise testable language of height < 2 c 2||ϕ|| . By the induction hypothesis, this is clear for formulas from (1) since ||ϕ || ||ϕ||.
We consider the language that, by the induction hypothesis, is piecewise testable of height < 2 c 2||ϕ || . Now we have to consider the four possible values of θ separately.
Thus, we reached our second and final goal.
In summary, we proved that the set of words satisfying ϕ(x) is a Boolean combination of piecewise testable languages of height < 2 c 2||ϕ|| and therefore belongs to this class as well.
This finishes the inductive proof of the theorem.
Since piecewise testable languages of bounded height can be defined by quantifier-free formulas from FO 2 ∅ , we obtain the following quantifier-elimination result (that, differently from the theorem above, applies also to formulas with two free variables).

Complexity of the C 2 ptNFA -theory
We now adapt the technique by Ferrante and Rackoff from first-order logic to its extension by threshold counting quantifiers to derive our upper complexity bound from Corollary 4.4. 4 Central to this proof is the following lemma expressing that quantification in formulas can be restricted to words of bounded length. This property is the core of the method by Ferrante and Rackoff [3].
Proof We have to show that, whenever ϕ(u) holds, then there are t short words v such that ψ(u, v) holds (the other implication is trivial).
So assume there are at least t words in the language L := v ∈ Σ * | S | ψ(u, v) . By Corollary 4.4, there exists a quantifier-and automata-free formula ψ (x, y) ∈ FO 2 ∅ equivalent to ψ(x, y) such that ||ψ || < 2 c 2||ψ|| < 2 c 2||ϕ|| N . Since |u| < N , also the norm of the quantifier-and automata-free formula ψ (u, y) is < N . Note that L is defined by this formula. Hence, by Lemma 4.1(2), L is piecewise testable of height N . Since L contains at least t words, the definition of the function g |Σ| together with the convexity of all equivalence classes implies that L contains mutually distinct words v 1 , . . . , v t of length < g |Σ| (N ) + t (N + t + 2) c (by Lemma 4.2). We have |bin(t)| ||ϕ|| which implies t N . Hence, (N + t + 2) c (2N + 2) c which is smaller than N 2c since N 16. Thus, we have |v i | < N 2c for all 1 i t. Consequently, we found t "short" witnesses for ψ(u, y).

Proposition 5.2
There is an alternating algorithm that, on input of a formula ϕ(x, y) ∈ C 2 ptNFA and words u and v, decides whether S | ϕ (u, v). This alternating algorithm runs in time doubly exponential in ϕ(u, v) and uses O |ϕ| alternations.
Proof Before we come to the actual proof, we explain the idea underlying our approach. First, from ϕ, u, v, and N , we could compute a propositional formula (whose atomic propositional formulas are atomic formulas from C 2 ptNFA ) that is equivalent to ϕ(u, v). This is possible since, by the previous lemma, we can restrict quantification in ϕ to words of bounded length. To serve as the basis of an alternating algorithm, we need in addition that the propositional formula is in negation normal form (i.e., at most atomic formulas are negated).
However, this approach has the following two problems. First, the length bound from Lemma 5.1 is doubly exponential. Hence, the propositional formula for ∃ 1 y : ψ(u, y) is the disjunction over all formulas ψ(u, v) with v a word of doubly exponential length. But computing this formula requires triply exponential time. The solution to this first problem is that the propositional formula is not calculated explicitly. Instead, its evaluation is simulated by a procedure that takes, as arguments, a formula α(x, y), two words w x and w y , and a natural number N and returns the truth value of α(w x , w y ) if all quantifications are bounded by values that depend on N .
Since we do not compute the propositional formula, we cannot compute its negation normal form afterwards. Nor can we transform the C 2 ptNFA -formula into negation normal form since this logic does not allow universal quantifiers. This problem is solved by considering not just one procedure as above, but two (one for formulas α occurring positively, one for negative occurrences). Formally (and now the actual proof starts), we use the following recursive procedures check P and check N whose parameters are -a C 2 ptNFA -formula α(x, y), -two words w x and w y , and -a natural number N .
(1) If α is an atomic formula, then decide whether α(w x , w y ) holds. This can be done by a nondeterministic algorithm in time linear in |w x | + |w y | + |α|. If so, the procedure check P returns true and false otherwise. The procedure check N returns the negation of these values. Then, check P (α, w x , w y , N ) returns true iff, for some set T of t words of length < N 2c , the call of check P (ψ, w x , w y , N 2c ) returns true for all words w y ∈ T . Thus, the evaluation of check P consists of two phases: an existential phase (in which a set T is guessed, i.e. the computation branches into a sub-computation for each choice of T ), followed by a universal phase (in which for each sub-computation, i.e. for each choice of T , it is checked whether T is a set of t solutions). Dually, check N (α, w x , w y , N ) returns true iff, for some set T of t − 1 words of length < N 2c , the call of check N (ψ, w x , w y , N 2c ) returns true for all words w y of length < N 2c that do not belong to T . As before, the call of check N (α, w x , w y , N ) too consists of an existential phase followed by a universal phase (considering, instead of the guessed set T , its complement wrt. the set of words of length < N 2c ). Now let ϕ(x, y) be a formula from C 2 ptNFA , u, v ∈ Σ * , and N 0 ∈ N with |u|, |v| < N 0 and 2 c 2||ϕ|| N 0 . By induction on the size of ϕ and using Lemma 5.1, one obtains that S | ϕ(u, v) iff check P (ϕ, u, v, N 0 ) returns true iff check N (ϕ, u, v, N 0 ) returns false. Now let ψ = ϕ(u, v). Then, S | ϕ(u, v) iff check P (ψ, ε, ε, 2 c 2||ψ|| ) returns true. We now analyse the runtime of an execution of a call of check P (ψ, ε, ε, 2 c 2||ψ|| ). First, the value of the parameter N is bounded by where d ||ψ|| is the quantifier depth of ψ. Consequently, the recursive execution considers only words of this doubly exponential length. Further, when handling a quantifier ∃ t , it considers a set of at most t words of this doubly exponential length. Since t is at most exponential in the size of ψ, the alternating algorithm runs in at most doubly exponential time.
Further note that the execution alternates between universal and existential states only linearly often.

Summary and open question
We considered the extension of first-order logic by threshold-counting quantifiers over the subword order with piecewise testable predicates and constants. We showed that the 2variable fragment of this theory is decidable using doubly exponential space, more precisely, it belongs to STA * , 2 2 poly(n) , O(n) . This extends a result from [10] in two aspects: first, we add threshold counting quantifiers and piecewise testable predicates to first-order logic and, secondly, we improve their upper bound by one exponent (if only considering the space bound). Our proof relies on two independent aspects: the consideration of the height of definable languages (which is a direct continuation from [10]) and an adaptation of Ferrante and Rackoff's method [3].
The work done in this paper can be continued in the following directions: -Addition of further binary relations: Let C be some collection of binary relations on Σ * such that Boolean combinations of relations from C ∪ { } are effectively rational. This holds, e.g., if C consists of the prefix relation, the relation "have equal length", the cover relation as well as powers thereof (e.g., the relation "u v and |v| − |u| = k" for fixed k ∈ N). Then, the proof of [9, Thm. 5.5] can be extended to show the following result: The FO 2 NFA -theory of the extension of the structure S with the binary relations from C is decidable. If the Boolean combinations are even effectively unambiguous rational, then the C 2 NFA -theory becomes decidable using the arguments from [16] (where the result is demonstrated in case C contains the cover relation, only). It is not clear for which sets C the C 2 ptNFA -theory becomes decidable in elementary space (which is the case for C = ∅ as demonstrated in this paper). The same question applies already for the FO 2 ∅ -theory. -Addition of regular predicates: By [16], the C 2 NFA -theory is decidable, but the only known algorithm is non-elementary. On the other hand, the C 2 ptNFA -theory is decidable using elementary space. It is not clear whether there are other classes of nfas A ⊆ NFA such that the C 2 A -or FO 2 A -theory are decidable in elementary space.