1 Introduction

Formal definitions of the concepts introduced in this section are given in Section 2. For now we assume that the reader is familiar with basic properties of regular languages and finite automata as covered in [27, 32], for example.

There are two fundamental congruence relations in the theory of regular languages: the Nerode (right) congruence [25], and the Myhill congruence [24]. In both cases, a language is regular if and only if it is a union of congruence classes of a congruence of finite index. The Nerode congruence leads to the definitions of left quotients of a language and the minimal deterministic finite automaton (DFA) recognizing the language, and the Myhill congruence, to the definitions of the syntactic semigroup of the language.

The state complexity of a language is the number of states in a minimal DFA recognizing the language. This concept has been studied extensively; for surveys and references see [2, 33]. The syntactic complexity of a regular language is the cardinality of its syntactic semigroup, which is isomorphic to the transition semigroup of a minimal DFA recognizing the language [29], where the transition semigroup is the semigroup of transformations of the set of states of the DFA induced by non-empty words.

Syntactic complexity does not refine state complexity, for there exist languages with the same syntactic complexity but different state complexities. However, it often helps to distinguish among languages with the same state complexity. For example, the DFAs in Fig. 1 all have the same alphabet, are all minimal, and all have state complexity 3. However, the syntactic complexity of \(\mathcal {D}_{1}\) is 3, that of \(\mathcal {D}_{2}\) is 9, and that of \(\mathcal {D}_{3}\) is 27.

Fig. 1
figure 1

DFAs with various syntactic complexities

The problem we study in this paper is the following: Given a language belonging to a subclass of the class of regular languages – for example, the subclass of finite languages or prefix-free languages (prefix-codes) – what is the maximal size of the syntactic semigroup of that language? Equivalently, given a minimal DFA of a language in the subclass, what is the maximal size of the transition semigroup of the DFA? A secondary problem is to find the minimal size of a set of generators for the maximal semigroup.

Syntactic complexity has been studied in several subclasses of regular languages other than ideals: prefix-, suffix-, bifix-, and factor-free languages [8, 12]; star-free languages [7, 10]; R- and J-trivial languages [6]; finite/cofinite and reverse definite languages [7]. This problem can be quite challenging, depending on the subclass; in the present case it is easy for right ideals but much more difficult for left- and two-sided ideals (defined below).

As syntactic complexity bounds the maximal size of the transition semigroup, it provides a natural bound on the time and space complexities of algorithms dealing with transition semigroups. For example, a simple algorithm determining whether the language of a given minimal DFA is star-free [23] requires the enumeration of all transformations and checking whether they do not contain non-trivial cycles. A language is star-free if it can be generated from finite languages by using only Boolean operations and product (concatenation), but not star; equivalently, its syntactic semigroup is group-free, that is, has no non-trivial subgroups.

Maximal transition semigroups also play an important role in the study of most complex languages [3] belonging to a given subclass. These are languages that meet all the upper bounds on the state complexities of Boolean operations, product, star, and reversal, have maximal syntactic semigroups and most complex atoms [13].

In contrast to the syntactic monoid of the language, the syntactic semigroup may or may not contain the neutral element (the identity transformation). The presence of letters acting as identity is often important in the case of state complexity of binary operations. Moreover, the syntactic semigroup is more suitable to characterize some classes of languages, which have a description in terms of semigroups. For example, in the class of (co)finite languages all transformations must admit a certain linear order of the states [15], and the identity transformation cannot be present; the latter condition would not be distinguished by the syntactic monoid.

In this paper we study the syntactic complexities of right ideals (satisfying the equation L = LΣ), left ideals (satisfying L = Σ L), and two-sided ideals (satisfying L = Σ LΣ). Ideals are fundamental objects in semigroup theory. They appear in the theoretical computer science literature in 1965 [26] and continue to be of interest. Ideal languages are special cases of convex languages (see e.g. [9]), and they are complements of prefix-, suffix-, factor-, and subword-closed languages. Besides being of theoretical interest, ideals also play a role in algorithms for pattern matching. For this application, a text is represented by a word w over some alphabet Σ. A pattern is a language L over Σ. An occurrence of a pattern represented by L in text w is a triple (u, x, v) such that w = u x v and x is in L. Searching text w for words in L is equivalent to looking for prefixes of w that belong to the language Σ L, which is the left ideal generated by L, or looking for factors of w that belong to Σ LΣ [16].

The state complexity of operations on the classes of ideal languages was studied by Brzozowski, Jirásková and Li [4]. The same problem for the classes of prefix-, suffix-, factor-, and subword-closed languages was studied by Han and K. Salomaa [17], Han, K. Salomaa, and Wood [18], and Brzozowski, Jirásková and Zou [5]. We refer the reader to these papers for a discussion of past work on this topic and additional references.

The set of all n n transformations of a set Q n of n elements is a monoid under composition of transformations, with identity as the unit element. In 1970, Maslov [22] dealt with the generators of the semigroup of all transformations in the setting of finite automata. Holzer and König [19], and independently Krawetz, Lawrence, and Shallit [20] studied the syntactic complexity of unary and binary regular languages. Recently, syntactic complexity has been studied in several subclasses of regular languages other than ideals: prefix-, suffix-, bifix-, and factor-free languages [8, 12]; star-free languages [7, 10]; R- and J-trivial languages [6].

We define our terminology and notation in Section 2, and give some basic properties of syntactic complexity in Section 3. The syntactic complexities of right, left, and two-sided ideals are treated in Sections 46, and Section 7 concludes the paper. As mentioned above, closed languages are complements of ideal languages. Since syntactic complexity is preserved under complementation, our proofs are for ideals only. The syntactic complexity of all-sided ideals remains open.

In the proof for the upper bounds for left and two-sided ideals we use the method of injective function, which is generally applicable for other subclasses of regular languages (see [12] for suffix-free and [31] for bifix-free languages). The proofs presented here are the first that apply this method to syntactic complexity.

A part of the results in this paper previously appeared in conference proceedings: In 2011 in [14] syntactic complexity of right ideals was established and lower bounds for the classes of left and two-sided ideals were presented. In 2014 in [11] incomplete proofs of the upper bounds for syntactic complexity of left and two-sided ideals were presented.

2 Preliminaries

If Σ is an alphabet (a non-empty finite set), then Σ is the free monoid generated by Σ, and Σ+ is the free semigroup generated by Σ. A word is any element of Σ, and the empty word is ε. The length of a word w ∈ Σ is |w|. A language over Σ is any subset of Σ.

If w = u x v for some u, v, x ∈ Σ, then u is a prefix of w, v is a suffix of w, and x is a factor of w. A prefix or suffix of w is also a factor of w. If w = u 1 v 1 u 2 v 2u k v k u k+1, where the u i and v i are in Σ, then v 1 v 2v k is a subword of w. A language L is prefix-closed if wL implies that every prefix of w is also in L. In an analogous way, we define suffix-closed, factor-closed, and subword-closed. We refer to all four types as closed languages.

The shuffle of two words u, v ∈ Σ is defined as follows:

figure b

The shuffle of two languages K and L is defined by

figure c

A language L ⊆ Σ is a right ideal (respectively, left ideal, two-sided ideal, all-sided ideal) if it is non-empty and satisfies L = LΣ (respectively, L = Σ L, L = Σ LΣ, ). We refer to all four of these types as ideal languages or simply ideals.

Proposition 1

Suppose L is a language over Σ and L ≠ Σ . Let \(\overline {L}={\Sigma }^{*}\setminus L\) be the complement of L. Then the following hold:

  • L is prefix-closed if and only if \(\overline {L}\) is a right ideal.

  • L is suffix-closed if and only if \(\overline {L}\) is a left ideal.

  • L is factor-closed if and only if \(\overline {L}\) is a two-sided ideal.

Proof

The claim for factor-closed languages was proved in [21]. The proof for prefix-closed languages [1] parallels the proof in [21], and that for suffix-closed languages follows by the dual argument. □

A transformation of a set Q n of n elements is a mapping of Q n into itself, whereas a permutation of Q n is a mapping of Q n onto itself. In this paper we consider only transformations of finite sets, and we assume without loss of generality that Q n = {0, 1, …, n − 1}. An arbitrary transformation has the form

$$t=\left( \begin{array}{ccccc} 0 & 1 & {\cdots} & n-2 & n-1 \\ q_{0} & q_{1} & {\cdots} & q_{n-2} & q_{n-1} \end{array} \right), $$

where q k Q n for 0 ≤ kn − 1. The image of element q under transformation t is denoted by qt. The identity transformation 1 maps each element to itself. For k ≥ 2, a transformation (permutation) s of a set P = {p 0, p 1, …, p k−1} ⊆ Q n is a k-cycle if p 0 s = p 1, p 1 s = p 2, …, p k−2 s = p k−1, p k−1 s = p 0. If a transformation t on Q n acts on PQ n like a k-cycle then t is said to have a k-cycle. A k-cycle is denoted by (p 0, p 1, …, p k−1) when it is viewed as a transformation of P. If t is a transformation of Q n , has a k-cycle (p 0, p 1, …, p k−1) of P, and acts as identity on Q n P, then we denote t also by (p 0, p 1, …, p k−1). A 2-cycle (p 0, p 1) is called a transposition. A transformation is constant if it maps all states to a single state q; it is denoted by (Qq). A transformation that maps a single state p to q and keeps Q ∖ {p} unchanged is denoted by (pq). A transformation mapping p to q p for p = 0, …, n − 1 is sometimes denoted by [q 0, …, q n−1].

The following facts are well-known [28, 30]:

Proposition 2

The complete transformation monoid \(\mathcal {T}_{n}\) of size n n can be generated by any cyclic permutation of n elements together with a transposition of any two elements adjacent in the cyclic permutation, and a singular (non-invertible) transformation of rank (image size) n − 1. In particular, \(\mathcal {T}_{n}\) can be generated by (0, 1, …, n − 1), (0, 1) and (n − 1 → 0). Moreover, \(\mathcal {T}_{n}\) cannot be generated by fewer than three generators for n ≥ 3.

Remark 1

Let \(T^{\prime }_{n}\) be a transformation semigroup that requires at least g generators. Suppose T n contains \(T^{\prime }_{n}\) as a subsemigroup. If for every \(t \in T^{\prime }_{n}\), no transformation from T n \(T^{\prime }_{n}\) can be used to generate t, then any set of generators of T n contains at least g generators from \(T^{\prime }_{n}\).

Proof

Let G be a set of generators of T n . Let \(t \in T^{\prime }_{n}\). Since tT n , it is generated by G. Since generators from \(T_{n} \setminus T^{\prime }_{n}\) cannot be used, t is generated by generators from GT n′. Thus G\(T^{\prime }_{n}\) generates \(T^{\prime }_{n}\), and so G contains at least g generators. □

An equivalence relation ∼ on Σ is a right congruence if, for all x, y ∈ Σ, xyx vy v, for all v ∈ Σ. It is a congruence if xyu x vu y v, for all u, v ∈ Σ.

For any language L ⊆ Σ, define the Nerode right congruence [25] ∼L of L by

$$ x \,{{\mathbin{\sim_L}}} \, y \text{ if and only if } xv\in L \Leftrightarrow yv\in L, \text { for all } v\in{\Sigma}^{*}. $$
(1)

The left quotient, or simply quotient, of a language L by a word w is the language w −1 L = {x ∈ Σw xL}. Evidently, x −1 L = y −1 L if and only if xL y. Thus, each equivalence class of this right congruence corresponds to a distinct quotient of L. Let K = {K 0, …, K n−1} be the set of quotients of a regular language L; by convention, we let K 0 = L = ε −1 L. The number of distinct quotients of L is the quotient complexity κ(L) of L.

The Myhill congruence [24] ≈L of L is defined by

$$ x \, {\mathbin{\approx_L}}\, y \text{ if and only if } uxv\in L \Leftrightarrow uyv\in L\text { for all } u,v\in{\Sigma}^{*}. $$
(2)

This congruence is also known as the syntactic congruence of L. The semigroup Σ+/≈L of equivalence classes of the relation ≈L is the syntactic semigroup of L, and Σ/≈L is the syntactic monoid of L. The syntactic complexity σ(L) of L is the cardinality of its syntactic semigroup.

A deterministic finite automaton (DFA) is a quintuple \(\mathcal {D}=(Q, {\Sigma }, \delta , q_{0},F)\), where Q is a finite, non-empty set of states, Σ is an alphabet, δ: Q × Σ → Q is the transition function, q 0Q is the initial state, and FQ is the set of final states. As usual, δ is extended to a function from Q × Σ to Q. By the language of a state q of \(\mathcal {D}\) we mean the language K q accepted by the automaton (Q, Σ, δ, q, F). States p and q are equivalent if K p = K q . A state q is reachable if δ(q 0, w) = q for some w ∈ Σ. A DFA is minimal if every state is reachable and no two states are equivalent. This implies that the number of states of a minimal DFA is minimal.

Each word w of Σ induces a transformation t as follows: q t = δ(q, w) for all qQ. The fact that w induces transformation t is denoted by w: t. The transition semigroup of a DFA is the set of transformations qδ(q, w) for all qQ, w ∈ Σ+ induced by words of Σ+ on the set of states. The transition semigroup of the quotient DFA of L is isomorphic to the syntactic semigroup of L [29].

The quotient automaton of L is \(\mathcal {D}=(K, {\Sigma }, \delta , L,F)\), where δ(K q , a) = a −1 K q , and F = {K q εK q }. Since the number of distinct quotients of L is precisely the number of states in the quotient automaton, the quotient automaton is always minimal, and so quotient complexity is the same as state complexity.

3 Syntactic Complexity of Languages with Special Quotients

We now present some basic properties of syntactic complexity.

Proposition 3

For any L ⊆ Σ with κ(L) = n > 1, we have n − 1 ≤ σ(L) ≤ n n .

Proof

Let \(\mathcal {D}=(K, {\Sigma }, \delta , L,F)\) be the quotient automaton of L. Since every state other than L has to be reachable from the initial state L by a non-empty word, there must be at least n − 1 transformations. If Σ = {a} and L = a n−1 a , then κ(L) = n, and σ(L) = n − 1; so the lower bound n − 1 is achievable. The upper bound is n n, and by Proposition 2 this upper bound is achievable if |Σ| ≥ 3. The upper bound is reachable with |Σ| = 2 for n = 2 by the language (ba aa b), and with |Σ| = 1 for n = 1 by the language Σ. □

If one of the quotients of L is (respectively, {ε}, Σ, Σ+), then we say that L has ∅ (respectively, {ε}, Σ, Σ+). A quotient w −1 L of a language L is uniquely reachable [2] if x −1 L = w −1 L implies that x = w. If (w a)−1 L is uniquely reachable for a ∈ Σ, then so is w −1 L. Thus, if L has a uniquely reachable quotient, then L itself is uniquely reachable by ε, i.e., a minimal automaton of L is non-returning [17].

Theorem 1 (Special Quotients)

Let L ⊆ Σ and let κ(L) = n ≥ 1.

  1. 1.

    If L has ∅ or Σ , then σ(L) ≤ n n−1 .

  2. 2.

    If L has {ε} or Σ+ , then σ(L) ≤ n n−2 .

  3. 3.

    If L is uniquely reachable, then σ(L) ≤ (n − 1)n .

  4. 4.

    If w −1 L is uniquely reachable by w ∈ Σ with 0 ≤ |w| ≤ n − 1, then σ(L) ≤ |w| + (n − 1 −|w|)n .

Moreover, all the bounds shown in Table 1 hold.

Table 1 Upper bounds on syntactic complexity for languages with special quotients

Proof

Suppose that L ⊆ Σ, n ≥ 1, and κ(L) = n.

  1. 1.

    Since a −1 = for all a ∈ Σ, there are only n − 1 states in the quotient automaton with which one can distinguish two transformations. Hence there are at most n n−1 transformations. If L has Σ, then a −1Σ = Σ, for all a ∈ Σ, and the same argument applies.

  2. 2.

    Since a −1{ε} = for all a ∈ Σ, a language L has if L has {ε}. Now there are two states that do not contribute to distinguishing among different transformations. Dually, a −1Σ+ = Σ for all a ∈ Σ, and the same argument applies.

  3. 3.

    If L is uniquely reachable then w −1 L = L implies w = ε. Thus L does not appear in the image of any transformation by a word in Σ+, and there remain only n − 1 choices for each of the n states.

  4. 4.

    If w −1 L is uniquely reachable, then so is x −1 L for every prefix x of w. Hence for each prefix x of w, x −1 L appears only in one transformation, and there are |w| such transformations. All the other transformations map every quotient x −1 L to y −1 L, where y is not a prefix of w. Therefore there can be at most (n − 1 −|w|)n other transformations.

The remaining entries in Table 1 are easily verified: every transformation fixes , Σ, maps {ε} to , and maps Σ+ to Σ, so these quotients are removed from counting possible mappings for a quotient. □

4 Right Ideals and Prefix-Closed Languages

In this section we prove that the syntactic complexity of right ideals is n n−1. First we define a witness DFA that meets this bound.

Definition 1 (Witness: Right Ideals)

For n ≥ 3, let \(\mathcal {W}_{n}=(Q_{n}, {\Sigma },\delta _{\mathcal {W}},0, \{n-1\}),\) be the DFA in which Σ = {a, b, c, d}, a: (0, …, n − 2), b: (0, 1), c: (n − 2 → 0), and d: (n − 2 → n − 1). For n = 3 inputs a and b induce the same transformation; hence Σ = {a, c, d} suffices. Furthermore, let \(\mathcal {W}_{2}=(Q_{2},\{a,b\},\delta _{\mathcal {W}},0,\{1\})\), where a: (0 → 1), and b: 1, and let \(\mathcal {W}_{1}=(Q_{1},\{a\},\delta _{\mathcal {W}},0,\{0\})\), where a: 1. Let \(L_{n}=L(\mathcal {W}_{n})\).

The structure of the DFA of Definition 1 is shown in Fig. 2 for n ≥ 3.

Fig. 2
figure 2

Quotient DFA \(\mathcal {W}_{n}\) of a right ideal with n n−1 transformations

Let W ri be the transition semigroup of the witness \(\mathcal {W}_{n}\).

Lemma 1

The DFA \(\mathcal {W}_{n}\) of Definition 1 is minimal, accepts a right ideal, and its transition semigroup W ri has size n n−1 .

Proof

If n ≤ 2 this is easily verified; here L 1 = Σ and L 2 = Σ aΣ.

For n ≥ 3, any state q with 0 ≤ qn − 2 is non-final, accepts a n−2−q d, and no other such state accepts this word. Since n − 1 is final, all states are distinguishable. Since \(\mathcal {W}_{n}\) has exactly one final state and that state accepts Σ, L n is a right ideal.

For the syntactic complexity, observe that inputs a, b, and c restricted to Q n−1 can induce any transformation of Q n−1 (Proposition 2); hence all (n − 1)n−1 transformations that fix n − 1 can be performed by \(\mathcal {W}_{n}\). Also observe that any transformation (qn − 1) for q ∈ {0, …, n − 3} is induced by a n−2−q d a q+1.

Note that every transformation from the transition semigroup W ri fixes state n − 1. Let t be any transformation such that (n − 1)t = n − 1. There are n n−1 such transformations, and we will show that all of them are generated. Let {p 1, …, p k } be the set of all states from Q ∖ {n − 1} that are mapped by t to n − 1. Then t can be generated by (p 1n − 1)⋯(p k n − 1)t , where t fixes n − 1 and all states p i , and acts as t on the other states; thus it is a transformation of Q n−1 if restricted to Q n−1 and can be generated by a, b, and c. □

We are now in a position to state our main theorem of this section.

Theorem 2 (Right Ideals and Prefix-Closed Languages)

Suppose that L ⊆ Σ and κ(L) = n . If L is a right ideal or a prefix-closed language, then σ(L) ≤ n n−1 . This bound is tight for n = 1 if |Σ| ≥ 1, for n = 2 if |Σ| ≥ 2, for n = 3 if |Σ| ≥ 3, and for n ≥ 4if |Σ| ≥ 4. Moreover, the sizes of the alphabet cannot be reduced.

Proof

For n ≥ 4, every transformation in the transition semigroup of a minimal DFA of any right ideal with n quotients must fix state n − 1; hence the size of this semigroup cannot exceed n n−1. By Lemma 1 this bound is tight.

It is easy to verify that the alphabet cannot be smaller if n ≤ 3. Let n ≥ 4. The set of transformations in the largest transition semigroup must contain every transformation t that maps Q n−1 to Q n−1 and fixes n − 1; otherwise, the bound cannot be met. Thus, none of the generators of this semigroup can map a state from Q n−1 to n − 1. When restricted to Q n−1, the transformations in this semigroup must form the full transformation semigroup of Q n−1 with n − 1 ≥ 3 states. So by Remark 1, from Proposition 2 we know that there must be at least three generators of these transformations, say a, b, c. As noted above, none of {a, b, c}, extended to Q n by adding the mapping of n − 1 to n − 1, can map a state from Q n−1 to n − 1. So we need at least one more generator, say d, which maps a state from Q n−1 to n − 1. Altogether, at least four generators are needed.

Since prefix-closed languages are complements of right ideals and the syntactic complexity is preserved by complementation, the result is the same for prefix-closed languages. □

Remark 2

A maximal transition semigroup of the quotient DFA of a right ideal contains all transformations of Q n that fix state n − 1. Hence there is only one maximal transition semigroup for right ideals, which is W ri.

5 Left Ideals and Suffix-Closed Languages

5.1 Basic Properties

Let \(\mathcal {D}_{n}=(Q_{n}, {\Sigma }_{\mathcal {D}}, \delta _{\mathcal {D}}, 0,F)\) be a minimal DFA, and let T n be its transition semigroup. Consider the sequence (0, 0t, 0t 2, …) of states obtained by applying a transformation tT n repeatedly, starting with the initial state. Since Q n is finite, there must eventually be a repeated state, that is, there must exist i and j such that 0, 0t, …, 0t i, 0t i+1, …, 0t j−1 are distinct, but 0t j = 0t i; the integer ji is the period of t. If the period is 1, t is said to be initially aperiodic. If t is initially aperiodic, then its sequence is 0, 0t, …,0t j−1 = 0t j.

Lemma 2

If \(\mathcal {D}_{n}\) is the quotient DFA of a left ideal, all the transformations in T n are initially aperiodic, and the empty set is not a quotient of L.

Proof

Let t be a transformation that is not initially aperiodic. Then there exist i, j such that p i = 0t i = 0t j = p j for some i < j, where ji ≥ 2. Let w be a word that induces t. Since \(\mathcal {D}_{n}\) is minimal, states p i and p j−1 must be distinguishable, say by word x ∈ Σ. If w i xL, then w j−1 x = w i w ji−1 x = w ji−1(w i x) ∉ L, contradicting the assumption that L is a left ideal. If w j−1 xL, then w j x = w(w j−1 x) ∉ L, again contradicting that L is a left ideal.

For the second claim, we know that a left ideal is non-empty by definition. So suppose that wL. If L has the empty quotient, say x −1 L = , then x wL, which contradicts the assumption that L is a left ideal. □

Example 1

Note that the conditions of Lemma 2 are not sufficient. For Σ = {a, b}, the language L = b ∪ Σ a satisfies the conditions, but is not a left ideal because bL but a bL. Its quotient automaton is shown in Fig. 3.

Fig. 3
figure 3

Quotient DFA of a language that is not a left ideal

If the final state is 2 instead of 1, the language becomes L = ΣΣ b = ΣΣb, which is a left ideal. The languages L and L have the same syntactic semigroup, but one is a left ideal while the other is not.

The following remark was proved in [4]:

Remark 3

A language L ⊆ Σ is a left ideal if and only if for all x, y ∈ Σ, y −1 L ⊆ (x y)−1 L. Hence, if x −1 LL, then Lx −1 L for any x ∈ Σ+.

Proof

If L is a left ideal then for all x, y, w ∈ Σ, we have y wL implies x y wL, that is, wy −1 L implies w ∈ (x y)−1 L.

For the other direction, if for some x, y ∈ Σ there is wy −1 L such that w ∉ (x y)−1 L, then xε and y wL but x y wL, which contradicts that L is a left ideal. □

It is useful to restate this observation it terms of the states of \(\mathcal {D}_{n}\). For DFA \(\mathcal {D}_{n}\) and states p, qQ n , we write pq if K p K q . Also, we write pq if K p K q .

Remark 4

A DFA \(\mathcal {D}_{n}\) is a minimal DFA of a left ideal if and only if for all s, tT n ∪ {1}, 0t ⪯ 0s t. Equivalently, since q = 0s for some s, for every qQ n ∖ {0} we have 0 ≺ q.

Remark 5

In a minimal DFA \(\mathcal {D}_{n}\) of a left ideal, if rQ n has a t-predecessor, that is, if there exists qQ n such that q t = r, then 0tr. In particular, if r appears in a cycle of t or is a fixed point of t, then 0tr.

Proof

This follows because 0 ⪯ q and so 0tq t = r by Remark 4. □

We consider chains of the form \(K_{i_{1}}\subset K_{i_{2}}\subset {\dots } \subset K_{i_{h}}\), where the \(K_{i_{j}}\) are quotients of L. If L is a left ideal, the smallest element of any maximal-length chain is always L. Alternatively, we consider chains of states starting from 0 and strictly ordered by ≺.

Proposition 4

For tT n and p, qQ n , pq implies p tq t . If pp t , then pp t ≺ ⋯ ≺ p t k = p t k+1 for some k ≥ 1. Similarly, pq implies p tq t , and pp t implies pp t ≻ … ≻ p t k = p t k+1 for some k ≥ 1.

Proof

Since ⊆ is a partial order on quotients, by definition of ≺, if K p K q then w −1 K p w −1 K q , where w is a word inducing t. This applied iteratively yields pp t ≺ … ≺ p t k = p t k+1 for some k ≥ 1, because there are finitely many quotients (kn). The same hold dually for ≻. □

5.2 Lower Bound

We now show that the syntactic complexity of the following DFA of a left ideal is n n−1 + n − 1.

Definition 2 (Witness: Left Ideals)

For n ≥ 3, let \(\mathcal {W}_{n} =(Q_{n},{\Sigma }_{\mathcal {W}},\delta _{\mathcal {W}},0,\{n-1\}),\) be the DFA in which \({\Sigma }_{\mathcal {W}}=\{a,b,c,d,e\}\), a: (1, …, n − 1), b: (1, 2), c: (n − 1 → 1), d: (n − 1 → 0), and e: (Q n → 1). For n = 3, a and b coincide, and we can use \({\Sigma }_{\mathcal {W}}=\{a,c,d,e\}\). Also, let \(\mathcal {W}_{2}=(Q_{2},\{a,b,c\},\delta _{\mathcal {W}},0,\{1\})\), where a: (0 → 1), b: 1, and c: (Q 2 → 1), and let \(\mathcal {W}_{1}=(Q_{1},\{a\},\delta ,0,\{0\})\), where a: 1. Let \(L_{n}=L(\mathcal {W}_{n})\).

The structure of the DFA of Definition 2 is shown in Fig. 4 for n ≥ 3.

Fig. 4
figure 4

Quotient DFA \(\mathcal {W}_{n}\) of a left ideal with n n−1 + n − 1 transformations

Lemma 3

The DFA of Definition 2 is minimal, accepts a left ideal, and has transition semigroup of size n n−1 + n − 1 that contains all the transformations fixing 0 and all the constant transformations.

Proof

State 0 does not accept a i for any i, whereas state i with 1 ≤ in − 2 accepts a n−1−i, and no other state j with 1 ≤ jn − 2 accepts this word. Since n − 1 is the only final state, all states are distinguishable.

To prove that L is a left ideal it suffices to show that for any wL, we also have x wL for every x ∈ Σ. This is obvious if x ∈ Σ ∖ {e}. If wL, then w has the form w = u e v, where \(\delta _{\mathcal {W}}(0,u)=0\), \(\delta _{\mathcal {W}}(0,ue)=1\), and v is accepted from state 1. But \(\delta _{\mathcal {W}}(0,eue)=1\), and since v is accepted from 1, we have e u e v = e wL n . Thus L n is a left ideal.

In \(\mathcal {W}_{n}\), the transformations induced by a, b, and c restricted to Q n ∖ {0} generate all the transformations of the last n − 1 states (Proposition 2). Together with the transformation of d, they generate all transformations of Q n that fix 0, and the number of such transformations is n n−1. To see this, consider any transformation t that fixes 0. If some states from {1, …, n − 1} are mapped to 0 by t, we can map them first to n − 1 and n − 1 to one of them by the transformations of a, b, and c, and then map n − 1 to 0 by the transformation of d.

Also the words of the form e a i for i ∈ {0, …, n − 2} induce constant transformations (Q n i + 1). Hence the transition semigroup of \(\mathcal {W}_{n}\) contains all the constant transformations of Q n (where (Q n → 0) has been already counted). Altogether, there are n n−1 + n − 1 transformations in the transition semigroup of \(\mathcal {W}_{n}\). □

Example 2

The maximal-length chains of quotients in \(\mathcal {W}_{n}\) have length 2. However, in other left ideals maximal-length chains can be as long as n. For this let n ≥ 2, Σ = {a, b} and L = Σ a n−1; then L has n quotients and a maximal-length chain of length n.

Proof

A maximal-length chain always starts at 0; suppose it ends with q. If there is a pQ n ∖ {0, q} such that pq, then K p K q , which contradicts that a n−1−pK p and a n−1−pK q .

In L = Σ a n−1, we have the unique maximal-length chain consisting of all quotients:

$${\Sigma}^{*}a^{n-1} \subset {\Sigma}^{*}a^{n-2} \subset {\dots} \subset {\Sigma}^{*}.$$

We will see that the maximal length of chains of quotients is an important structural feature; in particular, to meet the bound for syntactic complexity by both left and two-sided ideals, the maximal length of the chains must be the smallest possible.

5.3 Upper Bound

The derivation of the upper bound n n−1 + n − 1 for left ideals is much more difficult that for right ideals. We begin with the easy cases where n ∈ {1, 2}.

Remark 6

If n = 1, the only left ideal is Σ and the transition semigroup of its minimal DFA satisfies the bound 10 + 1 − 1 = 1. If n = 2, there are only three allowed transformations, since the transposition (0, 1) is not initially aperiodic and is ruled out by Lemma 2. Thus the bound 21 + 2 − 1 = 3 holds.

Let \(\mathcal {D}_{n}=(Q_{n}, {\Sigma }_{\mathcal {D}}, \delta _{\mathcal {D}}, 0,F)\) be a minimal DFA of an arbitrary left ideal with n quotients and let T n be the transition semigroup of \(\mathcal {D}_{n}\). Let W li be the transition semigroup of the witness DFA \(\mathcal {W}_{n}\) of Definition 2.

Lemma 4

If n ≥ 3 and a maximal-length chain in \(\mathcal {D}_{n}\) strictly ordered byhas length 2, then |T n | ≤ n n−1 + n − 1 and T n is a subsemigroup of W li .

Proof

Consider an arbitrary transformation tT n and let p = 0t. If p = 0, then any state other than 0 can possibly be mapped by t to any one of the n states; hence there are at most n n−1 such transformations. All of these transformations are in W li by the proof of Lemma 3.

If p ≠ 0, then 0 ≺ p. Consider any state q ∉ {0, p}; by Remark 4, 0t = pq t. If pq t, then pq t. But then we have the chain 0 ≺ pq t of length 3, contradicting our assumption. Hence we must have p = q t, and so t is the constant transformation t = (Q n p). Since p can be any one of the n − 1 states other than 0, we have at most n − 1 such transformations. Since all of these transformations are in W li by Lemma 3, T n is a subsemigroup of W li. □

Lemma 5 (Left Ideals, Suffix-Closed Languages)

If n ≥ 3 and L is a left ideal or a suffix-closed language with n quotients, then its syntactic complexity is less than or equal to n n−1 + n − 1.

Proof

Our approach is as follows: We consider a minimal DFA \(\mathcal {D}_{n}=(Q_{n}, {\Sigma }_{\mathcal {D}}, \delta _{\mathcal {D}}, 0,F)\) of an arbitrary left ideal with n quotients and let T n be the transition semigroup of \(\mathcal {D}_{n}\). We also deal with the witness DFA \(\mathcal {W}_{n} =(Q_{n},{\Sigma }_{\mathcal {W}},\delta _{\mathcal {W}},0,\{n-1\})\) of Definition 2 that has the same state set as \(\mathcal {D}_{n}\) and whose transition semigroup is W li. We will show that there is an injective mapping f : T n W li, and this will prove that |T n | ≤ |W li|.

It suffices to prove the result for left ideals, since suffix-closed languages are their complements.

In the proof of this lemma we enumerate the following cases illustrated in Fig. 5:

  • Case 1: tW li.

  • Case 2: tW li and 0t 2 ≠ 0t.

  • Case 3: tW li and 0t 2 = 0t.

    • (a): t has a cycle.

    • (b): t has no cycles and has a fixed point rp.

    • (c): t has no cycles, has no fixed point rp, and there is a state r such that pr with r t = p.

Fig. 5
figure 5

Map of the cases in the proof of Lemma 5. The transitions of t are represented by solid lines, and the modified transitions of s by dashed red lines

We now proceed to examine each of these cases.

  • Case 1: tW li.

Let f(t) = t; then obviously f restricted to W li is injective.

  • Case 2: tW li and 0t 2 ≠ 0t.

Note that tW li implies 0t ≠ 0 by Lemma 3. Let 0t = p. Since 0t 2 ≠ 0t, we have p = 0t ≺ 0t t = p t by Remark 4. Let p ≺ … ≺ p t k = p t k+1 be the chain defined from p; this chain is of length at least 2. Let f(t) = s, where s is the transformation defined by

$$0 s = 0, \quad p t^{k} s = p, \quad q s = q t \text{ for the other states } q\in Q_{n}.$$

Transformation s is shown in Fig. 5, Case 2, where the dashed transitions show how s differs from t.

By Lemma 3, sW li. However, sT n , as it contains the cycle (p, …, p t k) with states strictly ordered by ≺ in DFA \(\mathcal {D}_{n}\), which contradicts Proposition 4. Since sT n , it is distinct from the transformations defined in Case 1.

In going from t to s, we have added one transition (0s = 0) that is a fixed point, and one (p t k s = p) that is not. Since only one non-fixed-point transition has been added, there can be only one cycle in s with states strictly ordered by ≺. Since 0 cannot appear in this cycle, p is its smallest element with respect to ≺.

Suppose now that t t is another transformation that satisfies Case 2, that is, 0t = p ≠ 0 and p t p ; we will show that f(t) ≠ f(t ). Define s for t as s was defined for t. For a contradiction, assume s = f(t) = f(t ) = s .

Like state s, state s contains only one cycle strictly ordered by ≺, and p is its smallest element. Since we have assumed that s = s , we must have p = 0t = 0t = p and the cycles in s and s must be identical. In particular, p t k t = p t k = p(t )k t = p(t )k. For q of Q n ∖ {0, p t k}, we have q t = q s = q s = q t . Hence t = t – a contradiction. Therefore tt implies f(t) ≠ f(t ).

  • Case 3: tW li and 0t 2 = 0t.

As before, let 0t = p. Consider any state q ∉ {0, p}; then 0 ≺ q by Remark 4 and 0tq t by Proposition 4. Thus either pq t, or p = q t. We consider the following sub-cases:

  • (a): t has a cycle.

Since t has a cycle, take a state r from the cycle; then r and rt are not comparable under ⪯ by Proposition 4, and pr by Remark 5. Let f(t) = s, where s is the transformation shown in Fig. 5, Case 3(a), and defined by

$$0 s = 0, \quad p s = r, \quad q s = q t \text{ for the other states } q \in Q_{n}.$$

By Lemma 3, sW li. Suppose that sT n ; since pr, we have r = p sr s = r t by the definition of s and Proposition 4; this contradicts that r and rt are not comparable. Hence sT n , and so s is distinct from the transformations of Case 1.

We claim that p is not in a cycle of s; this cycle would have to be

$$p\overset{s }{\rightarrow} r \overset{s }{\rightarrow} rt \overset{s }{\rightarrow} {\dots} \overset{s }{\rightarrow} rt^{k-1} \overset{s}{\rightarrow} p, \text { that is, } p\overset{s }{\rightarrow} r \overset{t }{\rightarrow} rt \overset{t }{\rightarrow} {\dots} \overset{t }{\rightarrow} rt^{k-1} \overset{t}{\rightarrow} p, $$

for some k ≥ 2 because rp = p t and r tp. Since pr we have p = p tr t; but then we have a chain pr t ≺ … ≺ r t k = p, contradicting Proposition 4.

Since p is not in a cycle of s, it follows that s does not contain a cycle with states strictly ordered by ≺, as such a cycle would also be in t. So s is distinct from the transformations of Case 2.

We claim there is a unique state q such that (a) 0 ≺ qq s, (b) q s ⪯̸ q s 2. First we show that p satisfies these conditions: (a) holds because p s = r and pr; (b) holds because p s = r, p s 2 = r t and r and rt are not comparable. Now suppose that q satisfies the two conditions, but qp. Note that q sp, because q s = p implies q s = pr = q s 2, contradicting (b). Since q, q s ∉ {0, p}, we have q t = q s ⪯̸ q s 2 = q t 2. But Proposition 4 for qq t implies that q tq t 2 – a contradiction. Thus p is the only state satisfying these conditions.

If t t is another transformation satisfying the conditions of this case, we define s like s. Suppose that s = f(t) = f(t ) = s . Since both s and s contain a unique state p satisfying the two conditions above, we have 0t = 0t = p and p t = p t = p. Since the other states are mapped by s exactly as by t and t , we have t = t .

  • (b): t has no cycles and has a fixed point rp.

Because 0 ≺ r by Remark 4, 0tr t by Proposition 4. Since r is a fixed point of t, then p = 0tr t = r. Since rp, we have pr. Let f(t) = s, where s is the transformation shown in Fig. 5, Case 3(b), and defined by

$$0s\,=\,0,\,\,\,\,\,\, q s \,=\, 0 \text{ for each fixed point } q\notin \{0,p\}, q s \,=\, q t \text{ for the other states } q\in Q_{n}.$$

By Lemma 3, sW li. Suppose that sT n ; because pr, p s = p, r s = 0, and p sr s by Proposition 4, we have p ≺ 0, which is a contradiction. Hence s is not in T n and so is distinct from the transformations of Case 1. Also, s maps at least one state other than 0 to 0, and so is distinct from the transformations of Case 2 and also from the transformations of Case 3(a).

If t t is another transformation satisfying the conditions of this case, we define s like s. Now suppose that s = f(t) = f(t ) = s . There is only one fixed point of s other than 0 (p s = p), and only one fixed point of s other than 0 (p s = p ); hence 0t = p = p = 0t . By the definition of s, for each state q ≠ 0 such that q s = 0, we have q t = q. Similarly, for each state q ≠ 0 such that q s = 0, we have q t = q. Hence t and t agree on these states. Since the remaining states are mapped by s exactly as they are mapped by t and t , we have t = t . Thus we have proved that tt implies f(t) ≠ f(t ).

  • (c): t has no cycles, has no fixed point rp, and there is a state r such that pr with r t = p.

Let f(t) = s, where s is the transformation shown in Fig. 5, Case 3(c), and defined by

$$0 s = 0, \hspace{.2cm} p s = r, \hspace{.2cm} q s = 0 \text{ for each}\, q \succ p\, \text{such that } q t = p,$$
$$\hspace{.2cm} q s = q t \text{ for the other states } q \in Q_{n}.$$

By Lemma 3, sW li. Suppose that sT n ; because pr, p s = r, r s = 0, and r = p sr s = 0 by Proposition 4, we have r ≺ 0 – a contradiction. Hence sT n and s is distinct from the transformations of Case 1.

Because s maps at least one state other than 0 to 0 (r s = 0), it is distinct from the transformations of Case 2 and 3(a). Also s does not have a fixed point other than 0, while the transformations of Case 3(b) have such a fixed point.

We claim that there is a unique state q such that (a) 0 ≺ qq s and (b) q s 2 = 0. First we show that p satisfies these conditions. By assumption 0 ≺ pr and r t = p; also r s = 0 by the definition of s. Condition (a) holds because 0 ≺ pr = p s, and (b) holds because 0 = r s = p s 2.

Now suppose that 0 ≺ qq s, q s 2 = 0 and qp. Since q s ≠ 0, we have q s = q t by the definition of s. Because q t has a t-predecessor, pq t by Remark 5. Also q t = q sp, for q s = p implies 0 = q s 2 = p s = r – a contradiction. Hence pq t. From q t = q s and qq s, we have qq t. Since q s 2 = 0 we have (q t)s = 0 and so (q t)t = p, by the definition of s. By Proposition 4, from qq t we have q t ⪯ (q t)t = p, contradicting pq t. So q = p.

If t t is another transformation satisfying the conditions of this case, we define s like s. Suppose that s = f(t) = t(t ) = s . Since s and s contain a unique state p satisfying the two conditions above, we have 0t = 0t = p and p t = p t = p. Then r and the states qp with q t = p are determined by p, since they are precisely the states qp with q s = 0. Since the other states are mapped by s exactly as by t and t , we have t = t , and f is injective restricted to the transformations of this case also.

  • All cases are covered:

We need to ensure that any transformation t fits in at least one case. It is clear that t fits in Case 1 or 2 or 3. Let p = 0t. For Case 3, it is sufficient to show that if (i) tW li does not contain a fixed point rp, and (ii) there is no state r with pr and r t = p, then t contains a cycle and so fits in Subcase 3(c).

First, if there is no r such that pr, we claim that t is the constant transformation (Q n p), thus it fits in Case 1. Consider any state qQ n such that q tp. Then pq t by Remark 4, contradicting that there is no state r = q t such that pr.

So let t be a transformation that fits in Case 3 and satisfies (i) and (ii), and let r be some state such that pr. Consider the sequence r, r t, r t 2, …. By Remark 5, pr t i for all i ≥ 0. If r t k = p for some k ≥ 1, let k be the smallest such number, then r t k−1p; we have pr t k−1 and (r t k−1)t = p, contradicting (ii). Since p is the only fixed point by (i), we have r t ir t i−1 for all i ≥ 1. Since there are finitely many states, r t i = r t j for some i and j such that 0 ≤ i < j − 1, and so the states r t i, r t i+1, …, r t j = r t i form a cycle.

We have shown that for every transformation t in T n there is a corresponding transformation f(t) in W li, and f is injective. So |T n | ≤ |W li| = n n−1 + n − 1. □

Next we prove that W li is the only transition semigroup meeting the bound. It follows that minimal DFAs of left ideals with the maximal syntactic complexity have maximal-length chains of length 2.

Theorem 3

If T n has size n n−1 + n − 1, then T n = W li .

Proof

Consider a maximal-length chain of states strictly ordered by ≺ in \(\mathcal {D}_{n}\). If its length is 2, then by Lemma 4, T n is a subsemigroup of W li. Thus only T n = W li reaches the bound in this case.

Assume now that the length of a maximal-length chain is at least 3. Then there are states p and r such that 0 ≺ pr. Let R = {qpq}. We show that there exists a transformation s that is in W li but not in f(T n ). To define s we use the constant transformation t = (Q n p) as an auxiliary transformation. Note that t fits in Case 3(c) in the proof of Lemma 5 except that tW li. We define s from t according to the rules of Case 3(c):

$$0 s = 0, \hspace{.2cm} p s = r, \hspace{.2cm} q s = 0 \text{ for each}\ q \in R,$$

q s = q t = p for the other states q.

By Lemma 3, sW li.

Let f be the injective function from the proof of Lemma 5. It remains to be shown that there is no transformation t T n such that s = f(t ). The proof that s is different from the transformations f(t ) of Cases 1, 2, 3(a) and 3(b) is exactly the same as the corresponding proof in Case 3(c) following the definition of s.

It remains to verify that there is no t T n in Case 3(c) such that f(t ) = s. Suppose there is such a t . Recall that states p and r satisfying 0 ≺ pr have been fixed by assumption. By the definition of s, state p satisfies the conditions (a) 0 ≺ pp s and (b) p s 2 = 0. We claim that p is the only state satisfying these conditions. Indeed, if qp then either q s = 0, qq s = 0 and (a) is violated, or q s = p, q s 2 = p s = r ≠ 0 and (b) is violated. This observation is used in the proof of Case 3(c) to prove the claim below.

Both t and t satisfy the conditions of Case 3(c), except that t fails the condition tW li. However, that latter condition is not used in the proof that if tt and t satisfy the other conditions of Case 3(c), then s s, where s is the transformation obtained from t by the rules of s. Thus s is also different from the transformations in f(T n ) from Case 3(c).

Because f is injective, sf(T n ), sW li and f(T n ) ⊆ W li, the bound n n−1 + n − 1 cannot be reached if the length of the maximal-length chains is not 2. □

Proposition 5

For n ≥ 4, the minimal number of generators of the transition semigroup W li is 5.

Proof

We need a generator, say e, that maps 0 to a state in Q n ∖ {0}. Since all such transformations in W li are constant transformations, e is also constant.

Let U be the set of all transformations that map Q n ∖ {0} to Q n ∖ {0} and fix 0. The transition semigroup W li contains U. If a transformation tU would be generated by a generator g mapping a state q from Q n ∖ {0} to 0, then g must be used together with some constant generator s to map 0 back to a state p in Q n ∖ {0}. Then 0t = (0g)s = p, since s is constant; hence t does not fix 0, which is a contradiction. Hence, all the transformations in U must be generated by generators in U.

When restricted to Q n ∖ {0}, the set U forms the full transformation semigroup with n − 1 ≥ 3 states. So by Remark 1, from Proposition 2 we need at least three generators for this semigroup, say a, b, and c.

Finally, T n contains transformations mapping some states from Q n ∖ {0} to 0, so we need one more generator, say d, mapping a state from Q n ∖ {0} to 0. □

We are finally in a position to prove our main theorem of this section.

Theorem 4 (Left Ideals, Suffix-Closed Languages)

Suppose that L ⊆ Σ and κ(L) = n . If L is a left ideal or a suffix-closed language, then σ(L) ≤ n n−1 + n − 1. This bound is tight for n = 1 if |Σ| ≥ 1, for n = 2 if |Σ| ≥ 3, for n = 3 if |Σ| ≥ 4, and for n ≥ 4 if |Σ| ≥ 5. Moreover, the sizes of the alphabet cannot be reduced.

Proof

If L is a left ideal, then σ(L n ) ≤ n n−1 + n − 1 by Lemma 5. By Lemma 3 the languages of Definition 2 meet this bound. It is easy to verify that the size of the alphabet cannot be reduced if n ≤ 3. For n ≥ 4, by Theorem 3 only languages L whose quotient automata have transition semigroups isomorphic to W li meet the bound, and by Proposition 5 W li requires 5 generators. □

6 Two-Sided Ideals

If a language L is a right ideal, then L = LΣ and L has exactly one final quotient, namely Σ; hence this also holds for two-sided ideals. For n ≥ 3, in a two-sided ideal every maximal chain is of length at least 3: it starts with L, every quotient contains L and is contained in Σ.

6.1 Lower Bound

We now show that the syntactic complexity of the following DFA of a two-sided ideal is n n−2 + (n − 2)2n−2 + 1.

Definition 3 (Witness: Two-Sided Ideals)

For n ≥ 4, define the DFA \(\mathcal {W}_{n} =(Q_{n},{\Sigma }_{\mathcal {W}},\delta _{\mathcal {W}},0,\{n-1\}),\) where \({\Sigma }_{\mathcal {W}}=\{a,b,c,d,e,f\}\), a: (1, …, n − 2), b: (1, 2), c: (n − 2 → 1), d: (n − 2 → 0), e: Q n−1 → 1, and f : (1 → n − 1). For n = 4, inputs a and b coincide, and we can use \({\Sigma }_{\mathcal {W}}=\{a,c,d,e,f\}\). Also, let \(\mathcal {W}_{3}=(Q_{3},\{a,b,c\},\delta _{\mathcal {W}},0,\{2\})\), where a: (1 → 2)(0 → 1), b: (1 → 0), and c: 1, and let \(\mathcal {W}_{2}=(Q_{2},\{a,b\},\delta _{\mathcal {W}},0,\{1\})\), where a: (0 → 1), and b: 1. Finally, let \(L_{n}=L(\mathcal {W}_{n})\).

The structure of the DFA of Definition 3 is shown in Fig. 6 for n ≥ 4.

Fig. 6
figure 6

Quotient DFA of a two-sided ideal with n n−2 + (n − 2)2n−2 + 1 transformations

Lemma 6

For n ≥ 2, the DFA of Definition 3 is minimal, accepts a two-sided ideal, and its transition semigroup has size n n−2 + (n − 2)2n−2 + 1. In particular, in contains all transformations of Q n that

  1. 1.

    fix 0 and n − 1,

  2. 2.

    map S ∪ {n − 1}to n − 1and Q n ∖ ({S} ∪ {n − 1}) to i, for all S ⊆ {1, …, n − 2} and i ∈ {1, …, n − 2},

  3. 3.

    map Q n to n − 1.

Proof

For n = 2, the DFA \(\mathcal {W}_{2}\) has only two states 0 and 1, and is obviously minimal. Also, \(L(\mathcal {W}_{2})=\{a,b\}^{*}a \{a,b\}^{*}\) is a two-sided ideal. The set S is empty, and W 2 contains all transformations of types 1 and 3. Finally, \(\mathcal {W}_{2}\) meets the bound 2.

For i = 1, …, n − 2, state i is the only non-final state that accepts a n−1−i f; hence all these states are distinguishable. State 0 is distinguishable from these states, because it does not accept any words in a f. Hence \(\mathcal {W}_{n}\) is minimal. The proof that \(\mathcal {W}_{n}\) is a left ideal is like that in Lemma 3. Since n − 1 is the only final state and it accepts Σ L n is a right ideal. Hence it is two-sided.

For n = 3, \(\mathcal {W}_{3}\) meets the bound 6 with the transition semigroup consisting of the transformations [0, 0, 2], [0, 1, 2], [0, 2, 2], [1, 1, 2], [1, 2, 2], and [2, 2, 2].

From now on we may assume that n ≥ 4. In \(\mathcal {W}_{n}\), the transformations induced by a, b, and c restricted to Q n ∖ {0, n − 1} generate all the transformations of the states 1, …, n − 2. When restricted to Q n ∖ {n − 1}, together with the transformation of d, they generate all (n − 1)n−2 transformations that fix 0: Let t be such a transformation mapping a subset SQ n ∖ {n − 1} to 0. First, using a, b, c, we can map S to n − 2 and n − 2 to a state from S (unless n − 2 ∈ S). Then we apply d. Finally, using a, b, c, we can map n − 2 to the original state, and the remaining states as in t.

In the same way, together with the transformation f, we have all n n−2 transformations of Q n that fix 0 and n − 1.

For any subset S ⊆ {1, …, n − 2}, there is a transformation – induced by a word w S , say – that maps S to n − 1 and fixes Q n S. Then the words of the form w S e a i, for i ∈ {0, …, n − 3}, induce all transformations that map S ∪ {n − 1} to n − 1 and Q n ∖ (S ∪ {n − 1}) to i + 1. There are 2n−2 subsets S, and there are n − 2 possibilities for i. Hence there are (n − 2)2n−2 transformations of this type. There is also the constant transformation e f : (Q n n − 1), which yields the total number claimed. □

6.2 Upper Bound

We consider a minimal DFA \(\mathcal {D}_{n}=(Q_{n}, {\Sigma }_{\mathcal {D}}, \delta _{\mathcal {D}}, 0,\{n-1\})\) of an arbitrary two-sided ideal with n quotients, and let T n be the transition semigroup of \(\mathcal {D}_{n}\). We also deal with the witness DFA \(\mathcal {W}_{n} =(Q_{n},{\Sigma }_{\mathcal {W}},\delta _{\mathcal {W}},0,\{n-1\})\) of Definition 3 with transition semigroup W 2i.

Lemma 7

If n ≥ 4 and a maximal-length chain in \(\mathcal {D}_{n}\) strictly ordered byhas length 3, then |T n | ≤ n n−2 + (n − 2)2n−2 + 1, and T n is a subsemigroup of W 2i .

Proof

Consider an arbitrary transformation tT n ; then (n − 1)t = n − 1. If 0t = 0, then any state not in {0, n − 1} can possibly be mapped by t to any one of the n states; hence there are at most n n−2 such transformations.

If 0t ≠ 0, then 0 ≺ 0t. Consider any state q ∉ {0, 0t}; since \(\mathcal {D}_{n}\) is minimal, q must be reachable from 0 by some transformation s, that is, q = 0s. If 0s t ∉ {0t, n − 1}, then 0t ≺ 0s t by Remark 4. But then we have the chain 0 ≺ 0t ≺ 0s tn − 1 of length 4, contradicting our assumption. Hence we must have either 0s t = 0t, or 0s t = n − 1. For a fixed 0t, a subset of the states in Q n ∖ {0, n − 1} can be mapped to 0t and the remaining states in Q n ∖ {0, n − 1} to n − 1, thus giving 2n−2 transformations. Since there are n − 2 possibilities for 0t, we obtain the second part of the bound. Finally, all states can be mapped to n − 1.

By Lemma 6 all of the above-mentioned transformations are in W 2i. □

Lemma 8 (Two-Sided Ideals, Factor-Closed Languages)

If L is a two-sided ideal or a factor-closed language with n ≥ 2quotients, then its syntactic complexity is less than or equal to n n−2 + (n − 2)2n−2 + 1.

Proof

It suffices to prove the result for two-sided ideals, since factor-closed languages are their complements.

As we did for left ideals, we show that |T n | ≤ |W 2i|, by constructing an injective function f : T n W 2i.

We have qn − 1 for all qQ n , and n − 1 is a fixed point of every transformation in T n and W 2i.

For a transformation tT n , consider the cases shown in Fig. 7.

Fig. 7
figure 7

Map of the cases in the proof of Lemma 8. The transitions of t are represented by solid lines, and the modified transitions of s by dashed red lines

We now prove the lemma for each of these cases.

  • Case 1: tW 2i.

The proof is the same as that of Case 1 of Lemma 5.

  • Case 2: tW 2i, and 0t 2 ≠ 0t.

Let 0t = p ≺ … ≺ p t k = p t k+1 be the chain defined from p.

  • (a): p t kn − 1.

The proof is the same as that of Case 2 of Lemma 5.

  • (b): p t k = n − 1 and k ≥ 2.

Let f(t) = s, where s is the transformation shown in Fig. 7, Case 2(b), and defined by

$$0 s = 0, \quad p t^{i} s = p t^{i-1} \text{ for }1 \le i \le k-1 , \quad p s = n-1,$$
$$q s = q t \text{ for the other states } q\in Q_{n}.$$

By Lemma 6, sW 2i. We have p tp, p t s = p, and p s = n − 1. By Proposition 4, p t sp s, that is, pn − 1, which contradicts the fact that k ≥ 2 (so 0t = pn − 1), and qn − 1 for all qQ n . Thus s is not in T n , and so it is different from the transformations of Case 1.

Observe that s does not have a cycle with states strictly ordered by ≺, since no state from {0, p, p t, …, p t k−1} can be in a cycle, and t cannot have such a cycle with ordered states by Proposition 4. Hence s is different from the transformations of Case 2(a).

In s, there is a unique state q such that q s = n − 1 and for which there exists a state r such that rq and r s = q, and that this state q must be p. Indeed, if qp, then q t = q s = n − 1 by the definition of s. From rq, we have r tq t = n − 1; hence r s = r t = n − 1 and r sq – a contradiction. Hence q = p.

By a similar argument, we show that there exists a unique state q such that qp, and q s = p, and that this state q must be pt. If qp t then q s = q t. But qq t and p = q tq t 2 = p t contradicts that pp t. Continuing in this way for p t 2, …, p t k−1 we show that there is a unique chain \(pt^{k-1} \overset {s }{\rightarrow } {\dots } \overset {s }{\rightarrow } pt \overset {s }{\rightarrow } p\).

If t t is another transformation satisfying the conditions of this case, we define s like s. Now suppose that s = f(t) = f(t ) = s . Since we have a unique state p such that p s = n − 1 for which there exists a state r such that rp and r s = p, we have 0t = 0t = p. Also the chain of states p, p t, p t 2, …, p t k−1 is unique in s and s as we have shown above; so p t i = p t i for i = 1, …, k − 1. Since the other states are mapped by s exactly as by t and t , we have t = t .

  • (c): p t = n − 1.

Let P = {0, p, n − 1}. We have n ≥ 4, as otherwise tW 2i since it is a transformation of type 2 from Lemma 6. So there must be a state rP; let r be chosen arbitrarily. If pr for all rP, then n − 1 = p tr t; hence r t = n − 1 for all such r, and q t ∈ {p, n − 1} for all qQ n . By Lemma 6, there is a transformation in W 2i that maps S ∪ {n − 1} to n − 1, and Q n ∖ (S ∪ {n − 1}) to p for any S ⊆ {1, …, n − 2}. Thus tW 2i – a contradiction.

In view of the above, there must exist a state rP such that p ⪯̸ r. By Remark 4, we have pr t and of course r tn − 1. If rt is p or n − 1 for all rP, we again have the situation described above, showing that tW 2i. Hence there must exist an rP such that p ⪯̸ r and pr tn − 1.

Also we claim that t does not have a cycle. Indeed, if pq, then q is mapped to n − 1; if p ⪯̸ q, then q is mapped to a state q tp and again q cannot be in a cycle since the chain starting with q ends in n − 1.

Let f(t) = s, where s is the transformation shown in Fig. 7, Case 2(c), and defined by

$$0 s = 0, \quad p s = r t, \quad (r t) s = p, \quad r s = 0,$$
$$q s = q t \text{ for the other states } q\in Q_{n}.$$

Since s fixes both 0 and n − 1, it is in W 2i by Lemma 6. But s is not in T n , as we have the cycle (p, r t) with pr t, which would contradict Proposition 4. So s is different from the transformations of Case 1. Since s maps a state other than 0 to 0, it is different from the transformations of Cases 2(a) and 2(b).

Observe that t does not map any state to 0; otherwise, if q t = 0 for some q, then 0 ≺ p implies q ≺ 0 by Proposition 4, which contradicts that 0 ≺ q from Remark 4. Consequently, in s there is the unique state r ≠ 0 mapped to 0. Also, as t does not contain a cycle, the only cycle in s must be (p, r t).

If t t is another transformation satisfying the conditions of this case, we define s like s. Now suppose that s = f(t) = f(t ) = s . Because both s and s have the unique non-fixed point r mapped to 0, r = r . Also s and s contain the unique cycle (p, r t), pr t. Thus p = p , p t = p t = n − 1 and r t = r t . It follows that 0t = 0t = p. Because pr t = r t , we have (r t)t = (r t)t = n − 1. The other states are mapped by s exactly as by t and t , and so t = t .

  • Case 3: tW 2i, 0t = p ≠ 0, and p t = p.

    • (a): t has a cycle.

The case is analogous to that of Case 3(a) in Lemma 5.

Since t has a cycle, take a state r from the cycle; then r and rt are not comparable under ⪯ by Proposition 4, and pr by Remark 5. Let f(t) = s, where s is the transformation shown in Fig. 7 and defined by

$$0 s = 0, \quad p s = r, \quad q s = q t \text{ for the other states } q \in Q_{n}.$$

The proof that s is different from the s of Case 1, 2(a), and that there is no t t fitting in this case and yielding the same s, is the same as in Lemma 5.

In s there is the state p with the property that pp s but ps and p s 2 are not comparable under ⪯. Consider a transformation t that fits in Case 2(b). Then in s every state q = p t i for 0 ≤ ik − 1, and q = 0, is such that q s is comparable with q s ′2 under ⪯. So if there is such a state in s , it must be also present in t T n . But then q q t implies q t q t ′2 by Proposition 4, so this is not possible. Thus ss .

For a distinction from the transformations of Case 2(c) observe that s does not map to 0 any state other than 0.

  • (b): t has no cycles and has a fixed point r ∉ {p, n − 1}.

The case is analogous to that of Case 3(b) in Lemma 5.

Because 0 ≺ r by Remark 4, 0tr t by Proposition 4. Since r is a fixed point of t, then p = 0tr t = r. Since rp, we have pr. Let f(t) = s, where s is the transformation shown in Fig. 7 and defined by

$$0s=0, \quad q s = 0 \text{ for each fixed point } q\notin \{0,p,n-1\},$$
$$q s = q t \text{ for the other states } q\in Q_{n}.$$

The proof that s is different from the s of Case 1, 2(a), 3(a), and that there is no t t fitting in this case and yielding the same s, is the same as in Lemma 5.

Since s maps to 0 a state other than 0, this case is distinct from Case 2(b). Because t does not have a cycle, and no state q mapped to 0 can be in a cycle in s, it follows that s does not have a cycle. Thus s is different from the transformations of Case 2(c).

  • (c): t has neither a cycle nor a fixed point r ∉ {p, n − 1}, and has a state rp mapped to p.

The case is analogous to that of Case 3(c) in Lemma 5.

Let f(t) = s, where s is the transformation shown in Fig. 7 and defined by

$$0 s = 0, \hspace{.2cm} p s = r, \hspace{.2cm} q s = 0 \text{ for each}\, q \succ p\, \text{such that } q t = p,$$
$$\hspace{.2cm} q s = q t \text{ for the other states } q \in Q_{n}.$$

The proof that s is different from the s of Case 1, 2(a), 3(a), 3(b), and that there is no t t fitting in this case and yielding the same s, is the same as in Lemma 5.

Since s maps to 0 a state other than 0, this case is distinct from Case 2(b). In s, 0 cannot be in a cycle, no state qp mapped to 0 can be in a cycle and p cannot be in a cycle as p s = r and r s = 0. Since the other states are mapped as in t, s does not have a cycle. Thus s is different from the transformations of Case 2(c).

  • (d): t has no cycles, no fixed point r ∉ {p, n − 1}, and no state rp mapped to p, and has a state r such that prn − 1 that is mapped to n − 1.

Let f(t) = s, where s is the transformation shown in Fig. 7, Case 3(d), and defined by

$$0 s = 0, \quad q s = q \text{ for states } q \text{ such that } q t = n-1, \quad p s = n-1$$
$$qs = qt \text{ for the other states } q\in Q_{n}.$$

By Lemma 6, sW 2i. However, s is not in T n , as we have a fixed point r such that prn − 1 and p s = n − 1. So Proposition 4 yields n − 1 = p sr s = r – a contradiction. Thus s is different from the transformations of Case 1.

Transformation s does not have any cycles, as t does not have one in this case and fixed points q and p cannot be in a cycle. So s is different from the transformations of Cases 2(a) and 3(a). Also, since p is the unique state mapped to n − 1 and there is no state rp mapped to p, s is different from the transformations of Case 2(b). For a distinction from the transformations of Cases 2(c), 3(b) and 3(c), observe that s does not map to 0 any state other than 0.

If t t is another transformation satisfying the conditions of this case, we define s like s. Now suppose that s = f(t) = f(t ) = s . Observe that t does not have a fixed point other than n − 1. So for every fixed point q ∉ {0, n − 1} of s we have q t = q t = n − 1. Also, since p is the unique state mapped to n − 1 in s, 0t = 0t = p and p t = p t = p. The other states are mapped by s as by t and t ; so t = t .

  • All cases are covered:

We need to ensure that any transformation t fits in at least one case. It is clear that t fits in Case 1 or 2 or 3. Any transformation from Case 2 fits in Case 2(a) or 2(b) or 2(c). For Case 3, it is sufficient to show that if (i) tW 2i does not contain a fixed point r ∉ {p, n − 1}, and (ii) there is no state r, prn − 1, mapped to p or n − 1, then t has a cycle.

If there is no state r such that prn − 1, then q t ∈ {p, n − 1} for all qQ n , since q tp. By the proof of Lemma 6 in W 2i for any SQ n ∖ {n − 1} there are all transformations that map S ∪ {n − 1} to n − 1, and the other states Q n ∖ (S ∪ {n − 1}) to any state from Q n ; thus tW 2i – a contradiction.

So let t be a transformation that fits in Case 3 and satisfies (i) and (ii), and let r be some state such that prn − 1. Consider the sequence r, r t, r t 2, …. By Remark 5, pr t i for all i ≥ 0. If r t k ∈ {p, n − 1} for some k ≥ 1, let k be the smallest such number, then r t k−1 ∉ {p, n − 1}; we have pr t k−1n − 1 and (r t k−1)t ∈ {p, n − 1}, contradicting (ii).

Since p and n − 1 are the only fixed points by (i), we have r t ir t i−1. Since there are finitely many states, r t i = r t j for some i and j such that 0 ≤ i < j − 1, and so the states r t i, r t i+1…, r t j = r t i form a cycle. □

Theorem 5

If T n has size n n−2 + (n − 2)2n−2 + 1, then T n = W 2i .

Proof

The proof is very similar to that of Theorem 3.

Consider a maximal-length chain of states strictly ordered by ≺ in \(\mathcal {D}_{n}\). If its length is 3, then by Lemma 7 T n is a subsemigroup of W 2i. Thus only T n = W 2i reaches the bound.

If there is a chain of length 4, then there are states p and r such that 0 ≺ prn − 1. Let R = {qQ n ∖ {n − 1}∣pq}. To define s we use the transformation t = (Q n ∖ {n − 1} → p) as an auxiliary transformation. Note that t fits in Case 3(c) in the proof of the proof of Lemma 8 except that tW 2i. We define s from t according to the rules of Case 3(c):

$$\begin{array}{@{}rcl@{}} 0s=0, \hspace{.2cm} ps=r, \hspace{.2cm} qs=0\, \text{for each} q\in R,\\ qs = qt = p\, \text{for the other states} q. \end{array} $$

By Lemma 6 (transformations of type 1), sW 2i.

Let f be the injective function from the proof of Lemma 8. It remains to be shown that there is no transformation t T n such that s = f(t ). The proof that s is different from the transformations f(t ) of Cases 1, 2(a), 2(b), 2(c), 3(a), and 3(b) is exactly the same as the corresponding proof in Case 3(c) following the definition of s. The proof that s is different from the transformations t T n in Case 3(c) is exactly the same as the corresponding proof in Theorem 3. It remains to show that that there is no t T n in Case 3(d) such that s = f(t ). Indeed, f(t ) from Case 3(d) does not map any state to 0 other than 0, while we have r s = 0. So s is also different from these transformations.

Because f is injective, sf(T n ), sW 2i and f(T n ) ⊆ W 2i, the bound n n−1 + n − 1 cannot be reached if the length of the maximal-length chains is not 3. □

Proposition 6

For n ≥ 4, the minimal number of generators of the transition semigroup W 2i is 6.

Proof

Transition semigroup W 2i contains all transformations of Q n−1 to Q n−1 that fix n − 1. Since every transformation in W 2i fixes n − 1, they must be also generated only by transformations of this form, as otherwise a generated transformation would map a state from Q n−1 to n − 1, as n − 1 is always fixed. When restricted to Q n−1, these transformations form the largest transformation semigroup of a left ideal W li with n − 1 ≥ 4 states. So by Remark1, from Proposition 5 we know that they require 5 generators. These generators do not map any state from Q n ∖ {n − 1} to n − 1, hence, we need one more generator which maps a state from Q n−1 to n − 1. □

We are now in a position to prove our main theorem of this section.

Theorem 6 (Two-Sided Ideals, Factor-Closed Languages)

Suppose that L ⊆ Σ and κ(L) = n > 1. If L is a two-sided ideal or a factor-closed language, then σ(L) ≤ n n−2 + (n − 2)2n−2 + 1. This bound is tight for n = 2 if |Σ| ≥ 2, for n = 3 if |Σ| ≥ 3, for n ≥ 4 if |Σ| ≥ 5, and for n ≥ 5 if |Σ| ≥ 6. Moreover, the sizes of the alphabet cannot be reduced.

Proof

This follows from Lemmas 6 and 8. It is easy to verify that the size of the alphabet cannot be reduced if n ≤ 4. For n ≥ 5, by Theorem 5 only languages L whose quotient automaton has transition semigroup isomorphic to W 2i meet the bound, and by Proposition 6, transition semigroup W 2i requires 6 generators. □

7 Conclusions

We have found tight upper bounds on the syntactic complexity of right, left, and two-sided ideals. We have shown that in each of the three cases the maximal transition semigroup is unique.

In our proof for left and two-sided ideals we exhibited an injective function from the transition semigroup of a minimal DFA of an arbitrary left, right, two-sided ideal language to the transition semigroup of the witness DFA attaining the upper bound for these languages. This approach is generally applicable for other subclasses of regular languages. For example, in [12] we have used this method to establish the upper bound for suffix-free languages.