# Canonical Nondeterministic Automata

## Abstract

For each regular language \(L\) we describe a family of canonical nondeterministic acceptors (nfas). Their construction follows a uniform recipe: build the minimal dfa for \(L\) in a locally finite variety \(\mathcal {V}\), and apply an equivalence between the finite \(\mathcal {V}\)-algebras and a category of finite structured sets and relations. By instantiating this to different varieties we recover three well-studied canonical nfas (the átomaton, the jiromaton and the minimal xor automaton) and obtain a new canonical nfa called the distromaton. We prove that each of these nfas is minimal relative to a suitable measure, and give conditions for state-minimality. Our approach is coalgebraic, exhibiting additional structure and universal properties.

## 1 Introduction

*deterministic*finite automaton (dfa): its states \(Q_L\) are the derivatives of \(L\), i.e.,

*nondeterministic*finite automata (nfas) the situation is significantly more complex: a regular language may have many non-isomorphic state-minimal nfas, and generally there is no way to identify a “canonical” one among them. However, several authors have recently proposed nondeterministic acceptors that

*are*in some sense canonical (though not necessarily state-minimal), e.g. the

*átomaton*of Brzozowski and Tamm [8], the

*jiromaton*

^{1}of Denis, Lemay and Terlutte [10], and the

*minimal xor automaton*of Vuillemin and Gama [17]. In each case, the respective nfa is formed by closing the set \(Q_L\) of derivatives under certain algebraic operations and taking a minimal set of generators as states. Specifically,

- 1.
the states of the átomaton are the atoms of the boolean algebra generated by \(Q_L\), obtained by closing \(Q_L\) under finite union, finite intersection and complement;

- 2.
the states of the jiromaton are the join-irreducibles of the join-semilattice generated by \(Q_L\), obtained by closing \(Q_L\) under finite union;

- 3.
the states of the minimal xor automaton form a basis for the \(\mathbb {Z}_2\)-vector space generated by \(Q_L\), obtained by closing \(Q_L\) under symmetric difference.

*deterministic*automata interpreted in a locally finite variety \(\mathcal {V}\), where

*locally finite*means that finitely generated algebras are finite. A

*deterministic*\(\mathcal {V}\)-

*automaton*is a coalgebra for the functor \(T_\varSigma =\mathbb {2}\times \mathsf {Id}^\varSigma \) on \(\mathcal {V}\), for a fixed two-element algebra \(\mathbb {2}\). In Sect. 2 we describe a Brzozowski-like construction that yields, for every regular language, the minimal deterministic finite \(\mathcal {V}\)-automaton accepting it. Next, for certain varieties \(\mathcal {V}\) of interest, we derive an equivalence between the full subcategory \(\mathcal {V}_f\) of finite algebras and a suitable category \(\overline{\mathcal {V}}\) of finite structured sets, whose morphisms are relations preserving the structure. In each case, the objects of \(\overline{\mathcal {V}}\) are “small” representations of their counterparts in \(\mathcal {V}_f\), based on specific generators of algebras in \(\mathcal {V}_f\). The equivalence \(\mathcal {V}_f\cong \overline{\mathcal {V}}\) then induces an equivalence between deterministic finite \(\mathcal {V}\)-automata and coalgebras in \(\overline{\mathcal {V}}\) which are

*nondeterministic*automata.

This suggests a two-step procedure for constructing a canonical nfa for a given regular language \(L\): (i) form \(L\)’s minimal deterministic \(\mathcal {V}\)-automaton, and (ii) use the equivalence of \(\mathcal {V}_f\) and \(\overline{\mathcal {V}}\) to obtain an equivalent nfa. Applying this to different varieties \(\mathcal {V}\) yields the three canonical nfas mentioned above. For the átomaton one takes \(\mathcal {V}= \mathsf {BA}\) (boolean algebras). Then the minimal deterministic \(\mathsf {BA}\)-automaton for \(L\) arises from the minimal dfa by closing its states \(Q_L\) under boolean operations. The category \(\overline{\mathcal {V}}=\overline{\mathsf {BA}}\) is based on Stone duality: \(\overline{\mathsf {BA}}\) is the dual of the category of finite sets, so it has a objects all finite sets, as morphisms all converse-functional relations, and the equivalence functor \(\mathsf {BA}_f \rightarrow \overline{\mathsf {BA}}\) maps each finite boolean algebra to the set of its atoms. This equivalence applied to the minimal deterministic \(\mathsf {BA}\)-automaton for \(L\) gives precisely \(L\)’s átomaton. Similarly, by taking \(\mathcal {V}\) = join-semilattices and \(\mathcal {V}\) = vector spaces over \(\mathbb {Z}_2\) and describing a suitable equivalence \(\mathcal {V}_f\cong \overline{\mathcal {V}}\), we recover the jiromaton and the minimal xor automaton, respectively. Finally, for \(\mathcal {V}\) = distributive lattices we get a new canonical nfa called the *distromaton*, which bears a close resemblance to the universal automaton [14].

### *Example 1.1*

Consider the language \(L = (a + b)^* b (a + b)^n\) where \(n\in \omega \). Its minimal dfa has \(\ge 2^{n}\) states and its (A) átomaton, (X) minimal xor automaton, (J) jiromaton and (D) distromaton are the nfas with \(\le n+3\) states depicted below (see Sect. 3.3).

The minimal xor automaton accepts \(L\) by \(\mathbb {Z}_2\)-weighted acceptance, which is the usual acceptance in this case. It is a state-minimal nfa, as is the jiromaton. The state-minimality of the latter follows from a general result (Theorem 4.4).

- (a)
all the four canonical nfas can have exponentially fewer states than the minimal dfa;

- (b)
the minimal xor automaton and jiromaton have no more states than the minimal dfa;

- (c)
the átomaton and distromaton have the same number of states, although their structure can be very different.

*state-minimality*of the canonical nfas. That is, there exists a natural class of languages where

*canonical*state-minimal nfas exist and can be computed relatively easily.

*Related work.* Our paper unifies the constructions of canonical nfas given in [8, 10, 17] from a coalgebraic perspective. Previously, several authors have studied coalgebraic methods for constructing minimal and canonical representatives of machines, including Adámek, Bonchi, Hülsbusch, König, Milius and Silva [1], Adámek, Milius, Moss and Sousa [2] and Bezhanishvili, Kupke and Panangaden [4]. Only the first of these three papers, however, treats the case of nondeterministic automata explicitly – in particular, there the átomaton is recovered as an instance of projecting coalgebras in a Kleisli category into a reflective subcategory. This approach is methodologically rather different from the present paper where a categorical equivalence (rather than a reflection) is the basis for the construction of nfas.

In [8] the authors propose a surprisingly simple algorithm for constructing the átomaton of a language \(L\): take the minimal dfa for \(L\)’s reversed language, and reverse this dfa. These steps form a fragment of a classical dfa minimization algorithm due to Brzozowski. Recently Bonchi, Bonsangue, Rutten and Silva [6] gave a (co-)algebraic explanation of this procedure, based on the classical duality between observability and reachability of dfas. We provide another explanation in Sect. 3.3.

A coalgebraic treatment of linear weighted automata (of which xor automata considered here are a special case) appears in [5]; this paper also provides a procedures for computing the minimal linear weighted automaton.

Finally, our work is somewhat related to work on coalgebraic trace semantics [11]. However, while that work considers coalgebras whose carrier is a the free algebra of a variety we consider coalgebras whose carriers are arbitrary algebras from the given variety; this means we consider coalgebras over an Eilenberg-Moore category (cf. [7, 12]).

## 2 Deterministic Automata

We start with recalling the concept of a finite automaton. Throughout this paper let us fix a finite input alphabet \(\varSigma \).

### **Definition 2.1**

- (a)
A

*nondeterministic finite automaton*(*nfa*) is a triple \(N = (Z,R_a,F)\) consisting of a finite set \(Z\) of states, transition relations \(R_a \subseteq Z \times Z\) for each \(a \in \varSigma \) and final states \(F \subseteq Z\). Morphisms of nfas are the usual bisimulations, i.e., relations that preserve and reflect transitions and final states. If \(N\) is equipped with initial states \(I \subseteq Z\) we write \(N=(Z,R_a,F,I)\). In this case, \(N\) accepts a language \(\mathcal {L}_N(I) \subseteq \varSigma ^*\) in the usual way. - (b)
A

*deterministic finite automaton*(*dfa*) is an nfa with a single initial state whose transition relations are functions.

Although the goal of our paper is constructing canonical nondeterministic automata, we first consider deterministic ones from a coalgebraic perspective. Given an endofunctor \(T : \mathcal {V}\rightarrow \mathcal {V}\) of a category \(\mathcal {V}\), a \(T\)-*coalgebra* \((Q,\gamma )\) consists of a \(\mathcal {V}\)-object \(Q\) and a \(\mathcal {V}\)-morphism \(\gamma : Q \rightarrow TQ\). A *coalgebra homomorphism* into another coalgebra \(\gamma ' : Q' \rightarrow TQ'\) is a \(\mathcal {V}\)-morphism \(h : Q \rightarrow Q'\) such that \(Th \circ \gamma = \gamma ' \circ h\). This defines a category \(\mathsf {Coalg}(T)\). If it exists, its terminal object \(\nu T\) is called the *final* \(T\)-*coalgebra*.

### **Assumption 2.2**

From now on \(\mathcal {V}\) is a locally finite variety with a specified two-element algebra \(\mathbb {2}=\{0,1\}\). That is, \(\mathcal {V}\) is the category of algebras for some finitary signature and equations, its morphisms being the usual algebra homomorphisms. That \(\mathcal {V}\) is *locally finite* means its finitely generated algebras are finite, equivalently its finitely generated *free* algebras are finite.

### *Example 2.3*

- (a)
The category \(\mathsf {Set}_\star \) of pointed sets is a locally finite variety, given by the signature with a constant \(0\) and no equations. Let \(\mathbb {2}\in \mathsf {Set}_\star \) have point \(0\).

- (b)
The category \(\mathsf {BA}\) of boolean algebras is a locally finite variety: a boolean algebra on \(n\) generators has at most \(2^{2^n}\) elements. \(\mathbb {2}\) is the \(2\)-chain \(0 < 1\).

- (c)
The category \(\mathsf {Vect}(\mathbb {Z}_2)\) of vector spaces over the binary field \(\mathbb {Z}_2\) is a locally finite variety. Here \(\mathbb {2}= \mathbb {Z}_2\) as a one-dimensional vector space.

- (d)
The category \({\mathsf {JSL}}\) of (join-)semilattices with a least element \(0\) is locally finite: the finite powerset \(\mathcal {P}_\mathsf {f} X\) is the free semilattice on \(X\), so a semilattice on \(n\) generators has at most \(2^n\) elements. \(\mathbb {2}\) is the \(2\)-chain \(0 < 1\).

- (e)
The category \(\mathsf {DL}\) of distributive lattices with a least and largest element \(\bot \) and \(\top \) is locally finite. Again, \(\mathbb {2}\) is the \(2\)-chain \(0<1\).

### **Definition 2.4**

If \(Q\) is a join-semilattice then \(q \in Q\) is *join-irreducible* if (i) \(q \ne 0\) and (ii) \(q = r \vee r'\) implies \(q = r\) or \(q = r'\). The set of join-irreducibles is written \(J(Q) \subseteq Q\).

### **Definition 2.5**

A \(T\)-coalgebra \((Q',\gamma ')\) is a *subcoalgebra* of \((Q,\gamma )\) if there exists an injective coalgebra homomorphism \(m : (Q',\gamma ') \rightarrowtail (Q,\gamma )\), and a *quotient coalgebra* of \((Q,\gamma )\) if there exists a surjective coalgebra homomorphism \(e : (Q,\gamma ) \twoheadrightarrow (Q',\gamma ')\).

### **Definition 2.6**

*deterministic*\(\mathcal {V}\)-

*automaton*is a coalgebra for the functor

### *Remark 2.7*

Hence, by the universal property of the product, a deterministic \(\mathcal {V}\)-automaton \(Q\rightarrow \mathbb {2}\times Q^\varSigma \) is given by an algebra \(Q\) of states, a \(\mathcal {V}\)-morphism \(\gamma _\epsilon : Q \rightarrow \mathbb {2}\) defining final states via \(\gamma _\epsilon ^{-1}(\{1\})\) and, for each \(a \in \varSigma \), a \(\mathcal {V}\)-morphism \(\gamma _a : Q \rightarrow Q\) representing the \(a\)-transitions. In particular, deterministic \(\mathsf {Set}\)-automata are precisely the classical (possibily infinite) deterministic automata without initial states, shortly *da*’s.

### *Example 2.8*

- (a)
A deterministic \(\mathsf {Set}_\star \)-automaton is a da whose carrier is a pointed set and whose point is a non-final sink state; these are the partial automata of [16].

- (b)
A deterministic \(\mathsf {BA}\)-automaton is a da with a boolean algebra structure on the states \(Q\) such that (i) the final states form an ultrafilter, (ii) \(q \xrightarrow {a} q'\) and \(r \xrightarrow {a} r'\) implies \(q \vee r \xrightarrow {a} q' \vee r'\) and \(\lnot q \xrightarrow {a} \lnot q'\), and (iii) \(\bot \) is a non-final sink state.

- (c)
A deterministic \(\mathsf {Vect}(\mathbb {Z}_2)\)-automaton is a da with a \(\mathbb {Z}_2\)-vector space structure on the states \(Q\) such that (i) the final states \(F \subseteq Q\) satisfy \(0 \notin F\) and also \(q + r \in F\) iff either \(q \in F\) or \(r \in F\) but not both, (ii) \(q \xrightarrow {a} q'\) and \(r \xrightarrow {a} r'\) implies \(q + q \xrightarrow {a} r + r'\), and (iii) \(0\) is a non-final sink state.

- (d)
A deterministic \({\mathsf {JSL}}\)-automaton is a da with a join-semilattice structure on the states \(Q\) such that (i) the final states form a prime filter, (ii) \(q \xrightarrow {a} q'\) and \(r \xrightarrow {a} r'\) implies \(q + r \xrightarrow {a} q' + r'\), and (iii) \(0\) is a non-final sink state. Recall that a

*prime filter*is an upwards closed \(F \subseteq Q\) where \(0 \notin F\) and \(q + q' \in F\) iff \(q \in F\) or \(q' \in F\). - (e)
A deterministic \(\mathsf {DL}\)-automaton is a da with a distributive lattice structure on the states \(Q\) such that (i) the final states form an prime filter, (ii) \(q \xrightarrow {a} q'\) and \(r \xrightarrow {a} r'\) implies \(q \vee r \xrightarrow {a} q' \vee r'\) and \(q \wedge r \xrightarrow {a} q' \wedge r'\), and (iii) \(\bot \) is a non-final sink state and \(\top \) is a final one.

### *Remark 2.9*

For finitary endofunctors \(T\), Milius [15] introduced the concept of a locally finitely presentable coalgebra: it is a filtered colimit of coalgebras carried by finitely presentable objects. In the present context the finitely presentable objects are precisely the finite algebras in \(\mathcal {V}\), so we speak about *locally finite coalgebras*. A \(T_\varSigma \)-coalgebra is locally finite iff from each state only finitely many states are reachable by transitions.

### *Remark 2.10*

- 1.
The final \(T_\varSigma \)-coalgebra in \(\mathsf {Set}\) is \(\nu T_\varSigma =\mathcal {P}\varSigma ^*\), the set of formal languages over \(\varSigma \), with transitions \(L \xrightarrow {a} a^{-1} L\) for \(a \in \varSigma \) and final states precisely those languages containing \(\epsilon \). Importantly, \(\nu T_\varSigma \) arises as the \(\omega ^{op}\)-limit of \(T_\varSigma \)’s terminal sequence \((T_\varSigma ^n 1)_{n<\omega }\), see [3]. Since for any variety \(\mathcal {V}\) the forgetful functor from \(\mathcal {V}\) to \(\mathsf {Set}\) creates limits, the final \(T_\varSigma \)-coalgebra \(\nu T_\varSigma \) in \(\mathcal {V}\) exists and lifts the one in \(\mathsf {Set}\), so \(\nu T_\varSigma \) has underlying set \(\mathcal {P}\varSigma ^*\) and the transitions and final states are as above.

- 2.
The

*final locally finite*\(T_\varSigma \)-coalgebra is denoted by \(\rho T_\varSigma \). In \(\mathcal {V}= \mathsf {Set}\) this is the sub-da of \(\nu T_\varSigma =\mathcal {P}\varSigma ^*\) given by the set of all regular languages over \(\varSigma \). This generalizes to any locally finite variety \(\mathcal {V}\): \(\rho T_\varSigma \) is a subcoalgebra of \(\nu T_\varSigma \) and its underlying set is the set of regular languages.

### *Example 2.11*

- (a)
In \(\mathsf {Set}_\star \) the carrier of the final coalgebra \(\nu T_\varSigma \) has the constant \(\emptyset \), which \(\rho T_\varSigma \) inherits.

- (b)
In \(\mathsf {BA}\), \(\nu T_\varSigma \) has the usual set-theoretic boolean algebra structure. The principal filter \(\mathord {\uparrow } \epsilon \) is an ultrafilter and the transition maps \(L \mapsto a^{-1} L\) are boolean morphisms.

- (c)
In \(\mathsf {Vect}(\mathbb {Z}_2)\) the vector space structure on \(\nu T_\varSigma \) and \(\rho T_\varSigma \) is given by symmetric difference and \(\emptyset \) is the zero vector.

- (d)
In \({\mathsf {JSL}}\) the join-semilattice structure on \(\nu T_\varSigma \) is union and \(\emptyset \). The final states form a one-generated upset \(\mathord {\uparrow } \epsilon \) which is a prime filter because the language \(\{\epsilon \}\) is join-irreducible in \(\nu T_\varSigma \). The transitions maps are join-semilattice morphisms.

- (e)
In \(\mathsf {DL}\) we have the usual set-theoretic lattice structure on \(\nu T_\varSigma \). The final states form a prime filter and the transition maps are lattice morphisms.

### **Notation 2.12**

### **Definition 2.13**

*pointed*\(T_\varSigma \)-

*coalgebra*\((Q,\gamma ,q_0)\) is a \(T_\varSigma \)-coalgebra \((Q,\gamma )\) with a morphism \(q_0 : V \rightarrow Q\). The latter may be viewed as the initial state \(q_0 (g) \in Q\). The

*language accepted by*\((Q,\gamma ,q_0)\) is \(\mathcal {L}_\gamma (q_0)\). We say that \((Q,\gamma ,q_0)\) is

- 1.
*reachable*if it is generated by \(q_0\), i.e., no proper subcoalgebra contains \(q_0\); - 2.
*simple*if it has no proper quotients, i.e., for every quotient coalgebra \(e : (Q,\gamma ) \twoheadrightarrow (Q',\gamma ')\) the map \(e\) is bijective; - 3.
*minimal*if it is reachable and simple.

### **Lemma 2.14**

\((Q,\gamma ,q_0)\) is reachable iff the algebra \(Q\) is generated by those \(q \in Q\) reachable from \(q_0\) by transitions. It is simple iff \(\mathcal {L}_\gamma \) is injective.

Brozozowski’s construction of the minimal dfa for a regular language (see Introduction) generalizes to deterministic \(\mathcal {V}\)-automata as follows:

### **Construction 2.15**

- 1.
\(Q_L\) is the subalgebra of \(\nu T_\varSigma =\mathcal {P}\varSigma ^*\) generated by all derivatives \(w^{-1}L\) (\(w\in ~\varSigma ^*\)).

- 2.
The transitions are \(K \xrightarrow {a} a^{-1} K\) for \(a \in \varSigma \) and \(K \in Q_L\).

- 3.
\(K \in Q_L\) is final iff \(\epsilon \in K\).

### **Lemma 2.16**

For every regular language \(L \subseteq \varSigma ^*\), \(A_\mathcal {V}^L\) is a well-defined finite pointed \(T_\varSigma \)-coalgebra.

### *Proof*

### *Example 2.17*

- (a)
In \(\mathsf {Set}_\star \), we have \(Q_L = \{\emptyset \} \cup \{ w^{-1} L : w \in \varSigma ^*\}\).

- (b)
In \(\mathsf {BA}\), \(Q_L\) is the closure of \(\{\emptyset \} \cup \{ w^{-1} L : w \in \varSigma ^*\}\) under union and complement.

- (c)
In \(\mathsf {Vect}(\mathbb {Z}_2)\), \(Q_L\) is the closure of \( \{ w^{-1} L : w \in \varSigma ^*\}\) under symmetric difference.

- (d)
In \({\mathsf {JSL}}\), \(Q_L\) is the closure of \(\{\emptyset \} \cup \{ w^{-1} L : w \in \varSigma ^*\}\) under union.

- (e)
In \(\mathsf {DL}\), \(Q_L\) is the closure of \(\{\emptyset ,\varSigma ^*\} \cup \{ w^{-1} L : w \in \varSigma ^*\}\) under union and intersection.

### *Remark 2.18*

The category \(\mathsf {Coalg}(T_\varSigma )\) of \(T_\varSigma \)-coalgebras has a factorization system (surjective homomorphism, injective homomorphism) lifting the usual factorization system (surjective, injective) = (regular epi, mono) in \(\mathcal {V}\).

### **Construction 2.19**

**(see**[2]

**).**These factorizations give a two-step minimization of any finite pointed \(T_\varSigma \)-coalgebra \((Q,\gamma ,q_0)\):

- 1.
Construct the reachable subcoalgebra \((R,\delta ) \hookrightarrow (Q,\gamma )\) generated by \(q_0\).

- 2.Factorize the unique \(T_\varSigma \)-coalgebra homomorphism \(\mathcal {L}_\delta : (R,\delta ) \rightarrow (\rho T_\varSigma ,\gamma _\rho )\) as:$$ (R,\delta ) \mathop {\twoheadrightarrow }\limits ^{s} (R',\delta ') \mathop {\hookrightarrow }\limits ^{m} (\rho T_\varSigma ,\gamma _\rho ) $$

### **Theorem 2.20**

Let \(L \subseteq \varSigma ^*\) be a regular language. Then \(A_\mathcal {V}^L\) is (up to isomorphism) the unique minimal pointed \(\mathcal {V}\)-automaton accepting \(L\). It arises from any pointed finite \(\mathcal {V}\)-automaton \((Q,\gamma ,q_0)\) accepting \(L\) by Construction 2.19.

### *Proof*

Viewed as a da, \(A_\mathcal {V}^L\) is a subautomaton of the da \(\rho T_\varSigma \) of regular languages. Then the state \(L\) accepts \(L\). It is reachable because every state is a \(\mathcal {V}\)-algebraic combination of those states reachable from \(L\) by transitions i.e. \(L\)’s derivatives. It is simple because different states accept different languages, so it is minimal.

Now let \((Q,\gamma ,q_0)\) be any pointed \(T_\varSigma \)-coalgebra accepting \(L\) and \((R,\delta ,q_0)\) its reachable subautomaton, so every \(q' \in R\) arises as a \(\mathcal {V}\)-algebraic combination of those states reachable from \(q_0\) by transitions. Now \(\mathcal {L}_\delta : R \rightarrow \rho T_\varSigma \) is an automata morphism, so the languages of states reachable from \(q_0\) are precisely the derivatives of \(L\). Since \(\mathcal {L}_\delta \) is an algebra morphism its image is \(Q_L\). \(\square \)

## 3 From Deterministic to Nondeterministic Automata

*many*canonical deterministic acceptors: one for each locally finite variety \(\mathcal {V}\) containing a two-element algebra \(\mathbb {2}\). However this canonical acceptor \(A_\mathcal {V}^L\) is generally larger than the minimal dfa in \(\mathsf {Set}\) because one has to close under the \(\mathcal {V}\)-algebraic operations on the regular languages. In this section we will show how these larger

*deterministic*machines induce smaller

*nondeterministic*ones. Let us outline our approach:

- 1.
We restrict attention to finite da’s in \(\mathcal {V}\), i.e., \(T_\varSigma \)-coalgebras with finite carrier.

- 2.
For each of our varieties \(\mathcal {V}\) of interest, we describe an equivalence \(G\) of categories between the finite algebras \(\mathcal {V}_f\) and another category \(\overline{\mathcal {V}}\) where (i) \(\overline{\mathcal {V}}\)’s objects are “small” representations of their counterparts in \(\mathcal {V}_f\), and (ii) \(\overline{\mathcal {V}}\)’s morphisms are relations, not functions (see Lemmas 3.4, 3.8 and 3.10).

- 3.
From \(G\) we derive equivalences \(\mathbb {G}\) and \(\mathbb {G}_*\) between (pointed) deterministic finite \(\mathcal {V}\)-automata and (pointed) coalgebras in \(\overline{\mathcal {V}}\) which are

*nondeterministic*finite automata, see Lemma 3.17. - 4.
Applying this equivalence to the minimal deterministic \(\mathcal {V}\)-automaton \(A_\mathcal {V}^L\) gives a canonical nondeterministic acceptor for \(L\). This is illustrated in Sect. 3.3.

### 3.1 The Equivalence Between \(\mathcal {V}_f\) and \(\overline{\mathcal {V}}\)

For each of our varieties \(\mathcal {V}\) of interest there is a well-known description of the dual category of \(\mathcal {V}_f\): we have Stone duality (\(\mathsf {BA}_f \cong \mathsf {Set}_f^{op}\)), Priestley duality (\(\mathsf {DL}_f\cong \mathsf {Poset}_f^{op}\)), where \(\mathsf {Poset}_f\) is the category of finite posets and monotone functions, and the self-dualities \({\mathsf {JSL}}_f\cong {\mathsf {JSL}}_f^{op}\) and \(\mathsf {Vect}_f(\mathbb {Z}_2)\cong \mathsf {Vect}_f(\mathbb {Z}_2)^{op}\). We now describe each of these dually equivalent categories as a category \(\overline{\mathcal {V}}\) of finite structured sets and relations. The idea is to represent the finite algebras in \(\mathcal {V}\) in terms of a minimal set of generators.

### *Example 3.1*

- (a)
For any \(Q \in \mathsf {Set}_\star \) the subset \(Q \setminus \{0\}\) generates \(Q\); that means that we can always drop one element.

- (b)
Any finite boolean algebra \(Q \in \mathsf {BA}_f\) is generated by its atoms \(\mathsf {At}(Q)\), these being the join-irreducible elements.

- (c)
Any finite join-semilattice \(Q \in {\mathsf {JSL}}_f\) is generated by its join-irreducibles \(J(Q)\).

- (d)
A finite dimensional vector space \(Q\in \mathsf {Vect}_f(\mathbb {Z}_2)\) is generated by any basis \(B\subseteq Q\), although there is no canonical choice of a basis.

- (e)
Any finite distributive lattice \(Q \in \mathsf {DL}_f\) is generated by its join-irreducibles \(J(Q)\).

In the case of \({\mathsf {Set}_\star }_f\), \(\mathsf {BA}_f\) and \(\mathsf {Vect}_f(\mathbb {Z}_2)\) we can replace each algebra by a set of generators and each algebra morphism by a relation between these generators.

### **Definition 3.2**

### **Notation 3.3**

Given a basis \(GQ\) of a vector space \(Q\), for each basis vector \(z \in GQ\) denote by \(\pi _z : Q \rightarrow \{0,1\}\) the projection onto the \(z\)-coordinate.

### **Lemma 3.4**

- 1.\(G : {\mathsf {Set}_\star }_f \rightarrow \mathsf {Par}_f\) defined by$$ GQ = Q \setminus \{0\} \quad Gf(z) =\left\{ \begin{array}{ll} f(z) &{} {if f(z)} \ne 0, \\ {undefined} &{} {otherwise}. \end{array}\right. $$
- 2.
\(G : \mathsf {BA}_f \rightarrow \overline{\mathsf {BA}}\) where \(GQ = \mathsf {At}(Q)\) is the set of atoms and \(Gf = \{ (z,z') \in \mathsf {At}(Q) \times \mathsf {At}(Q') : z' \le _{Q'} f(z) \}\).

- 3.
\(G : \mathsf {Vect}_f(\mathbb {Z}_2) \rightarrow \overline{\mathsf {Vect}(\mathbb {Z}_2)}\) where \(GQ\) chooses a basis and \(Gf = \{ (z,z') \in GQ \times G Q' : \pi _{z'} \circ f(z) = 1 \}\).

Finite join-semilattices are represented using closure spaces:

### **Definition 3.5**

*closure operator*(shortly, a

*closure*) on \(X\) is a function \(\mathbf{cl}_X : \mathcal {P}X \rightarrow \mathcal {P}X\) such that for all \(S, S' \subseteq X\):

*closure space*\(X = (X,\mathbf{cl}_X)\) is a set with a closure defined on it. It is

*finite*if \(X\) is finite,

*strict*if \(\mathbf{cl}_X(\emptyset ) = \emptyset \),

*separable*if \(x\ne x'\) implies \(\mathbf{cl}_X(x)\ne \mathbf{cl}_X(x')\), and

*topological*if \(\mathbf{cl}_X(A\cup B)=\mathbf{cl}_X(A)\cup \mathbf{cl}_X(B)\) for all \(A,B\subseteq X\). A subset \(S \subseteq X\) is

*closed*if \(\mathbf{cl}_X(S) = S\) and

*open*if its complement is closed.

Finite posets are well-known to be equivalent to finite \(T_0\) topological spaces, which amount to finite separable topological closures. For finite join-semilattices we instead use *finite strict closures* i.e. we do not require separability or preservation of unions.

### *Example 3.6*

### **Definition 3.7**

*continuous*if, for all \(x \in X\) and \(S \subseteq X\),

- 1.
\(R[x] \subseteq Y\) is closed, and

- 2.
if \(x \in \mathbf{cl}_X(S)\) then \(R[x] \subseteq \mathbf{cl}_Y (R[S])\).

The following equivalence was derived from a similar one due to Moshier [13].

### **Lemma 3.8**

### *Proof*

### **Definition 3.9**

- 1.
Each \(R[p] \subseteq Q\) is downclosed,

- 2.
If \(p \le _P p'\) then \(R[p] \subseteq R[p']\),

- 3.
\(R\) preserves all intersections of downclosed subsets.

### **Lemma 3.10**

### *Proof*

\(G\) is restriction of the equivalence \({\mathsf {JSL}}_f\cong \overline{{\mathsf {JSL}}}\) described above. The closure spaces associated to distributive lattices are precisely the separable topological ones, so we can replace them by finite posets. This gives the first two conditions on morphisms, where closed means downwards closed. However semilattice morphisms between distributive lattices need not preserve meets. This is captured by the third condition. \(\square \)

### 3.2 From Determinism to Nondeterminism

### **Lemma 3.11**

Given a \(\overline{T}_\varSigma \)-coalgebra \(\delta : Z \rightarrow \overline{T}_\varSigma Z = \mathbb {1}\times Z^\varSigma \) we write its component maps as \(\delta _\epsilon : Z \rightarrow \mathbb {1}\) and \(\delta _a : Z \rightarrow Z\) for \(a \in \varSigma \). Notice that these are *relations* rather than functions, so \(\overline{T}_\varSigma \)-coalgebras are *nondeterministic* automata.

### *Example 3.12*

- (a)When \(\mathcal {V}= \mathsf {Set}_\star \) a \(\overline{T}_\varSigma \)-coalgebra \(\delta : X \rightarrow \overline{T}_\varSigma X\) consists of:Hence \(\overline{T}_\varSigma \)-coalgebras are
- 1.
A finite set \(X\).

- 2.
A partial function \(\delta _\epsilon : X \rightarrow \{1\}\) whose domain defines the final states.

- 3.
A partial function \(\delta _a : X \rightarrow X\) for each \(a \in \varSigma \), defining the transitions.

*partial dfas*. The equivalence \(\mathbb {G}\) assigns to each deterministic \(\mathsf {Set}_\star \)-automaton \((Q,\gamma )\) the partial dfa \((Q \setminus \{0_Q \},\delta )\) whose final states are the given ones and \(q \xrightarrow {a} q'\) iff \(\gamma _a(q) = q' \ne 0_Q\). - 1.
- (b)When \(\mathcal {V}= \mathsf {BA}\) a \(\overline{T}_\varSigma \)-coalgebra \(\delta : X \rightarrow \overline{T}_\varSigma X\) consists of:Hence \(\overline{T}_\varSigma \)-coalgebras are
- 1.
A finite set \(X\).

- 2.
A converse-functional relation \(\delta _\epsilon \subseteq X \times \{1\}\) whose domain defines a single final state.

- 3.
Converse-functional relations \(\delta _a \subseteq X \times X\) for \(a \in \varSigma \).

*reverse-deterministic nfas*, i.e., reversing all transitions yields a dfa. The equivalence \(\mathbb {G}\) assigns to each deterministic \(\mathsf {BA}\)-automaton \((Q,\gamma )\) an nfa \((\mathsf {At}(Q),\delta )\) whose states are \(Q's\) atoms. Moreover, its single final state is the unique atom generating the ultrafilter \(\gamma _\epsilon ^{-1} (\{1\})\) and \(z \xrightarrow {a} z'\) iff \(z' \le _Q \gamma _a(z)\). - 1.
- (c)If \(\mathcal {V}= \mathsf {Vect}(\mathbb {Z}_2)\) then a \(\overline{T}_\varSigma \)-coalgebra \(\delta : X \rightarrow \overline{T}_\varSigma X\) consists of:Hence \(\overline{T}_\varSigma \)-coalgebras are classical nfas. The equivalence \(\mathbb {G}\) assigns to a deterministic \(\mathsf {Vect}(\mathbb {Z}_2)\)-automaton \((Q,\gamma )\) the nfa \((Z,\delta )\) for some chosen basis \(Z \subseteq Q\). The final states are \(Z \cap \gamma _\epsilon ^{-1}(\{1\})\) and \(z \xrightarrow {a} z'\) iff \(\pi _{z'} \circ \gamma _a(z) = 1\), cf. Notation 3.3.
- 1.
A finite set \(X\).

- 2.
An arbitrary relation \(\delta _\epsilon \subseteq X \times \{1\}\), amounting to an arbitrary set of final states by taking the domain.

- 3.
Arbitrary relations \(\delta _a \subseteq X \times X\) for each \(a \in \varSigma \).

- 1.
- (d)If \(\mathcal {V}= {\mathsf {JSL}}\) then a \(\overline{T}_\varSigma \)-coalgebra \(\delta : Z \rightarrow \overline{T}_\varSigma Z\) consists of:We call \(\overline{T}_\varSigma \)-coalgebras
- 1.
A finite strict closure space \(Z = (Z,\mathbf{cl}_Z)\).

- 2.
A continuous relation \(\delta _\epsilon \subseteq Z \times \{1\}\), equivalently \(\delta _\epsilon \)’s domain \(F \subseteq Z\) is an open set of final states.

- 3.
Continuous relations \(\delta _a \subseteq Z \times Z\).

*nondeterministic closure automata*. The equivalence \(\mathbb {G}\) assigns to each deterministic \({\mathsf {JSL}}\)-automaton \((Q,\gamma )\) the nondeterministic closure automaton \(((J(Q),\mathbf{cl}_Q),\delta )\) whose states are \(Q\)’s join-irreducibles. The open set of final states is \(J(Q) \cap \gamma _\epsilon ^{-1} (\{1\})\) and \(z \xrightarrow {a} z'\) iff \(z' \le _Q \gamma _a(z)\).Note that every nfa can be turned into a nondeterministic closure automaton by endowing the states with the identity closure, so classical nfas form a proper subclass.

- 1.
- (e)If \(\mathcal {V}= \mathsf {DL}\) then a \(\overline{T}_\varSigma \)-coalgebra \(\delta : P \rightarrow \overline{T}_\varSigma P\) consists of:Note that reverse-deterministic nfas are the special case where \(P\) is discrete. An important non-discrete example is the
- 1.
A finite poset \(P\).

- 2.
A non-empty relation \(\delta _\epsilon \subseteq P \times \{1\}\) whose domain is a filter (i.e., a down-directed upset), these being the final states.

- 3.Transition relations \(\delta _a \subseteq P \times P\) such that:
- (i)
\(\delta _a[p]\) is downclosed for each \(p \in P\).

- (ii)
\(p \le _P q\) implies \(\delta _a[p] \subseteq \delta _a[q]\).

- (iii)
\(\delta _a[\bigcap _I A_i] = \bigcap _I \delta _a[A_i]\) for downclosed \(A_i\).

- (i)

*universal automaton*[14], we recall it after Corollary 3.21.The equivalence \(\mathbb {G}\) assigns to each deterministic \(\mathsf {DL}\)-automaton \((Q,\gamma )\) the \(\overline{T}_\varSigma \)-coalgebra \((J(Q),\delta )\) where \(J(Q)\) is a subposet of \(Q\). The final states form the upwards closed set \(J(Q) \cap \gamma _\epsilon ^{-1} (1)\) and \(z \xrightarrow {a} z'\) iff \(z' \le _Q \gamma _a(z)\).

- 1.

### *Remark 3.13*

### 3.3 Canonical Nondeterministic Automata

So far we have seen equivalences between deterministic and nondeterministic automata without initial states. Next, for each of our five running examples \(\mathcal {V}=\) \(\mathsf {Set}_*\), \(\mathsf {BA}\), \(\mathsf {Vect}(\mathbb {Z}_2)\), \({\mathsf {JSL}}\), \(\mathsf {DL}\) we will extend \(\mathbb {G}: \mathsf {Coalg}(T_\varSigma ) \rightarrow \mathsf {Coalg}(\overline{T}_\varSigma )\) to an equivalence of pointed coalgebras.

### **Definition 3.14**

\(\mathsf {Coalg}_*(T_\varSigma )\) is the category whose objects are the pointed \(T_\varSigma \)-coalgebras and whose morphisms \(f : (Q,\gamma ,q_0) \rightarrow (Q',\gamma ',q'_0)\) are those \(T_\varSigma \)-coalgebra homomorphisms \(f : (Q,\gamma ) \rightarrow (Q',\gamma ')\) preserving initial states, i.e., \(f \circ q_0 = q'_0\).

Using the equivalence \(G : \mathcal {V}_f \rightarrow \overline{\mathcal {V}}\), a pointed \(\overline{T}_\varSigma \)-coalgebra is a \(\overline{T}_\varSigma \)-coalgebra \((Z,\delta )\) equipped with a \(\overline{\mathcal {V}}\)-morphism \(i : G V \rightarrow Z\). And pointed \(\overline{T}_\varSigma \)-coalgebra homomorphisms are those \(\overline{T}_\varSigma \)-coalgebra homomorphisms \(f\) from \((Z, \delta )\) to \((Z', \delta ')\) such that \(f \circ i = i'\). Just as a morphism \(q_0 : V \rightarrow Q\) corresponds to an initial state \(q_0 (g)\), it turns out that a morphism \(i : GV \rightarrow Z\) corresponds to a *set* of initial states \(I = i[g] \subseteq Z\), as one would expect for nfas.

### *Example 3.15*

- (a)
If \(\mathcal {V}= \mathsf {Set}_\star \) then \(V = \{0,g\}\) and \(G V = \{g\}\). Partial functions \(i : \{g\} \rightarrow Z\) are determined by their codomain \(I = i[g]\). Then \(I\) is either empty or any singleton subset.

- (b)
If \(\mathcal {V}= \mathsf {BA}\) then \(V = \{\bot ,g,\lnot g,\top \}\) and \(G V = \{g, \lnot g\}\). Given \(i \subseteq \{g, \lnot g\} \times Z\) then \(i[g]\), \(i[\lnot g]\) partition \(Z\) so \(i\) is determined by \(I = i[g]\). Then \(I\) is any subset of \(Z\).

- (c)
If \(\mathcal {V}= \mathsf {Vect}(\mathbb {Z}_2)\) then \(V = \{0,g\}\) and \(G V = \{g\}\), so the arbitrary relation \(i \subseteq \{g\} \times Z\) is determined by its codomain \(I = i[g]\). Then \(I\) is any subset of \(Z\).

- (d)
If \(\mathcal {V}= {\mathsf {JSL}}\) then \(V = \{0,g\}\) and \(G V = \{g\}\) with closure \(\mathsf {id}_{\mathcal {P}\{g\}}\). The relation \(i \subseteq \{g\} \times Z\) is determined by \(I = i[g]\). By continuity \(I \subseteq Z\) is any closed subset.

- (e)
If \(\mathcal {V}= \mathsf {DL}\) then \(V = \{\bot ,g,\top \}\) is a \(3\)-chain and \(G V = \{g,\top \}\) a \(2\)-chain. Given \(i \subseteq \{g, \top \} \times Z\) then \(i[g] \subseteq i[\top ]\) and \(i[\{g,\top \}] = Z\) implies \(i[\top ] = Z\), so \(i\) is determined by \(I = i[g]\). Then \(I\) is any downclosed subset of \(Z\).

By reinterpreting point preservation relative to \(I\) we can finally define the category of pointed \(\overline{T}_\varSigma \)-coalgebras.

### **Definition 3.16**

- 1.
If \(\mathcal {V}= \mathsf {Set}_\star \), \(\mathsf {BA}\) or \(\mathsf {DL}\) then \(I' = f[I]\).

- 2.
If \(\mathcal {V}= {\mathsf {JSL}}\) then \(I'\) is the closure of \(f[I]\).

- 3.
If \(\mathcal {V}= \mathsf {Vect}(\mathbb {Z}_2)\) then \(I' = \{ z' \in Z' : |I \cap \breve{f}[z']|\ is\ odd\}\).

### **Lemma 3.17**

Let us spell out the equivalence \(\mathbb {G}_*\) for each of our varieties \(\mathcal {V}\). For the rest of this section fix a \(T_\varSigma \)-coalgebra \(A=(Q,\gamma ,q_0)\) and a regular language \(L\subseteq \varSigma ^*\). We give an explicit description of the nfa \(G_* A\) and, in particular, of the canonical nfa for \(L\) obtained by applying \(\mathbb {G}_*\) to \(A_\mathcal {V}^L\) from Construction 2.15.

(a) * The Minimal Partial Dfa*. If \(\mathcal {V}= \mathsf {Set}_\star \) then \(\mathbb {G}_* A\) is the partial dfa \((Q \setminus \{0_Q\},\delta ,I)\) that arises from \(A\) by deleting the state \(0_Q\) along with all in- and outgoing transitions. Hence the initial states are \(I = \{q_0\}\) if \(q_0 \ne 0_Q\) and \(I = \emptyset \) if \(q_0= 0_Q\). Clearly \(\mathbb {G}_* A\) (viewed as an nfa) accepts \(A\)’s language.

*minimal partial dfa*of \(L\). It has states

*. If \(\mathcal {V}= \mathsf {BA}\) then \(\mathbb {G}_* A\) is the nfa \((\mathsf {At}(Q),\delta ,I)\) with initial states \(I = \{ q \in \mathsf {At}(Q) : q \le _Q q_0 \}\). It accepts \(A\)’s language. In particular, \(\mathbb {G}_* (A_\mathsf {BA}^L)\) is called the*

**The Átomaton***átomaton*of \(L\), see [8]. Its states

- 1.
Construct the minimal dfa for \(L\)’s reversed language.

- 2.
Construct its reversed nfa i.e. flip initial/final states and reverse all transitions.

*dual*concepts (see Definition 2.13), a \(T'_\varSigma \)-coalgebra is minimal iff its image under \(H\) is minimal, implying the above description.

### *Example 3.18*

- 1.
The átomaton for \(L = (a + b)^* b (a + b)^n\) in Example 1.1 arises by constructing the minimal dfa for the reversed language \(\mathsf {rev}(L)\) and taking the reverse nfa. Its atoms are \(\{(a+b)^* a (a + b)^n,L\} \cup \{(a + b)^j : 0 \le j \le n\}\).

- 2.
The átomaton can have exponentially many more states than the minimal dfa, e.g. for \(L = (a+b)^n b (a + b)^*\) it has \(\ge 2^{n}\) states.

(c) * The Minimal Xor Automaton*. If \(\mathcal {V}= \mathsf {Vect}(\mathbb {Z}_2)\) then \(\mathbb {G}_* A\) is the nfa \((Z,\delta ,I)\) where \(Z \subseteq Q\) is a basis and \(I = \{ z \in Z : \pi _z (q_0) = 1 \}\), see Notation 3.3. It accepts \(A\)’s language by \(\mathbb {Z}_2\)-

*weighted*nondeterministic acceptance: a word \(w\in \varSigma ^*\) is accepted iff its number of accepting paths is odd (this is different than the usual acceptance condition of standard nondeterministic automata).

*minimal xor automaton*of \(L\), see [17]. Note that its construction depends on the choice of a basis, so the minimal xor automaton is only determined up to isomorphism in the category of pointed \(\overline{T}_\varSigma \)-coalgebras. We provide a new way to construct it:

- 1.
Construct \(L\)’s átomaton \((Z,R_a,F,I)\) and determine the collection \(C \subseteq \mathcal {P}Z\) of all subsets of \(Z\) which are reachable from \(I\).

- 2.
Find any minimal \(\mathcal {Q}\subseteq \mathcal {P}Z\) whose closure under set-theoretic symmetric difference equals \(C\)’s closure.

- 3.
Build the nfa \((\mathcal {Q},R'_a,\mathcal {Q}\cap F,I)\) where \(R'_a(y,y')\) iff \(\pi _{y'} (R_a[y]) = 1\) and \(I = \{ y \in \mathcal {Q}: \pi _y (I) = 1 \}\).

*the minimal xor automaton is never larger than the minimal dfa of*\(L\), see [17].

### *Example 3.19*

Take the átomaton of Example 1.1, with states \(Z = \{x\} \cup \{z_i : 0 \le i \le n + 1\}\) and reachable subsets \(C = \{ S \subseteq Z : x \notin S, z_0 \in S \}\). One can verify that (i) the closure of \(\mathcal {Q}= \{ \{z_i\} : 0 \le i \le n + 1\}\) under symmetric difference is the closure of \(C\) and (ii) \(\mathcal {Q}\) is minimal. The induced nfa is the minimal xor automaton of Example 1.1. Alternatively \(\mathcal {Q}= \{ \{z_0,z_i\} : 0 \le i \le n + 1 \} \subseteq C\) yields a different nfa.

*. If \(\mathcal {V}= {\mathsf {JSL}}\) then \(\mathbb {G}_* A\) is the nondeterministic closure automaton \((J(Q),\delta ,I)\) with initial states \(I = \{ z \in J(Q) : z \le _Q q_0 \}\) where \(J(Q)\) is the closure space of Example 3.6. The underlying nfa (forgetting the closure) accepts \(A\)’s language. In particular, \(\mathbb {G}_* (A_{\mathsf {JSL}}^L)\)’s underlying nfa is called the*

**The Jiromaton***jiromaton*of \(L\), see [10]. Its states

*prime*derivatives. Therefore, the jiromaton has no more states than the minimal dfa. Its structure is analogous to the átomaton: \(K \in \mathcal {Q}_L\) is initial iff \(K \subseteq L\), final iff \(\epsilon \in K\) and \(K \xrightarrow {a} K'\) iff \(K' \subseteq a^{-1} K\).

An algorithm to construct the jiromaton from any nfa accepting \(L\) is given in [10].

### *Example 3.20*

In the jiromaton of Example 1.1, the state \(z_0\) accepts \(L\) and state \(z_i\) accepts \(L + (a + b)^{i-1}\) for each \(i > 0\). These are the prime derivatives of \(L\). The closure is defined \(\mathbf{cl}_Z(\emptyset ) = \emptyset \), \(\mathbf{cl}_Z(S) = \{z_0\} \cup \{S\}\) for \(S \ne \emptyset \). It is topological: the closed sets are the downsets of the poset where \(z_0 \le _Z z_i\) for all \(0 \le i \le n + 1\).

*. If \(\mathcal {V}= \mathsf {DL}\) then \(\mathbb {G}_* A=(J(Q),\delta ,I)\) with initial states \(I = \{ z \in J(Q) : z \le _Q q_0 \}\). Forgetting \(J(Q)\)’s poset structure, the underlying nfa accepts \(A\)’s language. We call \(\mathbb {G}_* (A_\mathsf {DL}^L)\) the*

**The Distromaton***distromaton*of \(L\). Its states

- 1.
Take the minimal pointed dfa \((Z,\xrightarrow {a},z_0,F)\) for the reversed language \(\mathsf {rev}(L)\) where \(Z\) is ordered by language-inclusion.

- 2.
Build the pointed \(\overline{T}_\varSigma \)-coalgebra \((Z^{op},\delta ,F)\) with final states \(\downarrow _Z z_0\) and \(z' \in \delta _a[z]\) iff \(z' \xrightarrow {a} y \ge _Z z\).

### **Corollary 3.21**

\(L\)’s átomaton and distromaton have the same number of states, namely, the number of states of the minimal dfa for the reversed language \(\mathsf {rev}(L)\).

### *Example 3.22*

The distromaton in Example 1.1 has order \(z_0 \le _Z z_i\) and \(z_i \le _Z \top \) for all \(0 \le i \le n + 1\). We have the state \(\top \) because \(\varSigma ^*\) is not the union of non-empty intersections of \(L\)’s derivatives, see Example 3.20. It arises from the jiromaton by adding a final sink state, see Corollary 4.6.

*universal automaton*for \(L\) [14]. It is the nfa with states

## 4 State Minimality and Universal Properties

- 1.
We prove \(L\)’s jiromaton is minimal amongst all nondeterministic acceptors of \(L\) relative to a suitable measure (Sect. 4.1).

- 2.
We give a sufficient condition on \(L\) such that the jiromaton is state-minimal and the distromaton and átomaton have at most one more state (Sect. 4.2).

- 3.
We characterize each of our canonical nfas amongst subclasses of nondeterministic acceptors (Sect. 4.3).

### 4.1 The Jiromaton is Minimal

### **Theorem 4.1**

- (1)
\(\mathtt{acc}(J_L) \le \mathtt{acc}(N)\),

- (2)If additionally \(\mathtt{acc}(J_L) = \mathtt{acc}(N)\) then either:
- (a)
\(\left| J_L\right| < \left| N\right| \) or

- (b)
\(\left| J_L\right| = \left| N\right| \) and \(\mathtt{tr}(N) \le \mathtt{tr}(J_L)\).

- (a)

### *Proof*

Since \(J_L\)’s individual states accept derivatives of \(L\), it follows that \(J_L\) accepts precisely the unions of derivatives of \(L\). Any nfa \(N\) accepting \(L\) accepts these languages, so \(\mathtt{acc}(J_L) \le \mathtt{acc}(N)\). Suppose \(\mathtt{acc}(J_L) = \mathtt{acc}(N)\), so \(N\) accepts precisely the unions of \(L\)’s derivatives. Then each prime derivative has a distinct state in \(N\) accepting it, as it cannot arise as the union of other derivatives, so \(\left| J_L\right| \le \left| N\right| \). Lastly if \(\mathtt{acc}(J_L) = \mathtt{acc}(N)\) and \(\left| J_L\right| = \left| N\right| \) then there is language preserving bijection between \(N\)’s states and the set of prime derivatives \(P_L\), so assume \(N\)’s carrier is \(P_L\). Given \(K \xrightarrow {a} K'\) in \(N\) we must have \(K' \subseteq a^{-1} K\), so there is a corresponding transition in \(J_L\). Hence \(\mathtt{tr}(N) \le \mathtt{tr}(J_L)\) and (2) holds. Moreover, in case \(\mathtt{tr}(N)=\mathtt{tr}(J_L)\) the previous argument shows that \(N\) and \(J_L\) are isomorphic. Thus the conditions (1) and (2) determine \(J_L\) up to isomorphism. \(\square \)

### 4.2 Conditions for Canonical State-Minimality

In the following let \(d_L\) and \(n_L\) be the minimal number of states of a dfa (respectively nfa) accepting the regular language \(L\). For any state-minimal nfa \(N = (n_L,R_a,F)\) accepting \(L\) via \(I \subseteq n_L\), one can construct a simple pointed \(T_\varSigma \)-coalgebra \((\mathcal {Q},\gamma ',L)\) whose equivalent nondeterministic closure automaton is another state-minimal acceptor of \(L\). First view \(N\) as the \(T_\varSigma \)-coalgebra \((\mathcal {P}n_L,\gamma )\) via the subset construction. Factorizing the unique homomorphism \(\mathcal {L}_\gamma \) we obtain \((\mathcal {Q},\gamma ')\) where \(\mathcal {Q}\) is the semilattice of languages accepted by \(N\). Then \((\mathcal {Q},\gamma ')\) is equivalent to a nondeterministic closure automaton accepting \(L\). Since \(\mathcal {P}n_L \twoheadrightarrow \mathcal {Q}\) implies \(n_L=\left| J(\mathcal {P}n_L)\right| \ge \left| J(\mathcal {Q})\right| \), by forgetting the closure we obtain a state-minimal nfa accepting \(L\).

Hence instead of working with state-minimal nfas we may work with simple \(T_\varSigma \)-coalgebras which are *supercoalgebras* of \(A_{\mathsf {JSL}}^L\). This follows because \(A_{\mathsf {JSL}}^L\)’s carrier is the semilattice \(S_L\) of unions of \(L\)’s derivatives, which \(\mathcal {Q}\) necessarily contains. We now provide a condition ensuring that \(\left| J(S_L)\right| \) is the minimal size of an nfa accepting \(L\) and hence \(L\)’s jiromaton is *state-minimal*.

### **Definition 4.2**

A regular language \(L\) is *intersection-closed* if every binary intersection of \(L\)’s derivatives is a union of \(L\)’s derivatives.

### *Example 4.3*

- 1.
\(L = (a + b)^* b (a + b)^n\) where \(n \in \omega \) is intersection-closed.

- 2.
\(\emptyset \), \(\varSigma ^*\) and \(\{w\}\) for \(w \in \varSigma ^*\) are intersection-closed.

- 3.
Fix \(n \in \omega \), \(t \in \mathbb {R}\) and \(k_i \in \mathbb {R}\) (\(1 \le i \le n\)). Then the language \(L=\{w\in 2^n: \sum _ i k_iw_i \ge t\}\) (modeling the behaviour of an artificial neuron) is intersection-closed.

- 4.
Every linear subspace \(L\subseteq \mathbb {Z}_2^n\) (viewed as a language over the alphabet \(\{0,1\}\)) is intersection-closed.

### **Theorem 4.4**

If \(L\) is intersection-closed then its jiromaton is state-minimal.

### *Proof*

By assumption the carrier \(S_L\) of \(A_{\mathsf {JSL}}^L\) is closed under both unions *and* non-empty intersections, so \(D = S_L \cup \{\varSigma ^*\}\) is a distributive lattice of languages. Let \(N\) be any state-minimal nfa accepting \(L\) via initial states \(I\), and \(S \subseteq \mathcal {P}\varSigma ^*\) be the semilattice of languages accepted by \(N\) (by varying \(I\)). The nfa \(N\) must at least accept \(L\)’s derivatives. Since \(S\) is closed under unions we have \(S_L \subseteq S\). By the surjective morphism \(\mathcal {P}n_L \twoheadrightarrow S\) it follows that \(\left| N\right| \ge \left| J(S)\right| \), so it suffices to prove that \(\left| J(S)\right| \ge \left| J(S_L)\right| \). Let \(S_* = S \cup \{ \varSigma ^* \}\) be the semilattice obtained by adding a top element if necessary. We have a \({\mathsf {JSL}}_f\)-morphism \(\iota : D \hookrightarrow S_*\). The meets in \(D\) are also meets in \(S_*\) so the same function defines a \({\mathsf {JSL}}_f\)-morphism \(\iota : D^{op} \hookrightarrow S^{op}_*\). By the self-duality of \({\mathsf {JSL}}_f\) we obtain a surjective morphism \(\iota ' : S_* \twoheadrightarrow D\), hence \(\left| J(S_*)\right| \ge \left| J(D)\right| \). If \(D = S_L\) then \(S_* = S\), so \(\left| J(S)\right| \ge \left| J(S_L)\right| \) and we are done. Otherwise \(\varSigma ^* \notin S_L\) and we now prove \(\varSigma ^* \notin S\). By state minimality \(N\) is reachable, so each state \(q\) accepts a subset of some \(L\)-derivative. Then if \(\varSigma ^* \in S\) we deduce \(\varSigma ^*\) is the union of \(L\)’s derivatives, so \(\varSigma ^* \in S_L\) – a contradiction. Consequently \(\left| J(D)\right| = 1 +\left| J(S_L)\right| \) and \(\left| J(S_*)\right| = 1 +\left| J(S)\right| \) hence \(\left| J(S)\right| \ge \left| J(S_L)\right| \) again. \(\square \)

### *Remark 4.5*

The converse of this theorem is generally false: the language \(L = \overline{\{aa\}}\) is not intersection-closed, but its jiromaton is state-minimal.

### **Corollary 4.6**

If \(L\) is intersection-closed then its átomaton and distromaton have at most one more state than the jiromaton.

### *Proof*

By the above proof the distromaton may only have an additional final sink state – otherwise it has the same transition structure. By Corollary 3.21 the átomaton has the same number of states. \(\square \)

By Corollary 3.21 we further deduce:

### **Corollary 4.7**

If \(L\subseteq \varSigma ^*\) is intersection-closed then any state-minimal nfa accepting \(L\) has (i) \(d_{\mathsf {rev}(L)}\) states if \(\varSigma ^*\) is a union of \(L\)’s derivatives and (ii) \(d_{\mathsf {rev}(L)} - 1\) otherwise.

### **Theorem 4.8**

If \(d_L = 2^{n_L}\) then the jiromaton of \(L\) is state-minimal.

### *Proof*

Let \(N = (n_L,R_a,F)\) be a state-minimal nfa accepting \(L\) via \(I \subseteq n_L\). View it as a pointed \(T_\varSigma \)-coalgebra \(A = (\mathcal {P}n_L,\gamma ,I)\) via the subset-construction. By assumption \(d_L=\left| \mathcal {P}n_L\right| \), so this is a state-minimal dfa accepting \(L\); in particular, it is a reachable pointed \(T_\varSigma \)-coalgebra. Then the surjective morphism \(A \twoheadrightarrow A_{\mathsf {JSL}}^L\) implies that \(A_{\mathsf {JSL}}^L\) has no more than \(n_L\) join-irreducibles, so the jiromaton is state-minimal. \(\square \)

### 4.3 Characterizing the Canonical Nfas

Although the canonical nfas are generally not state-minimal, they are state-minimal amongst certain subclasses of nfas.

### **Theorem 4.9**

The átomaton of a regular language \(L\) is state-minimal amongst all nfas accepting \(L\) whose accepted languages are closed under complement.

### *Proof*

Assume the weaker condition that an nfa \(N\) accepts every language in the boolean algebra \(\mathcal {B}\subseteq _\omega \mathcal {P}\varSigma ^*\) generated by \(L\)’s derivatives. By an earlier argument, \(N\) induces a simple \(T_\varSigma \)-coalgebra \((\mathcal {Q},\gamma )\) whose states are the languages \(N\) accepts and \(\left| N\right| \ge \left| J(\mathcal {Q})\right| \). By assumption \(\mathcal {Q}\supseteq \mathcal {B}\) (a distributive lattice), so \(|J(\mathcal {Q})| \ge |J(\mathcal {B})|\) by the proof of Theorem 4.4. The join-irreducibles of a finite boolean algebra are its atoms, so \(N\) has no less states than the átomaton. \(\square \)

The next result is from [17]. It follows because quotients and subspaces of finite-dimensional vector spaces cannot have larger dimension.

### **Theorem 4.10**

**(**[17]**).** Any canonical xor nfa for \(L\) is state-minimal amongst nfas accepting \(L\) via \(\mathbb {Z}_2\)-weighted acceptance.

We give a mild generalization of a result in [10]. Recall that nfas accepting \(L\) also accept all unions of its derivatives. Then we can conclude from Theorem 4.1:

### **Corollary 4.11**

The jiromaton of a regular language \(L\) is state-minimal amongst nfas accepting precisely the unions of \(L\)’s derivatives.

### *Example 4.12*

Let \(N\) be an nfa accepting \(L\) via initial states \(I\). If every singleton set of states is reachable from \(I\) then \(N\) accepts precisely the unions of \(L\)’s derivatives. Thus, it is no smaller than \(L\)’s jiromaton.

### **Theorem 4.13**

The distromaton of a regular language \(L\) is state-minimal amongst all nfas accepting \(L\) whose accepted languages are closed under intersection.

### *Proof*

Reuse the proof of Theorem 4.9. Again we actually have a stronger result: the distromaton is state-minimal amongst all nfas which can accept every intersection of \(L\)’s derivatives. \(\square \)

## 5 Conclusions and Future Work

It is often claimed in the literature that canonical nondeterministic automata do not exist, usually as a counterpoint to the minimal dfa. On the contrary we have shown that they *do* exist and moreover arise from the minimal dfa interpreted in a locally finite variety. In so doing we have unified previous work from three sources [8, 10, 17] and introduced a new canonical nondeterministic acceptor, the distromaton. We also identified a class of languages where canonical *state-minimal* nfas exist. These results depend heavily on a coalgebraic approach to automata theory, providing not only new structural insights and construction methods but also a new perspective on what a state-minimal acceptor actually is.

In this paper we introduced nondeterministic closure automata, viz. \(\overline{T}_\varSigma \)-coalgebras in the category of closure spaces, mainly as a tool for constructing the jiromaton. However, nondeterministic closure automata bear interesting structural properties themselves, which we did not discuss here in depth. We expect that a proper investigation of these machines will lead to further insights about nondeterminism, in particular additional and more general criteria for the (state-)minimality of nfas.

Another point we aim to investigate in more detail are the algorithmic aspects of the state-minimization problem for nfas. Although this problem is known to be \(\mathsf {PSPACE}\)-complete in general, the canonicity of our nfas suggests that – at least for certain natural subclasses of nfas – efficient state-minimization procedures may be in reach. We leave the study of such complexity-related issues for future work.

## Footnotes

## References

- 1.Adámek, J., Bonchi, F., Hülsbusch, M., König, B., Milius, S., Silva, A.: A coalgebraic perspective on minimization and determinization. In: Birkedal, L. (ed.) FOSSACS 2012. LNCS, vol. 7213, pp. 58–73. Springer, Heidelberg (2012) CrossRefGoogle Scholar
- 2.Adámek, J., Milius, S., Moss, L.S., Sousa, L.: Well-pointed coalgebras (extended abstract). In: Birkedal, L. (ed.) FOSSACS 2012. LNCS, vol. 7213, pp. 89–103. Springer, Heidelberg (2012) CrossRefGoogle Scholar
- 3.Barr, M.: Terminal coalgebras in well-founded set theory. Theor. Comput. Sci.
**114**(2), 299–315 (1993)CrossRefzbMATHMathSciNetGoogle Scholar - 4.Bezhanishvili, N., Kupke, C., Panangaden, P.: Minimization via duality. In: Ong, L., de Queiroz, R. (eds.) WoLLIC 2012. LNCS, vol. 7456, pp. 191–205. Springer, Heidelberg (2012) CrossRefGoogle Scholar
- 5.Bonchi, F., Bonsangue, M.M., Boreale, M., Rutten, J.J.M.M., Silva, A.: A coalgebraic perspective on linear weighted automata. Inform. Comput.
**211**, 77–105 (2012)CrossRefzbMATHMathSciNetGoogle Scholar - 6.Bonchi, F., Bonsangue, M.M., Rutten, J.J.M.M., Silva, A.: Brzozowski’s algorithm (co)algebraically. In: Constable, R.L., Silva, A. (eds.) Logic and Program Semantics, Kozen Festschrift. LNCS, vol. 7230, pp. 12–23. Springer, Heidelberg (2012) CrossRefGoogle Scholar
- 7.Bonsangue, M.M., Milius, S., Silva, A.: Sound and complete axiomatizations of coalgebraic language equivalence. ACM Trans. Comput. Log.
**14**(1), 7:1–7:52 (2013)CrossRefMathSciNetGoogle Scholar - 8.Brzozowski, J., Tamm, H.: Theory of átomata. In: Mauri, G., Leporati, A. (eds.) DLT 2011. LNCS, vol. 6795, pp. 105–116. Springer, Heidelberg (2011) CrossRefGoogle Scholar
- 9.Brzozowski, J.A.: Canonical regular expressions and minimal state graphs for definite events. Mathematical Theory of Automata. MRI Symposia Series, vol. 12, pp. 529–561. Polytechnic Press/Polytechnic Institute of Brooklyn, New York (1962) Google Scholar
- 10.Denis, F., Lemay, A., Terlutte, A.: Residual finite state automata. Fund. Inform.
**XX**, 1–30 (2002)Google Scholar - 11.Hasuo, I., Jacobs, B., Sokolova, A.: Generic trace semantics via coinduction. Log. Methods Comput. Sci.
**3**(4:11), 1–36 (2007)MathSciNetGoogle Scholar - 12.Jacobs, B., Silva, A., Sokolova, A.: Trace semantics via determinization. In: Pattinson, D., Schröder, L. (eds.) CMCS 2012. LNCS, vol. 7399, pp. 109–129. Springer, Heidelberg (2012) CrossRefGoogle Scholar
- 13.Jipsen, P.: Categories of algebraic contexts equivalent to idempotent semirings and domain semirings. In: Kahl, W., Griffin, T.G. (eds.) RAMICS 2012. LNCS, vol. 7560, pp. 195–206. Springer, Heidelberg (2012) CrossRefGoogle Scholar
- 14.Lombardy, S., Sakarovitch, J.: The universal automaton. In: Flum, J., Grädel, E., Wilke, T. (eds.) Logic and Automata. Texts in Logic and Games, vol. 2, pp. 457–504. Amsterdam University Press, Amsterdam (2008)Google Scholar
- 15.Milius, S.: A sound and complete calculus for finite stream circuits. In: Proceedings of 25th Annual Symposium on Logic in Computer Science (LICS’10), pp. 449–458. IEEE Computer Society (2010)Google Scholar
- 16.Silva, A., Bonchi, F., Bonsangue, M.M., Rutten, J.J.M.M.: Generalizing determinization from automata to coalgebras. Log. Methods Comput. Sci
**9**(1:9), 23 (2013)MathSciNetGoogle Scholar - 17.Vuillemin, J., Gama, N.: Efficient equivalence and minimization for non deterministic Xor automata. Research report, LIENS (May 2010). http://hal.inria.fr/inria-00487031