1 Introduction

A long-standing open question at the intersection of automata theory and computational complexity is whether every two-way nondeterministic finite automaton () with s states can be simulated by a two-way deterministic finite automaton () with a number of states which is only polynomial in s. The well-known Sakoda-Sipser conjecture proposes that the answer is negative [5].

A stronger variant of this conjecture claims that, indeed, a needs super-polynomially many states even when the input head of the can only move forward, namely even when the is really a one-way nondeterministic finite automaton. By [5], we know that this stronger claim holds iff a certain graph-theoretic problem called one-way liveness for height h () cannot be solved by any whose number of states is only polynomial in h.

In one of the many approaches to this question, Hromkovič et al. [3] proposed the concept of a reasonable automaton (ra). Intuitively, this is a with the following modifications: it works only on inputs of a fixed length n; it is random-access, in the sense that its head can move from any cell of its n-long input to any other in a single step; and, most importantly, it has every state associated with a propositional formula which indicates what the automaton ‘knows’ about its input when it is in that state. In this model, one uses the additional information provided by the formulas to study the different types of propositional reasoning that a could possibly employ in trying to solve on inputs of length n. Clearly, the standard model would be too lean to support such a study.

Not surprisingly, the power of a ra varies as we vary (i) the set of propositional variables that can appear in the formulas and (ii) the rules for how these variables can be combined to create the formulas. By [3, Theorem 2], we know that a sufficiently expressive set of variables allows a ra to simulate any s-state on n-long inputs with only O(sn) states; so, the model is powerful enough to represent any algorithm for on inputs of fixed length, and remains small if so do both the and the length. We also know that certain options for (i) and (ii) make it necessary for a ra solving to use \(2^{\varOmega (h)}\) states, even when the length is restricted to just \(n=2\) symbols [3, Theorems 3 and 4]; while other options make O(h) states already sufficient for \(n=2\) symbols [6], and \(O(h^2)\) states already sufficient for \(n=3\) and \(n=4\) symbols [3, Theorems 6 and 7].

For example, the linear upper bound for two symbols from [6] was proved for the case where (i) there is one propositional variable \(e_{a,b}\) for every two vertices ab of the input graph, indicating whether an edge between a and b exists; and (ii) all rules of propositional-formula formation are allowed. That is, if denotes the restriction of to instances of length exactly 2, then O(h) states are enough for solving by a ra which builds its formulas by arbitrarily applying the logical connectives \(\wedge ,\vee ,\lnot \) to the variables \(e_{a,b}\). In fact, this upper bound is also known to be optimal in the specific case, as Bianchi, Hromkovič, and Kováč [1, Theorem 1] recently proved that such ras for also need \(\varOmega (h)\) states. Overall, we arrive at the nice conclusion that, with such variables and rules, every smallest ra for has \(\varTheta (h)\) states.

By inspecting the proof of [1, Theorem 1] for the above linear lower bound, one can observe that it is actually valid not only for ras with the particular variables and rules, but also for all possible ras. In fact, minor technical changes make that proof valid even for arbitrary . That is, every  that solves needs \(\varOmega (h)\) states, even if it is not reasonable. At the same time, one easily realizes that the ra that proves the matching upper bound in [6] implies directly that a , too, can solve with O(h) states. Overall, we actually know the much stronger fact that every smallest for has \(\varTheta (h)\) states.

At this point, it is interesting to ask what the corresponding fact is for on three symbols, or four, or five, and so on. In general, for \(n\ge 2\), we let denote the restriction of to instances of exactly n symbols, and ask:

(1)

For \(n=2\), the asymptotic answer to this question is, of course, provided by our discussion above. Note, however, that we are still missing the exact answer, namely a function s(h) such that some for has at most s(h) states and no for has strictly fewer than s(h) states.

For \(n\ge 3\), we know neither the asymptotic nor the exact answer to (1). We only have the asympotic upper bounds implied by the ras of [3, Theorems 6 and 7] which implement Savitch’s algorithm on n symbols and are easily converted into with (n times more states, and thus with) the same asymptotic size (if n is constant). For the cases \(n=3\) and \(n=4\), those bounds are both quadratic, so we know that

In this paper, we study (1) for the cases \(n=2\) and \(n=3\). Our main contribution is the asymptotic answer for \(n=3\) (Sect. 4):

Theorem 1

Every smallest has \(\varTheta (h^2/\log h)\) states.

This involves a new algorithm for the upper bound (Lemma 8) and a standard argument for the lower bound (Lemma 7). Before that, we also take the time to carefully examine the case \(n=2\) and the known asymptotic answer (Sect. 3):

Theorem 2

Every smallest has \(\varTheta (h)\) states.

We give a detailed argument for the lower bound (Lemma 6), which mimics that of [1, Theorem 1] but applies to any  and results in a higher exact value. For completeness, we also give the known algorithm for the upper bound (Lemma 5).

In both cases, we put the extra effort to find exact values for our bounds, so that one can appreciate the gap, between lower and upper bound, where the actual size of the smallest lies. For example, in the case \(n=2\), one sees that the \(\varTheta (h)\) size of the best is actually somewhere above \(\frac{1}{2} h + \frac{1}{4}\lg h - \frac{1}{2}\) and below 2h. We also put the effort to make our intermediate lemmata as general as possible, even if this full generality is not strictly needed in our proofs. For example, the Hybrid Rule (Lemma 2) is proved for all cases, even for when the computation on the hybrid string is looping, although we only use that rule once (Lemma 6), in a case where that computation is (accepting, and thus) halting.

2 Preparation

If S is a set, then |S|, \(\overline{S}\), and are respectively its size, complement, and powerset. If \(\varSigma \) is an alphabet, then \(\varSigma ^*\) is the set of all strings over it and \(\varSigma ^n\) is its subset containing only the strings of length exactly n. If \(z\in \varSigma ^*\) is a string, then |z| and \(z_j\) are respectively its length and its j-th symbol (if \(1\le j\le |z|\)). If \(n\ge 0\), then \([n]:=\{1,2,\dots ,n\}\) is the set of the n smallest positive integers.

Problems. A (promise) problem over \(\varSigma \) is any pair \(\mathfrak {L}=(L,\tilde{L})\) of disjoint subsets of \(\varSigma ^*\). Its positive instances are all \(w\in L\), whereas its negative instances are all \(w\in \tilde{L}\). A machine solves \(\mathfrak {L}\) if it accepts every positive instance but no negative one. If \(\tilde{L}=\overline{L}\), then we call \(\mathfrak {L}\) a language and represent it only by L.

Let \(h\ge 2\).Footnote 1 The alphabet consists of all two-column directed graphs with h nodes per column and only rightward arrows (Fig. 1a). A string \(w\in \varSigma _h^n\) is naturally viewed as an \((n+1)\)-column graph, where every arrow connects successive columns (Fig. 1b, c); we usually index the columns from 0 to n and, for simplicity, drop the directions of the arrows. If w contains a path from the leftmost to the rightmost column (called a live path), then we say that w is live; otherwise we say that w is dead. The language

represents the computational task of checking that a given string in \(\varSigma _h^*\) contains a live path [5]. If the string is guaranteed to be of a fixed length n, then the task is best represented by the promise problem

Then is the family of all such promise problems for \(h\ge 2\).

Fig. 1.
figure 1

(a) Three symbols \(x,y,z\in \varSigma _5\); e.g., \(z=\{(1,2),(1,4),(2,5),(4,4)\}\). (b) The string xy of the first two symbols, simplified and indexed, as an instance of . (c) The string xyz of all symbols, simplified and indexed, as an instance of .

Machines. A two-way deterministic finite automaton () is any tuple of the form \(M=(S,\varSigma ,\delta ,q_s ,q_a )\), where S is a set of states, \(\varSigma \) is an alphabet, \(q_s , q_a \in S\) are respectively the start and accept states, and \( \delta :S\times (\varSigma \cup \{{\vdash }{,}{\dashv }\})\rightharpoonup S\times \{\textsc {l} {,}\textsc {r} \}\) is a (partial) transition function, for \({\vdash }{,}{\dashv }\notin \varSigma \) the left and right endmarkers, and \(\textsc {l} {,}\textsc {r} \) the left and right directions.

An input \(w\in \varSigma ^*\) is presented to M surrounded by the endmarkers, as \({\vdash }w{\dashv }\). The computation starts at \(q_s \) and on \({\vdash }\). In each step, the next state and head move (if any) are derived from \(\delta \) and the current state and symbol. Endmarkers are never violated, except if the next state is \(q_a \); that is, \(\delta ({\,.\,}\,,{\vdash })\) is always \((q_a ,\textsc {l} )\) or of the form \(({\,.\,}\,,\textsc {r} )\); and \(\delta ({\,.\,}\,,\!{\dashv })\) is always \((q_a ,\textsc {r} )\) or of the form \(({\,.\,}\,,\textsc {l} )\). Hence, the computation either loops, if it ever repeats a state on the same input cell; or hangs, if it ever reaches a state and symbol for which \(\delta \) is undefined; or falls off \({\vdash }\) or \({\dashv }\) into \(q_a \), in which case we say that M accepts w.

Formally, for any string z, position i, and state q, the computation of M when started at q on the i-th symbol of z is the unique sequence

$$ \textsc {comp}_{M,q,i}(z) = \bigl ( (q_t,i_t) \bigr )_{0\le t<m} $$

where \((q_0,i_0)=(q,i)\), \(1\le m\le \infty \), every pair is derived from its predecessor via \(\delta \) and z, every pair is within z (\(1\le i_t\le |z|\)) except possibly for the last one, and the last pair is within z iff \(\delta \) is undefined on the corresponding state and symbol. We say m is the length of this computation. If \(m=\infty \), then the computation loops. Otherwise, it hits left into \(q_{m-1}\), if \(i_{m-1}=0\); or hangs, if \(1\le i_{m-1}\le |z|\); or hits right into \(q_{m-1}\), if \(i_{m-1}=|z|{+}1\) (Fig. 2). When \(i=1\) (respectively, \(i=|z|\)) we get the left (right) computation of M from q on z:

$$ \textsc {lcomp}_{M,q}(z) :=\textsc {comp}_{M,q,1}(z) \quad \text {and}\quad \textsc {rcomp}_{M,q}(z) :=\textsc {comp}_{M,q,|z|}(z) \,. $$

The (full) computation of M on z is the typical \(\textsc {comp}_M(z):=\textsc {lcomp}_{M,q_s }({\vdash }z{\dashv })\), so that M accepts z iff \(\textsc {comp}_M(z)\) hits right or left (into \(q_a \)).

Fig. 2.
figure 2

(a) Cells and boundaries on a 6-long z; a computation that hits left. (b) One that hangs. (c) One that hits right, and its i-th frontier: \(R^c_i\) in circles and \(L^c_i\) in boxes.

Frontiers. Consider a computation \(c = ( (q_t,i_t) )_{0\le t<m}\) over some input z, and the index \(0\le i\le |z|\) of some boundary of z (Fig. 2c). How does c behave over that boundary? One answer to this question is the standard notion of the i-th crossing sequence of c [2], namely the sequence \(q_1,q_2,q_3,\dots \) where \(q_j\) is the state entered by c right after the j-th time that c crosses that boundary.

Another answer, with much less information, is the i-th frontier of c [4], which records only which states are used by the crossings, completely ignoring the order in which they are used. Formally, this is the pair of sets of states \((L_i^c,R_i^c)\), where \(R^c_i\) (respectively, \(L^c_i\)) consists of every state which is entered by c right after some left-to-right (right-to-left) crossing of the i-th boundary of z:

$$ \begin{aligned} R_i^c&:=\{ q_t \mid 0\le t< m\,\, \;\, \& \;\,\,\, i_{t-1}=i\,\, \;\, \& \;\,\,\, i_t=i+1 \} \,, \\ L_i^c&:=\{ q_t \mid 0\le t < m\,\, \;\, \& \;\,\,\, i_{t-1}=i\,\,+\,\,1\,\, \;\, \& \;\,\,\, i_t=i \} \,. \end{aligned}$$

Here we also assume \(i_{-1}=i_0-1\), so that, if c starts on the cell right after the boundary (\(i_0=i+1\)), then \(R_i^c\) also contains \(q_0\). (This reflects the convention that the initial state \(q_0\) is always the result of an ‘invisible’ left-to-right step.)

If c is a full computation, then it starts (on \({\vdash }\), and thus) on the left side of the i-th boundary of z (since \(i\ge 0\)), and then eventually hangs or loops or accepts. If it hangs or accepts also on the left side of the boundary, then it crosses it from left to right exactly as many times as it crosses it from right to left; hence, \(R^c_i\) and \(L^c_i\) contain the same number of states. By similar reasoning, if c hangs or accepts on the right side of the boundary, then \(R^c_i\) contains one more state than \(L^c_i\). Finally, if c loops, then \(R^c_i\) contains either exactly as many states as \(L^c_i\), if c never crosses the boundary or the latest crossing which does not result in a repeated pair \((q_t,i_t)\) is from right to left; or one more state, otherwise. Overall, we conclude that, if c is full, then \(|R^c_i|\) is always either \(|L^c_i|\) or \(|L^c_i|{+}1\).

With this motivation, we define a frontier of M to be any pair (LR) such that \(L,R\subseteq S\) and either \(|L|=|R|\) or \(|L|{+}1=|R|\). In the former case, the frontier is called balanced; otherwise, it is called unbalanced. Standard counting arguments show that, if M has s states, then it has \(\left( {\begin{array}{c}2s\\ s\end{array}}\right) \) balanced and \({\left( {\begin{array}{c}2s\\ s{+}1\end{array}}\right) }\) unbalanced frontiers, for a total of \(\smash {\left( {\begin{array}{c}2s{+}1\\ s{+}1\end{array}}\right) }\) frontiers overall.

The next lemma (Lemma 1) and rule (Lemma 2) are of independent interest. In this paper, we will use them to prove a lower bound for (Lemma 6).

Lemma 1

(Hybrid Lemma). Let \(c_1,c_2,c\) be respectively the full computations of M on strings \(x_1y_1,x_2y_2\) and their hybrid \(x_1y_2\). Let \((L_1,R_1),(L_2,R_2),(L,R)\) be the respective frontiers on the boundaries \(x_1\)-\(y_1\), \(x_2\)-\(y_2\), and \(x_1\)-\(y_2\). Then

$$ L_1 \supseteq L_2 \;\;\, \& \;\,\; R_1 \subseteq R_2 \;\;\implies \;\; L_1 \supseteq L_2 \supseteq L \;\;\, \& \;\,\; R \subseteq R_1 \subseteq R_2 \,. $$

Proof

Suppose \(L_1\supseteq L_2\) and \(R_1\subseteq R_2\). We must prove four inclusions. The first and last of them are just our assumption (included in the statement only for aesthetics), so we just need to prove that \( L_2\supseteq L\,\;\, \& \;\,\,R \subseteq R_1\).

Let \(0\le k\le \infty \) be the number of times c crosses the \(x_1\)-\(y_2\) boundary. Let \(q_j\) be the state entered by c right after the j-th crossing, for all finite j with \(1\le j\le k\) (Fig. 3). It suffices to prove the following.

Claim

All \(q_j\) for odd j are in \(R_1\), and all \(q_j\) for even j are in \(L_2\).

Indeed, if the claim holds, then every \(q\in R\) is also in \(R_1\), because it is a \(q_j\) for some odd j; and every \(q\in L\) is also in \(L_2\), because it is a \(q_j\) for some even j.

To prove the claim, we first note that it is vacuously true when \(k=0\). So, we assume \(1\le k\le \infty \) and apply induction on j.

Fig. 3.
figure 3

Computations in the proof of the Hybrid Lemma.

In the base case, \(j=1\) and we must prove \(q_1\in R_1\). But \(q_1\) is the result of the first crossing of the \(x_1\)-\(y_2\) boundary, so \(d_0:=\textsc {lcomp}_{M,q_s }({\vdash }x_1)\) hits right into \(q_1\). But \(d_0\) is clearly a prefix of \(c_1\), therefore the first crossing of the \(x_1\)-\(y_1\) boundary along \(c_1\) results into \(q_1\), as well. Hence, \(q_1\in R_1\).

In the inductive step, we assume the claim for \(j\ge 1\) and prove it for \(j{+}1\le k\).

If j is odd, then \(j{+}1\) is even, and thus \(q_{j{+}1}\) is the result of crossing the \(x_1\)-\(y_2\) boundary from right to left. In particular, \(d_j:=\textsc {lcomp}_{M,q_j}(y_2{\dashv })\) is an infix of c and hits left into \(q_{j{+}1}\). We know \(q_j\in R_1\) (by the inductive hypothesis), and thus \(q_j\in R_2\) (since \(R_1\subseteq R_2\)). Therefore, \(c_2\) produces \(q_j\) in one of its left-to-right crossings of the \(x_2\)-\(y_2\) boundary. Hence, after that crossing, \(c_2\) continues as in \(\textsc {lcomp}_{M,q_j}(y_2{\dashv })\), namely as in \(d_j\), and thus crosses the boundary again, from right to left and into \(q_{j{+}1}\). Consequently, \(q_{j{+}1}\in L_2\).

If j is even, we work symmetrically. Since \(j{+}1\) is odd and \({\ge }\,3\), \(q_{j{+}1}\) results from a left-to-right crossing of the \(x_1\)-\(y_2\) boundary and c contains the infix \(d_j:=\textsc {rcomp}_{M,q_j}({\vdash }x_1)\) which hits right into \(q_{j{+}1}\). But \(q_j\) is in \(L_2\) (by the inductive hypothesis), and thus in \(L_1\) (since \(L_1\supseteq L_2\)), so \(c_1\) produces it in some right-to-left crossing of the \(x_1\)-\(y_1\) boundary. Hence, after that, \(c_1\) continues as in \(\textsc {rcomp}_{M,q_j}({\vdash }x_1)=d_j\), which means that it left-to-right crosses the boundary into \(q_{j{+}1}\), causing \(q_{j{+}1}\in R_1\).    \(\square \)

Fig. 4.
figure 4

Computations in the proof of the Hybrid Rule. (a) If c never crosses the critical boundary: then \(c,c_1\) decide the same. (b) If c crosses the critical boundary finitely often, and the last crossing is from left to right: then \(c,c_2\) decide the same. (c) If c crosses the critical boundary infinitely often, and the earliest same-side repetition is after a left-to-right crossing: then \(c,c_1\) decide the same.

Lemma 2

(Hybrid Rule). Suppose M decides identically on strings \(x_1y_1,x_2y_2\) but differently on their hybrid \(x_1y_2\). Then the frontiers \((L_1,R_1),(L_2,R_2)\) of the full computations of M on \(x_1y_1,x_2y_2\) on the boundaries \(x_1\)-\(y_1\) and \(x_2\)-\(y_2\) satisfy:

$$ L_1 \not \supseteq L_2 \;\;\;\vee \;\;\; R_1 \not \subseteq R_2 \,. $$

Proof

Let \(c_1,c_2,c\) be the full computations of M on strings \(x_1y_1\), \(x_2y_2\), and their hybrid \(x_1y_2\). Let \((L_1,R_1),(L_2,R_2),(L,R)\) be the frontiers of these computations on the boundaries \(x_1\)-\(y_1\), \(x_2\)-\(y_2\), and \(x_1\)-\(y_2\), respectively. Towards a contradiction, assume \( L_1\supseteq L_2\,\;\, \& \;\,\,R_1\subseteq R_2\). Then, by the Hybrid Lemma,

$$ L_1 \supseteq L_2 \supseteq L \;\;\;\, \& \;\,\;\; R \subseteq R_1 \subseteq R_2 \,. $$

Using this, we will prove that on at least one of \(x_1y_1\) and \(x_2y_2\), the decision of M must be identical to its decision on the hybrid \(x_1y_2\)—a contradiction.

We take cases on how often c crosses the critical boundary \(x_1\)-\(y_2\) (Fig. 4).

If c never crosses the critical boundary, then c lies fully within \({\vdash }x_1\) (Fig. 4a). So, M notices no difference between \(x_1y_2,x_1y_1\), and decides identically on both.

If c crosses the critical boundary finitely often, then there is a last crossing.

Suppose this last crossing is from left to right (Fig. 4b). Let q be the state resulting from it. Then \(d:=\textsc {lcomp}_{M,q}(y_2{\dashv })\) does not hit left (it hangs, or loops, or falls off \({\dashv }\)), and is thus a suffix of c. At the same time, \(q\in R\) (by its selection as the result of a left-to-right crossing), and thus \(q\in R_2\) (since \(R\subseteq R_2\)). Hence, \(c_2\) also contains a left-to-right crossing of the boundary \(x_2\)-\(y_2\) that results in q. Clearly, from that point on, \(c_2\) behaves as in \(\textsc {lcomp}_{M,q}(y_2{\dashv })\), namely as in d, which does not hit left. Therefore \(c_2\) also finishes with d. Overall, c and \(c_2\) have the same suffix d, which implies that M decides identically on \(x_1y_2\) and \(x_2y_2\).

If the last crossing of the \(x_1\)-\(y_2\) boundary along c is from right to left, then a symmetric argument applies: \(q\in L\subseteq L_1\) and \(d:=\textsc {rcomp}_{M,q}({\vdash }x_1)\) is a suffix of both c and \(c_1\), causing M to decide identically on \(x_1y_2\) and \(x_1y_1\).

If c crosses the critical boundary infinitely often, then consider the infinite list \(q_1,q_2,\dots \) where \(q_j\) is the state produced by the j-th crossing (Fig. 4c). Of course, this list contains repetitions: there exist \(1\le j_1<j_2\) such that \(q_{j_1}=q_{j_2}\). Equally clearly, it also contains same-side repetitions: namely, repetitions where \(j_1,j_2\) are either both odd or both even (so that \(q_{j_1},q_{j_2}\) are produced both by left-to-right crossings or both by right-to-left crossings, respectively). Let q be the state in the earliest such repetition (namely \(q=q_{j_1}=q_{j_2}\) in the same-side repetition with the smallest \(j_2\)) and p the state produced by the crossing just before that repetition happened (namely \(p=q_{j_2-1}\)).

Suppose the \(j_1\)-th and \(j_2\)-th crossings are from left to right. Then the \(j_2{-}1\)st crossing is from right to left and is followed by \(d:=\textsc {rcomp}_{M,p}({\vdash }x_1)\), which hits right into q. At the same time, \(p\in L\) and \(q\in R\) (by their selection as the results of a right-to-left and a left-to-right crossing, respectively) and thus \(p\in L_1\) and \(q\in R_1\) (as \(L_1\supseteq L\) and \(R\subseteq R_1\)). So, \(c_1\) also contains right-to-left crossings that produce p (to be called “p-crossings”) and left-to-right crossings that produce q (to be called “q-crossings”). The next claim implies that at least one of these two types of crossings repeats, and thus \(c_1\) loops, exactly as c does. Therefore, M decides identically on \(x_1y_2\) and \(x_1y_1\).

Claim

There are at least two p-crossings or at least two q-crossings in \(c_1\).

Proof

Towards a contradiction, assume \(c_1\) contains exactly one p-crossing and exactly one q-crossing. We distinguish cases based on their order inside \(c_1\).

If the q-crossing appears before the p-crossing: We know the p-crossing is followed by \(\textsc {rcomp}_{M,p}({\vdash }x_1)\), namely by d, which we already know hits right into q. So, the p-crossing is followed by a q-crossing. Hence, \(c_1\) contains at least two q-crossings (one before and one after the p-crossing), a contradiction.

If the q-crossing appears after the p-crossing: We first return to c to observe that \(j_1\ne 1\), namely the first of the two crossings that produce q cannot be the very first of all crossings. (Because then \(\textsc {lcomp}_{M_,q_s }({\vdash }x_1)\) hits right into q; so the one q-crossing in \(c_1\) is also the very first of all crossings, and thus appears before the p-crossing, a contradiction.)

Hence, we can talk about the \((j_1{-}1)\)-st crossing of the critical boundary in c (from right to left). Let \(p'\) be the state it produces. Then \(d':=\textsc {rcomp}_{M,p'}({\vdash }x_1)\) hits right into q. Note that \(p'\ne p\), or else \(q_{j_1-1}=q_{j_2-1}\), contrary to our selection of \(q=q_{j_1}=q_{j_2}\) as the earliest same-side repetition. Also note that \(p'\in L\) (as the result of a right-to-left crossing), hence \(p'\in L_1\) (since \(L_1\supseteq L\)), hence \(c_1\) also contains a right-to-left crossing that produces \(p'\) (to be called “\(p'\)-crossing”).

It now follows that \(c_1\) contains at least two q-crossings: the one right after the p-crossing (caused by d) and the one right after the \(p'\)-crossing (caused by \(d'\)), which we know are distinct (because \(p'\ne p\)). This is again a contradiction.    \(\boxdot \)

If the \(j_1\)-th and \(j_2\)-th crossings are from right to left, then we argue symmetrically. We know that \(d:=\textsc {lcomp}_{M,p}(y_2{\dashv })\) hits left into q, and that \(p\subseteq R\subseteq R_2\) and \(L_2\supseteq L\ni q\), so \(c_2\) also contains left-to-right crossings that produce p and right-to-left crossings that produce q. As before, at least one of these two types of crossings repeats, hence \(c_2\) loops, causing M to decide on \(x_2y_2\) just as on \(x_1y_2\).

This concludes the third case of our argument and, with it, the full proof.    \(\square \)

Behaviors. Let z be any string. The behavior of M on z is the (partial) function which returns the results of all left and right computations of M on z. Specifically, it is the function \(\gamma _{M,z}:S\times \{\textsc {l} {,}\textsc {r} \}\rightharpoonup S\times \{\textsc {l} {,}\textsc {r} \}\) such that, for all \(p\in S\),

$$ \gamma _{M,z}(p,\textsc {l} ) :={\left\{ \begin{array}{ll} (q,\textsc {l} ) &{}\text {if }\textsc {lcomp}_{M,p }(z)\,\, \text {hits left into }q, \\ \text {undefined} &{}\text {if } \textsc {lcomp}_{M,p }(z) \,\,\text {hangs or loops,}\\ (q,\textsc {r} ) &{}\text {if } \textsc {lcomp}_{M,p }(z)\,\,\text {hits right into }q; \end{array}\right. } $$

and similarly for \(\gamma _{M,z}(p,\textsc {r} )\), using \(\textsc {rcomp}_{M,p}(z)\) instead of \(\textsc {lcomp}_{M,p}(z)\).

With this motivation, we define a behavior of M to be any partial function from \(S\times \{\textsc {l} {,}\textsc {r} \}\) to \(S\times \{\textsc {l} {,}\textsc {r} \}\). Easily, if M has s states, then it has \((2s{+}1)^{2s}\) behaviors. However, if \(|z|=1\), then every left computation is also a right computation (since the first and last cells of z coincide), causing \(\gamma _{M,z}(p,\textsc {l} )=\gamma _{M,z}(p,\textsc {r} )\) for all p; hence, the number of single-symbol behaviors of M is only \((2s{+}1)^s\).

A standard fact in the analysis of computations is that, if M exhibits the same behavior on two strings, then each of them can be replaced by the other in any context, without M noticing the difference. Formally, this is captured by the next lemma, which we state without proof.

Lemma 3

(Infix Lemma). If \(\gamma _{M,y_1}=\gamma _{M,y_2}\), then \(\gamma _{M,xy_1z}=\gamma _{M,xy_2z}\).

As a direct consequence of this, M cannot decide differently on two strings which differ only at two infixes that do not change its behavior.

Lemma 4

(Infix Rule). If M decides differently on strings \(xy_1z,xy_2z\), then:

$$ \gamma _{M,y_1}\ne \gamma _{M,y_2} \,. $$

Proof

Towards the contrapositive, assume \(\gamma _{M,y_1}=\gamma _{M,y_2}\). Then, by the Infix Lemma, \(\gamma _{M,{\vdash }xy_1z{\dashv }}=\gamma _{M,{\vdash }xy_2z{\dashv }}\). In particular, the two functions return the same value on \((q_s,\textsc {l} )\), which is either \((q_a ,\textsc {l} )\), or \((q_a ,\textsc {r} )\), or undefined. In all three cases, it follows that M decides identically on \(xy_1z\) and \(xy_2z\).    \(\square \)

3 The Case of Two Symbols

We now prove Theorem 2, that all smallest for have \(\varTheta (h)\) states. This follows directly from the next two lemmata. The upper bound (Lemma 5) is well-known and also implied by [6]; here, we give a careful construction. The lower bound (Lemma 6) is a tighter variant of [1, Theorem 1] that uses frontiers, as opposed to arbitrary pairs of sets of states. (Without this modification, the lower bound for by the argument of [1, Theorem 1] is only \(\frac{1}{2}h\).)

Lemma 5

Some solves with \({\le }\,2h\) states.

Proof

Fix \(h\ge 2\) and consider an instance xy of , for \(x,y\in \varSigma _h\) (Fig. 1b). Let \({u}_1,{u}_2,\ldots ,{u}_{h}\) be the nodes of column 1, from top to bottom. We say \(u_i\) is l -live, if it has non-zero degree in x; r -live, if it has non-zero degree in y; live, if it is both l-live and r-live; and dead if it is not live. Clearly, xy is live iff some \(u_i\) is live. So, our  M simply searches for a live \(u_i\) sequentially, from \(u_1\) to \(u_h\).

The set of states is \(S:=[h]\times \{\textsc {l} {,}\textsc {r} \}\) and state \((1,\textsc {l} )\) serves as both the start and the accept state. Each other state \((i,\textsc {l} )\) is used only on \({\vdash }x\); it assumes that all \(u_j\) above \(u_i\) are dead and that \(u_i\) is r-live, and tries to check if \(u_i\) is also l-live. Symmetrically, every state \((i,\textsc {r} )\) is used only on \(y{\dashv }\); it assumes that all \(u_j\) above \(u_i\) are dead and that \(u_i\) is l-live, and tries to check if \(u_i\) is also r-live.

The transitions between these states are now not hard to see:

From \((1,\textsc {l} )\) on \({\vdash }\), M moves to \((1,\textsc {l} )\) on x. If x contains no edges, then xy is obviously dead, so M just hangs. Otherwise, there exists at least one l-live \(u_i\), so M finds the topmost such \(u_i\) and moves to the corresponding state \((i,\textsc {r} )\) on y.

From a state \((i,\textsc {r} )\) on y, M checks if \(u_i\) is r-live. If so, then xy is live, so M moves to \((i,\textsc {r} )\) on \({\dashv }\), and then off \({\dashv }\) into \((1,\textsc {l} )\) to accept. Otherwise, it checks if any \(u_j\) below \(u_i\) is r-live. If not, then xy is dead, so M just hangs. Otherwise, M finds the topmost r-live \(u_j\) below \(u_i\), and moves to the corresponding state \((j,\textsc {l} )\) on x, to check whether that \(u_j\) is also l-live.

From a state \((i,\textsc {l} )\) on x with \(i\ge 2\), M behaves symmetrically as above: if \(u_i\) is l-live, then M moves to \((i,\textsc {l} )\) on \({\vdash }\), and then off \({\vdash }\) into \((1,\textsc {l} )\) to accept. Otherwise, it either hangs, if no \(u_j\) below \(u_i\) is l-live; or moves to y and into the state \((j,\textsc {r} )\) corresponding to the topmost such \(u_j\).

It should be clear that M works correctly and uses exactly 2h states.    \(\square \)

Lemma 6

Every solving has \({>}\,\frac{1}{2} h + \frac{1}{4}\lg h - \frac{1}{2}\) states.

Proof

Let M be any solving . Let S be its set of states and \(s:=|S|\).

For every \(\alpha \subseteq [h]\), let \(x_\alpha :=\{(u,u)\mid u\in \alpha \}\) be the symbol consisting of the “horizontal” edges which correspond to indices in \(\alpha \) (e.g., see the leftmost symbol in Fig. 1a). Clearly, for every \(\alpha ,\beta \subseteq [h]\), the string \(x_{\alpha }x_{\beta }\) is live iff \(\alpha \cap \beta \ne \emptyset \). We also easily verify the following.

Claim 1

For all distinct \(\alpha ,\beta \subseteq [h]\) \(:\)

  1. (i)

    the strings \(x_\alpha x_{\overline{\alpha }}\) and \(x_\beta x_{\overline{\beta }}\) are dead, but

  2. (ii)

    at least one of their two hybrids \(x_\alpha x_{\overline{\beta }}\) and \(x_\beta x_{\overline{\alpha }}\) is live.

Proof

Let \(\alpha ,\beta \subseteq [h]\) with \(\alpha \ne \beta \). Then (i) is obvious, since \(\alpha \cap \overline{\alpha }=\beta \cap \overline{\beta }=\emptyset \). For (ii), suppose both hybrids \(x_\alpha x_{\overline{\beta }}\) and \(x_\beta x_{\overline{\alpha }}\) are dead. Then \(\alpha \cap \overline{\beta }=\emptyset \) and \(\beta \cap \overline{\alpha }=\emptyset \); equivalently, \(\alpha \subseteq \beta \) and \(\beta \subseteq \alpha \); hence \(\alpha =\beta \), a contradiction.    \(\boxdot \)

Now, for each \(\alpha \subseteq [h]\), consider the the full computation of M on \(x_\alpha x_{\overline{\alpha }}\) and let \((L_\alpha ,R_\alpha )\) be its frontier over the middle boundary. This effectively defines \(2^h\) frontiers of M, one for each \(\alpha \). We claim that these are all distinct.

Claim 2

For all distinct \(\alpha ,\beta \subseteq [h]\) \(:\)   \((L_\alpha ,R_\alpha )\ne (L_\beta ,R_\beta )\) .

Proof

Let \(\alpha ,\beta \subseteq [h]\) with \(\alpha \ne \beta \). By Claim 1 and since M solves , we know M decides identically on \(x_\alpha x_{\overline{\alpha }}\) and \(x_\beta x_{\overline{\beta }}\) (it does not accept) but differently on at least one of their hybrids \(x_\alpha x_{\overline{\beta }}\) and \(x_\beta x_{\overline{\alpha }}\) (it accepts). Without loss of generality, assume the interesting hybrid is \(x_\alpha x_{\overline{\beta }}\) (otherwise, swap the roles of the two strings). Then, by the Hybrid Rule (Lemma 2), we know \(L_\alpha \not \supseteq L_\beta \) or \(R_\alpha \not \subseteq R_\beta \). Therefore \(L_\alpha \ne L_\beta \) or \(R_\alpha \ne R_\beta \), namely \((L_\alpha ,R_\alpha )\ne (L_\beta ,R_\beta )\).    \(\boxdot \)

Overall, we conclude that M uses distinct frontiers on the \(2^h\) distinct dead strings of the form \(x_{\alpha }x_{\overline{\alpha }}\). Since it has only \(\left( {\begin{array}{c}2s{+}1\\ s{+}1\end{array}}\right) \) frontiers total, it follows that

$$\begin{aligned} \left( {\begin{array}{c}2s+1\\ s+1\end{array}}\right) \ge 2^h \,, \end{aligned}$$
(2)

and we need to solve for s. Using the well-known Stirling’s bounds

$$ \sqrt{2\pi }\sqrt{n}\cdot (\tfrac{n}{e})^n \;\le \; n! \;\le \; e\sqrt{n}\cdot (\tfrac{n}{e})^n $$

for the factorial function, we calculate:

$$ \left( {\begin{array}{c}2s{+}1\\ s{+}1\end{array}}\right) = \frac{(2s{+}1)!}{(s{+}1)!s!} = \frac{2s{+}1}{s{+}1}\cdot \frac{(2s)!}{(s!)^2} < 2\cdot \frac{ e\sqrt{2s}\cdot (2s/e)^{2s} }{ ( \sqrt{2\pi }\sqrt{s}\cdot (s/e)^s)^2 } = \frac{e\sqrt{2}}{\pi }\cdot \frac{2^{2s}}{\sqrt{s}} $$

so that (2) implies \((e\sqrt{2}/\pi )(2^{2s}/\sqrt{s})>2^h\), and thus

$$\begin{aligned} s > \tfrac{1}{2} h + \tfrac{1}{4}\lg s - \tfrac{1}{2}\lg (e\sqrt{2}/\pi )\,. \end{aligned}$$
(3)

Since \(s\ge 1\) and \(\tfrac{1}{2}\lg (e\sqrt{2}/\pi )\le 0.15\), this implies \(s>\tfrac{1}{2}h-0.15\), and thus \(s\ge \tfrac{1}{2}h\) (since s and h are both integers). So, \(\lg s\ge \lg h-1\). Substituting in (3), we get:

$$ s > \tfrac{1}{2} h + \tfrac{1}{4}\lg h - \tfrac{1}{2}\lg (2e/\pi )\,. $$

Finally, we note that \(\lg (2e/\pi )\approx 0.8<1\).    \(\square \)

4 The Case of Three Symbols

To prove Theorem 1, that all smallest for have \(\varTheta (h^2/\log h)\) states, we start with the lower bound (Lemma 7), which is simpler; then continue with the upper bound (Lemma 8), which is a bit more involved.

Lemma 7

Every solving has \({\ge }\,\frac{1}{2}\frac{h^2}{\lg h}\) states.

Proof

Let M be any solving . Let S be its set of states and \(s:=|S|\).

Let \({y}_1,{y}_2,\ldots ,{y}_{N}\) be a list of all symbols in the input alphabet \(\varSigma _h\). Since every symbol may contain up to \(h^2\) edges, we know \(N=2^{h^2}\). For each \(i=1,2,\dots ,N\), let \(\gamma _i:=\gamma _{M,y_i}\) be the behavior of M on \(y_i\).

Claim

For all distinct \(i,j\in [N]\) \(:\)   \(\gamma _i\ne \gamma _j\) .

Proof

Let \(i,j\in [N]\) with \(i\ne j\). Then \(y_i\ne y_j\). Hence, there exists at least one edge (uv) which appears in one of \(y_i,y_j\), but not the other. Without loss of generality, assume (uv) appears in \(y_i\) but not in \(y_j\). Let \(x:=\{(u,u)\}\) and \(z:=\{(v,v)\}\) be the symbols containing only the “horizontal” edges corresponding to u and v. Then the strings \(xy_iz\) and \(xy_jz\) are respectively a live and a dead instance of . Hence, M decides differently on them. By the Infix Rule (Lemma 4), it follows that \(\gamma _i\ne \gamma _j\).    \(\boxdot \)

Hence, M uses distinct behaviors on the \(\smash {2^{h^2}}\) possible middle symbols. Since it has only \((2s\,{+}\,1)^s\) single-symbol behaviors in total (cf. p. 9), it follows that

$$\begin{aligned} (2s{+}1)^s \ge 2^{h^2} \,, \end{aligned}$$
(4)

which implies the bound in the statement, as follows.

If \(h=2\), then (4) asks that \((2s{+}1)^s\ge 16\). Easily, this holds only if \(s\ge 2\), which matches the bound in the statement: \(\frac{1}{2}(h^2/\lg h)=\tfrac{1}{2}(4/1)=2\).

If \(h=3\), then similarly we need \((2s{+}1)^s\ge 512\), which holds only if \(s\ge 4\), which exceeds the bound in the statement: \(\frac{1}{2}(h^2/\lg h)=\tfrac{1}{2}(9/\lg 3)\approx 2.84\).

If \(h\ge 4\), then we first take logarithms to rewrite (4) as \(s\lg (2s{+}1)\ge h^2\). Towards a contradiction, we assume \(\smash {s<\frac{1}{2}\frac{h^2}{\lg h}}\) and calculate:

$$\begin{aligned} \lg (2s{+}1)< \lg \bigl (2\cdot 2s) < \lg \bigl ( 2\tfrac{h^2}{\lg h} \bigr ) = 2\lg h -(\lg \lg h-1) \le 2\lg h \,, \end{aligned}$$

where the first step uses the fact that \(2s{+}1<4s\) (since \(s\ge 1\)); the second step uses the assumption that \(\smash {s<\frac{1}{2}\frac{h^2}{\lg h}}\); and the last step uses the fact that \(\lg \lg h\ge 1\) (since \(h\ge 4\)). Therefore, we can conclude that

$$\begin{aligned} s\lg (2s{+}1) < \bigl ( \tfrac{1}{2}\tfrac{h^2}{\lg h} \bigr ) \bigl ( 2\lg h \bigr ) = h^2 \,, \end{aligned}$$

contrary to the rewriting of (4) above.    \(\square \)

Lemma 8

Some solves with \({\le }\,4h\lceil \frac{h}{\lfloor \lg h\rfloor }\rceil \) states.

Fig. 5.
figure 5

The proof of Lemma 8 for the example \(h=32\), \(l=5\), \(m=7\), and \(r=3\). Relative to \(b_r\), the shown vertex \(v_j\) has connectivity \(\{2,3,5\}\).

Proof

Fix \(h\ge 2\) and consider an instance xyz of for \(x,y,z\in \varSigma _h\) (Fig. 1c). Let \({u}_1,{u}_2,\ldots ,{u}_{h}\) and \({v}_1,{v}_2,\ldots ,{v}_{h}\) be the nodes of columns 1 and 2, respectively, from top to bottom. Similarly to the proof of Lemma 5, we say \(u_i\) is l -live if it has non-zero degree in x; and \(v_i\) is r -live if it has non-zero degree in z.

We first partition column 1 into \(h/\lg h\) blocks of length \(\lg h\) each. More carefully, we let \(l:=\lfloor \lg h\rfloor \) be the desired length, and assign every \(u_i\) to block \(b_{\lceil i/l\rceil }\). Easily, this produces \(m:=\lceil h/l\rceil \) disjoint blocks, \({b}_1,{b}_2,\ldots ,{b}_{m}\subseteq \{{u}_1,{u}_2,\ldots ,{u}_{h}\}\). For example, for \(h=5\), the desired length is \(l=2\) and we get the \(m=3\) blocks \(b_1=\{u_1,u_2\}\), \(b_2=\{u_3,u_4\}\), and \(b_3=\{u_5\}\). Note that, if l does not divide h, then the length of the last block \(b_m\) is not l, but only \(l':=h \bmod l\).

We say a block \(b_r\) is live if there exists a live path that passes through a node belonging to \(b_r\) (Fig. 5). Clearly, xyz is live iff some \(b_r\) is live. So, our  M simply searches for a live \(b_r\) sequentially, from \(b_1\) to \(b_m\).

Specifically, M consists of m disjoint sub-automata \({M}_1,{M}_2,\ldots ,{M}_{m}\), where every \(M_r\) uses states of the form \((r,\dots )\) and is responsible for checking whether the corresponding \(b_r\) is live. The start state of M is the start state of \(M_1\). From then on, the iteration works as follows. If the current \(M_r\) confirms that \(b_r\) is live, then it falls off \({\dashv }\) into the designated accept state \(q_a \) of M, causing M to accept, too. Otherwise, \(M_r\) arrives at a state and cell where it has confirmed that \(b_r\) is dead; from there, it either advances the iteration by moving to x and to the start state of \(M_{r+1}\), if \(r<m\); or just hangs, causing M to hang too, if \(r=m\).

Every \(M_r\) works in the same way as every other, except that it focuses on its own \(b_r\). So, all sub-automata are of the same size \(\tilde{s}\), causing M’s total size to be \(s=m\tilde{s}=O(h\tilde{s}/\log h)\). Hence, to achieve the \(O(h^2/\log h)\) size in the statement, we ensure that each \(M_r\) has size O(h). Here is how.

Checking a block. Fix any \(b_r\) and let \({\hat{u}}_1,{\hat{u}}_2,\ldots ,{\hat{u}}_{l}\) be its vertices, from top to bottom. In the case where \(b_r\) is the last block \(b_m\) and has only \(l'<l\) nodes, we pretend that the nodes \(\hat{u}_{l'{+}1},\dots ,\hat{u}_l\) exist but have degree zero in both x and y.

Now consider any \(v_j\) in column 2. (See Fig. 5.) The connectivity of \(v_j\) (relative to \(b_r\) \()\) is the set \(\xi \subseteq [l]\) of the indices of all \(\hat{u}_i\) which connect to \(v_j\):

$$ i\in \xi \iff y~\text {contains the edge}~(\hat{u}_i,v_j) \,. $$

With this motivation, we call connectivity any subset of [l]. Clearly, the number of different connectivities is \(k:=2^l=2^{\lfloor \lg h\rfloor }\le 2^{\lg h}=h\). Let \({\xi }_1,{\xi }_2,\ldots ,{\xi }_{k}\) be any fixed listing of them, so that each \(v_j\) has its connectivity in this list.

We say that, relative to \(b_r\), a connectivity \(\xi _t\) is l -live, if for at least one \(i\in \xi _t\) the corresponding \(\hat{u}_i\) is l -live; r -live, if at least one of the \(v_j\) with connectivity \(\xi _t\) is r -live; live, if it is both l-live and r-live; and dead, if it is not live. Easily:

Claim

Block \(b_r\) is live iff at least one connectivity \(\xi _t\) is live relative to it.

Proof

Suppose \(b_r\) is live. Then some live path passes through it. Let \(\hat{u}_i\) and \(v_j\) be the nodes used by this path in \(b_r\) and in column 2, respectively, and \(\xi _t\) the connectivity of \(v_j\) relative to \(b_r\). Then \(\hat{u}_i\) is l-live and \(v_j\) is r-live (clearly); and \(i\in \xi _t\) (because of the edge from \(\hat{u}_i\) to \(v_j\)). Hence \(\xi _t\) is both l-live (because of \(\hat{u}_i\)) and r-live (because of \(v_j\)). So, \(\xi _t\) is live.

Conversely, suppose some \(\xi _t\) is live relative to \(b_r\). Let \(\hat{u}_i\) and \(v_j\) be witnesses for why it is l-live and r-live, respectively. Then (i) \(\hat{u}_i\) is l-live and (ii) \(i\in \xi _t\); and (iii) \(v_j\) is r-live and (iv) \(v_j\) has connectivity \(\xi _t\) relative to \(b_r\). By (ii) and (iv), we know the edge \((\hat{u}_i,v_j)\) exists. By (i) and (iii), we know this edge is actually on a live path through \(\hat{u}_i\) and \(v_j\). Hence, \(b_r\) is live.    \(\boxdot \)

Therefore, our sub-automaton \(M_r\) simply searches for a live \(\xi _t\) sequentially, from \(\xi _1\) to \(\xi _k\). Intuitively, \(M_r\) consists of k sub-automata \(M_{r,1},M_{r,2},\dots ,M_{r,k}\), where every \(M_{r,t}\) is responsible for checking whether the corresponding \(\xi _t\) is live. The machine starts on x and in the start state of \(M_{r,1}\). From then on, the iteration works as follows. If the current \(M_{r,t}\) confirms that \(\xi _t\) is live, then it falls off \({\dashv }\) into \(q_a \), causing \(M_r\) to behave exactly as promised above. Otherwise, \(M_{r,t}\) “halts” on x or y, in the sense that it enters a state where it has confirmed that \(\xi _t\) is dead. From there: if \(t<k\), then it advances the iteration over all connectivities by moving to x and to the start state of \(M_{r,t+1}\); otherwise (\(t=k\)), it either moves to x and to the start state of \(M_{r+1}\), if \(r<m\), or just hangs, if \(r=m\), causing \(M_r\) to behave exactly as promised above.

Checking a connectivity. Every \(M_{r,t}\) starts on x in state \((r,t,\textsc {l} )\), trying to decide whether \(\xi _t\) is l-live. For this, it checks if any of the \(\hat{u}_i\) in \(b_r\) with \(i\in \xi _t\) have non-zero degree in x. If not, then \(\xi _t\) is dead, so \(M_{r,t}\) halts (on x). Else, it moves to y in state \((r,t,\textsc {r} )\), to check if \(\xi _t\) is also r-live. Reading y, it computes the connectivities of all \(v_j\) relative to \(b_r\), and focuses only on those \(v_j\) with connectivity \(\xi _t\). If no such \(v_j\) exist, then \(\xi _t\) is dead, so \(M_{r,t}\) again halts (on y). Else, \(M_{r,t}\) iterates over all such \(v_j\), from top to bottom, moving to z to check if any of them is r-live.

Let us first see a naive way to implement this last iteration: From y and in state , \(M_{r,t}\) finds the topmost \(v_j\) with connectivity \(\xi _t\) and moves to z in state . Reading z, it checks whether \(v_j\) is r-live. If so, then it moves to \({\dashv }\) into \(q_a \) and then off \({\dashv }\) into \(q_a \). Otherwise, it moves back to y in a state . Reading y, it finds the topmost \(v_{j'}\) with \(j'>j\) and connectivity \(\xi _t\). If none exists, then it halts (on y). Else, it moves to z in state , and so on. Of course, this implementation needs 2h states in each \(M_{r,t}\) (two for each \(j=1,\ldots ,h\)), causing the total size of \(M_{r}\) to rise to \(\varTheta (h^2)\)—when our goal is only O(h).

The smart implementation works similarly to the naive one, except for the following modification. Every time \(M_{r,t}\) moves to z to examine the r-liveness of some \(v_j\), its state is not , but just . In other words, whenever on z, the automaton “forgets” the connectivity \(\xi _t\) that it is responsible for. Of course, this creates no problem in checking whether \(v_j\) is r-live (since this needs only j). Still, if \(v_j\) is not r-live and the automaton returns to y, then it is unclear how it will continue its iteration over the remaining nodes of connectivity \(\xi _t\) (below \(v_j\)), let alone the iteration over the remaining connectivities (after \(\xi _t\)). How will it manage to do all this, if it has forgotten \(\xi _t\)?

The answer is that the automaton can recover the forgotten \(\xi _t\) from (i) what it still remembers and (ii) the edges in y. Specifically, whenever \(v_j\) is not r-live, the automaton moves from z and state  back on y and in state . Reading y, it finds all edges between the nodes of \(b_r\) and \(v_j\); from these, it computes the connectivity of \(v_j\) relative to \(b_r\), which is exactly \(\xi _t\). Hence, having remembered \(\xi _t\), it continues its iteration as if it had never forgotten it.

Put another way, the trick is that \(M_{r,t}\) is not disjoint from the rest of the sub-automata \(M_{r,{\,.\,}}\). Although it uses its own states \((r,t,\textsc {l} )\) and \((r,t,\textsc {r} )\) on xy, it shares with the rest of the \(M_{r,{\,.\,}}\) the states and on yz.

Overview. Returning to M, we see that it has states of the form \((r,t,\textsc {l} )\), \((r,t,\textsc {r} )\), , and , where \(r\in [m]\), \(t\in [k]\), \(j\in [h]\). Namely, its set of states is

for a total size of \(2mk+2mh\le 4mh\), as promised in the statement.

State \((1,1,\textsc {l} )\) is simultaneously the start and accept state: \(q_s =q_a :=(1,1,\textsc {l} )\). When in it and on \({\vdash }\) or \({\dashv }\), the machine stays in the state and moves right (either to x, to start the search for a live block; or off \({\dashv }\), to accept). This is the only state used on \({\vdash }\) and \({\dashv }\). All other states are used only on x, if of the form \((r,t,\textsc {l} )\); or only on y, if of the form \((r,t,\textsc {r} )\) or \((r,j,\bot )\); or only on z, if of the form \((r,j,\top )\).

When in \((r,t,\textsc {l} )\) reading x, the machine checks if some node in \(b_r\) with non-zero degree in x has its index in \(\xi _t\) (i.e., \(\xi _t\) is l-live relative to \(b_r\)). If so, then M moves to y and in \((r,t,\textsc {r} )\) to check if \(\xi _t\) is also r-live relative to \(b_r\). Otherwise, M “continues the search from x”, meaning that: it moves to (stays on) x and enters \((r,t{+}1,\textsc {l} )\) to advance the iteration over connectivities, if \(t<k\); or moves to (stays on) x and enters \((r{+}1,1,\textsc {l} )\) to advance the iteration over blocks, if \(t=k\) and \(r<m\); or hangs immediately to signify rejection, if \(t=k\) and \(r=m\).

When in \((r,t,\textsc {r} )\) reading y, the machine checks if any node of column 2 has connectivity \(\xi _t\) relative to \(b_r\). If so, then M finds the topmost such node \(v_j\) and moves to z and in \((r,j,\top )\) to check if \(v_j\) is r-live. Otherwise (\(\xi _t\) is not r-live), M continues the search from x (as above).

When in \((r,j,\top )\) reading z, the machine checks if node \(v_j\) has non-zero degree in z. If so (\(b_r\) is live), then M moves to \({\dashv }\) and in \(q_a \) to signify acceptance. Else, it moves back to y in \((r,j,\bot )\) to advance the iteration over nodes which share with \(v_j\) the same connectivity relative to \(b_r\).

When in \((r,j,\bot )\) reading y, the machine finds the connectivity \(\xi _t\) of node \(v_j\) relative to \(b_r\) and searches for the topmost \(v_{j'}\) with \(j'>j\) and the same connectivity relative to \(b_r\). If such \(v_{j'}\) exists, then M moves to z and in \((r,j',\top )\) to check if \(v_{j'}\) is r-live. Otherwise, M continues the search from x (as above).

This concludes the definition of the transition function of M. Note that our description uses transitions (out of the states \((r,t,\textsc {l} )\)) which do not move the input head, contrary to our definition of as machines which always move (Sect. 2); but this violation can be easily removed, with a standard, well-known technique that does not change the number of states.

Other than that, it should be clear that M decides , as promised.    \(\square \)

5 Conclusion

Motivated by recent work on reasonable automata [1, 3], we initiated a study of the size of arbitrary solving one-way liveness on inputs of constant length. We gave (i) a more detailed proof of the known fact (from [1]) that \(\varTheta (h)\) states are necessary and sufficient for length 2; and (ii) a proof of the new fact that \(\varTheta (h^2/\log h)\) states are necessary and sufficient for length 3. This concludes the discussion for these two lengths and for the asymptotic size.

For longer lengths, the question remains open. For example, we can still ask: What is the size of a smallest solving one-way liveness on four symbols? The answer is known to be both \(O(h^2)\) and \(\varOmega (h^2/\log h)\) (by the application of Savitch’s method in [3, Theorems 6 and 7]; and our Lemma 7, which clearly also extends to four symbols). However, the tight asymptotic growth remains elusive.

At the same time, we still do not know the exact sizes of the smallest for two and three symbols. Finding those sizes would also be very interesting.