1 Introduction

Well ordering principles assert that certain (computable) transformations of linear orders preserve well foundedness. Historically, the first example concerns the transformation of a linear order X into the set

$$\begin{aligned} \omega (X):=\{\langle x_0,\ldots ,x_{n-1}\rangle \,|\,x_0,\ldots ,x_{n-1}\in X\text { and }x_{n-1}\le _X\cdots \le _Xx_0\} \end{aligned}$$

of finite non-increasing sequences in X, ordered lexicographically. As shown by Girard [26, Theorem 5.4.1] and Hirst [27], the statement that ‘\(\omega (X)\) is well founded whenever the same holds for X’ is equivalent to a set existence principle known as arithmetical comprehension. The latter is, in turn, equivalent to important mathematical results such as the Arzelà-Ascoli theorem or the infinite Ramsey theorem (for tuples with a fixed number of at least three elements). To make clear that these equivalences are informative, we point out that they are established in a weak base system \(\textsf{RCA}_0\) (‘recursive comprehension axiom’). They are part of a research programme known as ‘reverse mathematics’, developed by Friedman [23] and Simpson (see his textbook [55] for a comprehensive introduction).

The literature contains many more equivalences between well ordering principles, statements about set existence, and mathematical theorems [2, 37, 45, 47, 48, 50, 58]. At the same time, there is a fundamental limitation: The statement that ‘X is well founded’ has complexity \(\Pi ^1_1\) (one universal quantification over infinite sets). Given a computable transformation D of linear orders, the principle that ‘D(X) is well founded whenever the same holds for X’ will thus be \(\Pi ^1_2\) (‘for all—exists’). It is known that principles of this form cannot be equivalent to more abstract set existence statements, such as the principle of \(\Pi ^1_1\)-comprehension from reverse mathematics or the ‘minimal bad sequence lemma’ of Nash-Williams [39] (see the analysis by Marcone [36]).

To overcome this limitation, one can consider order transformations of higher type, which have other transformations as arguments or values. More precisely, the latter should be dilators in the sense of Girard [24], i. e., particularly uniform transformations \(X\mapsto D(X)\) of well orders (see below for details). In the prime example from the literature, a given dilator D is transformed into a linear order \(\vartheta (D)\) that represents a relativized Bachmann–Howard ordinal (details below). The statement that ‘\(\vartheta (D)\) is well founded for every dilator D’ is equivalent to the principle of \(\Pi ^1_1\)-comprehension, as shown by the first author [10,11,12,13]. For related work by the second author we refer to [46] and to Section 6 of the earlier paper [45]. The equivalence with \(\Pi ^1_1\)-comprehension had been conjectured in A. Montalbán’s list of ‘Open questions in reverse mathematics’ [38]).

The cited result on \(\vartheta (D)\) has become the basis for an analysis of the minimal bad sequence lemma in terms of a uniform Kruskal theorem [21], for a new approach to Friedman’s gap condition [14, 19], for another equivalence that involves patterns of resemblance [18] (which resolves a further open question from Montalbán’s list [38]), for work on a functorial version of the fast-growing hierarchy [3], and for a result on inverse Goodstein sequences [59]. These applications show why well ordering principles are relevant: they connect very intricate constructions from proof theory to reverse mathematics, set theory, and core mathematics. The present paper shows that these connections extend far beyond the existing literature. Specifically, we will study iterated \(\Pi ^1_1\)-comprehension or, equivalently, hierarchies of admissible sets. In particular, we will obtain a characterization of \(\Pi ^1_1\)-transfinite recursion, which is equivalent to mathematical results such as the Galvin-Prikry theorem from Ramsey theory (as shown by Tanaka [56]). We will also characterize the statement that ‘every set is contained in a countable \(\beta \)-model of \(\Pi ^1_1\)-comprehension’, which solves an important case of the general Conjecture 6.1 from [45]. Analogous to the applications of [11, 13] that were mentioned at the beginning of this paragraph, the present paper has already been used to prove an equivalence between \(\Pi ^1_1\)-transfinite recursion and a uniform Kruskal-Friedman theorem with gap condition (see [35] for the combinatorial result and [17] for the analysis in reverse mathematics).

Let us recall some terminology that is needed to state our result. We write \(\textsf{LO}\) for the category with linear orders as objects and embeddings (strictly increasing functions) as morphisms. By \([\cdot ]^{<\omega }\) we denote the finite subset functor on the category of sets, with

$$\begin{aligned} {[}X]^{<\omega }&:=\text {`the set of finite subsets of }~X\text {'},\\ [f]^{<\omega }(a)&:=\{f(x)\,|\,x\in a\}\quad \text {(for } f:X\rightarrow Y \text { and } a\in [X]^{<\omega }). \end{aligned}$$

We will suppress the forgetful functor from linear orders to sets. In the following definition, this allows us to view both D and \([\cdot ]^{<\omega }\) as functors from linear orders to sets, so that we can consider a natural transformation between them (i. e., a family of maps \({\text {supp}}_X:D(X)\rightarrow [X]^{<\omega }\) such that \([f]^{<\omega }\circ {\text {supp}}_X={\text {supp}}_Y\circ D(f)\) holds for any embedding \(f:X\rightarrow Y\) of linear orders). By \({\text {rng}}(f)\) we denote the range (in the sense of ‘image’) of a function f.

Definition 1.1

A predilator consists of a functor \(D:\textsf{LO}\rightarrow \textsf{LO}\) and a natural transformation \({\text {supp}}:D\Rightarrow [\cdot ]^{<\omega }\) such that the ‘support condition’

$$\begin{aligned} {\text {rng}}(D(f))=\{\sigma \in D(Y)\,|\,{\text {supp}}_Y(\sigma )\subseteq {\text {rng}}(f)\} \end{aligned}$$

is satisfied for every embedding \(f:X\rightarrow Y\) of linear orders. If D(X) is well founded for any well order X, then D (together with \({\text {supp}}\)) is a dilator.

Girard additionally demands that \(D(f)\le D(g)\) follows from \(f\le g\) (pointwise inequalities between morphisms), which is automatic for dilators but not for predilators (see [24, Proposition 2.3.10] or also [21, Lemma 5.3]). Apart from this, our definition is equivalent to Girard’s, which does not mention supports but demands that D preserves direct limits and pullbacks (see [10, Remark 2.2.2]). Predilators are determined by their restrictions to the category of finite orders, essentially because any linear order is the union of its finite suborders. As observed by Girard, this allows us to treat predilators as sets (rather than proper classes) and to represent them in reverse mathematics (assuming their values on finite orders are countable). To make the present paper more readable, we will not work with representations explicitly. The reader who desires a detailed formalization of our considerations in reverse mathematics will find a blueprint in [13, Section 2].

The aforementioned characterization of \(\Pi ^1_1\)-comprehension can now be made more precise. For a subset a and an element y of a linear order X, we write

$$\begin{aligned} a\subseteq _X y\quad :\Leftrightarrow \quad x<_X y\text { for all } x\in a. \end{aligned}$$

This fits with the usual identification of ordinals with their sets of predecessors. The following notion—first defined in [11]—is inspired by Rathjen’s notation system for the Bachmann–Howard ordinal (see [49]).

Definition 1.2

A Bachmann–Howard collapse for a predilator D consists of a linear order X and a function \(\vartheta :D(X)\rightarrow X\) such that

  1. (i)

    \(\sigma <_{D(X)}\tau \) and \({\text {supp}}_X(\sigma )\subseteq _X\vartheta (\tau )\) entail \(\vartheta (\sigma )<_X\vartheta (\tau )\),

  2. (ii)

    we have \({\text {supp}}_X(\sigma )\subseteq _X\vartheta (\sigma )\) for all \(\sigma \in D(X)\).

If such a \(\vartheta \) exists, we call X a Bachmann–Howard fixed point of D.

In [13, Section 4] it is shown that any predilator D has a minimal Bachmann–Howard fixed point \(\vartheta (D)\), which is computable with a representation of D as oracle. We can now give a precise formulation of the result that was mentioned above.

Theorem 1.3

([11, 13]) The following are equivalent over \(\textsf{RCA}_0\):

  1. (i)

    \(\Pi ^1_1\)-comprehension,

  2. (ii)

    any dilator has a well founded Bachmann–Howard fixed point,

  3. (iii)

    if D is a dilator, then \(\vartheta (D)\) is well founded.

Let us point out that (ii) and (iii) have different virtues. Since \(D\mapsto \vartheta (D)\) is a computable transformation, statement (iii) is a well ordering principle of higher type, as discussed above. The explicit construction of \(\vartheta (D)\) reveals that the strength of (ii) lies in well foundedness, not in the existence of Bachmann–Howard fixed points as linear orders. On the other hand, statement (ii) has the advantage that it is very easy to formulate. This demonstrates another advantage of well ordering principles: they allow us to condense central ideas of ordinal analysis into elegant set theoretic principles. With a grain of salt, we suggest to view these principles as ‘large cardinal axioms’ in the computable realm.

We now describe how Theorem 1.3 will be generalized in the present paper. The product \(X\times Y\) of linear orders is defined as usual, namely by

$$\begin{aligned} (x,y)<_{X\times Y}(x',y')\quad :\Leftrightarrow \quad x<_X x' \text { or } (x=x' \text { and } y<_Y y'). \end{aligned}$$

Given functions \(f:X\rightarrow X'\) and \(g:Y\rightarrow Y'\), we define \(f\times g:X\times Y\rightarrow X'\times Y'\) by \((f\times g)(x,y):=(f(x),g(y))\). Let us note that we omit one pair of parentheses in the expression \((f\times g)((x,y))\) to improve readability. If f or g is the identity on \(X=X'\) or \(Y=Y'\), respectively, we write \(X\times g\) or \(f\times Y\) rather than \(f\times g\). By Example 1.5, the following generalizes the \(\psi \)-functions of Buchholz [6].

Definition 1.4

Given a well order \(\nu \) and a predilator D, a \(\nu \)-collapse for D consists of a linear order X and an embedding \(\pi :X\rightarrow \nu \times D(X)\) with the following two properties: First, we demand that the relation \(\vartriangleleft \) on X that is given by

$$\begin{aligned} s\vartriangleleft t\quad :\Leftrightarrow \quad s\in {\text {supp}}_X(\tau )\text { for }\pi (t)=(\alpha ,\tau ) \end{aligned}$$

admits a height function \(h:X\rightarrow {\mathbb {N}}\) with \(h(s)<h(t)\) for any \(s\vartriangleleft t\) (think of s as a subterm of t). For \(\gamma <\nu \), we use recursion along \(\vartriangleleft \) to define \(G^D_\gamma :X\rightarrow [D(X)]^{<\omega }\) and simultaneously \(G_\gamma :D(X)\rightarrow [D(X)]^{<\omega }\) by

$$\begin{aligned} G^D_\gamma (t)&:={\left\{ \begin{array}{ll} \{\tau \}\cup G_\gamma (\tau ) &{} \text {if }\pi (t)=(\alpha ,\tau )\text { with }\alpha \ge \gamma ,\\ \emptyset &{} \text {if }\pi (t)=(\alpha ,\tau )\text { with }\alpha <\gamma , \end{array}\right. }\\ G_\gamma (\tau )&:=\bigcup \{G^D_\gamma (s)\,|\,s\in {\text {supp}}_X(\tau )\}. \end{aligned}$$

Secondly, we now demand that \(\pi \) has range

$$\begin{aligned} {\text {rng}}(\pi )=\{(\alpha ,\tau )\in \nu \times D(X)\,|\,G_\alpha (\tau )\subseteq _{D(X)}\tau \}. \end{aligned}$$

If such a \(\pi \) exists, we say that X is a \(\nu \)-fixed point of D.

In the presence of weak Kőnig’s lemma, the existence of our height function h is equivalent to the well foundedness of \(\vartriangleleft \) (since supports are finite). The given formulation of the definition has the advantage that \(G^D_\gamma \) and \(G_\gamma \) can be constructed even over \(\textsf{RCA}_0\) (as kindly pointed out by Patrick Uftring). We will see that the existence of well founded \(\nu \)-fixed points entails principles that are far stronger than Kőnig’s lemma.

Instead of \(\pi \), we will often consider its partial inverse \(\psi :\nu \times D(X)\rightarrow _p X\), which can be seen as a collapsing function in the sense of impredicative ordinal analysis (see the following example). While some readers may prefer to reformulate the definition in terms of \(\psi \), we feel that the use of \(\pi \) has notational advantages. Note that we cannot expect \(\psi \) to be total, because the order type of \(\nu \times D(X)\) will typically exceed the one of X. Very roughly, the condition on \({\text {rng}}(\pi )\) ensures that \(\psi \) has a large domain of definition. Given that \(\pi \) and hence \(\psi \) is order preserving, this means that X must have large order type.

Example 1.5

To turn the transformation \(X\mapsto \omega (X)\) into a dilator, we declare

$$\begin{aligned} \omega (f)(\langle x_0,\ldots ,x_{n-1}\rangle )&:=\langle f(x_0),\ldots ,f(x_{n-1})\rangle ,\\ {\text {supp}}^\omega _X(\langle x_0,\ldots ,x_{n-1}\rangle )&:=\{x_0,\ldots ,x_{n-1}\}. \end{aligned}$$

Consider Buchholz’ order \(\textsf{OT}\) from [6, Section 2], and let \({\textsf{P}}\subseteq \textsf{OT}\) be the suborder of principal terms, which have the form \(D_\alpha t\) with \(\alpha <\omega +1\) and \(t\in \textsf{OT}\). Let us note that a principal term of the indicated form represents a value of a function \(\psi _\alpha \) that ‘collapses’ large ordinals below the \(\alpha \)-th regular uncountable cardinal. All such values are additively principal ordinals (i. e., have the form \(\omega ^\gamma )\), and the remaining terms in \(\textsf{OT}\) represent finite sums of them. Our aim here is to show that \({\textsf{P}}\) is an \((\omega +1)\)-fixed point of the dilator \(\omega (\cdot )\). Up to the obvious isomorphism \(\textsf{OT}\cong \omega ({\textsf{P}})\), we can define \(\pi :{\textsf{P}}\rightarrow (\omega +1)\times \omega ({\textsf{P}})\) by \(\pi (D_\alpha t):=(\alpha ,t)\). Clause (\({\prec }2\)) from the cited paper by Buchholz ensures that \(\pi \) is an embedding. Given \(s\vartriangleleft D_\alpha t\) for a term \(t=\langle t_0,\ldots ,t_{n-1}\rangle \), we invoke the definition of \(\vartriangleleft \) to get

$$\begin{aligned} s\in {\text {supp}}^\omega _{{\textsf{P}}}(t)=\{t_0,\ldots ,t_{n-1}\}. \end{aligned}$$

The latter entails that s is a subterm of \(D_\alpha t\) (in the usual sense), which ensures that \(\vartriangleleft \) is well founded. The isomorphism \(\textsf{OT}\cong \omega ({\textsf{P}})\) identifies \(t\in {\textsf{P}}\subseteq \textsf{OT}\) with the element \(\langle t\rangle \in \omega ({\textsf{P}})\). Up to this identification, the function \(G_\gamma :\omega ({\textsf{P}})\rightarrow [\omega ({\textsf{P}})]^{<\omega }\) from Definition 1.4 is an extension of \(G^\omega _\gamma :{\textsf{P}}\rightarrow [\omega ({\textsf{P}})]^{<\omega }\). Based on this observation, one readily checks that our function \(G_\gamma \) coincides with \(G_\gamma :\textsf{OT}\rightarrow [\textsf{OT}]^{<\omega }\) as defined by Buchholz, still modulo \(\textsf{OT}\cong \omega ({\textsf{P}})\). In view of Buchholz’ clause (\(\textsf{OT}3\)), it follows that \(\pi \) has range as required by Definition 1.4.

In Sect. 2, we explicitly construct a \(\nu \)-fixed point \(\psi _\nu (D)\) of a given predilator D. More precisely, the order \(\psi _\nu (D)\) will be given as a term system that is computable relative to \(\nu \) and D, so that its existence is known in the axiom system \(\textsf{RCA}_0\). We will also show that \(\psi _\nu (D)\) is isomorphic to any other \(\nu \)-fixed point of D, so that \(\nu \)-fixed points are essentially unique. This confirms the significance of Example 1.5. Let us now state our main result, which is further explained below. The proof spans most of our paper and will be completed in Sect. 9.

Theorem 1.6

Provably in \(\textsf{RCA}_0\), the following principles are equivalent for any infinite well order \(\nu \):

  1. (i)

    \(\Pi ^1_1\)-recursion along \(\nu \),

  2. (ii)

    any dilator has a well founded \(\nu \)-fixed point,

  3. (iii)

    if D is a dilator, then \(\psi _\nu (D)\) is well founded.

Over \(\mathsf {ATR_0^{set}}\), statements (i) to (iii) are also equivalent to the following:

  1. (iv)

    for any set u, there is a sequence of admissible sets \(\textsf{Ad}_\alpha \ni u\) for \(\alpha <\nu \), such that \(\alpha<\beta <\nu \) entails \(\textsf{Ad}_\alpha \in \textsf{Ad}_\beta \) (where we consider \(\nu \) as an ordinal).

The restriction to infinite \(\nu \) is convenient, because it will allow us to reduce to the case where \(\nu \) is of limit type. In \(\textsf{RCA}_0\) one can also prove the equivalence for \(\nu =1\) and hence for each finite \(\nu \) that is fixed externally, as we shall see in Corollary 4.4 (based on Theorem 1.3). What we will not show is that \(\textsf{RCA}_0\) proves the equivalence uniformly for all finite \(\nu \). We believe that this could be established by our methods, but this would seem to require a separate treatment of the successor case, which we were keen to avoid.

Let us now explain statement (i) from Theorem 1.6. Given \(Y\subseteq {\mathbb {N}}\) and \(\alpha <\nu \), we write \(Y_\alpha \) for the set of all \(x\in {\mathbb {N}}\) such that (the Cantor code of) the pair \(\langle \alpha ,x\rangle \) is contained in Y. In other words, we view Y as a representation of the sequence of sets \(Y_\alpha \subseteq {\mathbb {N}}\) with \(\alpha <\nu \). Its initial segments are represented by the sets

$$\begin{aligned} Y_{<\alpha }:=\{\langle \gamma ,x\rangle \in Y\,|\,\gamma <\alpha \}=\{\langle \gamma ,x\rangle \in \alpha \times \mathbb N\,|\,x\in Y_\gamma \}\subseteq {\mathbb {N}}. \end{aligned}$$

For a formula \(\varphi (x,\alpha ,X)\), possibly with further parameters, let \(H_\varphi (Y)\) be given by (the obvious formalization of)

$$\begin{aligned} H_\varphi (Y)\quad :\Leftrightarrow \quad Y_\alpha =\{x\in \mathbb N\,|\,\varphi (x,\alpha ,Y_{<\alpha })\}\text { for all }\alpha <\nu . \end{aligned}$$

More intuitively, this expresses that the sets \(Y_\alpha \subseteq {\mathbb {N}}\) are built by recursion along \(\nu \), where \(\varphi \) determines the recursion step. Let us recall that \(\Pi ^1_1\)-formulas have the form \(\forall X\subseteq \mathbb N.\,\theta \) for a formula \(\theta \) that contains quantifiers \(\forall n\in {\mathbb {N}}\) and \(\exists n\in {\mathbb {N}}\) only. Statement (i) from Theorem 1.6 is the axiom schema that consists of all statements

$$\begin{aligned} \forall x_1,\ldots ,x_m\in {\mathbb {N}}\,\forall X_1,\ldots ,X_n\subseteq {\mathbb {N}}\,\exists Y\subseteq {\mathbb {N}}.\, H_\varphi (Y) \end{aligned}$$

for a \(\Pi ^1_1\)-formula \(\varphi \) with number and set parameters \(x_1,\ldots ,x_m\) and \(X_1,\ldots ,X_n\).

Before we discuss the axiom system \(\mathsf {ATR_0^{set}}\) and statement (iv) from Theorem 1.6, we consider some instances that are relevant in their own right (see Sect. 9 for proofs). First, the following result was promised in [46], for a projected article with the title ‘A proof-theoretic characterization of \(\beta \)-models of \(\Pi ^1_1\)-comprehension’, which we have incorporated into the present more general paper.

Corollary 1.7

The following are equivalent over \(\textsf{RCA}_0\):

  1. (i)

    every subset of \({\mathbb {N}}\) is contained in a countable \(\beta \)-model of \(\Pi ^1_1\)-comprehension,

  2. (ii)

    any dilator has a well founded \(\omega \)-fixed point,

  3. (iii)

    if D is a dilator, then \(\psi _\omega (D)\) is well founded.

Secondly, the axiom schema and rule of \(\Delta ^1_2\)-comprehension are closely connected to iterations of \(\Pi ^1_1\)-recursion along fixed \(\nu <\varepsilon _0\) and \(\nu <\omega ^\omega \), respectively, as shown by Friedman [22] and Feferman [9] (see also the presentation by Pohlers [41, Section 3.2]). Our Theorem 1.6 yields analogous connections with the well foundedness of \(\nu \)-fixed points. Finally, we obtain the following corollary when we quantify over \(\nu \). To confirm the significance of this result, we recall that \(\Pi ^1_1\)-transfinite recursion is equivalent to the Galvin-Prikry theorem and to the principle of \(\Delta ^0_2\)-determinacy, due to Tanaka [56, 57].

Corollary 1.8

The following are equivalent over \(\textsf{RCA}_0\):

  1. (i)

    \(\Pi ^1_1\)-transfinite recursion, i. e., the principle that \(\Pi ^1_1\)-recursion is available along any well order \(\nu \),

  2. (ii)

    any dilator has a well founded \(\nu \)-fixed point for every well order \(\nu \),

  3. (iii)

    if D is a dilator and \(\nu \) is any well order, then \(\psi _\nu (D)\) is well founded.

Let us now complete our explanation of Theorem 1.6. The axiom system \(\mathsf {ATR_0^{set}}\) is a set theory due to Simpson [54, 55], who showed that it is conservative over the axiom system \(\mathsf {ATR_0}\) (‘arithmetical transfinite recursion’) from reverse mathematics. Its axioms ensure that all primitive recursive set functions (in the sense of Jensen and Karp [34]) are total and that every well order is isomorphic to an ordinal (‘axiom beta’). We also include the axiom that all sets are countable, as in [55] (while [54] marks this axiom as ‘optional’).

We also recall that an admissible set is a transitive model of Kripke-Platek set theory. For \(\nu =1\), the equivalence between (i) and (iv) has been shown by Jäger [31] (see also [10, Section 1.4]). The extension to general \(\nu \) can probably be considered as known, but we will also obtain a new—if rather indirect—proof in the present paper. Indeed, we will work in \(\mathsf {ATR_0^{set}}\) to prove the circle of implications

$$\begin{aligned} (\text {i})\quad \Rightarrow \quad (\text {ii})\quad \Leftrightarrow \quad (\text {iii})\quad \Rightarrow \quad (\text {iv})\quad \Rightarrow \quad (\text {i}) \end{aligned}$$

between the statements from Theorem 1.6. In order to obtain the equivalence of (i), (ii) and (iii) over \(\textsf{RCA}_0\), we will argue that each of these statements entails arithmetical transfinite recursion (consider Theorem 4.2 together with Theorem 1.3 above). Note that (iv) cannot be (directly) considered over \(\textsf{RCA}_0\), as it is a statement of set theory rather than reverse mathematics.

Statements (ii) and (iii) of Theorem 1.6 are equivalent because \(\psi _\nu (D)\) is the unique \(\nu \)-fixed point of D (up to isomorphism), as mentioned above and proved in Sect. 2. The implication from (i) to (ii) is established in Sect. 3, where we relativize Buchholz’ [5] method of ‘distinguished sets’ to a given dilator (cf. the relativization to a single order in [47, Section 12.3.1]). In Sect. 9 we recall the standard proof that (iv) implies (i).

To prove the crucial implication from (ii) to (iv), we will generalize the argument that was given for \(\nu =1\) in [11]. There we developed a notion of \(\beta \)-proof (cf. [25]) that is sound and complete for the class of models \({\mathbb {L}}^u_\alpha \), i. e., the stages of the constructible hierarchy over a transitive \(u=:\mathbb L^u_0\). By completeness, the existence of an admissible set \(\mathbb L^u_\alpha \) (which implies (i) of Theorem 1.3) was reduced to the claim that there is no \(\beta \)-proof of contradiction in Kripke-Platek set theory. This claim is a natural target for ordinal analysis, which is specialized in consistency proofs based on large well orders. Specifically, one argues that the height of a given \(\beta \)-proof can be bounded by some dilator D. Based on the well order \(\vartheta (D)\) from (ii) of Theorem 1.3, one can employ Jäger’s ordinal analysis of Kripke-Platek set theory [30], to conclude that the given \(\beta \)-proof does not derive a contradiction.

In the argument from [11] that we have sketched in the previous paragraph, the relevant \(\beta \)-proofs consist of a tree \(S_X\) for each linear order X (see [11, Section 4]). The aforementioned dilator D is essentially given by \(D(X)=S_X\) with the Kleene-Brouwer order. In the present paper, we obtain corresponding trees \(S^R_X\) that depend not only on a linear order X but also on a given embedding \(R:\nu \rightarrow X\), which corresponds to the sequence of admissible sets in (iv) of Theorem 1.6 (see Sect. 5). However, we cannot allow D(X) to depend on R, because (ii) of Theorem 1.6 requires a dilator, i. e., a transformation whose arguments are linear orders without additional structure. This new obstacle is resolved in Sect. 6, which can be seen as the main technical contribution of the present paper. To complete the proof that (ii) implies (iv) in Theorem 1.6, we then adapt the classical ordinal analysis for iterated admissible sets, developed by Jäger and Pohlers [32] and streamlined by Buchholz [7] (see also the earlier work on inductive definitions [8] and the detailed results in [44]). Our ‘abstract’ version of this ordinal analysis is worked out in Sects. 7 and 8. In the final Sect. 9, we combine all previous work into official proofs of Theorem 1.6 and Corollaries 1.7 and 1.8.

2 Existence and uniqueness of \(\nu \)-fixed points

In the present section, we construct a \(\nu \)-fixed point \(\psi _\nu (D)\) of a given predilator D for an arbitrary well order \(\nu \). Before, we show that all \(\nu \)-fixed points of D are isomorphic, which will entail that \(\psi _\nu (D)\) is essentially unique. The following result is central for our uniqueness proof.

Proposition 2.1

For well orders \(\mu \) and \(\nu \), consider a \(\mu \)-collapse \(\pi :X\rightarrow \mu \times D(X)\) and a \(\nu \)-collapse \(\kappa :Y\rightarrow \nu \times D(Y)\) of a predilator D. Given an embedding \(I:\mu \rightarrow \nu \), there is a unique embedding \(f:X\rightarrow Y\) such that

figure a

is a commutative diagram.

Proof

Write \(\vartriangleleft \) for the well founded relation on X that is given by Definition 1.4. To prepare the proof of existence, we establish a more general form of uniqueness. For the purpose of this proof, let us say that a (finite or infinite) set \(a\subseteq X\) is closed if \(s\vartriangleleft t\in a\) implies \(s\in a\). We write \(\iota _a:a\hookrightarrow X\) for the inclusion. By the definition of \(\vartriangleleft \) and the support condition from Definition 1.1, any closed a validates

$$\begin{aligned} t\in a\,\Rightarrow \,{\text {supp}}_X(\tau )\subseteq a={\text {rng}}(\iota _a)\,\Rightarrow \,\tau \in {\text {rng}}(D(\iota _a))\qquad \text {for}\qquad \pi (t)=(\alpha ,\tau ). \end{aligned}$$

Given that \(D(\iota _a)\) is an embedding, we get a unique embedding \(\pi _a\) such that

figure b

commutes. By an a-approximation, we shall mean an embedding \(f_a:a\rightarrow Y\) such that the diagram from the proposition commutes if we replace \(X,\pi ,f\) by \(a,\pi _a,f_a\). When a is the entire order X, then the functions \(\iota _a\) and \(D(\iota _a)\) are the identity on \(a=X\) and \(D(a)=D(X)\), respectively, since D is a functor. In this case, the functions \(\pi _a\) and \(\pi \) will thus coincide, which means that an X-approximation is a function f as in the proposition. Our strong form of uniqueness reads as follows.

Claim

Given any a-approximation \(f_a\) and b-approximation \(f_b\) for closed \(a,b\subseteq X\), we have \(f_a(t)=f_b(t)\) for all \(t\in a\cap b\).

To prove the claim, one checks that \(c:=a\cap b\) is closed and that \(f_a\!\restriction \!c\) and \(f_b\!\restriction \!c\) are c-approximations (write \(f_a\!\restriction \!c=f_a\circ \iota \) with \(\iota :c\hookrightarrow a\)). To conclude, we consider an arbitrary c-approximation f and show that its values are uniquely determined. Given \(t\in c\), write \(\pi _c(t)=(\alpha ,\tau )\) and consider the inclusion \(\iota :{\text {supp}}_c(\tau )\hookrightarrow c\). By the support condition, we can write \(\tau =D(\iota )(\tau _0)\), where \(\tau _0\) is unique since \(D(\iota )\) is an embedding. As f is a c-approximation, we obtain

$$\begin{aligned} \kappa \circ f(t)=(I\times D(f))\circ \pi _c(t)=(I(\alpha ),D(f\circ \iota )(\tau _0)). \end{aligned}$$

Given that \(\kappa \) is an embedding, this means that f(t) is determined by \(f\circ \iota \). We can deduce uniqueness by induction over \(\vartriangleleft \) (or over the heights from Definition 1.4), as \(s\in {\text {rng}}(\iota )\) implies \(s\vartriangleleft t\). To see the latter, note that we have

$$\begin{aligned} \pi (t)=\pi \circ \iota _c(t)=({\text {Id}}\times D(\iota _c))\circ \pi _c(t)=(\alpha ,D(\iota _c)(\tau )), \end{aligned}$$

and that the naturality of \({\text {supp}}:D\Rightarrow [\cdot ]^{<\omega }\) yields

$$\begin{aligned} {\text {rng}}(\iota )={\text {supp}}_c(\tau )=[\iota _c]^{<\omega }\circ {\text {supp}}_c(\tau )={\text {supp}}_X(D(\iota _c)(\tau )). \end{aligned}$$

As a next step towards existence, we show that approximations can be combined:

Claim

Consider a family \(\langle f_i\,|\,i\in I\rangle \) of \(a_i\)-approximations \(f_i\) for closed \(a_i\subseteq X\). The function \(f:a=\bigcup _{i\in I}a_i\rightarrow Y\) with \(f(t)=f_i(t)\) for \(t\in a_i\) is an a-approximation.

Note that a is closed and that f is well defined by the previous claim. To show that f is an a-approximation, we need to consider at most two indices at a time, namely, when we check that f is an order embedding. This means that the claim for general I reduces to the one for \(I=\{0,1\}\). We establish the latter by induction on the cardinality \(|a_0\cup a_1|\in \mathbb N\cup \{\infty \}\). The crucial step is to show

$$\begin{aligned} t_0<_X t_1\,\Rightarrow \,f_0(t_0)<_Y f_1(t_1)\qquad \text {for}\qquad t_i\in a_i. \end{aligned}$$

Let \(a_i'\subseteq a_i\) consist of the predecessors of \(t_i\) in the transitive closure of \(\vartriangleleft \). Then the set \(c:=a_0'\cup a_1'\) is finite and cannot contain both \(t_0\) and \(t_1\), as \(\vartriangleleft \) is well founded. Due to the induction hypothesis, the restrictions \(f_i\!\restriction \!a_i'\) can thus be combined into a c-approximation \(f'\). Put \(\pi _i:=\pi _d\) with \(d=a_i\). As in the proof of uniqueness, we can write \(\pi _i(t_i)=(\alpha _i,D(\iota _i')(\tau _i))\) with \(\iota _i':a_i'\hookrightarrow a_i\). For \(\iota _i:a_i\hookrightarrow X\) we get

$$\begin{aligned} \pi (t_i)=\pi \circ \iota _i(t_i)=({\text {Id}}\times D(\iota _i))\circ \pi _i(t_i)=(\alpha _i,D(\iota _i\circ \iota _i')(\tau _i)). \end{aligned}$$

Let us also consider the inclusions \(\iota _i'':a_i'\hookrightarrow c\) and \(\iota _c:c\hookrightarrow X\). Clearly,

figure c

is a commutative diagram. Aiming at the implication above, we now assume \(t_0<t_1\). As \(\pi \) is an embedding, we get either \(\alpha _0<\alpha _1\) or \(\alpha _0=\alpha _1\) and

$$\begin{aligned} D(\iota _c)\circ D(\iota _0'')(\tau _0)=D(\iota _0\circ \iota _0')(\tau _0)<D(\iota _1\circ \iota _1')(\tau _1)=D(\iota _c)\circ D(\iota _1'')(\tau _1), \end{aligned}$$

which entails \(D(\iota _0'')(\tau _0)<D(\iota _1'')(\tau _1)\). By the choice of \(f'\) we have \(f'\!\restriction \!a_i'=f_i\!\restriction \!a_i'\), or equivalently \(f'\circ \iota _i''=f_i\circ \iota _i'\). Hence the last inequality entails

$$\begin{aligned} D(f_0\circ \iota _0')(\tau _0)=D(f')\circ D(\iota _0'')(\tau _0)<D(f')\circ D(\iota _1'')(\tau _1)=D(f_1\circ \iota _1')(\tau _1). \end{aligned}$$

To conclude \(f_0(t_0)<f_1(t_1)\), it is thus enough to observe

$$\begin{aligned} \kappa \circ f_i(t_i)=(I\times D(f_i))\circ \pi _i(t_i)=(I(\alpha _i),D(f_i\circ \iota _i')(\tau _i)). \end{aligned}$$

Now that this second claim is proved, the proposition is reduced to the following:

Claim

Given any \(t\in X\), there is an a-approximation for some finite closed \(a\ni t\).

Arguing by induction on \(\vartriangleleft \), we can use the previous claim to produce a b-approximation f for some finite closed \(b\subseteq X\) that contains all \(s\vartriangleleft t\). As before, we can write \(\pi (t)=(\alpha ,D(\iota _b)(\tau ))\) with \(\iota _b:b\rightarrow X\). To extend f into a function \(f':a\rightarrow Y\) on the closed set \(a:=b\cup \{t\}\), we would like to stipulate \(\kappa \circ f'(t)=(I(\alpha ),D(f)(\tau ))\). For this purpose, we need to show that the right side lies in the range of \(\kappa \). Let us write \(G^{D,Z}_\gamma :Z\rightarrow [D(Z)]^{<\omega }\) and \(G^{Z}_\gamma :D(Z)\rightarrow [D(Z)]^{<\omega }\) for the functions from Definition 1.4, where Z can be X or Y. Analogous functions for \(Z=b\) arise by

$$\begin{aligned} G^{D,b}_\gamma (s)&:={\left\{ \begin{array}{ll} \{\sigma \}\cup G^b_\gamma (\sigma ) &{} \text {if }\pi _b(s)=(\alpha ,\sigma )\text { with }\alpha \ge \gamma ,\\ \emptyset &{} \text {if }\pi _b(s)=(\alpha ,\sigma )\text { with }\alpha <\gamma , \end{array}\right. }\\ G^b_\gamma (\sigma )&:=\bigcup \{G^{D,b}_\gamma (r)\,|\,r\in {\text {supp}}_b(\sigma )\}. \end{aligned}$$

To see that this recursion is well founded, note that \(\pi _b(s)=(\alpha ,\sigma )\) and \(r\in {\text {supp}}_b(\sigma )\) entail \(r\vartriangleleft s\), as in the proof of the first claim. By induction along \(\vartriangleleft \) we get

$$\begin{aligned} \begin{aligned} {[}D(\iota _b)]^{<\omega }\circ G^{D,b}_\gamma&=G^{D,X}_\gamma \circ \iota _b,\\ {[}D(\iota _b)]^{<\omega }\circ G^{b}_\gamma&=G^{X}_\gamma \circ D(\iota _b), \end{aligned}\qquad \begin{aligned} {[}D(f)]^{<\omega }\circ G^{D,b}_\gamma&=G^{D,Y}_{I(\gamma )}\circ f,\\ {[}D(f)]^{<\omega }\circ G^{b}_\gamma&=G^{Y}_{I(\gamma )}\circ D(f). \end{aligned} \end{aligned}$$

For \(t\in X\) with \(\pi (t)=(\alpha ,D(\iota _b)(\tau ))\) as above, we can invoke Definition 1.4 to get

$$\begin{aligned} {[}D(\iota _b)]^{<\omega }\circ G^{b}_\alpha (\tau )=G^{X}_\alpha \circ D(\iota _b)(\tau )\subseteq _{D(X)}D(\iota _b)(\tau ). \end{aligned}$$

The latter entails \(G^{b}_\alpha (\tau )\subseteq _{D(b)}\tau \) and then

$$\begin{aligned} G^{Y}_{I(\alpha )}\circ D(f)(\tau )=[D(f)]^{<\omega }\circ G^{b}_\alpha (\tau )\subseteq _{D(Y)}D(f)(\tau ). \end{aligned}$$

Again by Definition 1.4, it follows that \((I(\alpha ),D(f)(\tau ))\) lies in the range of \(\kappa \). As indicated above we can thus define \(f':a=b\cup \{t\}\rightarrow Y\) by stipulating

$$\begin{aligned} \kappa \circ f'(t)=(I(\alpha ),D(f)(\tau )) \end{aligned}$$

and \(f'\!\restriction \!b=f\). The fact that \(f'\) is order preserving is readily deduced from the following observation: For \(s\in b\) with \(\pi _b(s)=(\beta ,\sigma )\) we have \(\pi (s)=(\beta ,D(\iota _b)(\sigma ))\), and since f is a b-approximation we get

$$\begin{aligned} \kappa \circ f'(s)=\kappa \circ f(s)=(I\times D(f))\circ \pi _b(s)=(I(\beta ),D(f)(\sigma )). \end{aligned}$$

To see that the diagram from the proposition commutes with \(c,\pi _c,f'\) at the place of \(X,\pi ,f\), we note that \(f'\!\restriction \!b=f\) amounts to \(f=f'\circ \iota \) with \(\iota :b\hookrightarrow c\). For \(s\in b\) or \(s=t\), we see that \(\pi (s)=(\beta ,D(\iota _b)(\sigma ))\) yields \(\pi _c(s)=(\beta ,D(\iota )(\sigma ))\) and hence

$$\begin{aligned} (I\times D(f'))\circ \pi _c(s)=(I(\beta ),D(f'\circ \iota )(\sigma ))=(I(\beta ),D(f)(\sigma )), \end{aligned}$$

which coincides with \(\kappa \circ f'(s)\) as computed above. \(\square \)

In terminology from category theory, the proposition shows that any \(\nu \)-fixed point satisfies the universal property of an initial object. As the following proof makes explicit, this entails that \(\nu \)-fixed points are essentially unique. For an application of Proposition 2.1 with \(\mu <\nu \), we refer to Corollary 2.10 below.

Corollary 2.2

All \(\nu \)-fixed points of a given predilator are order isomorphic.

Proof

Consider \(\nu \)-fixed points \(\pi :X\rightarrow \nu \times D(X)\) and \(\kappa :Y\rightarrow \nu \times D(Y)\), and write \(I:\nu \rightarrow \nu \) for the identity. Two applications of the previous proposition (one with X and Y interchanged) yield embeddings \(f:X\rightarrow Y\) and \(g:Y\rightarrow X\) with

$$\begin{aligned} \pi \circ g\circ f=(I\times D(g))\circ \kappa \circ f=(I\times D(g))\circ (I\times D(f))\circ \pi =(I\times D(g\circ f))\circ \pi . \end{aligned}$$

If \({\text {Id}}_X\) is the identity on X, then \(D({\text {Id}}_X)\) is the identity on D(X), as D is a functor. Hence we also have \(\pi \circ {\text {Id}}_X=(I\times D({\text {Id}}_X))\circ \pi \). We can conclude \(g\circ f={\text {Id}}_X\) by the uniqueness part of the previous proposition. The analogous argument shows that \(f\circ g\) is the identity on Y, so that f is indeed an isomorphism. \(\square \)

To prepare the construction of \(\nu \)-fixed points, we recall a notion of normal form that is due to Girard [24]. Where the context suggests it, we identify \(n\in {\mathbb {N}}\) and the finite order \(\{0,\ldots ,n-1\}\) (with the usual order between natural numbers). We also agree to write \(|a|=\{0,\ldots ,|a|-1\}\) for the cardinality of a finite set a.

Definition 2.3

The trace of a predilator D is defined as

$$\begin{aligned} {\text {Tr}}(D):=\{(n,\sigma )\,|\,n\in {\mathbb {N}}\text { and }\sigma \in D(n)\text { with }{\text {supp}}_n(\sigma )=n\}. \end{aligned}$$

We say that \(\sigma \in D(X)\) has normal form \(\sigma \mathrel {=_{{\text {NF}}}}D(e)(\sigma _0)\) with \(e:n\rightarrow X\) for some \(n\in {\mathbb {N}}\) if we have \((n,\sigma _0)\in {\text {Tr}}(D)\) and \(\sigma \) is indeed equal to \(D(e)(\sigma _0)\).

Let us recall a standard observation:

Lemma 2.4

Any \(\sigma \in D(X)\) has a unique normal form \(\sigma \mathrel {=_{{\text {NF}}}}D(e)(\sigma _0)\).

Proof

If \(\sigma \) has normal form as given, then e is determined as the unique embedding with domain \(n:=|{\text {supp}}_X(\sigma )|\) and range \({\text {supp}}_X(\sigma )\subseteq X\), as naturality yields

$$\begin{aligned} {\text {supp}}_X(\sigma )={\text {supp}}_X\circ D(e)(\sigma _0)=[e]^{<\omega }\circ {\text {supp}}_n(\sigma _0)=[e]^{<\omega }(n)={\text {rng}}(e). \end{aligned}$$

For existence, consider e as determined in the uniqueness proof. The support condition from Definition 1.1 ensures that \(\sigma =D(e)(\sigma _0)\) holds for some \(\sigma _0\in D(n)\). By the equations above, we see that \({\text {supp}}_X(\sigma )={\text {rng}}(e)\) entails \({\text {supp}}_n(\sigma _0)=n\) and hence \((n,\sigma _0)\in {\text {Tr}}(D)\). \(\square \)

In order to construct a \(\nu \)-fixed point \(\psi _\nu (D)\) of a given predilator D, we shall first build an order \(\psi _\nu ^+(D)\supseteq \psi _\nu (D)\) that admits an order isomorphism

$$\begin{aligned} \psi _\nu ^+(D)\cong \nu \times D\left( \psi _\nu ^+(D)\right) . \end{aligned}$$

We will later show that \(\psi _\nu (D)\) is well founded when D is a dilator (cf. Theorem 1.6). The same cannot hold for \(\psi _\nu ^+(D)\), which explains the auxiliary status of this order. Indeed, when we have \(\nu >1\) and D admits embeddings \(X\hookrightarrow D(X)\), then the order type of \(\nu \times D(X)\) will always exceed the one of X (namely, \(\nu \times D(X)\) is isomorphic to \(\beta \cdot \nu >\alpha \) when we have \(\alpha \cong X\hookrightarrow D(X)\cong \beta \) with \(\alpha \le \beta \)).

Definition 2.5

Consider an ordinal \(\nu \) and a predilator D. The set \(\psi _\nu ^+(D)\) of terms is generated by the following recursive clause: Given a finite set \(a\subseteq \psi _\nu ^+(D)\), we add a term \(\psi _\alpha (a,\sigma )\in \psi _\nu ^+(D)\) for each \(\alpha <\nu \) and each \(\sigma \in D(|a|)\) with \((|a|,\sigma )\in {\text {Tr}}(D)\).

Note that \(\psi ^+_\nu (D)\) is non-empty if the same holds for D(0). We recursively put

$$\begin{aligned} l:\psi _\nu ^+(D)\rightarrow {\mathbb {N}}\quad \text {with}\quad l\left( \psi _\alpha (a,\sigma )\right) :=1+\textstyle \sum _{t\in a}2\cdot l(t). \end{aligned}$$

The following definition determines \(s\preceq t\) by recursion on \(l(s)+l(t)\). In particular, the factor 2 in the definition of l allows us to determine the restriction of \(\preceq \) to \(a\cup b\). We demand that this restriction is linear, to ensure that \(D(a\cup b)\) is defined (as a linear order with the order relation written as \(\le _{D(a\cup b)}\)).

Definition 2.6

In order to define a binary relation \(\preceq \) on \(\psi _\nu ^+(D)\) by recursion, we declare that \(\psi _\alpha (a,\sigma )\preceq \psi _\beta (b,\tau )\) holds precisely if \(a\cup b\) is linearly ordered by \(\preceq \) and

  1. (i)

    either we have \(\alpha <\beta \),

  2. (ii)

    or we have \(\alpha =\beta \) and \(D(e_a)(\sigma )\le _{D(a\cup b)}D(e_b)(\tau )\) for the strictly increasing functions \(e_a:|a|\rightarrow a\cup b\) and \(e_b:|b|\rightarrow a\cup b\) with range a and b, respectively.

The condition that \(a\cup b\) is linearly ordered is made redundant by the following.

Lemma 2.7

The relation \(\preceq \) is a linear order on \(\psi _\nu ^+(D)\).

Proof

By induction on \(n\in {\mathbb {N}}\), one can simultaneously show

$$\begin{aligned} \begin{array}{cl} t\preceq t&{}\qquad \text {for }l(t)<n,\\ r\preceq s\text { and }s\preceq t\text { imply }r\preceq t&{}\qquad \text {for }l(r)+l(s)+l(t)<n,\\ s\preceq t\text { and }t\preceq s\text { imply }s=t&{} \qquad \text {for }l(s)+l(t)<n,\\ s\preceq t\text { or }t\preceq s\quad &{} \qquad \text {for }l(s)+l(t)<n. \end{array} \end{aligned}$$

Let us establish transitivity for \(r=\psi _\alpha (a,\rho )\), \(s=\psi _\beta (b,\sigma )\) and \(t=\psi _\gamma (c,\tau )\). The induction hypothesis ensures that \(\preceq \) is linear on \(d:=a\cup b\cup c\) (due to the factor 2 in the definition of l and since transitivity is trivial when all three relevant terms are equal). Given \(r\preceq s\) and \(s\preceq t\), the conclusion \(r\preceq t\) is immediate unless we have \(\alpha =\beta =\gamma \) as well as

$$\begin{aligned} D(e_a^{a\cup b})(\rho )\le _{D(a\cup b)} D(e_b^{a\cup b})(\sigma )\quad \text {and}\quad D(e_b^{b\cup c})(\sigma )\le _{D(b\cup c)} D(e_c^{b\cup c})(\tau ), \end{aligned}$$

where \(e_u^v:|u|\rightarrow v\) is strictly increasing with range \(u\subseteq v\). Note that \(\iota _v^w\circ e_u^v=e_u^w\) holds for the inclusion \(\iota _v^w:v\hookrightarrow w\). After composing the previous inequalities with \(D(\iota _{a\cup b}^d)\) and \(D(\iota _{b\cup c}^d)\), respectively, we can invoke transitivity in D(d) to get

$$\begin{aligned} D(\iota _{a\cup c}^d)\circ D(e_a^{a\cup c})(\rho )=D(e_a^d)(\rho )\le _{D(d)} D(e_c^d)(\tau )=D(\iota _{a\cup c}^d)\circ D(e_c^{a\cup c})(\tau ). \end{aligned}$$

We obtain \(D(e_a^{a\cup c})(\rho )\le _{D(a\cup c)}D(e_c^{a\cup c})(\tau )\), so that clause (ii) of Definition 2.6 yields the desired inequality \(r\preceq t\). By similar but easier arguments, we can reduce the reflexivity and linearity of \(\preceq \) to the corresponding properties of orders D(d). To establish antisymmetry, we must show that \(s=t\) follows from

$$\begin{aligned} D(e_b^{b\cup c})(\sigma )=D(e_c^{b\cup c})(\tau ). \end{aligned}$$

The expressions on both sides of this equation are normal forms in the sense of Definition 2.3, as Definition 2.5 ensures that \((|b|,\sigma )\) and \((|c|,\tau )\) lie in \({\text {Tr}}(D)\). Hence Lemma 2.4 allows us to conclude. \(\square \)

To obtain an order isomorphism \(\psi _\nu ^+(D)\cong \nu \times D\left( \psi _\nu ^+(D)\right) \) as promised above, it suffices to map \(\psi _\alpha (a,\sigma )\) to \((\alpha ,D(e_a)(\sigma ))\), where \(e_a:|a|\rightarrow \psi _\nu ^+(D)\) is strictly increasing with range a. This fact will not be used, but a very similar result is shown in the proof of Theorem 2.9 below. We now single out the desired suborder.

Definition 2.8

In the following, let \(e_a:|a|\rightarrow \psi ^+_\nu (\sigma )\) denote the strictly increasing function with range a and the indicated codomain. For each ordinal \(\gamma <\nu \) we define a function \(G^+_\gamma :\psi _\nu ^+(D)\rightarrow [D(\psi _\nu ^+(D))]^{<\omega }\) by recursion over terms, stipulating

$$\begin{aligned} G^+_\gamma (\psi _\alpha (a,\sigma )):={\left\{ \begin{array}{ll} \{D(e_a)(\sigma )\}\cup \bigcup \{G^+_\gamma (r)\,|\,r\in a\} &{} \text {if } \alpha \ge \gamma ,\\ \emptyset &{} \text {if } \alpha <\gamma . \end{array}\right. } \end{aligned}$$

The suborder \(\psi _\nu (D)\subseteq \psi _\nu ^+(D)\) is determined by the recursive clause

$$\begin{aligned} \psi _\alpha (a,\sigma )\in \psi _\nu (D)\quad :\Leftrightarrow \quad a\subseteq \psi _\nu (D)\text { and }\bigcup \{G^+_\alpha (r)\,|\,r\in a\}\subseteq _{D(\psi _\nu ^+(D))}D(e_a)(\sigma ). \end{aligned}$$

Let us now establish the main result of this section.

Theorem 2.9

The order \(\psi _\nu (D)\) is a \(\nu \)-fixed point of a given predilator D.

Proof

Write \(\iota :\psi _\nu (D)\hookrightarrow \psi ^+_\nu (D)\) for the inclusion and \(e'_a:|a|\rightarrow \psi _\nu (D)\) for the strictly increasing function with range a, so that \(e_a=\iota \circ e'_a\) is the same function as in Definition 2.8. Now consider the function

$$\begin{aligned} \pi :\psi _\nu (D)\rightarrow \nu \times D(\psi _\nu (D))\quad \text {with}\quad \pi (\psi _\alpha (a,\sigma )):=(\alpha ,D(e'_a)(\sigma )). \end{aligned}$$

One readily shows that \(\pi (s)\le \pi (t)\) entails \(s\preceq t\) (factorize \(e'_a=\iota _{a\cup b}\circ e_a^{a\cup b}\) with \(\iota _{a\cup b}:a\cup b\hookrightarrow \psi _\nu (D)\) as in the proof of Lemma 2.7). Since the codomain of \(\pi \) is a linear order, it follows that \(\pi \) is an embedding. With \(X:=\psi _\nu (D)\) we compute

$$\begin{aligned} {\text {supp}}_X\left( D(e'_a)(\sigma )\right) =[e'_a]^{<\omega }\left( {\text {supp}}_{|a|}(\sigma )\right) =[e'_a]^{<\omega }(|a|)=a. \end{aligned}$$

Here the first equality holds since \({\text {supp}}:D\Rightarrow [\cdot ]^{<\omega }\) is natural, while the second one relies on \((|a|,\sigma )\in {\text {Tr}}(D)\) according to Definition 2.5. The binary relation \(\vartriangleleft \) that is determined in Definition 1.4 can thus be characterized by

$$\begin{aligned} s\vartriangleleft \psi _\alpha (a,\sigma )\quad \Leftrightarrow \quad s\in a, \end{aligned}$$

which entails that it is well founded with a height function that corresponds to the usual notion of height for terms. Let the functions \(G^D_\gamma :\psi _\nu (D)\rightarrow [D(\psi _\nu (D))]^{<\omega }\) and \(G_\gamma :D(\psi _\nu (D))\rightarrow [D(\psi _\nu (D))]^{<\omega }\) be given as in Definition 1.4. By induction along \(\vartriangleleft \) one readily shows

$$\begin{aligned} G^+_\gamma \circ \iota =[D(\iota )]^{<\omega }\circ G^D_\gamma . \end{aligned}$$

In view of Definition 2.8, we can deduce that \(\psi _\alpha (a,\sigma )\in \psi _\nu (D)\) entails

$$\begin{aligned} {[}D(\iota )]^{<\omega }\circ G_\alpha (D(e'_a)(\sigma ))=\bigcup \{G^+_\alpha (r)\,|\,r\in a\}\subseteq _{D(\psi _\nu ^+(D))} D(\iota )\circ D(e'_a)(\sigma ). \end{aligned}$$

It follows that we have

$$\begin{aligned} {\text {rng}}(\pi )\subseteq \{(\alpha ,\tau )\in \nu \times D(\psi _\nu (D))\,|\,G_\alpha (\tau )\subseteq _{D(\psi _\nu (D))}\tau \}, \end{aligned}$$

as Definition 1.4 demands. To show that the converse of this inclusion holds as well, we consider an arbitrary element \((\alpha ,\tau )\) of the right side. Writing \(X=\psi _\nu (D)\), we put \(a:={\text {supp}}_X(\tau )\). The support condition from Definition 1.1 yields a \(\sigma \in D(|a|)\) with \(\tau =D(e'_a)(\sigma )\). As in the proof of Lemma 2.4 we get \((|a|,\sigma )\in {\text {Tr}}(D)\), which allows us to form the term \(\psi _\alpha (a,\sigma )\in \psi _\nu ^+(D)\). Given \(G_\alpha (\tau )\subseteq \tau \), we get

$$\begin{aligned} \bigcup \{G^+_\alpha (r)\,|\,r\in a\}=[D(\iota )]^{<\omega }\circ G_\alpha (\tau )\subseteq _{D(\psi ^+_\nu (D))} D(\iota )(\tau )=D(e_a)(\sigma ). \end{aligned}$$

This entails that \(\psi _\alpha (a,\sigma )\) lies in \(\psi _\nu (D)\subseteq \psi _\nu ^+(D)\). By construction, we can now conclude that \((\alpha ,\tau )=\pi (\psi _\alpha (a,\sigma ))\) is contained in the range of \(\pi \). \(\square \)

By Corollary 2.2, any \(\nu \)-fixed point of D is isomorphic to \(\psi _\nu (D)\), which confirms that statements (ii) and (iii) from Theorem 1.6 are equivalent. If the equivalence with (i) is to hold, then (iii) must become stronger as \(\nu \) grows. We conclude the section with a direct proof that this is the case.

Corollary 2.10

If \(\psi _\nu (D)\) is well founded, then so is \(\psi _\mu (D)\) for any \(\mu <\nu \).

Proof

Given \(\mu <\nu \), there is an embedding of \(\mu \) into \(\nu \). By Proposition 2.1 (which applies due to Theorem 2.9), we get an embedding of \(\psi _\mu (D)\) into \(\psi _\nu (D)\). \(\square \)

3 A proof of well foundedness

In this section, we prove that (i) implies (ii) in Theorem 1.6, i. e., we use iterated \(\Pi ^1_1\)-comprehension to show that \(\nu \)-fixed points of dilators are well founded. To make the general case more transparent, we provide an argument for \(\nu =1\) first.

Remark 3.1

We show that any 1-fixed point X of a dilator D is well founded. Consider a 1-collapse \(\pi :X\rightarrow D(X)\), where D(X) is identified with \(1\times D(X)\). Up to this identification, Definition 1.4 yields

$$\begin{aligned} s\vartriangleleft t\quad \Leftrightarrow \quad s\in {{\text {supp}}_X}\circ \pi (t), \end{aligned}$$

and the definitions of \(G^D_0:X\rightarrow [D(X)]^{<\omega }\) and \(G_0:D(X)\rightarrow [D(X)]^{<\omega }\) become

$$\begin{aligned} G_0^D(t)=\{\pi (t)\}\cup G_0(\pi (t))\quad \text {and}\quad G_0(\tau )=\bigcup \{G_0^D(s)\,|\,s\in {\text {supp}}_X(\tau )\}. \end{aligned}$$

Furthermore, the condition on the range of \(\pi \) can now be written as

$$\begin{aligned} {\text {rng}}(\pi )=\{\tau \in D(X)\,|\,G_0(\tau )\subseteq _{D(X)}\tau \}. \end{aligned}$$

As a special feature of the case \(\nu =1\), we get

$$\begin{aligned} s\vartriangleleft t\quad \Rightarrow \quad \pi (s)\in G_0^D(s)\subseteq G_0(\pi (t))\subseteq _{D(X)}\pi (t)\quad \Rightarrow \quad s<t. \end{aligned}$$

Assuming \(\Pi ^1_1\)-comprehension, we may form the well founded part W of X, which can be given as the intersection of all sets \(Z\subseteq X\) such that we have \(t\in Z\) whenever \(s\in Z\) holds for all \(s<_X t\). One readily shows that W is well founded with

$$\begin{aligned} t\in W\quad \Leftrightarrow \quad s\in W\text { for all }s\in X\text { with }s<_Xt. \end{aligned}$$

Write \(\iota :W\hookrightarrow X\) for the inclusion. By the previous observations and the support condition from Definition 1.1, we get

$$\begin{aligned} t\in W\quad \Rightarrow \quad {{\text {supp}}_X}\circ \pi (t)\subseteq W={\text {rng}}(\iota )\quad \Rightarrow \quad \pi (t)\in {\text {rng}}(D(\iota )). \end{aligned}$$

It follows that there is a function

$$\begin{aligned} \kappa :W\rightarrow D(W)\quad \text {with}\quad D(\iota )\circ \kappa =\pi \circ \iota . \end{aligned}$$

We will show that \(\kappa \) is a 1-collapse of D. Once this has been achieved, we can invoke Corollary 2.2 to learn that \(X\cong W\) is well founded, as desired. In fact, the existence part of Proposition 2.1 yields an embedding \(f:X\rightarrow W\) with \(\kappa \circ f=(I\times D(f))\circ \pi \), where \(I:\nu \rightarrow \nu \) is the identity. By the uniqueness part of the same proposition, the composition \(\iota \circ f\) must be the identity on \(X=W\). It remains to show that \(\kappa \) satisfies the conditions from Definition 1.4. The latter ensures that \(\pi \) is an order embedding, so that the same holds for \(\kappa \). Given \(s,t\in W\), we observe that the naturality of \({\text {supp}}:D\Rightarrow [\cdot ]^{<\omega }\) yields

$$\begin{aligned} {[}\iota ]^{<\omega }\circ {{\text {supp}}_W}\circ \kappa (t)={{\text {supp}}_X}\circ D(\iota )\circ \kappa (t)={{\text {supp}}_X}\circ \pi \circ \iota (t), \end{aligned}$$

so that \(\iota (s)\vartriangleleft \iota (t)\) is equivalent to \(s\in {{\text {supp}}_W}\circ \kappa (t)\). This shows that the restriction of \(\vartriangleleft \) to W coincides with the relation that \(\kappa \) induces according to Definition 1.4. The latter also yields functions \(G_W^D:W\rightarrow [D(W)]^{<\omega }\) and \(G_W:D(W)\rightarrow [D(W)]^{<\omega }\), which are given by

$$\begin{aligned} G_W^D(t)=\{\kappa (t)\}\cup G_W(\kappa (t))\quad \text {and}\quad G_W(\tau )=\bigcup \{G_W^D(s)\,|\,s\in {\text {supp}}_W(\tau )\}. \end{aligned}$$

A straightforward induction along \(\vartriangleleft \) shows that we have

$$\begin{aligned} {[}D(\iota )]^{<\omega }\circ G^D_W=G_0^D\circ \iota \quad \text {and}\quad [D(\iota )]^{<\omega }\circ G_W=G_0\circ D(\iota ). \end{aligned}$$

By the aforementioned condition on the range of \(\pi \), we obtain

$$\begin{aligned} {[}D(\iota )]^{<\omega }\circ G_W\circ \kappa (t)=G_0\circ D(\iota )\circ \kappa (t)=G_0\circ \pi \circ \iota (t)\subseteq _{D(X)}\pi \circ \iota (t)=D(\iota )\circ \kappa (t) \end{aligned}$$

for any \(t\in W\). Since \(D(\iota )\) is an embedding, we can conclude

$$\begin{aligned} {\text {rng}}(\kappa )\subseteq \{\tau \in D(W)\,|\,G_W(\tau )\subseteq _{D(W)}\tau \}. \end{aligned}$$

It remains to establish the converse inclusion. Note that D(W) is well founded, as D is a dilator and W is a well order. We argue by (main) induction on \(\tau \in D(W)\) to prove the crucial implication

$$\begin{aligned} G_W(\tau )\subseteq _{D(W)}\tau \quad \Rightarrow \quad \tau \in {\text {rng}}(\kappa ). \end{aligned}$$

Assuming the premise, we get \(G_0(D(\iota )(\tau ))\subseteq _{D(X)} D(\iota )(\tau )\) as above, which allows us to write \(D(\iota )(\tau )=\pi (t)\) with \(t\in X\). We will show \(t\in W\), so that we obtain

$$\begin{aligned} D(\iota )\circ \kappa (t)=\pi \circ \iota (t)=\pi (t)=D(\iota )(\tau ). \end{aligned}$$

Since \(D(\iota )\) is an embedding, we can conclude \(\tau =\kappa (t)\in {\text {rng}}(\kappa )\) as desired. In order to get \(t\in W\), we establish

$$\begin{aligned} s\in X\text { and }s<_Xt\quad \Rightarrow \quad s\in W \end{aligned}$$

by (side) induction on s in the order \(\vartriangleleft \). For \(r\vartriangleleft s<t\) we get \(r<t\), so that the induction hypothesis yields \(r\in W\). This shows that we have \({{\text {supp}}_X}\circ \pi (s)\subseteq {\text {rng}}(\iota )\). We can thus write \(\pi (s)=D(\iota )(\sigma )\), due to the support condition. As above, the condition on the range of \(\pi \) entails \(G_W(\sigma )\subseteq _{D(W)}\sigma \). Since \(s<t\) implies \(\sigma <\tau \), the main induction hypothesis yields \(\sigma =\kappa (s')\) for some \(s'\in W\). In view of

$$\begin{aligned} \pi (s)=D(\iota )(\sigma )=D(\iota )\circ \kappa (s')=\pi \circ \iota (s') \end{aligned}$$

we get \(s=\iota (s')\in W\), as needed to complete the side induction step.

The previous remark is loosely inspired by [49, Section 10]. Similarly, the following generalization to \(\nu >1\) can be seen as an ‘abstract’ version of [47, Section 12]. For all results up to Theorem 3.12, we fix a \(\nu \)-collapse \(\pi :X\rightarrow \nu \times D(X)\) of a dilator D (note that D preserves well foundedness).

Definition 3.2

For each \(\alpha <\nu \) we put

$$\begin{aligned} X_\alpha :=\{t\in X\,|\,\pi (t)=(\gamma ,\tau )\text { with }\gamma \le \alpha \}. \end{aligned}$$

Furthermore, we define \(E^D_\alpha :X\rightarrow [X_\alpha ]^{<\omega }\) and \(E_\alpha :D(X)\rightarrow [X_\alpha ]^{<\omega }\) by

$$\begin{aligned} E^D_\alpha (t)&:={\left\{ \begin{array}{ll} \{t\} &{} \text {if } t\in X_\alpha ,\\ E_\alpha (\tau ) &{} \text {if } \pi (t)=(\gamma ,\tau ) \text { with } \gamma >\alpha , \end{array}\right. }\\ E_\alpha (\tau )&:=\bigcup \{E^D_\alpha (s)\,|\,s\in {\text {supp}}_X(\tau )\}. \end{aligned}$$

This amounts to a recursion along the well founded relation \(\vartriangleleft \) from Definition 1.4.

Note that each set \(X_\alpha \) is an initial segment of X, since \(\pi \) is an embedding.

Definition 3.3

By \(\Pi ^1_1\)-recursion on \(\alpha <\nu \), define \(W_\alpha \) as the well founded part of

$$\begin{aligned} M_\alpha :=\{t\in X_\alpha \,|\,E^D_\gamma (t)\subseteq W_\gamma \text { for all }\gamma <\alpha \}. \end{aligned}$$

Let us also set \(W:=\bigcup \{W_\alpha \,|\,\alpha <\nu \}\).

We point out that the sets \(W_\alpha \) are distinguished (‘ausgezeichnet’) in the sense of Buchholz [5], modulo the fact that we are in a somewhat more abstract setting.

Lemma 3.4

For \(\alpha \le \beta \) we have \(W_\alpha =W_\beta \cap X_\alpha =W\cap X_\alpha \).

Proof

For \(\alpha <\beta \) and \(t\in W_\beta \cap X_\alpha \) we get \(t\in E^D_\alpha (t)\subseteq W_\alpha \) by the definition of \(M_\beta \). To establish \(W_\alpha \subseteq W_\beta \), we argue by induction on \(\beta \). For \(\alpha \le \gamma <\beta \), the induction hypothesis ensures that \(t\in W_\alpha \subseteq X_\gamma \) entails \(E^D_\gamma (t)=\{t\}\subseteq W_\gamma \), so that we get

$$\begin{aligned} W_\alpha \subseteq M_\beta \cap X_\alpha \subseteq M_\alpha . \end{aligned}$$

By definition of the well founded part, \(W_\beta \) is the largest initial segment of \(M_\beta \) that is well founded. The given inclusions entail that \(W_\alpha \) is such a segment and hence contained in \(W_\beta \). More explicitly, induction on \(t\in W_\alpha \) yields \(t\in W_\beta \). \(\square \)

As W is the union of well founded initial segments, we get the following.

Corollary 3.5

The suborder \(W\subseteq X\) is well founded.

In the next lemma, we collect some basic facts for later use.

Lemma 3.6

The following holds for any \(\alpha ,\beta <\nu \), any \(s,t\in X\) and any \(\tau \in D(X)\):

  1. (a)

    Given \(s\in E^D_\beta (t)\) and \(\alpha \le \beta \), we get \(E^D_\alpha (s)\subseteq E^D_\alpha (t)\). The same holds when \(E^D_\beta (t)\) is replaced by \(E_\beta (\tau )\).

  2. (b)

    If \(\pi (t)=(\alpha ,\tau )\), then we have \(E_\alpha (\tau )\subseteq _X t\).

  3. (c)

    From \((\alpha ,\tau )\in {\text {rng}}(\pi )\) we get \((\beta ,\tau )\in {\text {rng}}(\pi )\) for any \(\beta \ge \alpha \).

Proof

(a) We argue by induction on t in the order \(\vartriangleleft \). For \(s=t\), the claim is trivial. In the remaining case, we have \(\pi (t)=(\delta ,\tau )\) with \(\delta >\beta \ge \alpha \). We get \(s\in E^D_\beta (r)\) for some \(r\vartriangleleft t\), so that the induction hypothesis yields

$$\begin{aligned} E^D_\alpha (s)\subseteq E^D_\alpha (r)\subseteq E_\alpha (\tau )=E^D_\alpha (t). \end{aligned}$$

(b) By induction on s in the order \(\vartriangleleft \), we prove the auxiliary claim

$$\begin{aligned} r\in E^D_\alpha (s)\text { and }\pi (r)=(\alpha ,\rho )\quad \Rightarrow \quad \rho \in G^D_\alpha (s). \end{aligned}$$

Assuming the antecedent, we must have \(\pi (s)=(\gamma ,\sigma )\) with \(\gamma \ge \alpha \), so that

$$\begin{aligned} G^D_\alpha (s)=\{\sigma \}\cup \bigcup \{G^D_\alpha (s')\,|\,s'\in {\text {supp}}_X(\sigma )\}. \end{aligned}$$

For \(r=s\) we obtain \(\rho =\sigma \in G^D_\alpha (s)\). In the remaining case we have \(r\in E^D_\alpha (s')\) for some \(s'\vartriangleleft s\), so that the induction hypothesis yields \(\rho \in G^D_\alpha (s')\subseteq G^D_\alpha (s)\). To deduce the lemma, consider an arbitrary \(r\in E_\alpha (\tau )\). Write \(\pi (r)=(\delta ,\rho )\), necessarily with \(\delta \le \alpha \). If we have \(\delta <\alpha \), then we immediately get \(\pi (r)<\pi (t)\) and hence \(r<t\). Now assume \(\delta =\alpha \), and note that we have \(r\in E^D_\alpha (s)\) for some \(s\in {\text {supp}}_X(\tau )\). By the auxiliary claim and the condition on \({\text {rng}}(\pi )\) in Definition 1.4, we get

$$\begin{aligned} \rho \in G^D_\alpha (s)\subseteq G_\alpha (\tau )\subseteq _{D(X)}\tau . \end{aligned}$$

Once again this yields \(\pi (r)<\pi (t)\) and hence \(r<t\), as required for \(E_\alpha (\tau )\subseteq _X t\).

(c) Given \(\alpha \le \beta \), one checks \(G^D_\beta (t)\subseteq G^D_\alpha (t)\) by a straightforward induction on t in the order \(\vartriangleleft \). The same inclusion then holds with \(G_\gamma (\tau )\) at the place of \(G^D_\gamma (t)\). Now it suffices to recall the condition on \({\text {rng}}(\pi )\) from Definition 1.4. \(\square \)

Inspired by [47, Definition 12.64], we introduce the following crucial sets.

Definition 3.7

Let us put

$$\begin{aligned} B&:=\{\tau \in D(X)\,|\,\text {we have } t\in W \text { whenever }~\pi (t)=(\alpha ,\tau ) \text { for some }~\alpha<\nu \},\\ M&:=\{\tau \in D(X)\,|\,\text {we have } E_\gamma (\tau )\subseteq W \text { for all }~\gamma <\nu \}. \end{aligned}$$

All of the following results rely on the standing assumption that D is a dilator. Note that we only use this assumption once, namely in the following proof.

Lemma 3.8

The suborder \(M\subseteq D(X)\) is well founded.

Proof

Given any \(\tau \in M\), pick a \(\gamma <\nu \) such that the finite set \({\text {supp}}_X(\tau )\) is fully contained in \(X_\gamma \). By the definition of M, we obtain

$$\begin{aligned} W\supseteq E_\gamma (\tau )=\bigcup \{E^D_\gamma (s)\,|\,s\in {\text {supp}}_X(\tau )\}={\text {supp}}_X(\tau ). \end{aligned}$$

For the inclusion \(\iota :W\hookrightarrow X\), we get \(\tau \in {\text {rng}}(D(\iota ))\) by the support condition from Definition 1.1. Hence M lies in the range of the embedding \(D(\iota ):D(W)\rightarrow D(X)\). To conclude, note that D(W) is well founded as D is a dilator. \(\square \)

The next result is the technical core of this section.

Proposition 3.9

We have \(M\subseteq B\).

Proof

We argue by (main) induction over the well order M, i. e., we assume \(\tau \in M\) and \(\{\sigma \in M\,|\,\sigma <\tau \}\subseteq B\) to derive \(\tau \in B\). Aiming at the latter, consider an arbitrary \(t\in X\) such that \(\pi (t)=(\alpha ,\tau )\) holds for some \(\alpha \). We need to prove \(t\in W\). Given \(\tau \in M\), we get \(t\in M_\alpha \) via

$$\begin{aligned} E^D_\gamma (t)=E_\gamma (\tau )\subseteq W\cap X_\gamma =W_\gamma \quad \text {for}\quad \gamma <\alpha . \end{aligned}$$

Since \(W_\alpha \) is the accessible part of \(M_\alpha \), we can conclude \(t\in W_\alpha \subseteq W\) once the following is established (cf. [47, Lemma 12.65]):

Claim

Given any \(\alpha <\nu \) and \(t\in X\) with \(\pi (t)=(\alpha ,\tau )\), we obtain \(s\in W_\alpha \) for all elements \(s\in M_\alpha \) with \(s<t\).

To prove this claim, we argue by (side) induction on s in the transitive closure \(\vartriangleleft ^+\) of the well founded relation \(\vartriangleleft \), or alternatively on h(s) for

$$\begin{aligned} h:X\rightarrow {\mathbb {N}}\quad \text {with}\quad h(s):=\max (\{0\}\cup \{h(r)+1\,|\,r\vartriangleleft s\}). \end{aligned}$$

It will be important that the induction hypothesis is available for all \(\alpha \) and hence for various t, while \(\tau \) remains fixed as above. In the side induction step, we first assume that \(s\in X_\gamma \) holds for some \(\gamma <\alpha \). Given \(s\in M_\alpha \), we then get

$$\begin{aligned} s\in E^D_\gamma (s)\subseteq W_\gamma \subseteq W_\alpha . \end{aligned}$$

In the remaining case, we have \(\pi (s)=(\alpha ,\sigma )\) with \(\sigma <\tau \), as \(s<t\) entails \(\pi (s)<\pi (t)\). To use the main induction hypothesis, we want to show \(\sigma \in M\), which amounts to

$$\begin{aligned} E_\gamma (\sigma )\subseteq W\quad \text {for all }\gamma <\nu . \end{aligned}$$

We prove the latter by (auxiliary) induction on \(\gamma \). For \(\gamma <\alpha \) we can invoke \(s\in M_\alpha \) to get \(E_\gamma (\sigma )=E^D_\gamma (s)\subseteq W_\gamma \). In the case of \(\gamma =\alpha \), we use Lemma 3.6(a) to obtain

$$\begin{aligned} E^D_\delta (r)\subseteq E_\delta (\sigma )\subseteq W\cap X_\delta =W_\delta \quad \text {for all }r\in E_\alpha (\sigma )\text { and }\delta <\alpha , \end{aligned}$$

which yields \(E_\alpha (\sigma )\subseteq M_\alpha \). By Lemma 3.6(b), we have \(E_\alpha (\sigma )\subseteq _X s<t\). Furthermore, it is not hard to see that the elements of \(E_\alpha (\sigma )\) lie below s in \(\vartriangleleft ^+\) (alternatively check \(h(r')\le h(r)\) for \(r'\in E_\alpha ^D(r)\) by induction over \(\vartriangleleft \)). We can thus use the side induction hypothesis to get \(E_\alpha (\sigma )\subseteq W_\alpha \). Finally, we consider the case of \(\gamma >\alpha \). The auxiliary induction hypothesis entails \(E_\gamma (\sigma )\subseteq M_\gamma \) as before. By Lemma 3.6(c) we find \(s',t'\in X\) with \(\pi (s')=(\gamma ,\sigma )\) and \(\pi (t')=(\gamma ,\tau )\). In view of \(\sigma <\tau \) we get

$$\begin{aligned} E_\gamma (\sigma )\subseteq _X s'<t'. \end{aligned}$$

Thus the desired inclusion \(E_\gamma (\sigma )\subseteq W_\gamma \) follows from the side induction hypothesis (now with \(\gamma \) and \(t'\) at the place of \(\alpha \) and t). This completes the auxiliary induction and hence the proof of \(\sigma \in M\), as noted above. We can now invoke the main induction hypothesis to get \(\sigma \in B\). Given \(\pi (s)=(\alpha ,\sigma )\), this yields \(s\in W\cap X_\alpha =W_\alpha \), which concludes the steps of side induction (claim) and main induction. \(\square \)

In Remark 3.1, we have exploited the fact that \(t\in W\) and \(s\vartriangleleft t\) entail \(s\in W\). The proof that we have given breaks down for \(\nu >1\). However, we get the desired closure property for an inductively generated suborder:

Definition 3.10

Let \(V\subseteq W\) be given by the recursive clause

$$\begin{aligned} t\in V\quad :\Leftrightarrow \quad t\in W\text { and }s\in V\text { for all }s\vartriangleleft t. \end{aligned}$$

In the following result, the implication \(\Rightarrow \) is the closure property mentioned above. The converse implication encapsulates most previous work of this section.

Corollary 3.11

For \(t\in X\) with \(\pi (t)=(\alpha ,\tau )\) we have

$$\begin{aligned} t\in V\quad \Leftrightarrow \quad {\text {supp}}_X(\tau )\subseteq V. \end{aligned}$$

Proof

Since \(s\vartriangleleft t\) amounts to \(s\in {\text {supp}}_X(\tau )\), it suffices to show that \({\text {supp}}_X(\tau )\subseteq V\) implies \(t\in W\). For \(\gamma <\nu \), a straightforward induction over \(\vartriangleleft \) shows that \(s\in V\) entails \(E^D_\gamma (s)\subseteq V\). Given \({\text {supp}}_X(\tau )\subseteq V\), we thus get

$$\begin{aligned} E_\gamma (\tau )=\bigcup \{E^D_\gamma (s)\,|\,s\in {\text {supp}}_X(\tau )\}\subseteq V\subseteq W. \end{aligned}$$

This shows \(\tau \in M\), so that Proposition 3.9 yields \(\tau \in B\), which entails \(t\in W\). \(\square \)

Finally, we deduce the main result of this section, which shows that (i) implies (ii) in Theorem 1.6. To justify the formulation of the following theorem, we recall that \(\nu \)-fixed points exist and are essentially unique, by Theorem 2.9 and Corollary 2.2.

Theorem 3.12

If \(\Pi ^1_1\)-recursion along \(\nu \) is available, then the \(\nu \)-fixed point of any dilator is well founded.

Proof

Consider a dilator D and a \(\nu \)-fixed point X with collapse \(\pi :X\rightarrow \nu \times D(X)\). Using \(\Pi ^1_1\)-recursion along \(\nu \), we can construct sets \(W_\alpha \) as in Definition 3.3, to obtain suborders \(V\subseteq W\subseteq X\) as in Definition 3.10. Note that V is well founded by Corollary 3.5. We shall show that V is a \(\nu \)-fixed point of D. Once this is achieved, we can use Corollary 2.2 to conclude that \(X\cong V\) is well founded. In fact, we could derive \(X=V\) via Proposition 2.1 (as in Remark 3.1). Write \(\iota :V\rightarrow X\) for the inclusion. By the previous corollary and the support condition from Definition 1.1, we get \(\tau \in {\text {rng}}(D(\iota ))\) whenever we have \(\pi (t)=(\alpha ,\tau )\) with \(t\in V\). We thus obtain an embedding \(\kappa \) so that

figure d

commutes. Concerning the constructions from Definition 1.4, we note that \(\kappa \) and \(\pi \) induce the same relation \(\vartriangleleft \) on \(V\subseteq X\), as in Remark 3.1. The cited definition also yields functions \(G^{D,Z}_\gamma :Z\rightarrow [D(Z)]^{<\omega }\) and \(G^Z_\gamma :D(Z)\rightarrow [D(Z)]^{<\omega }\) for \(Z=X\) and for \(Z=V\), which are defined with respect to \(\kappa \) and \(\pi \). As in Remark 3.1, a straightforward induction over \(\vartriangleleft \) shows

$$\begin{aligned} {[}D(\iota )]^{<\omega }\circ G^{D,V}_\gamma= & {} G^{D,X}_\gamma \circ \iota ,\\ {}[D(\iota )]^{<\omega }\circ G^V_\gamma= & {} G^X_\gamma \circ D(\iota ). \end{aligned}$$

It remains to establish the crucial condition from Definition 1.4, i. e., the equation

$$\begin{aligned} {\text {rng}}(\kappa )=\{(\alpha ,\tau )\in \nu \times D(V)\,|\,G^V_\alpha (\tau )\subseteq _{D(V)}\tau \}. \end{aligned}$$

We point out that the analogous condition is given for \(\pi \), as the latter is a \(\nu \)-collapse. As in Remark 3.1, one derives the inclusion \(\subseteq \) and shows that \(G^V_\alpha (\tau )\subseteq _{D(V)}\tau \) entails \((\alpha ,D(\iota )(\tau ))=\pi (t)\) for some \(t\in X\). Note that we have

$$\begin{aligned} {{\text {supp}}_X}\circ D(\iota )(\tau )=[\iota ]^{<\omega }\circ {\text {supp}}_V(\tau )\subseteq V, \end{aligned}$$

as \({\text {supp}}:D\Rightarrow [\cdot ]^{<\omega }\) is a natural transformation. Crucially, we can now infer \(t\in V\) by the non-trivial direction of Corollary 3.11. In view of

$$\begin{aligned} (I\times D(\iota ))(\alpha ,\tau )=(\alpha ,D(\iota )(\tau ))=\pi (t)=\pi \circ \iota (t)=(I\times D(\iota ))\circ \kappa (t), \end{aligned}$$

we get \((\alpha ,\tau )=\kappa (t)\in {\text {rng}}(\kappa )\) as desired. \(\square \)

4 Booting up: Bachmann–Howard fixed points and Veblen hierarchy

In the first part of this section, we establish a connection between Bachmann–Howard fixed points and 1-fixed points (cf. Definitions 1.2 and 1.4). This will allow us to use \(\Pi ^1_1\)-comprehension whenever the well foundedness of 1-fixed points is given, due to Theorem 1.3 (proved in [11, 13]). Amongst others, \(\Pi ^1_1\)-comprehension secures the Veblen hierarchy of normal functions. In the second part of this section, we discuss a functor \(\Gamma \) that represents this hierarchy. It will be used in our proof that (iii) implies (iv) in Theorem 1.6.

We begin with the easier part of the connection, which will not be needed in this paper but completes the picture in a satisfactory way:

Proposition 4.1

Assume that Z is a Bachmann–Howard fixed point of a given predilator D. Then some suborder \(X\subseteq Z\) is a 1-fixed point of D.

Proof

By assumption, we have a Bachmann–Howard collapse \(\vartheta :D(Z)\rightarrow Z\). To see that \(\vartheta \) is injective, consider an inequality \(\sigma <\tau \) in the linear order D(Z). If we have \({\text {supp}}_Z(\sigma )\subseteq _{D(Z)}\vartheta (\tau )\), then clause (i) of Definition 1.2 yields \(\vartheta (\sigma )<\vartheta (\tau )\). Otherwise, there is an \(r\in {\text {supp}}_Z(\sigma )\) with \(\vartheta (\tau )\le r<\vartheta (\sigma )\), where the second inequality relies on clause (ii) of the cited definition. We shall assume that \(\vartheta \) is also surjective and that

$$\begin{aligned} s\vartriangleleft \vartheta (\tau )\quad :\Leftrightarrow \quad s\in {\text {supp}}_Z(\tau ) \end{aligned}$$

defines a well founded relation on Z. To justify these assumptions, we point out that they hold when Z is the minimal Bachmann–Howard fixed point \(\vartheta (D)\) that was constructed in [13, Section 4]. In other words, we can replace Z by \(\vartheta (D)\subseteq Z\) to satisfy the additional assumptions. Let us now define \(G^D:Z\rightarrow [D(Z)]^{<\omega }\) and simultaneously \(G:D(Z)\rightarrow [D(Z)]^{<\omega }\) by the recursive clauses

$$\begin{aligned} G^D(\vartheta (\tau )):=\{\tau \}\cup G(\tau )\quad \text {and}\quad G(\tau ):=\bigcup \{G^D(s)\,|\,s\in {\text {supp}}_Z(\tau )\}. \end{aligned}$$

By induction on s in the order \(\vartriangleleft \), we can show

$$\begin{aligned} G^D(s)\subseteq _{D(Z)}\tau \quad \Rightarrow \quad s<\vartheta (\tau ). \end{aligned}$$

Indeed, assume that the premise holds for \(s=\vartheta (\sigma )\). We then have \(\sigma \in G^D(s)\) and hence \(\sigma <\tau \). To conclude by clause (i) of Definition 1.2, we note that \(r\in {\text {supp}}_Z(\sigma )\) entails \(G^D(r)\subseteq G^D(s)\), so that \(r<\vartheta (\tau )\) follows by induction hypothesis. Now set

$$\begin{aligned} Y:=\{s\in Z\,|\,s=\vartheta (\sigma )\text { with }G(\sigma )\subseteq _{D(Z)}\sigma \}. \end{aligned}$$

To generate \(X\subseteq Y\), we inductively declare

$$\begin{aligned} t\in X\quad :\Leftrightarrow \quad t\in Y\text { and }s\in X\text { for all }s\vartriangleleft t. \end{aligned}$$

Write \(\iota :X\hookrightarrow Z\) for the inclusion. For \(\vartheta (\tau )\in X\) we get \({\text {supp}}_Z(\tau )\subseteq X={\text {rng}}(\iota )\). Hence we have \(\tau =D(\iota )(\sigma )\) for a (necessarily unique) element \(\sigma \in D(X)\), by the support condition from Definition 1.1. To define \(\pi :X\rightarrow D(X)\), we now declare that \(\pi (t)=\sigma \) holds for \(t=\vartheta (\tau )\) with \(\tau =D(\iota )(\sigma )\), i. e., we stipulate that

figure e

is a commutative diagram. Clearly \(\pi \) is injective. To conclude that it is an order embedding, we assume \(\pi (s)<\pi (t)\) and deduce \(s<t\). Given \(s\in X\), we get

$$\begin{aligned} \vartheta \circ D(\iota )\circ \pi (s)=\iota (s)\in X\subseteq Y. \end{aligned}$$

By the definition of Y, this yields \(G(D(\iota )\circ \pi (s))\subseteq _{D(Z)}D(\iota )\circ \pi (s)\) and hence

$$\begin{aligned} G^D(\vartheta \circ D(\iota )\circ \pi (s))=\{D(\iota )\circ \pi (s)\}\cup G(D(\iota )\circ \pi (s))\subseteq _{D(Z)}D(\iota )\circ \pi (t). \end{aligned}$$

Due to the implication that was shown above, one can infer \(s<t\) via

$$\begin{aligned} \iota (s)=\vartheta \circ D(\iota )\circ \pi (s)<\vartheta \circ D(\iota )\circ \pi (t)=\iota (t). \end{aligned}$$

After some straightforward verifications, we can conclude that \(\pi \) is a 1-collapse of the predilator D (where we identify D(X) and \(1\times D(X)\) as in Remark 3.1). \(\square \)

Let D and E be predilators with associated transformations \({\text {supp}}^D:D\Rightarrow [\cdot ]^{<\omega }\) and \({\text {supp}}^E:E\Rightarrow [\cdot ]^{<\omega }\). The predilator \(E\circ D\) consists of the usual composition as functors and the transformation \({\text {supp}}^{E\circ D}:E\circ D\Rightarrow [\cdot ]^{<\omega }\) that is given by

$$\begin{aligned} {\text {supp}}^{E\circ D}_X(\sigma ):=\bigcup \{{\text {supp}}^D_X(\rho )\,|\,\rho \in {\text {supp}}^E_{D(X)}(\sigma )\}. \end{aligned}$$

It is straightforward to check that the conditions from Definition 1.1 are satisfied. In the following theorem, we write \(\omega \) for the predilator from Example 1.5 (see also the beginning of Sect. 1). The result is an abstract version of [49, Corollary 3.1], which provides a similar connection between concrete ordinal notation systems.

Theorem 4.2

Any 1-fixed point of \(\omega \circ D\) is a Bachmann–Howard fixed point of D, where D can be any predilator.

Proof

Consider a 1-collapse \(\pi :X\rightarrow \omega \circ D(X)=:E(X)\), where we identify E(X) and \(1\times E(X)\) as before. Let \(G^E_0:X\rightarrow [E(X)]^{<\omega }\) and \(G_0:E(X)\rightarrow [E(X)]^{<\omega }\) be given as in Definition 1.4 (see also Remark 3.1), so that we have

$$\begin{aligned} {\text {rng}}(\pi ):=\{\tau \in \omega \circ D(X)\,|\,G_0(\tau )\subseteq _{\omega \circ D(X)}\tau \}. \end{aligned}$$

We need to define a function \(\vartheta :D(X)\rightarrow X\) that satisfies clauses (i) and (ii) from Definition 1.2. As in the first paragraph of Sect. 1, we write elements of \(\omega \circ D(X)\) in the form \(\langle \sigma _0,\ldots ,\sigma _{n-1}\rangle \), for elements \(\sigma _0\ge \cdots \ge \sigma _{n-1}\) of D(X). In particular, a given \(\sigma \in D(X)\) gives rise to an element \(\langle \sigma \rangle \in \omega \circ D(X)\), which allows us to form

$$\begin{aligned} \sigma ^\star :=\max \big (\{\langle \rangle \}\cup G_0(\langle \sigma \rangle )\big )\in \omega \circ D(X). \end{aligned}$$

Writing \(\sigma ^\star =\langle \sigma _0,\ldots ,\sigma _{n-1}\rangle \), we now set

$$\begin{aligned} \sigma ^+:=\langle \sigma _0,\ldots ,\sigma _{i(\sigma )-1},\sigma \rangle \quad \text {with}\quad i(\sigma ):=\min \big (\{i<n\,|\,\sigma _i<\sigma \}\cup \{n\}\big ). \end{aligned}$$

Note that we have \(\sigma ^+\in \omega \circ D(X)\), as the definition of \(i(\sigma )\) ensures \(\sigma _{i(\sigma )-1}\ge \sigma \). Informally, we point out that the given construction corresponds to \(\sigma ^+=\sigma ^\star +\omega ^\sigma \) in terms of ordinal arithmetic. Let us now show

$$\begin{aligned} G_0(\sigma ^+)\subseteq G_0(\sigma ^\star )\cup G_0(\langle \sigma \rangle )\subseteq G_0(\langle \sigma \rangle )\subseteq _{\omega \circ D(X)}\sigma ^+. \end{aligned}$$

The first inclusion reduces to the analogous inclusions for \({\text {supp}}^{\omega \circ D}_X\) and \({\text {supp}}^\omega _{D(X)}\), which we get by the definition of supports in Example 1.5. Concerning the second inclusion, we note that \(G_0(\langle \rangle )\) is empty, since the same holds for \({\text {supp}}^\omega _{D(X)}(\langle \rangle )\) and hence for \({\text {supp}}^{\omega \circ D}_X(\langle \rangle )\). In the remaining case we have \(\sigma ^\star \in G_0(\langle \sigma \rangle )\). Here we can infer \(G_0(\sigma ^\star )\subseteq G_0(\langle \sigma \rangle )\) from the general fact that \(\rho \in G^E_0(s)\) entails \(G_0(\rho )\subseteq G_0^E(s)\), which is readily verified by induction on s in the order \(\vartriangleleft \) from Definition 1.4. Finally, we see that \(r\in G_0(\langle \sigma \rangle )\) entails \(r\le \sigma ^\star <\sigma ^+\), by the definition of \(\sigma ^\star \) and as we have \(\sigma _{i(\sigma )}<\sigma \) or \(i(\sigma )=n\) (recall that \(\omega \circ D(X)\) is ordered lexicographically). For any \(\sigma \in D(X)\), we have shown \(G_0(\sigma ^+)\subseteq _{\omega \circ D(X)}\sigma ^+\), which entails \(\sigma ^+\in {\text {rng}}(\pi )\). This allows us to form the function

$$\begin{aligned} \vartheta :D(X)\rightarrow X\quad \text {with}\quad \pi \circ \vartheta (\sigma )=\sigma ^+, \end{aligned}$$

which is unique since \(\pi \) is an embedding. To verify clause (ii) of Definition 1.2, we show \(r<\vartheta (\sigma )\) for a given r in the set \({\text {supp}}^D_X(\sigma )\). The latter is equal to \({\text {supp}}^{\omega \circ D}_X(\langle \sigma \rangle )\), as we have \({\text {supp}}^\omega _X(\langle \sigma \rangle )=\{\sigma \}\). We thus get

$$\begin{aligned} \pi (r)\in G^{\omega \circ D}_0(r)\subseteq G_0(\langle \sigma \rangle )\quad \text {and hence}\quad \pi (r)\le \sigma ^\star <\sigma ^+=\pi \circ \vartheta (\sigma ), \end{aligned}$$

which yields \(r<\vartheta (\sigma )\) as desired. In order to prepare the remaining verification, we recall that \(s\vartriangleleft t\) entails \(\pi (s)<\pi (t)\), as observed in Remark 3.1. One can derive that \(\rho \in G^{\omega \circ D}_0(t)\) entails \(\rho \le \pi (t)\), by a straightforward induction on t in the order \(\vartriangleleft \). Aiming at clause (i) of Definition 1.2, we now assume

$$\begin{aligned} \sigma <_{D(X)}\tau \quad \text {and}\quad {\text {supp}}^D_X(\sigma )\subseteq _X\vartheta (\tau ). \end{aligned}$$

For an arbitrary \(r\in {\text {supp}}^D_X(\sigma )\) and any \(\rho \in G^{\omega \circ D}_0(r)\), we get

$$\begin{aligned} \rho \le \pi (r)<\pi \circ \vartheta (\tau )=\tau ^+. \end{aligned}$$

In view of \({\text {supp}}^{\omega \circ D}_X(\langle \sigma \rangle )={\text {supp}}^D_X(\sigma )\) from above, this yields

$$\begin{aligned} G_0(\langle \sigma \rangle )=\bigcup \left\{ \left. G^{\omega \circ D}_0(r)\,\right| \,r\in {\text {supp}}^{\omega \circ D}_X(\langle \sigma \rangle )\right\} \subseteq _{\omega \circ D(X)}\tau ^+. \end{aligned}$$

Together with \(0\le \tau ^\star <\tau ^+\), we get \(\sigma ^\star <\tau ^+\). The latter and \(\sigma <\tau \) entail

$$\begin{aligned} \pi \circ \vartheta (\sigma )=\sigma ^+<\tau ^+=\pi \circ \vartheta (\tau ) \end{aligned}$$

and hence \(\vartheta (\sigma )<\vartheta (\tau )\), by basic considerations about the lexicographic order. \(\square \)

As noted at the beginning of Sect. 1, the statement that “\(\omega (X)\) is well founded for any well order X" is equivalent to arithmetical comprehension and hence unprovable in the theory \(\textsf{RCA}_0\). The latter can prove that \(\omega \) is a predilator but not that it is a dilator. To prepare the use of Theorem 4.2 over \(\textsf{RCA}_0\), we show the following proposition. It is interesting to compare the result with  [15, Theorem 2.2], which says that \(\omega (\omega (Y))\) is the minimal Bachmann–Howard fixed point of \(X\mapsto 1+(1+Y)\times X\).

Proposition 4.3

For any linear order Y, the order \(\omega (Y)\) is a 1-fixed point of a predilator D with \(D(X)=1+Y\times X\) (see the proof for a detailed definition of D).

Proof

Recall the notation for products from the paragraph before Definition 1.4. To complete the definition of \(1+Y\times X\), we introduce general notation for the sum of linear orders \(Z_0\) and \(Z_1\), which will also be needed later. The underlying set of our sum is the disjoint union

$$\begin{aligned} Z_0+Z_1:=\{z_0\,|\,z_0\in Z_0\}\cup \{Z_0+z_1\,|\,z_1\in Z_1\}. \end{aligned}$$

To determine the order, we declare that \(z_0\mapsto z_0\) and \(z_1\mapsto Z_0+z_1\) are embeddings of \(Z_0\) and \(Z_1\) into \(Z_0+Z_1\), while \(z_0<Z_0+z_1\) holds for any \(z_i\in Z_i\). Given embeddings \(f_i:Z_i\rightarrow Z_i'\), we define \(f_0+f_1:Z_0+Z_1\rightarrow Z'_0+Z'_1\) by

$$\begin{aligned} (f_0+f_1)(z_0):=f_0(z_0)\quad \text {and}\quad (f_0+f_1)(Z_0+z_1):=Z'_0+f_1(z_1). \end{aligned}$$

If \(f_0\) or \(f_1\) is the identity on \(Z_0=Z'_0\) or \(Z_1=Z'_1\), respectively, we write \(Z_0+f_1\) or \(f_0+Z_1\) rather than \(f_0+f_1\). Let us agree that \(\times \) binds stronger than \(+\) and that \(1=\{0\}\) denotes the singleton order. For our fixed order Y, this explains the transformations \(X\mapsto D(X):=1+Y\times X\) and \(f\mapsto D(f):=1+Y\times f\) of orders and embeddings. To turn D into a dilator, we define \({\text {supp}}_X:D(X)\rightarrow [X]^{<\omega }\) by

$$\begin{aligned} {\text {supp}}_X(0):=\emptyset \quad \text {and}\quad {\text {supp}}_X(1+(y,x)):=\{x\}. \end{aligned}$$

Let us now consider the embedding \(\pi :\omega (Y)\rightarrow 1+Y\times \omega (Y)\) with

$$\begin{aligned} \pi (\langle \rangle ):=0\quad \text {and}\quad \pi (\langle y_0,\ldots ,y_n\rangle ):=1+(y_0,\langle y_1,\ldots ,y_n\rangle ). \end{aligned}$$

To see that \(\pi \) is a 1-collapse of D, we need to show

$$\begin{aligned} {\text {rng}}(\pi )=\{\tau \in D\circ \omega (Y)\,|\,G_0(\tau )\subseteq _{D\circ \omega (Y)}\tau \}, \end{aligned}$$

with \(G_0:D\circ \omega (Y)\rightarrow [D\circ \omega (Y)]^{<\omega }\) as in Definition 1.4 (see also Remark 3.1). First note that we have \(0\in {\text {rng}}(\pi )\) while \({\text {supp}}_X(0)\) and hence \(G_0(0)\) is empty. Let us now consider \(\tau =1+(y_0,\langle y_1,\ldots ,y_n\rangle )\). We then have \(G_0(\tau )=G^D_0(\langle y_1,\ldots ,y_n\rangle )\), where \(G^D_0:\omega (Y)\rightarrow [D\circ \omega (Y)]^{<\omega }\) is recursively given by \(G^D_0(\langle \rangle )=\{0\}\) and

$$\begin{aligned} G^D_0(\langle z_0,\ldots ,z_m\rangle )=\{1+(z_0,\langle z_1,\ldots ,z_m\rangle )\}\cup G^D_0(\langle z_1,\ldots ,z_m\rangle ). \end{aligned}$$

Let us observe that \(1+(z_0,\langle z_1,\ldots ,z_m\rangle )\) is the largest element of this set, by a straightforward induction on m (note \(z_1\le z_0\) and \(\langle z_2,\ldots ,z_m\rangle <\langle z_1,\ldots ,z_m\rangle \)). If we have \(n=0\) and hence \(\tau =1+(y_0,\langle \rangle )\), then we get \(G_0(\tau )=\{0\}\subseteq _{D\circ \omega (Y)}\tau \) as well as \(\tau =\pi (\langle y_0\rangle )\in {\text {rng}}(\pi )\). In the case of \(n>0\), we need to show

$$\begin{aligned} \tau =1+(y_0,\langle y_1,\ldots ,y_n\rangle )\in {\text {rng}}(\pi )\quad \Leftrightarrow \quad 1+(y_1,\langle y_2,\ldots ,y_n\rangle )<\tau . \end{aligned}$$

Given \(\langle y_1,\ldots ,y_n\rangle \in \omega (Y)\), we see that both sides are equivalent to \(y_1\le y_0\). \(\square \)

Based on Theorem 1.3, we can now derive that the equivalence from Theorem 1.6 holds for \(\nu =1\). This allows us to use \(\Pi ^1_1\)-comprehension whenever the well foundedness of \(\nu \)-fixed points is given. In view of Proposition 4.1, the following can be seen as a strengthening of Theorem 1.3.

Corollary 4.4

For each fixed \(\nu \in {\mathbb {N}}\backslash \{0\}\), the following are equivalent over \(\textsf{RCA}_0\):

  1. (i)

    \(\Pi ^1_1\)-comprehension,

  2. (ii)

    the \(\nu \)-fixed point of any dilator is well founded,

  3. (iii)

    any dilator has a well founded \(\mu \)-fixed point for some well order \(\mu \ne \emptyset \).

Proof

By iterated applications of (i), we obtain \(\Pi ^1_1\)-recursion along \(\nu \), as the latter is fixed externally. We can then invoke Theorem 3.12 to get (ii), which does clearly imply (iii). Assuming the latter, we argue that any given dilator D has a well founded Bachmann–Howard fixed point, to infer (i) via Theorem 1.3. In any application of (iii) we may assume \(\mu =1\), due to Corollary 2.10. If Y is a well order, then the predilator from the previous proposition is a dilator, provably in \(\textsf{RCA}_0\). In the presence of (iii), we can conclude that \(\omega (Y)\) is well founded. So we know that \(\omega \circ D\) is a dilator. Using (iii) again, we get a well founded 1-fixed point of \(\omega \circ D\). By Theorem 4.2, this is the desired Bachmann–Howard fixed point of D. \(\square \)

In the rest of this section, we discuss a dilator \(\Gamma \) such that \(\Gamma (X)\) represents the Veblen function \(\varphi \) up to the X-th ordinal \(\alpha \) with \(\varphi (\alpha ,0)=\alpha \) (such \(\alpha \) are called ‘strongly critical’). The Veblen function plays an important role in ordinal analysis (see e. g. [53, Chapters V and VII]) and can also be analysed in terms of computability theory (as done by Marcone and Montalbán [37]). We will use the dilator \(\Gamma \) in our proof that (iii) implies (iv) in Theorem 1.6, where we mimic traditional ordinal analysis in a more abstract setting. To understand the following, it is not indispensable but certainly helpful to know the set theoretic approach to the Veblen function, for which we refer to [42, Section 3].

The next definition is equivalent to [45, Definition 2.5], despite a small difference in clause (ii’). A detailed justification of the recursion is given after the definition. The abbreviations \(\textsf{SC}\) and \({\textsf{H}}\) stand for ‘strongly critical’ ordinals and ‘Hauptzahlen’. The latter is German for (additively) ‘principal numbers’. We write \({\overline{\varphi }}\) in order to save the symbol \(\varphi \) for Definition 4.12 (note that \({\overline{\varphi }}\) is fixed point free by Lemma 4.6).

Definition 4.5

Given a linear order X, we define sets \(\textsf{SC}\subseteq {\textsf{H}}\subseteq \Gamma (X)\) of terms, a binary relation \(<_{\Gamma (X)}\) on \(\Gamma (X)\) and a critical level function \({\textsf{h}}:\Gamma (X)\rightarrow \Gamma (X)\) by simultaneous recursion. The terms are generated as follows (where \(s\le _{\Gamma (X)}t\) expresses that we have \(s<_{\Gamma (X)}t\) or that s and t are the same term):

  1. (i)

    We have terms \(0\in \Gamma (X)\backslash {\textsf{H}}\) and \(\Gamma _x\in \textsf{SC}\subseteq {\textsf{H}}\subseteq \Gamma (X)\) for all \(x\in X\).

  2. (ii)

    Assume that we are given terms \(s,t\in \Gamma (X)\) with \({\textsf{h}}(t)\le _{\Gamma (X)}s\), such that we have \(t\ne 0\) or \(s\notin \textsf{SC}\). We then add a term \({\overline{\varphi }} st\in {\textsf{H}}\backslash \textsf{SC}\subseteq \Gamma (X)\).

  3. (iii)

    Given \(n>1\) terms \(t_0,\ldots ,t_{n-1}\in {\textsf{H}}\) with \(t_{i+1}\le _{\Gamma (X)} t_i\) for \(i<n-1\), we add a term \(\langle t_0,\ldots ,t_{n-1}\rangle \in \Gamma (X)\backslash {\textsf{H}}\).

To determine \(\textsf {h}\), we put \({\textsf{h}}(\Gamma _x):=\Gamma _x\) and \({\textsf{h}}({\overline{\varphi }} st):=s\) as well as \({\textsf{h}}(t):=0\) in the remaining cases. Let us abbreviate \(\langle \rangle :=0\) and \(\langle t\rangle :=t\) for \(t\in {\textsf{H}}\), so that any element of \(\Gamma (X)\) can be uniquely written in the form \(\langle t_0,\ldots ,t_{n-1}\rangle \) with \(n\in {\mathbb {N}}\). We declare that \(<_{\Gamma (X)}\) is the minimal relation with the following closure properties:

  1. (i’)

    We have \(r<_{\Gamma (X)}\Gamma _y\) for \(r=\Gamma _x\) with \(x<_X y\), for \(r={\overline{\varphi }} st\) with \(s,t<_{\Gamma (X)}\Gamma _x\), and for \(r=\langle r_0,\ldots ,r_{n-1}\rangle \) with \(n=0\) or \(r_0<_{\Gamma (X)}\Gamma _y\).

  2. (ii’)

    We have \(r<_{\Gamma (X)}{\overline{\varphi }} st\) for \(r=\Gamma _x\) with \(r\le _{\Gamma (X)}s\) or \(r\le _{\Gamma (X)}t\), for a term \(r=\langle r_0,\ldots ,r_{n-1}\rangle \) with \(n=0\) or \(r_0<_{\Gamma (X)}{\overline{\varphi }} st\), and for \(r={\overline{\varphi }} s't'\) such that

    • we have \(s'<_{\Gamma (X)}s\) and \(t'<_{\Gamma (X)}{\overline{\varphi }} st\),

    • or we have \(s=s'\) and \(t<_{\Gamma (X)}t'\),

    • or we have \({\overline{\varphi }} s't'\le _{\Gamma (X)}t\).

  3. (iii’)

    We get \(\langle s_0,\dots ,s_{m-1}\rangle <_{\Gamma (X)}\langle t_0,\ldots ,t_{n-1}\rangle \), not necessarily with \(m,n>1\), if

    • we have \(m<n\) and \(s_i=t_i\) for all \(i<m\),

    • or there is a \(j<\min \{m,n\}\) with \(s_j<_{\Gamma (X)}t_j\) and \(s_i=t_i\) for all \(i<j\).

We will sometimes write < rather than \(<_{\Gamma (X)}\) when no ambiguity arises.

Note that clause (iii’) for \(m=0\) yields \(0<_{\Gamma (X)}t\) when \(t\ne 0\). For \(m=1\) we learn that \(s<_{\Gamma (X)}\langle t_0,\ldots ,t_{n-1}\rangle \) is equivalent to \(s\le _{\Gamma (X)}t_0\) when \(s\in {\textsf{H}}\) and \(n>1\). The reader may wish to reformulate the clause for \(m>1\) and \(n=1\) in a similar way. Also note that \(m,n=1\) makes (iii’) tautological, so that no new inequalities arise. Finally, observe that \(\Gamma (X)\) is isomorphic to \(\omega ({\textsf{H}})\), as defined in Sect. 1.

To justify the simultaneous recursion in Definition 4.5, let \(\Gamma ^+(X)\supseteq \Gamma (X)\) be generated by clauses (i) to (iii) but with all conditions that involve \(<_{\Gamma (X)}\) ignored. Define \(\textsf {h}:\Gamma ^+(\textsf {X})\rightarrow \Gamma ^+(\textsf {X})\) as above, and consider \(L:\Gamma ^+(X)\rightarrow {\mathbb {N}}\) with

$$\begin{aligned} L(0):=L(\Gamma _x):=0,\qquad L({\overline{\varphi }} st):=L(s)+L(t)+1,\\ L(\langle t_0,\ldots ,t_{n-1}\rangle ):=L(t_0)+\ldots +L(t_{n-1})+1\quad \text {(for } n>1). \end{aligned}$$

Note that \(L({\textsf{h}}(t))\le L(t)\) holds for all \(t\in \Gamma ^+(X)\). One can now decide \(r\in \Gamma (X)\) and \(s<_{\Gamma (X)}t\) by simultaneous recursion on L(r) and \(L(s)+L(t)\), respectively. This decision procedure is implicit in part (ii) of [45, Lemma 2.6]. Part (i) of the latter coincides with (b) in the next result, up to the modified formulation of (ii’) above.

Lemma 4.6

The following holds for any linear order X:

  1. (a)

    We have \(s,t<_{\Gamma (X)}{\overline{\varphi }} st\) and \(t_0<_{\Gamma (X)}\langle t_0,\ldots ,t_{n-1}\rangle \) in case \(n>1\).

  2. (b)

    The relation \(<_{\Gamma (X)}\) is a linear order on \(\Gamma (X)\).

Proof

We will first show transitivity, which is part of (b). Based in this, we then establish claim (a). Using the latter, we will finally prove irreflexivity as well as trichotomy, so that the proof of (b) is completed. To show that \(r<s\) and \(s<t\) yields \(r<t\), one employs induction on \(L(r)+L(s)+L(t)\) and a lengthy but straightforward case distinction. Now consider the function \({\text {sub}}:\Gamma (X)\rightarrow [\Gamma (X)]^{<\omega }\) that collects the subterms determined by

$$\begin{aligned} {\text {sub}}(0):={\text {sub}}(\Gamma _x):=\emptyset ,\qquad {\text {sub}}({\overline{\varphi }} st):=\{s,t\}\cup {\text {sub}}(s)\cup {\text {sub}}(t),\\ {\text {sub}}(\langle t_0,\ldots ,t_{n-1}\rangle ):=\textstyle \bigcup _{i<n}\big (\{t_i\}\cup {\text {sub}}(t_i)\big )\quad \text {(for } n>1). \end{aligned}$$

In order to obtain (a), it suffices to show that \(s\in {\text {sub}}(t)\) entails \(s<_{\Gamma (X)}t\). We argue by induction on \(L(s)+L(t)\). In the only interesting case, we are concerned with a term of the form \(t=\langle t_0,\ldots ,t_{n-1}\rangle \). Here the point is that an inductively given inequality \(s\le t_i\) will always yield \(s\le t_0\), as transitivity has already been proved. As indicated, we continue with the proof of (b). To show \(t\not < t\), one argues by induction on L(t). The only non-trivial task is to exclude \(t={\overline{\varphi }} t_0t_1\le t_1\). The latter would imply \(t_1<t_1\) by (a) and transitivity, against the induction hypothesis. An induction on \(L(s)+L(t)\) shows that we always have \(s<t\) or \(s=t\) or \(s>t\). \(\square \)

Concerning the following definition, it is immediate that the range of \(\Gamma (f)\) is contained in \(\Gamma ^+(Y)\supseteq \Gamma (Y)\), as defined in the paragraph before Lemma 4.6. In the proof of Proposition 4.8 below, we show that it is indeed contained in \(\Gamma (Y)\).

Definition 4.7

For an embedding \(f:X\rightarrow Y\), we define \(\Gamma (f):\Gamma (X)\rightarrow \Gamma (Y)\) by

$$\begin{aligned}{} & {} \Gamma (f)(0):=0,\qquad \Gamma (f)(\Gamma _x):=\Gamma _{f(x)},\\{} & {} \Gamma (f)({\overline{\varphi }} t_0t_1):={\overline{\varphi }} t_0't_1'\text { with }t_i':=\Gamma (f)(t_i),\\{} & {} \Gamma (f)(\langle t_0,\ldots ,t_{n-1}\rangle ):=\langle \Gamma (f)(t_0),\ldots ,\Gamma (f)(t_{n-1})\rangle \quad \text {(for } n>1). \end{aligned}$$

We also define functions \({\text {supp}}^\Gamma _X:\Gamma (X)\rightarrow [X]^{<\omega }\) by stipulating

$$\begin{aligned} {\text {supp}}^\Gamma _X(0):=\emptyset ,\quad {\text {supp}}^\Gamma _X(\Gamma _x):=\{x\},\quad {\text {supp}}^\Gamma _X({\overline{\varphi }} st):={\text {supp}}^\Gamma _X(s)\cup {\text {supp}}^\Gamma _X(t),\\ {\text {supp}}^\Gamma _X(\langle t_0,\ldots ,t_{n-1}\rangle ):=\textstyle \bigcup _{i<n}{\text {supp}}^\Gamma _X(t_i)\quad \text {(for } n>1). \end{aligned}$$

In the following, a stronger metatheory is needed for matters of well foundedness. We rely on \(\Pi ^1_1\)-comprehension, which will be available in our intended application (via Corollary 4.4). The proof shows that a somewhat weaker principle suffices.

Proposition 4.8

The data from Definitions 4.5 and 4.7 constitutes a predilator \(\Gamma \) (provably in \(\textsf{RCA}_0\)), which is in fact a dilator (in the presence of \(\Pi ^1_1\)-comprehension).

Proof

Given an embedding \(f:X\rightarrow Y\), let \(\Gamma (f):\Gamma ^+(X)\rightarrow \Gamma ^+(Y)\) be defined by the clauses from Definition 4.7, applied to the larger sets \(\Gamma ^+(Z)\supseteq \Gamma (Z)\) from the paragraph before Lemma 4.6. For \(r\in \Gamma ^+(X)\) and \(s,t\in \Gamma (X)\) one readily shows

$$\begin{aligned} r\in \Gamma (X)\quad&\Leftrightarrow \quad \Gamma (f)(r)\in \Gamma (Y),\\ s<_{\Gamma (X)}t\quad&\Leftrightarrow \quad \Gamma (f)(s)<_{\Gamma (Y)}\Gamma (f)(t) \end{aligned}$$

by simultaneous induction on L(r) and \(L(s)+L(t)\), respectively. Concerning the first equivalence, we note that \(\Gamma (f)\) commutes with the functions \(\textsf {h}:\Gamma (\textsf {Z})\rightarrow \Gamma (\textsf {Z})\) from Definition 4.5. To establish the second equivalence, it suffices to show the implication from left to right, which yields the second implication in

$$\begin{aligned} s\not<t\quad \Rightarrow \quad t\le s\quad \Rightarrow \quad \Gamma (f)(t)\le \Gamma (f)(s)\quad \Rightarrow \quad \Gamma (f)(s)\not <\Gamma (f)(t). \end{aligned}$$

By a straightforward induction over terms, one checks that \(\Gamma \) is functorial. A similar induction shows that supports are natural, in the sense that we have

$$\begin{aligned} {[}f]^{<\omega }\circ {\text {supp}}^\Gamma _X={{\text {supp}}^\Gamma _Y}\circ \Gamma (f). \end{aligned}$$

To conclude that \(\Gamma \) is a predilator, it remains to prove

$$\begin{aligned} {\text {rng}}(\Gamma (f))=\{t\in \Gamma (Y)\,|\,{\text {supp}}^\Gamma _Y(t)\subseteq {\text {rng}}(f)\}. \end{aligned}$$

The inclusion from left to right follows from naturality, as \(t=\Gamma (f)(s)\) yields

$$\begin{aligned} {\text {supp}}^\Gamma _Y(t)={{\text {supp}}^\Gamma _Y}\circ \Gamma (f)(s)=[f]^{<\omega }\circ {\text {supp}}^\Gamma _X(s)\subseteq {\text {rng}}(f). \end{aligned}$$

In the converse direction, a straightforward induction on the term \(t\in \Gamma (Y)\) shows that \({\text {supp}}^\Gamma _Y(t)\subseteq {\text {rng}}(f)\) entails \(t=\Gamma (f)(s)\) for some \(s\in \Gamma ^+(X)\). To get \(s\in \Gamma (X)\), we invoke the first equivalence in this proof. If \(\Pi ^1_1\)-comprehension is available, then any subset of \({\mathbb {N}}\) is contained in a countable coded \(\omega \)-model of arithmetical transfinite recursion, by [55, Theorems VII.2.7 and 2.10]. This principle is equivalent to the statement that \(\Gamma (X)\) is well founded for any well order X, by [45, Theorem 1.4]. \(\square \)

From [15, Theorem 3.5] we know that \(\Gamma (X)\) is a minimal Bachmann–Howard fixed point of a dilator D with \(D(Y)=1+2\times Y^2+X\). By the first part of the present section, it should not be hard to characterize \(\Gamma (X)\) as a 1-fixed point. Together with Theorem 3.12, this would yield another proof that \(\Gamma \) is a dilator.

Recall that a function f from ordinals to ordinals is normal if it is strictly increasing and continuous, where the latter means that \(f(\lambda )=\sup \{f(\alpha )\,|\,\alpha <\lambda \}\) holds when \(\lambda \) is a limit. In [20] we have combined previous work of Aczel [1] and Girard [24], to define a class of ‘normal dilators’ that induce normal functions on the ordinals. Informally, normal dilators admit internal versions of themselves:

Definition 4.9

For each linear order X, define \(\gamma _X:X\rightarrow \Gamma (X)\) by \(\gamma _X(x):=\Gamma _x\).

The following means that \(\Gamma \) is normal in the sense of [20].

Lemma 4.10

For all \(s\in \Gamma (X)\) and \(x\in X\) we have

$$\begin{aligned} s<_{\Gamma (X)}\gamma _X(x)\quad \Leftrightarrow \quad {\text {supp}}^\Gamma _X(s)\subseteq _X x. \end{aligned}$$

Each function \(\gamma _X\) is an embedding, we have \({\text {supp}}^\Gamma _X(\gamma _X(x))=\{x\}\) for all \(x\in X\), and the naturality property \(\Gamma (f)\circ \gamma _X=\gamma _Y\circ f\) holds for any embedding \(f:X\rightarrow Y\).

Proof

The equivalence is readily established by induction on the term s, while naturality holds by a straightforward computation. \(\square \)

An initial segment of an order Y is a suborder \(Y_0\subseteq Y\) such that \(y<_Y y'\in Y_0\) entails \(y\in Y_0\). Let us record an important consequence of normality.

Corollary 4.11

If the range of \(f:X\rightarrow Y\) is an initial segment of Y, then the range of \(\Gamma (f):\Gamma (X)\rightarrow \Gamma (Y)\) is an initial segment of \(\Gamma (Y)\).

Proof

Consider an inequality \(s<t\in {\text {rng}}(\Gamma (f))\). To get \(s\in {\text {rng}}(\Gamma (f))\) we need only show \({\text {supp}}^\Gamma _Y(s)\subseteq {\text {rng}}(f)\), due to the support condition from Definition 1.1. Aiming at a contradiction, assume that we have an element \(y\in {\text {supp}}^\Gamma _Y(s)\) with \(y\notin {\text {rng}}(f)\). Given that \({\text {rng}}(f)\) is an initial segment, we obtain \(y'<y\) for all \(y'\in {\text {rng}}(f)\). In view of \(t\in {\text {rng}}(\Gamma (f))\) we can write \(t=\Gamma (f)(t_0)\). The naturality of supports yields

$$\begin{aligned} {\text {supp}}^\Gamma _Y(t)={{\text {supp}}^\Gamma _Y}\circ \Gamma (f)(t_0)=[f]^{<\omega }\circ {\text {supp}}^\Gamma _X(t_0)\subseteq _Y y. \end{aligned}$$

Also note that \(y\not <y\) entails \({\text {supp}}^\Gamma _Y(s)\not \subseteq _Y y\). Now the previous lemma allows us to infer \(t<\gamma _Y(y)\le s\), which contradicts the assumption \(s<t\). \(\square \)

We now represent the total Veblen function. In the following, the first two cases do not clash as we have \({\textsf{h}}(0)=0\), and the third case applies precisely when \({\overline{\varphi }} st\) is defined.

Definition 4.12

Let \(\varphi :\Gamma (X)^2\rightarrow \Gamma (X)\) be given by

$$\begin{aligned} \varphi _st:=\varphi st:=\varphi (s,t):={\left\{ \begin{array}{ll} t &{} \text {if } s<_{\Gamma (X)}{\textsf{h}}(t),\\ s &{} \text {if } s\in \textsf{SC}\text { and } t=0,\\ {\overline{\varphi }} st &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Let us determine the range and fixed points of the Veblen function, its monotonicity properties, and comparisons with terms of the various forms.

Proposition 4.13

We have \({\textsf{H}}=\{\varphi st\,|\,s,t\in \Gamma (X)\}\) and

$$\begin{aligned} \textsf{SC}=\{\Gamma _x\,|\,x\in X\}=\{s\in \Gamma (X)\,|\,\varphi s0=s\}. \end{aligned}$$

Fixed points in the second argument are characterized by

$$\begin{aligned} \varphi st=t\quad \Leftrightarrow \quad s<_{\Gamma (X)}t\in \textsf{SC}\text { or }t=\varphi t_0t_1\text { for some }t_i\in \Gamma (X)\text { with }s<_{\Gamma (X)}t_0. \end{aligned}$$

For all \(s,s',t,t'\in \Gamma (X)\) we have \(s,t\le _{\Gamma (X)}\varphi st\) and

$$\begin{aligned} t'<_{\Gamma (X)}t\,\Rightarrow \,\varphi st'<_{\Gamma (X)}\varphi st\qquad \text {and}\qquad s'<_{\Gamma (X)}s\,\Rightarrow \,\varphi s't\le _{\Gamma (X)}\varphi st. \end{aligned}$$

Finally, we always have

$$\begin{aligned} \varphi st<_{\Gamma (X)}\Gamma _x\quad&\Leftrightarrow \quad s<_{\Gamma (X)}\Gamma _x\text { and }t<_{\Gamma (X)}\Gamma _x,\\ \varphi s't'<_{\Gamma (X)}\varphi st\quad&\Leftrightarrow \quad {\left\{ \begin{array}{ll} s'<_{\Gamma (X)}s\text { and }t'<_{\Gamma (X)}\varphi st,\\ \text {or }s'=s\text { and }t'<_{\Gamma (X)}t,\\ \text {or }s<_{\Gamma (X)}s'\text { and }\varphi s't'<_{\Gamma (X)}t. \end{array}\right. } \end{aligned}$$

Proof

To obtain the characterization of \({\textsf{H}}\), it suffices to note that the first case in Definition 4.12 can only apply when we have \({\textsf{h}}(t)\ne 0\) and hence \(t\in {\textsf{H}}\). The characterization of \(\textsf{SC}\) is immediate. In the first equivalence, the left side amounts to \(s<{\textsf{h}}(t)\), from which the right side is readily inferred. For the other direction, we need only observe that we always have \(t_0\le {\textsf{h}}(\varphi t_0t_1)\). In view of Lemma 4.6(a), the claim that we have \(s,t\le \varphi st\) reduces to the following observation: Due to the same lemma, we always have \({\textsf{h}}(t)\le t\), so that \(s<\mathsf h(t)\) entails \(s<t=\varphi st\). Monotonicity in the second argument is established by a case distinction. In the most interesting case, we have \(s\in \textsf{SC}\) and \(t'=0\), so that we get \(\varphi st'=s\le \varphi st\). Aiming at a contradiction, we assume \(\varphi st=s\). This value cannot arise by the second or third case from Definition 4.12, as \(t'<t\) entails \(t\ne 0\) and since s and \({\overline{\varphi }} st\) are different terms. In the remaining case, we would have \(s<{\textsf{h}}(t)\) and \(\varphi st=t\). But this would yield \(s=t\) and hence \(s<{\textsf{h}}(s)\), against an observation above. A similar case distinction yields weak monotonicity in the first argument (note that \(s'<s\) and \(\varphi st=t\) lead to \(\varphi s't=\varphi s'(\varphi st)=\varphi st\) by the fixed point property). The equivalence that characterizes \(\varphi st<\Gamma _x\) is immediate except when we have \(s<{\textsf{h}}(t)\). In this case, we observe that the left side of the equivalence entails

$$\begin{aligned} s<{\textsf{h}}(t)\le t=\varphi st<\Gamma _x. \end{aligned}$$

In the final equivalence of the proposition, the implication from right to left follows from the fixed point and monotonicity properties, e. g., because we have

$$\begin{aligned} s'<s\text { and }t'<\varphi st\quad \Rightarrow \quad \varphi s't'<\varphi s'(\varphi st)=\varphi st. \end{aligned}$$

Conversely, assume that the right side of the last equivalence in the lemma is false. If we have \(s'<s\) and \(\varphi st=t'\), then we get \(\varphi s't'=\varphi s'(\varphi st)=\varphi st\), so that the left side is false as well. A similar argument applies when we have \(s<s'\) and \(t=\varphi s't'\). If we have \(s'=s\) and \(t'=t\), then the claim is immediate. In all remaining cases, the right side will hold after we interchange s with \(s'\) as well as t with \(t'\). By the direction from right to left, we get \(\varphi st<\varphi s't'\), so that \(\varphi s't'<\varphi st\) fails again. \(\square \)

To conclude this section, we discuss some ordinal arithmetic that will be used later. It may help to recall that \(\Gamma (X)\) is isomorphic to the ordered set \(\omega ({\textsf{H}})\) of finite nonincreasing sequences in \({\textsf{H}}\), as observed in the paragraph after Definition 4.5. Indeed, the following corresponds to the usual operation from ordinal arithmetic, if one thinks of \(\langle t_0,\ldots ,t_{n-1}\rangle \) as the Cantor normal form \(\omega ^{t_0}+\ldots +\omega ^{t_{n-1}}\).

Definition 4.14

Let \(+:\Gamma (X)^2\rightarrow \Gamma (X)\) be given by

$$\begin{aligned}{} & {} \langle s_0,\ldots ,s_{m-1}\rangle +\langle t_0,\ldots ,t_{n-1}\rangle :=\langle s_0,\ldots ,s_{i-1},t_0,\ldots ,t_{n-1}\rangle \\{} & {} \quad \text {with}\quad i:={\left\{ \begin{array}{ll} m &{} \text {if } m=0 \text { or } n=0 \text { or } t_0\le _{\Gamma (X)}s_{m-1},\\ \min \{i<m\,|\,s_i<_{\Gamma (X)}t_0\} &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

The following is readily verified and standard (see [53, Chapter V.14.3]).

Lemma 4.15

For all \(r,r',s,t\in \Gamma (X)\) the following holds:

  1. (a)

    We have \((r+s)+t=r+(s+t)\) and \(t+0=t=0+t\).

  2. (b)

    Given \(s<_{\Gamma (X)}t\), we get \(r+s<_{\Gamma (X)}r+t\) and \(s+r\le _{\Gamma (X)}t+r\).

  3. (c)

    If we have \(t\in {\textsf{H}}\), then \(r<_{\Gamma (X)}r'+t\) and \(s<_{\Gamma (X)}t\) entail \(r+s<_{\Gamma (X)}r'+t\).

  4. (d)

    We have \(r\le _{\Gamma (X)}t\) if, and only if, there is an \(s\in \Gamma (X)\) with \(r+s=t\).

As \({\overline{\varphi }}_00\) is the smallest element of \(\mathsf H\subseteq \Gamma (X)\), the map

$$\begin{aligned} {\mathbb {N}}\ni n\mapsto n:=\overline{n}:=\underbrace{\langle {\overline{\varphi }}_00,\ldots ,{\overline{\varphi }}_00\rangle }_{n \text { entries}}\in \Gamma (X) \end{aligned}$$

embeds \({\mathbb {N}}\) as an initial segment of \(\Gamma (X)\). Addition on \({\mathbb {N}}\) and \(\Gamma (X)\) are related by

$$\begin{aligned} {\overline{m}}+t={\left\{ \begin{array}{ll} \overline{m+n} &{} \text {if } t={\overline{n}},\\ t &{} \text {if } t\ne {\overline{n}} \text { for all }~n\in {\mathbb {N}}. \end{array}\right. } \end{aligned}$$

In particular, this makes it harmless to write n at the place of \({\overline{n}}\). Instead of a binary multiplication, we use \(t\mapsto 1+t\) to define a unary operation \(t\mapsto \omega \cdot t\) with

$$\begin{aligned}{} & {} \omega \cdot 0:=0,\quad \omega \cdot \Gamma _z:=\Gamma _z,\quad \omega \cdot \langle t_0,\ldots ,t_{n-1}\rangle :=\langle \omega \cdot t_0,\ldots ,\omega \cdot t_{n-1}\rangle ,\\{} & {} \qquad \quad \qquad \omega \cdot {\overline{\varphi }}_0t:={\overline{\varphi }}_0(1+t),\quad \omega \cdot {\overline{\varphi }}_st:={\overline{\varphi }}_st\text { for }s\ne 0. \end{aligned}$$

It is not hard to check that \(t\mapsto \omega \cdot t\) is strictly increasing, that we have \(t\le \omega \cdot t\), and that \(s<\omega \cdot t\) entails \(s+n<\omega \cdot t\) for all \(n\in \mathbb N\). Finally, we record how the ordinal arithmetic interacts with supports. The following is immediate in view of Definitions 4.7 and 4.12.

Lemma 4.16

For any \(s,t\in \Gamma (X)\) we have

$$\begin{aligned} {\text {supp}}^\Gamma _X(\varphi st)\cup {\text {supp}}^\Gamma _X(s+t)\cup {\text {supp}}^\Gamma _X(\omega \cdot t)\subseteq {\text {supp}}^\Gamma _X(s)\cup {\text {supp}}^\Gamma _X(t). \end{aligned}$$

5 Hierarchies of admissible sets via search trees

Kurt Schütte’s method of search trees (also known as deduction chains) can be used to prove completeness and to construct models in various settings, including predicate and \(\omega \)-logic [51, 53], second order arithmetic [2, 33] and set theory [16]. In the present section, we use search trees to construct hierarchies of admissible sets. This extends the construction of a single admissible set in [11, Section 4].

We will search for admissible sets within the constructible hierarchy. Given a transitive set u, set \({\mathbb {L}}^u_0:=u\), let \({\mathbb {L}}^u_{\alpha +1}\) consist of the \(\Delta _0\)-definable subsets of \({\mathbb {L}}^u_\alpha \), and put \(\mathbb L^u_\lambda :=\bigcup _{\alpha <\lambda }{\mathbb {L}}^u_\alpha \) when \(\lambda \) is a limit. The restriction to \(\Delta _0\)-formulas (in which all quantifiers must be bounded as in \(\forall x\in y\) or \(\exists x\in a\)) is not essential but will have technical advantages.

In many of our arguments, the actual hierarchy \({\mathbb {L}}^u\) will be represented by a functorial variant \({\textbf{L}}^u\). This ensures that we get a dilator, to which the well ordering principle from Definition 1.4 can be applied. The functor \({\textbf{L}}^u\) has been introduced in [11, Section 3], based on the first author’s PhD thesis [10]. Central facts are recalled in the following, but we refer to [11] for full details.

First, each linear order Y gives rise to a set \({\textbf{L}}^u_Y\), which consists of ‘constant symbols’ from u and terms of the form \(L^u_s\) or \(\{x\in L^u_s\,|\,\varphi (x,a_0,\ldots ,a_{n-1})\}\), for an element \(s\in Y\), a \(\Delta _0\)-formula \(\varphi \) in the language of set theory, and previously constructed terms \(a_i\in \textbf{L}^u_Y\) that may only involve elements \(r\in Y\) with \(r<_Y s\) (so that we have \({\text {supp}}^{{\textbf{L}}}_Y(a_i)\subseteq _Ys\) in the notation below). To be more precise about the notion of formula, we declare that the signature is \(\{\in ,=\}\), that there are separate symbols for bounded quantifiers (which are thus distinguished from bounded occurrences of the usual quantifiers), and that formulas are in negation normal form. In view of the latter, negation and implication are defined operations that rely on de Morgan’s rules and delete double negations. As usual, a formula is \(\Delta _0\) or bounded if it only contains bounded quantifiers.

Prior to any functorial considerations, let us point out that we get an interpretation function \(\llbracket \cdot \rrbracket :\textbf{L}^u_\alpha \rightarrow {\mathbb {L}}^u_\alpha \) when \(Y=\alpha \) is an ordinal. Here \({\textbf{L}}^u_\alpha \) is the term system from above, while \(\mathbb L^u_\alpha \) refers to the actual constructible hierarchy. On the functorial side, each order embedding \(f:Y\rightarrow Z\) induces a function \({\textbf{L}}^u_f:{\textbf{L}}^u_Y\rightarrow {\textbf{L}}^u_Z\), which is defined by a straightforward recursion over terms. Another recursion yields support functions \({\text {supp}}^{{\textbf{L}}}_Y:\textbf{L}^u_Y\rightarrow [Y]^{<\omega }\) with

$$\begin{aligned}&{\text {supp}}^{{\textbf{L}}}_Y(w)=\emptyset \quad \text {for each constant symbol }w\in u,\qquad {\text {supp}}^{{\textbf{L}}}_Y(L^u_s)=\{s\},\\&{\text {supp}}^{\textbf{L}}_Y(\{x\in L^u_s\,|\,\varphi (x,a_0,\ldots ,a_{n-1})\})=\{s\}\cup \textstyle \bigcup _{i<n}{\text {supp}}^{\textbf{L}}_Y(a_i). \end{aligned}$$

In the last case, s is the biggest element of the support, due to the aforementioned condition \({\text {supp}}^{{\textbf{L}}}_Y(a_i)\subseteq _Y s\). Assuming that \(u=\{u_i\,|\,i\in \omega \}\) is countable with fixed enumeration, one can define coding and decoding maps

$$\begin{aligned} {\text {en}}^{{\textbf{L}}}_Y:[Y]^{<\omega }\times \omega \rightarrow \textbf{L}^u_Y\quad \text {and}\quad {\text {code}}^{\textbf{L}}_Y:[Y]^{<\omega }\times {\textbf{L}}^u_Y\rightarrow \omega \end{aligned}$$

that are natural in Y and satisfy \({\text {en}}^{\textbf{L}}_Y(y,{\text {code}}^{{\textbf{L}}}_Y(y,a))=a\) when \({\text {supp}}^{{\textbf{L}}}_Y(a)\subseteq y\) (for details see [11, Theorem 3.7]). Using these codes, one can define orders \(<^{{\textbf{L}}}_Y\) on the sets \({\textbf{L}}^u_Y\), which are compatible with the functions \({\textbf{L}}^u_f\). This turns \({\textbf{L}}^u\) into a dilator.

The previous constructions may not be too surprising, because there is little interaction between syntax and semantics. However, semantic aspects of the constructible hierarchy can also be recovered on the syntactic level, as we know from proof theoretic work of Jäger [28, 30] (cf. Schütte’s [52] work on ramified analysis). The relevant considerations are also functorial, as shown in [11, Section 3]: Consider the language that extends \(\{\in ,=\}\) by a constant symbol for each element of \({\textbf{L}}^u_Y\). By an \({\textbf{L}}^u_Y\)-formula we shall mean a formula in this language. The constant symbols that occur in an \({\textbf{L}}^u_Y\)-formula will also be called its parameters. Unless noted otherwise, we assume that \({\textbf{L}}^u_Y\)-formulas are closed. Let us assume \(\{0,1\}\subseteq u\subseteq {\textbf{L}}^u_Y\), in order to have indices for binary connectives. Then [11, Definition 3.12] associates each \({\textbf{L}}^u_Y\)-formula \(\varphi \) with a disjunction or conjunction

$$\begin{aligned} \varphi \simeq \textstyle \bigvee _{a\in \iota (\varphi )}\varphi _a\quad \text {or}\quad \varphi \simeq \textstyle \bigwedge _{a\in \iota (\varphi )}\varphi _a. \end{aligned}$$

Here \(\iota (\varphi )=\iota _Y(\varphi )\) is a subset of \(\textbf{L}^u_Y\) (which may be empty or infinite) and \(\varphi _a\) is an \({\textbf{L}}^u_Y\)-formula for each \(a\in \iota (\varphi )\). For full details we refer to the cited definition. As an example, we recall that \(\varphi =(b\in \{x\in L^u_s\,|\,\theta (x,c)\})\) yields

$$\begin{aligned} \varphi \simeq \textstyle \bigvee _{a\in \iota (\varphi )}\theta (a,c)\wedge a=b\quad \text {with}\quad \iota (\varphi )=\{a\in \textbf{L}^u_Y\,|\,{\text {supp}}^{{\textbf{L}}}_Y(a)\subseteq _Y s\}. \end{aligned}$$

If \(Y=\alpha \) is an ordinal, then we get a well founded relation by declaring that \(\varphi _a\) precedes \(\varphi \) for each \(a\in \iota (\varphi )\). In this case, our disjunctions and conjunctions yield an inductive definition of truth for \(\textbf{L}^u_\alpha \)-formulas. The latter coincides with satisfaction in the actual set \({\mathbb {L}}^u_\alpha \), under the aforementioned interpretation \(\llbracket \cdot \rrbracket :\textbf{L}^u_\alpha \rightarrow {\mathbb {L}}^u_\alpha \). Let us now state the crucial functorial property: For an embedding \(f:Y\rightarrow Z\), let \(\varphi [f]\) be the \({\textbf{L}}^u_Z\)-formula that results from a given \(\textbf{L}^u_Y\)-formula \(\varphi \) when each parameter a is replaced by \({\textbf{L}}^u_f(a)\). Then \(\varphi \) and \(\varphi [f]\) are both disjunctive or both conjunctive, and [11, Theorem 3.15] yields

$$\begin{aligned} \varphi _a[f]=\varphi [f]_{{\textbf{L}}^u_f(a)}\quad \text {when }a\in \iota _Y(\varphi )\text { or equivalently }\textbf{L}^u_f(a)\in \iota _Z(\varphi [f]). \end{aligned}$$

Using the constructions that we have just recalled, we will aim to build a hierarchy of \(\nu \) admissible sets above a transitive u. The following assumptions will be discharged in the proof of our main theorem. We write \({\text {Ord}}\) for the class of ordinals.

Standing Assumption 5.1

Until the end of Sect. 8, we fix a transitive set u and a limit ordinal \(\nu \), both countable with fixed enumerations \(u=\{u_i\,|\,i\in {\mathbb {N}}\}\) and \(\nu =\{\nu _i\,|\,i\in {\mathbb {N}}\}\) (no relation with the order). The height \(o(u):=u\cap {\text {Ord}}\) is assumed to be a successor ordinal \(o(u)>1\). We also assume that \(\Pi ^1_1\)-comprehension holds.

The assumption that u and \(\nu \) are countable is essential for our approach. On the other hand, the assumption about the height of u has technical reasons and can later be discharged. It entails \(\{0,1\}\subseteq u\), which provides the aforementioned indices for binary connectives. Furthermore, it ensures that \(\alpha \) is a limit ordinal whenever the same holds for \(o(\mathbb L^u_\alpha )=o(u)+\alpha \) (otherwise we could have \(\alpha =0\)). In this situation, the set \({\mathbb {L}}^u_\alpha \ni u\) is admissible if it satisfies the following axioms.

Definition 5.2

Let \(\langle {\text {Ax}}_n\,|\,n\ge 1\rangle \) enumerate all instances of \(\Delta _0\)-collection, i. e., all sentences (in the signature \(\{\in ,=\}\) and without parameters) that have the form

$$\begin{aligned} \forall z_1,\ldots , z_k\forall v(\forall x\in v\exists y\,\theta (x,y,z_1,\ldots ,z_k)\rightarrow \exists w\forall x\in v\exists y\in w\,\theta (x,y,z_1,\ldots ,z_k)) \end{aligned}$$

for a \(\Delta _0\)-formula \(\theta \). Furthermore, let \({\text {Ax}}_0\) be the sentence \(\forall x\exists y.\,x\in y\).

Let us write \(Z^{<\omega }\) for the tree of finite sequences with entries in Z. In [11] we have built labelled trees \(S_Y\subseteq ({\textbf{L}}^u_Y)^{<\omega }\) for all linear orders Y, which represent attempted proofs of contradiction from the axioms \({\text {Ax}}_n\) and the rules associated with the infinite disjunctions \(\varphi \simeq \bigvee _{a\in \iota _Y(\varphi )}\varphi _a\) and conjunctions \(\varphi \simeq \bigwedge _{a\in \iota _Y(\varphi )}\varphi _a\) that were mentioned above. By a relativized ordinal analysis, we showed that \(S_Y\) cannot be well founded for all well orders Y, assuming a suitable well ordering principle. This allowed us to conclude that \(S_Y\) has an infinite branch for some well order Y. Analogous to other proofs of completeness, such a branch determined a model of the axioms \({\text {Ax}}_n\), i. e., a single admissible set. The following construction of \(\nu \) admissible sets is similar overall but different in one respect: we will obtain search trees \(S_Y^R\) that depend not only on an order Y but also on an embedding \(R:\nu \rightarrow Y\). The latter determines the heights of the admissible sets in our hierarchy. On an intuitive level, one may think of R as enumerating regular cardinals (cf. [7, Definition 4.1]).

To describe our search trees in detail, we fix some notation and terminology. Given a sequence \(\sigma =\langle \sigma _0,\ldots ,\sigma _{n-1}\rangle \in Z^{<\omega }\), write \({\text {len}}(\sigma ):=n\) for its length and put \(\sigma \!\restriction \!k:=\langle \sigma _0,\ldots ,\sigma _{k-1}\rangle \) for any \(k\le {\text {len}}(\sigma )\). For \(z\in Z\) and \(\sigma \in Z^{<\omega }\) as before, set \(\sigma ^\frown z:=\langle \sigma _0,\ldots ,\sigma _{n-1},z\rangle \). The support functions of \({\textbf{L}}^u\) induce functions

$$\begin{aligned}{} & {} {\text {supp}}^S_Y:({\textbf{L}}^u_Y)^{<\omega }\rightarrow [Y]^{<\omega },\\{} & {} {\text {supp}}^S_Y(\langle \sigma _0,\ldots ,\sigma _{n-1}\rangle ):=\textstyle \bigcup _{i<n}{\text {supp}}^{\textbf{L}}_Y(\sigma _i). \end{aligned}$$

Our search trees will be labelled by \({\textbf{L}}^u_Y\)-sequents, which are defined as finite sequences of \({\textbf{L}}^u_Y\)-formulas. Semantically, one should think of a sequent as the disjunction of its entries. As usual, we use the letters \(\Gamma \) and \(\Delta \) to denote sequents (mind the clash of notation with the Veblen hierarchy from Definition 4.5), and we write \(\varphi _0,\ldots ,\varphi _{n-1}\) and \(\Gamma ,\varphi \) at the place of \(\langle \varphi _0,\ldots ,\varphi _{n-1}\rangle \) and \(\Gamma ^\frown \varphi \). When the order and multiplicity of formulas do not matter, we treat sequents like finite sets and write, for example, \(\varphi \in \Gamma \) to express that \(\varphi \) is an entry of \(\Gamma \). The relativization of an \(\textbf{L}^u_Y\)-formula \(\varphi \) to an element \(a\in {\textbf{L}}^u_Y\) is the \({\textbf{L}}^u_Y\)-formula \(\varphi ^a\) that results from \(\varphi \) when we replace all occurrences \(\forall x.\,\psi \) and \(\exists x.\,\psi \) of unbounded quantifiers by bounded quantifiers \(\forall x\in a.\,\psi \) and \(\exists x\in a.\,\psi \), respectively. We do not relativize quantifiers that are already bounded, as this is superfluous when a is transitive and contains the original bounds. Finally, we can describe our search trees in detail:

Definition 5.3

Consider a linear order Y and a strictly increasing map \(R:\nu \rightarrow Y\). Based on the enumeration \(\nu =\{\nu _i\,|\,i\in {\mathbb {N}}\}\) from Assumption 5.1, we put

$$\begin{aligned} L(i):=L^u_{R(\nu _i)}\in {\textbf{L}}^u_Y. \end{aligned}$$

We define a tree \(S^R_Y\subseteq ({\textbf{L}}^u_Y)^{<\omega }\) and a labelling function \(l_Y:S^R_Y\rightarrow ``{\textbf{L}}^u_Y\text {-sequents''}\) by recursion over sequences in \(({\textbf{L}}^u_Y)^{<\omega }\). Concerning the base case, we declare that we have \(\langle \rangle \in S^R_Y\) and \(l_Y(\langle \rangle )=\langle \rangle \). In the recursion step, it suffices to consider the children of a previously constructed element \(\sigma \in S^R_Y\), as we aim to build a tree. First assume \({\text {len}}(\sigma )=2k\) is even. Assuming that k codes the pair \(\langle n,i\rangle \), we declare

$$\begin{aligned} \sigma ^\frown a\in S^R_Y:\Leftrightarrow \,a=L(i)\quad \text {and}\quad l_Y(\sigma ^\frown L(i)):=l_Y(\sigma ),\lnot {\text {Ax}}_n^{L(i)}. \end{aligned}$$

Here \(a=L(i)\) asserts equality as terms, and the superscript refers to relativization. Now assume that \({\text {len}}(\sigma )=2k+1\) is odd and that k codes the triple \(\langle l,m,n\rangle \). We assume that our coding ensures \(l,m,n\le k\). This entails \(l<{\text {len}}(l_Y(\sigma ))\), as we append a formula at each even stage and do no delete any formulas in the following. Let \(\varphi \) be the l-th formula in \(l_Y(\sigma )\). If \(\varphi \simeq \bigwedge _{a\in \iota _Y(\varphi )}\varphi _a\) is conjunctive, we define

$$\begin{aligned} \sigma ^\frown a\in S^R_Y:\Leftrightarrow \,a\in \iota _Y(\varphi )\quad \text {and}\quad l_Y(\sigma ^\frown a):=l_Y(\sigma ),\varphi _a. \end{aligned}$$

If \(\varphi \simeq \bigvee _{a\in \iota _Y(\varphi )}\varphi _a\) is disjunctive, we put

$$\begin{aligned} b:={\text {en}}^{\textbf{L}}_Y({\text {supp}}^S_Y(\sigma \!\restriction \!m),n)\in {\textbf{L}}^u_Y, \end{aligned}$$

using the function \({\text {en}}_Y^{\textbf{L}}:[Y]^{<\omega }\times \omega \rightarrow {\textbf{L}}^u_Y\) mentioned above (the idea is to generate all potential witnesses b in a functorial way). We then declare

$$\begin{aligned} \sigma ^\frown a\in S^R_Y:\Leftrightarrow \,a=0\quad \text {and}\quad l_Y(\sigma ^\frown 0):={\left\{ \begin{array}{ll} l_Y(\sigma ),\varphi _b &{} \text {if }b\in \iota _Y(\varphi ),\\ l_Y(\sigma ) &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

for which we recall that \(0\in u\subseteq {\textbf{L}}^u_Y\) holds by Assumption 5.1.

For \(f:{\mathbb {N}}\rightarrow {\textbf{L}}^u_Y\) we write \(f\!\restriction \!k:=\langle f(0),\ldots ,f(k-1)\rangle \) and put

$$\begin{aligned} {\text {supp}}^\infty _Y(f):=\textstyle \bigcup _{k\in \mathbb N}{\text {supp}}^S_Y(f\!\restriction \!k)=\bigcup _{k\in \mathbb N}{\text {supp}}^{{\textbf{L}}}_Y(f(k))\subseteq Y. \end{aligned}$$

Recall that f is a branch of \(S^R_Y\) if \(f\!\restriction \!k\in S^R_Y\) holds for all \(k\in {\mathbb {N}}\). Given \(\alpha <\nu \), pick an \(i\in {\mathbb {N}}\) with \(\alpha =\nu _i\), and let k code a pair \(\langle n,i\rangle \) for some \(n\in {\mathbb {N}}\). Assuming that f is a branch, we must have \(f(2k)=L^u_{R(\alpha )}\), by construction of the search tree. By definition we have \({\text {supp}}^{\textbf{L}}_Y(L^u_{R(\alpha )})=\{R(\alpha )\}\), so that we get

$$\begin{aligned} R(\alpha )\in {\text {supp}}^\infty _Y(f)\quad \text {for all } \alpha <\nu . \end{aligned}$$

If Y is well founded, then so is its suborder \({\text {supp}}^\infty _Y(f)\). In the base theory \(\mathsf {ATR_0^{set}}\) from Theorem 1.6, we can use axiom beta to get a transitive collapse, i. e., an order preserving map from \({\text {supp}}^\infty _Y(f)\) onto an ordinal. This yields the desired admissibles:

Theorem 5.4

Assume that f is a branch in \(S^R_Y\) for a well order Y and a strictly increasing map \(R:\nu \rightarrow Y\). Let \(c:{\text {supp}}^\infty _Y(f)\rightarrow {\text {Ord}}\) be the transitive collapse. Then \({\mathbb {L}}^u_{c(R(\alpha ))}\ni u\) is an admissible set for every \(\alpha <\nu \).

Before we give a proof, we show that our construction of search trees is functorial. This fact will facilitate the proof of our theorem, but its full significance will only become apparent in the next section. Since we work with a functorial version of the constructible hierarchy, an embedding \(g:Y\rightarrow Z\) yields a function \({\textbf{L}}^u_g:{\textbf{L}}^u_Y\rightarrow {\textbf{L}}^u_Z\).

Definition 5.5

Consider an embedding \(g:Y\rightarrow Z\) of linear orders. We define

$$\begin{aligned}{} & {} S_g:({\textbf{L}}^u_Y)^{<\omega }\rightarrow ({\textbf{L}}^u_Z)^{<\omega },\\{} & {} S_g(\langle \sigma _0,\ldots ,\sigma _{n-1}\rangle ):=\langle \textbf{L}^u_g(\sigma _0),\ldots ,{\textbf{L}}^u_g(\sigma _{n-1})\rangle . \end{aligned}$$

Under the assumptions of the following proposition, we also write \(S_g:S^P_Y\rightarrow S^R_Z\) for the restriction with the indicated (co)domain. Furthermore, let us define \(<^S_Y\) as the Kleene-Brouwer order on \(({\textbf{L}}^u_Y)^{<\omega }\) (also called Lusin-Sierpiński order), which is generated by the clauses \(\sigma ^\frown a<^S_Y\sigma \) and \(\sigma ^\frown a<^S_Y\sigma ^\frown b\) for \(a<^{{\textbf{L}}}_Y b\). We also write \(<^S_Y\) for the restriction of this relation to a search tree \(S^P_Y\).

Due to the corresponding properties of \({\textbf{L}}^u\), it is immediate that the definition turns \(Y\mapsto (\textbf{L}^u_Y)^{<\omega }\) into a predilator. In particular, we have the support property

$$\begin{aligned} \{S_g(\sigma )\,|\,\sigma \in ({\textbf{L}}^u_Y)^{<\omega }\}=\{\tau \in ({\textbf{L}}^u_Z)^{<\omega }\,|\,{\text {supp}}^S_Z(\tau )\subseteq {\text {rng}}(g)\}. \end{aligned}$$

Under the assumptions of the following proposition, this equation remains valid when we replace \(({\textbf{L}}^u_Y)^{<\omega }\) and \(({\textbf{L}}^u_Z)^{<\omega }\) by \(S^P_Y\) and \(S^R_Z\), respectively.

Proposition 5.6

Consider linear orders Y and Z with embeddings \(P:\nu \rightarrow Y\) and \(R:\nu \rightarrow Z\). If the embedding \(g:Y\rightarrow Z\) satisfies \(g\circ P=R\), then

$$\begin{aligned} \sigma \in S^P_Y\quad \Leftrightarrow \quad S_g(\sigma )\in S^R_Z \end{aligned}$$

holds for all \(\sigma \in ({\textbf{L}}^u_Y)^{<\omega }\).

Proof

Recall that we have a map \(\varphi \mapsto \varphi [g]\) from \(\textbf{L}^u_Y\)-formulas to \({\textbf{L}}^u_Z\)-formulas. We extend this map to sequents, by setting

$$\begin{aligned} \Gamma [g]:=\varphi _0[g],\ldots ,\varphi _{n-1}[g]\quad \text {for}\quad \Gamma =\varphi _0,\ldots ,\varphi _{n-1}. \end{aligned}$$

By induction over the sequence \(\sigma \), we prove the equivalence from the proposition and simultaneously

$$\begin{aligned} l_Y(\sigma )[g]=l_Z(S_g(\sigma ))\quad \text {when}\quad \sigma \in S^P_Y. \end{aligned}$$

The base case with \(\sigma =\langle \rangle \) is immediate. In the induction step, we may assume that we have \(\sigma \in S^P_Y\) or equivalently \(S_g(\sigma )\in S^R_Z\), as we are concerned with trees. First assume that \({\text {len}}(\sigma )={\text {len}}(S_g(\sigma ))=2k\) is even, where k codes \(\langle n,i\rangle \). Refining the notation from Definition 5.3, we write \(L[Y](i):=L^u_{P(\nu _i)}\) and \(L[Z](i):=L^u_{R(\nu _i)}\). As [11, Definition 3.5] yields \(\textbf{L}^u_g(L^u_a)=L^u_{g(a)}\), we get

$$\begin{aligned} {\textbf{L}}^u_g(L[Y](i))=L^u_{g\circ P(\nu _i)}=L^u_{R(\nu _i)}=L[Z](i). \end{aligned}$$

Since \({\textbf{L}}^u_g\) is injective on terms (recall that it respects \(<^{{\textbf{L}}}\)), we can conclude

$$\begin{aligned} \sigma ^\frown a\in S^P_Y\quad \Leftrightarrow \quad a=L[Y](i)\quad&\Leftrightarrow \quad {\textbf{L}}^u_g(a)=L[Z](i)\\ {}&\Leftrightarrow \quad S_g(\sigma ^\frown a)=S_g(\sigma )^\frown {\textbf{L}}^u_g(a)\in S^R_Z. \end{aligned}$$

In order to see that the desired relation between the sequent labels is preserved, it suffices to observe that we get \({\text {Ax}}_n^{L[Y](i)}[g]={\text {Ax}}_n^{L[Z](i)}\) from the above (since the operation \(\varphi \mapsto \varphi [g]\) replaces any parameter a by \({\textbf{L}}^u_g(a)\)). For the case in which the sequences \(\sigma \) and \(S_g(\sigma )\) have odd length \(2k+1\), we refer to the detailed argument in the proof of [11, Proposition 4.8] (where the tuple \(\langle l,m,n\rangle \) with code k is written as \(\langle \pi _0(n),\pi _1(n),\pi _2(n)\rangle \) with code n). \(\square \)

Let us now establish the theorem that was stated above.

Proof of Theorem 5.4

As preparation, we provide a reduction to the case where the inclusion \({\text {supp}}^\infty _Y(f)\subseteq Y\) is an equality. Let \(g:\kappa \rightarrow Y\) be the increasing enumeration of \({\text {supp}}^\infty _Y(f)\), so that we have \(c(g(\gamma ))=\gamma \) for \(\gamma <\kappa \). Define \(P:\nu \rightarrow \kappa \) by stipulating \(g\circ P=R\), which yields \(P(\alpha )=c(R(\alpha ))\). For each \(k\in \mathbb N\) we have

$$\begin{aligned} {\text {supp}}^{{\textbf{L}}}_Y(f(k))\subseteq {\text {supp}}^\infty _Y(f)={\text {rng}}(g). \end{aligned}$$

By the support property for \({\textbf{L}}^u\) (see [11, Proposition 3.6]), it follows that f(k) lies in the range of \({\textbf{L}}^u_g:\textbf{L}^u_\kappa \rightarrow {\textbf{L}}^u_Y\). We thus get an \(h:{\mathbb {N}}\rightarrow \textbf{L}^u_\kappa \) with \({\textbf{L}}^u_g\circ h=f\). Since

$$\begin{aligned} S_g(h\!\restriction \!k)=\langle {\textbf{L}}^u_g\circ h(0),\ldots ,{\textbf{L}}^u_g\circ h(k-1)\rangle =f\!\restriction \!k\in S^R_Y \end{aligned}$$

holds for all \(k\in {\mathbb {N}}\), we can use Proposition 5.6 to conclude that h is a branch of \(S^P_\kappa \). By the naturality of supports for \(\textbf{L}^u\) (see again [11, Proposition 3.6]), we get

$$\begin{aligned} \{g(\gamma )\,|\,\gamma \in {\text {supp}}^\infty _\kappa (h)\}= & {} \textstyle \bigcup _{k\in {\mathbb {N}}}[g]^{<\omega }({\text {supp}}^{{\textbf{L}}}_\kappa (h(k))\\= & {} \textstyle \bigcup _{k\in {\mathbb {N}}}{\text {supp}}^{{\textbf{L}}}_Y(\textbf{L}^u_g\circ h(k))={\text {supp}}^\infty _Y(f). \end{aligned}$$

This shows \({\text {supp}}^\infty _\kappa (h)=\kappa \), which was the purpose of our preparatory construction. To formulate the central claim of this proof, we say that an \({\textbf{L}}^u_\kappa \)-formula \(\varphi \) occurs on h if we have \(\varphi \in l_\kappa (h\!\restriction \!k)\) for some \(k\in {\mathbb {N}}\). Let us also recall that we can evaluate \({\textbf{L}}^u_\kappa \)-formulas in \({\mathbb {L}}^u_\kappa \), via the aforementioned interpretation \(\llbracket \cdot \rrbracket :\textbf{L}^u_\kappa \rightarrow {\mathbb {L}}^u_\kappa \). Crucially, we shall show that \({\mathbb {L}}^u_\kappa \) satisfies \(\lnot \varphi \) whenever \(\varphi \) occurs on h. According to [11, Theorem 3.14], this reduces to the following claims:

  1. (i)

    if \(\varphi \simeq \bigwedge _{a\in \iota _\kappa (\varphi )}\varphi _a\) occurs on h, then so does \(\varphi _a\) for some \(a\in \iota _\kappa (\varphi )\),

  2. (ii)

    if \(\varphi \simeq \bigvee _{a\in \iota _\kappa (\varphi )}\varphi _a\) occurs on h, then so does \(\varphi _a\) for all \(a\in \iota _\kappa (\varphi )\).

Indeed, we get a well founded relation on \(\textbf{L}^u_\kappa \)-formulas by declaring that each \(\varphi _a\) precedes \(\varphi \), as mentioned above. Given (i) and (ii), transfinite induction over this relation shows that each \(\varphi \) on h must fail in \({\mathbb {L}}^u_\kappa \). The proof of [11, Theorem 3.14] shows that this inductive argument goes through in our base theory. Before we establish (i) and (ii), let us explain how to derive the theorem: Given any \(\alpha <\nu \) and \(n\in {\mathbb {N}}\), let k be the code of a pair \(\langle n,i\rangle \) with \(\nu _i=\alpha \). By construction of our search trees, the formula \(\lnot {\text {Ax}}_n^{L(i)}\) occurs in \(l_\kappa (h\!\restriction \!(2k+1))\) and hence on h. In view of [11, Definition 3.2] we have

$$\begin{aligned} \llbracket L(i)\rrbracket =\llbracket L^u_{P(\alpha )}\rrbracket ={\mathbb {L}}^u_{P(\alpha )}. \end{aligned}$$

Hence our central claim entails that \({\mathbb {L}}^u_\kappa \) satisfies the relativization of \({\text {Ax}}_n\) to \({\mathbb {L}}^u_{P(\alpha )}\). But this simply means that \({\mathbb {L}}^u_{P(\alpha )}\) satisfies \({\text {Ax}}_n\). It follows that \({\mathbb {L}}^u_{P(\alpha )}={\mathbb {L}}^u_{c(R(\alpha ))}\) is admissible (cf. the paragraph before Definition 5.2), as required by our theorem. Claims (i) and (ii) are established as in the proof of [11, Theorem 4.6]. However, the fact that we have \({\text {supp}}^\infty _\kappa (h)=\kappa \) does simplify matters. We provide details for the more difficult claim (ii): Assume that the disjunctive formula \(\varphi \) occurs on h, say as the j-th formula in \(l_\kappa (h\!\restriction \!m_0)\). Given an arbitrary \(a\in \iota _\kappa (\varphi )\), we observe

$$\begin{aligned} {\text {supp}}^{\textbf{L}}_\kappa (a)\subseteq \kappa ={\text {supp}}^\infty _\kappa (h)=\textstyle \bigcup _{k\in \mathbb N}{\text {supp}}^S_\kappa (h\!\restriction \!k). \end{aligned}$$

Since the last union is increasing, we may pick a number \(m\ge m_0\) such that the finite set \({\text {supp}}^{{\textbf{L}}}_\kappa (a)\) is contained in \({\text {supp}}^S_\kappa (h\!\restriction \!m)\). We then have

$$\begin{aligned} a={\text {en}}^{\textbf{L}}_\kappa ({\text {supp}}^S_\kappa (h\!\restriction \!m),n)\quad \text {for}\quad n:={\text {code}}^{\textbf{L}}_\kappa ({\text {supp}}^S_\kappa (h\!\restriction \!m),a), \end{aligned}$$

by [11, Theorem 3.7] or the discussion above. Let us now define k as the code of the triple \(\langle j,m,n\rangle \). As in Definition 5.3, we may assume that our coding of tuples ensures \(m\le k\) and hence \(m_0<2k+1\). When we build our search trees, we extend sequents at the end, but we never delete or permute formulas. Thus \(\varphi \) is still the j-th formula in \(l_\kappa (h\!\restriction \!(2k+1))\). By construction we get

$$\begin{aligned} l_\kappa (h\!\restriction \!(2k+2))=l_\kappa (h\!\restriction \!(2k+1)),\varphi _a. \end{aligned}$$

Hence \(\varphi _a\) occurs on h, as desired. \(\square \)

Using methods from ordinal analysis, we will show that the well ordering principle from Definition 1.4 entails the following: it cannot be the case that \(S^R_Y\) is well founded whenever Y is a well order. Once this is known, Theorem 5.4 will yield a hierarchy of \(\nu \) admissible sets, as needed for the crucial direction of Theorem 1.6. To conclude, we record a fact that will be needed later (cf. [11, Corollary 4.10]):

Corollary 5.7

Consider a linear order Z and an embedding \(R:\nu \rightarrow Z\). We have

$$\begin{aligned} {\text {supp}}^{\textbf{L}}_Z(b)\subseteq {\text {supp}}^S_Z(\sigma )\cup \{R(\alpha )\,|\,\alpha <\nu \} \end{aligned}$$

for any node \(\sigma \in S^R_Z\) and any parameter b that occurs in some formula of \(l_Z(\sigma )\).

Proof

Let Y be the set on the right of the desired inclusion, considered as a suborder of Z. Write \(\iota :Y\hookrightarrow Z\) for the inclusion, and define \(P:\nu \rightarrow Y\) by \(\iota \circ P=R\). In view of \({\text {supp}}^S_Z(\sigma )\subseteq {\text {rng}}(\iota )\) we obtain \(\sigma =S_\iota (\rho )\) for some node \(\rho \in S^P_Y\), due to Proposition 5.6. By the proof of the latter, we have \(l_Y(\rho )[\iota ]=l_Z(\sigma )\). We can thus write \(b={\textbf{L}}^u_\iota (a)\) with \(a\in {\textbf{L}}^u_Y\), so that

$$\begin{aligned} {\text {supp}}^{{\textbf{L}}}_Z(b)={{\text {supp}}^{{\textbf{L}}}_Z}\circ \textbf{L}^u_\iota (a)=[\iota ]^{<\omega }\circ {\text {supp}}^{\textbf{L}}_Y(a)\subseteq {\text {rng}}(\iota )=Y \end{aligned}$$

follows by the naturality of supports. \(\square \)

6 From search tree to collapsing functions

In this section, we apply the well ordering principle from Definition 1.4 to the search trees \(S^R_Y\) that were constructed in Definition 5.3. The result is an order \({\textbf{O}}\), which is quite close to the relativized ordinal notation system in [46, Definition 6.4] (cf. also [6] and [47, Section 12.2]). We will later use \({\textbf{O}}\) as a basis for the ordinal analysis that proves the implication from (iii) to (iv) in Theorem 1.6.

Recall the dilator \(\Gamma \) and the functions \(\gamma _X:X\rightarrow \Gamma (X)\) from Sect. 4. The desired order \({\textbf{O}}\) will be constructed as part of a system of orders and embeddings, which can be depicted as follows (where a hooked arrow indicates that the range is an initial segment of the codomain, while \(\mathrel {\twoheadrightarrow _p}\) refers to a partial surjective function):

figure f

Before we give a formal construction of these objects, let us explain their intuitive meaning. In view of Sect. 4, the order \({\textbf{O}}=\Gamma (\textbf{K})\) is closed under the binary Veblen function and includes the first \({\textbf{K}}\) strongly critical ordinals, which are represented by the elements \(\gamma _{{\textbf{K}}}(z)\in {\textbf{O}}\) with \(z\in {\textbf{K}}\) (we choose \({\textbf{K}}\) for ‘kritisch’). By composing all horizontal arrows, we obtain \(\nu \)-many partial but order preserving ‘collapsing functions’ from \({\textbf{O}}\) to itself. The values of these functions are represented by the elements of a set \({\textbf{X}}\). We have a map I that realizes this set as an initial segment of \({\textbf{K}}\). Since \(\Gamma \) is a functor and normal, we also obtain an identification \(\Gamma (I)\) of the set \(\Gamma ({\textbf{X}})\) with an initial segment of \({\textbf{O}}\) (see Corollary 4.11). This means, first, that the collapsing values form an initial segment of the strongly critical ordinals. Moreover, it means that the ordinals generated from the collapsing values form an initial segment of the full system \({\textbf{O}}\). Both properties are typical for ordinal notation systems (see again the examples in [6, 47]). It is also typical that there are strongly critical ordinals that lie above all collapsing values. In our case, these ‘large’ ordinals correspond to the nodes of a certain search tree \(S^{{\textbf{R}}}_{\Gamma (\textbf{X})}\) (cf. the elements \({\mathfrak {E}}_\sigma \) in [11, Definition 5.2]). For our ordinal analysis, it will be crucial that this search tree is built over the lower part \(\Gamma ({\textbf{X}})\) of the order \({\textbf{O}}\), with respect to a map \({\textbf{R}}:\nu \rightarrow \Gamma ({\textbf{X}})\) that has a meaningful connection to the collapsing functions. Concerning the latter, we will obtain \({\textbf{R}}(\alpha )=\gamma _{\textbf{X}}\circ \psi ^{{\textbf{X}}}(\alpha +1,0)\) for \(0\in \Gamma (\textbf{K})={\textbf{O}}\), which evokes \(\psi _{\alpha +1}0=\Omega _{\alpha +1}\in R\) from [6, Lemma 1.7] and [7, Definition 4.1].

We would like to define \(\psi ^{{\textbf{X}}}:\nu \times \textbf{O}\mathrel {\twoheadrightarrow _p}{\textbf{X}}\) as the partial inverse of a function \(\pi \) as in Definition 1.4. Before we can apply the latter, however, we must overcome a significant obstacle. The issue is that Definition 1.4 requires a dilator as input, while the construction of search trees in Definition 5.3 does not provide one, at least not directly: the tree \(S^R_Y\) depends not only on the order Y but also on a given embedding \(R:\nu \rightarrow Y\). This issue will occupy us for most of the present section, and its resolution may at times appear technical. At the same time, we believe that the issue itself is not technical but has real mathematical substance. In particular, it distinguishes the construction of a single admissible set in [11]—where no similar issue arose—from the construction of an infinite hierarchy of admissible sets.

In order to resolve the issue that was mentioned in the previous paragraph, we will precompose the construction of search trees with the order transformation

$$\begin{aligned} X\mapsto J(X):=\nu \times \Gamma (X). \end{aligned}$$

Recall that products were discussed in the paragraph before Definition 1.4, which does also explain \(J(f):=\nu \times \Gamma (f)\) for an order embedding f. It is straightforward to check that we get a dilator if we provide supports by

$$\begin{aligned} {\text {supp}}^J_X:J(X)\rightarrow [X]^{<\omega }\quad \text {with}\quad {\text {supp}}^J_X(\alpha ,\sigma ):={\text {supp}}^{\Gamma }_X(\sigma ). \end{aligned}$$

As \(\nu \) is a limit by Assumption 5.1, we may consider the embeddings

$$\begin{aligned} j[X]:\nu \rightarrow J(X)\quad \text {with}\quad j[X](\alpha ):=(\alpha +1,0). \end{aligned}$$

These are natural in the sense that \(J(f)\circ j[X]=j[Y]\) holds for any embedding f, as we have \(\Gamma (f)(0)=0\) by Definition 4.7. We can now describe the preprocessed search trees that were mentioned above:

Definition 6.1

Consider the order transformation

$$\begin{aligned} X\mapsto {\textbf{S}}_0(X):=S^{j[X]}_{J(X)}, \end{aligned}$$

where the definiens refers to Definitions 5.3 and 5.5. Invoking the latter in conjunction with Proposition 5.6, we map each embedding \(f:X\rightarrow Y\) to the embedding

$$\begin{aligned} {\textbf{S}}_0(f):=S_{J(f)}:{\textbf{S}}_0(X)\rightarrow {\textbf{S}}_0(Y). \end{aligned}$$

Note that the cited proposition can be applied because we have \(J(f)\circ j[X]=j[Y]\), as seen above. Finally, we define functions \({\text {supp}}^0_X:{\textbf{S}}_0(X)\rightarrow [X]^{<\omega }\) by setting

$$\begin{aligned} {\text {supp}}^0_X(\sigma ):=\bigcup \{{\text {supp}}^{J}_X(\rho )\,|\,\rho \in {\text {supp}}^S_{J(X)}(\sigma )\}. \end{aligned}$$

This relies on the definition of \({\text {supp}}^S\) in the paragraph before Definition 5.3.

As we had hoped, our preprocessed search trees form a dilator, at least when statement (iv) from Theorem 1.6 is violated.

Proposition 6.2

The constructions from Definition 6.1 yield a predilator \({\textbf{S}}_0\). The latter is a dilator if there is no sequence of admissible sets \(\textsf{Ad}_\alpha \) with \(u\in \textsf{Ad}_\alpha \in \textsf{Ad}_\beta \) for \(\alpha<\beta <\nu \) (with u and \(\nu \) as fixed in Assumption 5.1).

Proof

Let us observe that the first map in

$$\begin{aligned} X\mapsto ({\textbf{L}}^u_{J(X)})^{<\omega }\supseteq S^{j[X]}_{J(X)}={\textbf{S}}_0(X) \end{aligned}$$

is the composition of predilators and hence a predilator itself, by the paragraph before Proposition 5.6. Using the latter, we can conclude that \({\textbf{S}}_0\) is also a predilator. To provide details for the crucial step, we show that the support property

$$\begin{aligned} {\text {supp}}^0_Y(\sigma )\subseteq {\text {rng}}(f)\quad \Rightarrow \quad \sigma \in {\text {rng}}(\textbf{S}_0(f)) \end{aligned}$$

holds for any embedding \(f:X\rightarrow Y\) and any \(\sigma \in \textbf{S}_0(Y)\). Given the antecedent of our implication, the definition of \({\text {supp}}^0_Y\) and the support property for J yield

$$\begin{aligned} {\text {supp}}^S_{J(Y)}(\sigma )\subseteq {\text {rng}}(J(f)). \end{aligned}$$

This allows us to write

$$\begin{aligned} \sigma =S_{J(f)}(\sigma _0)\quad \text {for some}\quad \sigma _0\in ({\textbf{L}}^u_{J(X)})^{<\omega }, \end{aligned}$$

by the paragraph before Proposition 5.6. Now the latter ensures that \(\sigma \in {\textbf{S}}_0(Y)\) entails \(\sigma _0\in {\textbf{S}}_0(X)\) and hence \(\sigma ={\textbf{S}}_0(f)(\sigma _0)\in {\text {rng}}({\textbf{S}}_0(f))\), as desired. Under the assumption from the proposition, we now show that \({\textbf{S}}_0\) is a dilator. Given a well order X, we must establish that \({\textbf{S}}_0(X)\) is well founded. As \(\Pi ^1_1\)-comprehension is available by Assumption 5.1, we can infer that \(\Gamma (X)\) and J(X) are well orders, by Proposition 4.8 or directly by [45, Theorem 1.4]. According to [11, Lemma 3.10], it follows that \(\textbf{L}^u_{J(X)}\) is well founded (see the beginning of Sect. 5 and compare with the usual constructible hierarchy). Hence \({\textbf{S}}_0(X)\) is well founded (with respect to the Kleene-Brouwer order from Definition 5.5) unless it has a branch. In the latter case, Theorem 5.4 would yield a hierarchy of \(\nu \) admissible sets above u, against the assumption of the present proposition. \(\square \)

Following the informal explanation at the beginning of this section, we now add space for collapsing values below the elements of our search tree. Sums of linear orders and embeddings are defined as in the proof of Proposition 4.3. Recall that elements of \(Z_0+Z_1\) are written as \(z_0\) and \(Z_0+z_1\) with \(z_i\in Z_i\).

Definition 6.3

For each linear order X and each embedding \(f:X\rightarrow Y\), we put \({\textbf{S}}(X):=X+{\textbf{S}}_0(X)\) and define \({\textbf{S}}(f):\textbf{S}(X)\rightarrow {\textbf{S}}(Y)\) by \({\textbf{S}}(f):=f+{\textbf{S}}_0(f)\). By

$$\begin{aligned} {\text {supp}}^{{\textbf{S}}}_X(x):=\{x\}\text { for } x\in X\quad \text {and}\quad {\text {supp}}^{\textbf{S}}_X(X+\sigma ):={\text {supp}}^0_X(\sigma )\text { for } \sigma \in \mathsf S_0(X) \end{aligned}$$

we define a family of functions \({\text {supp}}^{{\textbf{S}}}_X:\textbf{S}(X)\rightarrow [X]^{<\omega }\).

The crucial direction (ii)\(\Rightarrow \)(iv) of Theorem 1.6 asserts that the well foundedness of \(\nu \)-fixed points yields a hierarchy of admissible sets. To prove this by contradiction, we will assume that statement (iv) fails. In view of Proposition 6.2, this will have the effect that \({\textbf{S}}_0\) is a dilator. It is easy to conclude that \({\textbf{S}}\) and \(\Gamma \circ {\textbf{S}}\) are dilators as well (recall how composition is defined in the paragraph before Proposition 4.2). We bring in statement (ii) of Theorem 1.6 in the form of the following assumption.

Standing Assumption 6.4

Until the end of Sect. 8, we assume that \(\Gamma \circ {\textbf{S}}\) is a dilator. Furthermore, we assume that we have a fixed well order \({\textbf{Y}}\) and \(\nu \)-collapse

$$\begin{aligned} \pi _{{\textbf{Y}}}:{\textbf{Y}}\rightarrow \nu \times (\Gamma \circ {\textbf{S}})(\textbf{Y}) \end{aligned}$$

in the sense of Definition 1.4 (with \(\nu \) and the suppressed u as in Assumption 5.1).

The inverse of \(\pi _{{\textbf{Y}}}\) is a partial embedding

$$\begin{aligned} \nu \times (\Gamma \circ {\textbf{S}})(\textbf{Y})=\nu \times \Gamma \left( {\textbf{Y}}+S^{j[{\textbf{Y}}]}_{J(\textbf{Y})}\right) \mathrel {\twoheadrightarrow _p}{\textbf{Y}}. \end{aligned}$$

This looks a lot like the function

$$\begin{aligned} \psi ^{{\textbf{X}}}:\nu \times \Gamma \left( {\textbf{X}}+S^{\textbf{R}}_{\Gamma ({\textbf{X}})}\right) \mathrel {\twoheadrightarrow _p}{\textbf{X}} \end{aligned}$$

that was promised at the beginning of this section. However, one important point remains to be improved: the collapse \(\psi ^{\textbf{X}}\) and the embedding \({\textbf{R}}:\nu \rightarrow \Gamma (X)\) were supposed to be connected in a meaningful way, while the function \(j[\textbf{Y}]:\nu \rightarrow J({\textbf{Y}})\) and the order \(J({\textbf{Y}})\) appear rather ad hoc and unrelated to \(\pi _{{\textbf{Y}}}\). Perhaps surprisingly, we can use \(\pi _{{\textbf{Y}}}\) to ‘infuse meaning’ ex post. The following is a preparation.

Lemma 6.5

We have \((\alpha ,0)\in {\text {rng}}(\pi _{{\textbf{Y}}})\) for all \(\alpha <\nu \).

Proof

By Definition 4.7 we have \({\text {supp}}^{\Gamma }_{{\textbf{S}}({\textbf{Y}})}(0)=\emptyset \), which entails

$$\begin{aligned} {\text {supp}}^{\Gamma \circ {\textbf{S}}}_{{\textbf{Y}}}(0)=\bigcup \{{\text {supp}}^{\textbf{S}}_{{\textbf{Y}}}(\rho )\,|\,\rho \in {\text {supp}}^{\Gamma }_{{\textbf{S}}(\textbf{Y})}(0)\}=\emptyset . \end{aligned}$$

In the notation from Definition 1.4, we get

$$\begin{aligned} G_\alpha (0)=\bigcup \{G^{\Gamma \circ \textbf{S}}_\alpha (s)\,|\,s\in {\text {supp}}^{\Gamma \circ {\textbf{S}}}_{\textbf{Y}}(0)\}=\emptyset \subseteq _{\Gamma \circ {\textbf{S}}({\textbf{Y}})}0. \end{aligned}$$

The claim follows by Definition 1.4. \(\square \)

Recall that the normal dilator \(\Gamma \) comes with an embedding \(\gamma _{{\textbf{Y}}}:{\textbf{Y}}\rightarrow \Gamma ({\textbf{Y}})\), which is given by Definition 4.9.

Definition 6.6

In view of the previous lemma, let the embedding \(\textbf{P}_0:\nu \rightarrow {\textbf{Y}}\) be determined by \(\pi _{{\textbf{Y}}}\circ \textbf{P}_0(\alpha )=(\alpha +1,0)\). We also put \({\textbf{P}}:=\gamma _{\textbf{Y}}\circ {\textbf{P}}_0:\nu \rightarrow \Gamma ({\textbf{Y}})\).

Given \(s\in \Gamma ({\textbf{Y}})\), let \(y\in {\textbf{Y}}\) be the maximal element of \({\text {supp}}^\Gamma _{{\textbf{Y}}}(s)\cup \{{\textbf{P}}_0(0)\}\), which is finite and non-empty. Write \(\pi _{\textbf{Y}}(y)=(\alpha ,\sigma )\). Since \(\pi _{{\textbf{Y}}}\) is an embedding, we get

$$\begin{aligned} {\text {supp}}^\Gamma _{{\textbf{Y}}}(s)\subseteq _{{\textbf{Y}}}\textbf{P}_0(\alpha )\quad \text {and thus}\quad s<_{\Gamma (\textbf{Y})}\gamma _{{\textbf{Y}}}\circ {\textbf{P}}_0(\alpha )={\textbf{P}}(\alpha ), \end{aligned}$$

using Lemma 4.10. This observation ensures that the following is well defined.

Definition 6.7

We define \(Y:\Gamma ({\textbf{Y}})\rightarrow J({\textbf{Y}})\) by \(Y(\textbf{P}(\alpha )):=(\alpha +1,0)\) and

$$\begin{aligned} Y(s):=(\alpha ,s)\quad \text {with}\quad \alpha =\min \{\gamma<\nu \,|\,s<_{\Gamma (\textbf{Y})}{\textbf{P}}(\gamma )\} \end{aligned}$$

for any \(s\in \Gamma ({\textbf{Y}})\) that does not lie in the range of \({\textbf{P}}\).

It is not hard to see that Y is an order embedding, and we have \(Y\circ {\textbf{P}}=j[{\textbf{Y}}]\) by construction. We can thus invoke Proposition 5.6 to obtain embeddings

$$\begin{aligned}&S_Y:{}S^{{\textbf{P}}}_{\Gamma ({\textbf{Y}})}\rightarrow S^{j[{\textbf{Y}}]}_{J({\textbf{Y}})}={\textbf{S}}_0({\textbf{Y}}),\\&{\textbf{Y}}+S_Y:{}{\textbf{Y}}+S^{{\textbf{P}}}_{\Gamma (\textbf{Y})}\rightarrow {\textbf{Y}}+{\textbf{S}}_0({\textbf{Y}})={\textbf{S}}({\textbf{Y}}). \end{aligned}$$

In contrast to \(j[{\textbf{Y}}]:\nu \rightarrow J({\textbf{Y}})\), the map \(\textbf{P}:\nu \rightarrow \Gamma ({\textbf{Y}})\) has a ‘natural’ codomain and a meaningful connection to \(\pi _{{\textbf{Y}}}\). With respect to the informal discussion at the beginning of this section, it may thus be tempting to define \({\textbf{X}}\) as \({\textbf{Y}}\). The partial function \(\psi ^{{\textbf{X}}}\) from this discussion should then be inverse to the dashed arrow in

figure g

However, it seems that the range of \(\pi _{{\textbf{Y}}}\) need not be contained in the range of the vertical arrow, so that the dashed arrow may not exist. To resolve this issue, we define a suborder that guarantees the desired inclusion in a hereditary way.

Definition 6.8

Let us write \(\vartriangleleft \) for the well founded relation on \({\textbf{Y}}\) that is provided by Definition 1.4, which means that we have

$$\begin{aligned} x\vartriangleleft y\quad \Leftrightarrow \quad x\in {\text {supp}}^{\Gamma \circ \textbf{S}}_{{\textbf{Y}}}(s)\text { for }\pi _{{\textbf{Y}}}(y)=(\alpha ,s). \end{aligned}$$

By recursion over this relation, we define a suborder \(\textbf{X}\subseteq {\textbf{Y}}\) with

$$\begin{aligned} y\in {\textbf{X}}\quad :\Leftrightarrow \quad \pi _{\textbf{Y}}(y)\in {\text {rng}}(\nu \times \Gamma ({\textbf{Y}}+S_Y))\text { and }x\in \textbf{X}\text { for all }x\vartriangleleft y. \end{aligned}$$

We will write \(\iota :{\textbf{X}}\rightarrow {\textbf{Y}}\) for the inclusion.

Let us complement Lemma 6.5 as follows.

Lemma 6.9

If \(\pi _{{\textbf{Y}}}(y)=(\alpha ,0)\) holds for some \(\alpha <\nu \), then we have \(y\in {\textbf{X}}={\text {rng}}(\iota )\).

Proof

It suffices to recall that \(0=\Gamma (f)(0)\in {\text {rng}}(\Gamma (f))\) holds for any embedding f, and that \({\text {supp}}^{\Gamma \circ \textbf{S}}_{{\textbf{Y}}}(0)=\emptyset \) was shown in the proof of Lemma 6.5. \(\square \)

To define the other objects that were promised at the beginning of this section, we repeat some of the previous constructions, but now with \({\textbf{X}}\) at the place of \({\textbf{Y}}\).

Definition 6.10

Determine \({\textbf{R}}_0:\nu \rightarrow {\textbf{X}}\) and \(\textbf{R}:\nu \rightarrow \Gamma ({\textbf{X}})\) by

$$\begin{aligned} \pi _{{\textbf{Y}}}\circ \iota \circ \textbf{R}_0(\alpha )=(\alpha +1,0)\quad \text {and}\quad \textbf{R}:=\gamma _{{\textbf{X}}}\circ {\textbf{R}}_0. \end{aligned}$$

For the order \(S^{{\textbf{R}}}_{\Gamma ({\textbf{X}})}\) given by Definitions 5.3 and 5.5, we now put

$$\begin{aligned} {\textbf{K}}:={\textbf{X}}+S^{{\textbf{R}}}_{\Gamma (\textbf{X})}\quad \text {and}\quad {\textbf{O}}:=\Gamma ({\textbf{K}}). \end{aligned}$$

Note that we have \(\iota \circ {\textbf{R}}_0={\textbf{P}}_0\), as \(\pi _{{\textbf{Y}}}\) is order preserving and hence injective. From Lemma 4.10 we know that \(\gamma \) is natural with respect to \(\iota :{\textbf{X}}\rightarrow {\textbf{Y}}\). We get

$$\begin{aligned} \Gamma (\iota )\circ {\textbf{R}}=\Gamma (\iota )\circ \gamma _{\textbf{X}}\circ {\textbf{R}}_0=\gamma _{{\textbf{Y}}}\circ \iota \circ \textbf{R}_0=\gamma _{{\textbf{Y}}}\circ {\textbf{P}}_0={\textbf{P}}. \end{aligned}$$

Thus Proposition 5.6 yields an embedding \(S_{\Gamma (\iota )}:S^{{\textbf{R}}}_{\Gamma ({\textbf{X}})}\rightarrow S^{{\textbf{P}}}_{\Gamma ({\textbf{Y}})}\). By composing with another map from above, we obtain embeddings

$$\begin{aligned} ({\textbf{Y}}+S_Y)\circ (\iota +S_{\Gamma (\iota )})=\iota +S_{Y\circ \Gamma (\iota )}&:{\textbf{K}}\rightarrow {\textbf{S}}({\textbf{Y}}),\\ \Gamma (\iota +S_{Y\circ \Gamma (\iota )})&:{\textbf{O}}=\Gamma (\textbf{K})\rightarrow \Gamma \circ {\textbf{S}}({\textbf{Y}}). \end{aligned}$$

In particular, we can conclude that \({\textbf{O}}\) is a well order, as \(\Gamma \circ {\textbf{S}}({\textbf{Y}})\) is well founded by Assumption 6.4. The following resolves an issue that was mentioned above. It may help to read the lemma in conjunction with the definition that follows it.

Lemma 6.11

The range of \(\pi _{{\textbf{Y}}}\circ \iota \) is contained in the range of \(\nu \times \Gamma (\iota +S_{Y\circ \Gamma (\iota )})\).

Proof

The crucial step is to show that any \(s\in {\textbf{Y}}+S^{\textbf{P}}_{\Gamma ({\textbf{Y}})}\) validates

$$\begin{aligned} s\in {\text {rng}}(\iota +S_{\Gamma (\iota )})\quad \Leftrightarrow \quad {{\text {supp}}^{\textbf{S}}_{{\textbf{Y}}}}\circ ({\textbf{Y}}+S_Y)(s)\subseteq \textbf{X}={\text {rng}}(\iota ). \end{aligned}$$

Even though we will not use this fact, we note that the equivalence means that

figure h

is a pullback, where \(X:\Gamma ({\textbf{X}})\rightarrow J({\textbf{X}})\) is constructed analogous to Definition 6.7. For \(s=y\in {\textbf{Y}}\subseteq {\textbf{Y}}+S^{{\textbf{P}}}_{\Gamma (\textbf{Y})}\) we can invoke Definition 6.3 to get

$$\begin{aligned} {{\text {supp}}^{{\textbf{S}}}_{{\textbf{Y}}}}\circ (\textbf{Y}+S_Y)(s)={\text {supp}}^{{\textbf{S}}}_{{\textbf{Y}}}(y)=\{y\}. \end{aligned}$$

So both sides of our equivalence amount to \(y\in {\text {rng}}(\iota )\). For \(s={\textbf{Y}}+\sigma \) we have

$$\begin{aligned} s\in {\text {rng}}(\iota +S_{\Gamma (\iota )})\quad \Leftrightarrow \quad \sigma \in {\text {rng}}(S_{\Gamma (\iota )})\quad \Leftrightarrow \quad {\text {supp}}^S_{\Gamma ({\textbf{Y}})}(\sigma )\subseteq {\text {rng}}(\Gamma (\iota )), \end{aligned}$$

where the second equivalence holds by Proposition 5.6 and the paragraph before it. On the other hand, Definitions 6.1 and 6.3 yield

$$\begin{aligned}{} & {} {{\text {supp}}^{{\textbf{S}}}_{{\textbf{Y}}}}\circ ({\textbf{Y}}+S_Y)(s)={\text {supp}}^{{\textbf{S}}}_{{\textbf{Y}}}({\textbf{Y}}+S_Y(\sigma ))={\text {supp}}^0_{{\textbf{Y}}}(S_Y(\sigma ))\\{} & {} \quad =\bigcup \{{\text {supp}}^J_{\textbf{Y}}(\rho )\,|\,\rho \in {\text {supp}}^S_{J(\textbf{Y})}(S_Y(\sigma ))\}=\bigcup \{{\text {supp}}^J_{\textbf{Y}}(Y(\tau ))\,|\,\tau \in {\text {supp}}^S_{\Gamma ({\textbf{Y}})}(\sigma )\}. \end{aligned}$$

Here the last equality relies on the fact that \({\text {supp}}^S\) is a natural transformation. By the previous lines of equivalences and equations, the desired equivalence reduces to

$$\begin{aligned} \tau \in {\text {rng}}(\Gamma (\iota ))\quad \Leftrightarrow \quad {\text {supp}}^J_{\textbf{Y}}(Y(\tau ))\subseteq {\textbf{X}}={\text {rng}}(\iota ). \end{aligned}$$

Considering the definition of Y, we distinguish two cases: For \(\tau ={\textbf{P}}(\alpha )\), the paragraph after Definition 6.10 yields \(\tau =\Gamma (\iota )\circ \textbf{R}(\alpha )\in {\text {rng}}(\Gamma (\iota ))\). We also have

$$\begin{aligned} {\text {supp}}^J_{{\textbf{Y}}}(Y(\tau ))={\text {supp}}^J_{\textbf{Y}}(\alpha +1,0)={\text {supp}}^\Gamma _{{\textbf{Y}}}(0)=\emptyset \subseteq \textbf{X}. \end{aligned}$$

If \(\tau \) does not lie in the range of \({\textbf{P}}\), then we have \(Y(\tau )=(\alpha ,\tau )\) for some \(\alpha <\nu \). In this case we get \({\text {supp}}^J_{{\textbf{Y}}}(Y(\tau ))={\text {supp}}^\Gamma _{{\textbf{Y}}}(\tau )\), so that the open equivalence coincides with the support property of the dilator \(\Gamma \). Thus the equivalence from the beginning of the proof is established. For \(s\in \Gamma ({\textbf{L}})\) with \(\textbf{L}:={\textbf{Y}}+S^{{\textbf{P}}}_{\Gamma ({\textbf{Y}})}\) we now observe

$$\begin{aligned} s\in {\text {rng}}(\Gamma (\iota +S_{\Gamma (\iota )}))\quad \Leftrightarrow \quad {\text {supp}}^\Gamma _{\textbf{L}}(s)\subseteq {\text {rng}}(\iota +S_{\Gamma (\iota )}), \end{aligned}$$

also by the support condition for \(\Gamma \). Furthermore, we compute

$$\begin{aligned} {{\text {supp}}^{\Gamma \circ {\textbf{S}}}_{{\textbf{Y}}}}\circ \Gamma ({\textbf{Y}}+S_Y)(s)&=\bigcup \{{\text {supp}}^{{\textbf{S}}}_{{\textbf{Y}}}(\rho )\,|\,\rho \in {{\text {supp}}^\Gamma _{{\textbf{S}}({\textbf{Y}})}}\circ \Gamma ({\textbf{Y}}+S_Y)(s)\}\\&=\bigcup \{{{\text {supp}}^{{\textbf{S}}}_{{\textbf{Y}}}}\circ (\textbf{Y}+S_Y)(\tau )\,|\,\tau \in {\text {supp}}^\Gamma _{{\textbf{L}}}(s)\}. \end{aligned}$$

Using the equivalence from the beginning of the proof, one can now derive

$$\begin{aligned} s\in {\text {rng}}(\Gamma (\iota +S_{\Gamma (\iota )}))\quad \Leftrightarrow \quad {{\text {supp}}^{\Gamma \circ \textbf{S}}_{{\textbf{Y}}}}\circ \Gamma ({\textbf{Y}}+S_Y)(s)\subseteq \textbf{X}={\text {rng}}(\iota ). \end{aligned}$$

Even though we will not use this, we note that this step corresponds to the fact that \(\Gamma \) preserves pullbacks. It is straightforward to derive the lemma: Given \(y\in {\textbf{X}}\), we write \(\pi _{\textbf{Y}}\circ \iota (y)=(\alpha ,t)\). The definition of \(\textbf{X}\subseteq {\textbf{Y}}\) yields \({\text {supp}}^{\Gamma \circ {\textbf{S}}}_{\textbf{Y}}(t)\subseteq {\textbf{X}}\) as well as \(t=\Gamma ({\textbf{Y}}+S_Y)(s)\) for some \(s\in \Gamma ({\textbf{L}})\). By the equivalence above, we can conclude that \(s=\Gamma (\iota +S_{\Gamma (\iota )})(r)\) holds for some \(r\in \Gamma ({\textbf{K}})={\textbf{O}}\). We thus get

$$\begin{aligned} t=\Gamma (\textbf{Y}+S_Y)\circ \Gamma (\iota +S_{\Gamma (\iota )})(r)=\Gamma \left( \iota +S_{Y\circ \Gamma (\iota )}\right) (r). \end{aligned}$$

So \(\pi _{{\textbf{Y}}}\circ \iota (y)=(\alpha ,t)\) is the image of \((\alpha ,r)\) under \(\nu \times \Gamma (\iota +S_{Y\circ \Gamma (\iota )})\). \(\square \)

The following completes the constructions that were sketched at the beginning of the present section. We point out that \(\pi _{\textbf{X}}\) is analogous to the dashed arrow from the diagramm before Lemma 6.8.

Definition 6.12

Invoking Lemma 6.11, let \(\pi _{{\textbf{X}}}\) be the unique embedding such that

figure i

is a commutative diagram. To define a partial function \(\psi ^{{\textbf{X}}}:\nu \times {\textbf{O}}\mathrel {\twoheadrightarrow _p}{\textbf{X}}\) that is surjective and order preserving, we put

$$\begin{aligned} \psi ^{{\textbf{X}}}_\alpha s:=\psi ^{{\textbf{X}}}(\alpha ,s):={\left\{ \begin{array}{ll} x &{} \text {if } \pi _{{\textbf{X}}}(x)=(\alpha ,s),\\ \text {undefined} &{} \text {if } (\alpha ,s)\notin {\text {rng}}(\pi _{\textbf{X}}). \end{array}\right. } \end{aligned}$$

We will write \({\text {dom}}(\psi ^{{\textbf{X}}}):={\text {rng}}(\pi _{{\textbf{X}}})\) for the domain of this partial function. Also, let \(I:\textbf{X}\rightarrow {\textbf{X}}+S^{{\textbf{R}}}_{\Gamma ({\textbf{X}})}={\textbf{K}}\) with \(I(x):=x\) be the map onto the first summand.

Crucially, the search tree \(S^{{\textbf{R}}}_{\Gamma ({\textbf{X}})}\) depends on an embedding \({\textbf{R}}:\nu \rightarrow \Gamma ({\textbf{X}})\) that has a meaningful connection to the collapsing function \(\psi ^{\textbf{X}}\).

Lemma 6.13

We have \((\alpha ,0)\in {\text {dom}}(\psi ^{{\textbf{X}}})\) and \(\gamma _{\textbf{X}}(\psi ^{{\textbf{X}}}_{\alpha +1}0)={\textbf{R}}(\alpha )\) for all \(\alpha <\nu \).

Proof

Note that we have distinct elements \(0\in \Gamma ({\textbf{K}})=\textbf{O}\) and \(0\in \mathbf \Gamma ({\textbf{S}}({\textbf{Y}}))\). In view of Definitions 4.7 and 6.10, we get

$$\begin{aligned} \left( \nu \times \Gamma (\iota +S_{Y\circ \Gamma (\iota )})\right) (\alpha +1,0)&=(\alpha +1,0)=\pi _{{\textbf{Y}}}\circ \iota \circ {\textbf{R}}_0(\alpha )\\ {}&=\left( \nu \times \Gamma (\iota +S_{Y\circ \Gamma (\iota )})\right) \circ \pi _X\circ \textbf{R}_0(\alpha ). \end{aligned}$$

This entails \((\alpha +1,0)=\pi _{{\textbf{X}}}\circ \textbf{R}_0(\alpha )\in {\text {rng}}(\pi _{{\textbf{X}}})={\text {dom}}(\psi ^{{\textbf{X}}})\) and \(\psi ^{{\textbf{X}}}_{\alpha +1}0={\textbf{R}}_0(\alpha )\), so that we get \(\gamma _{{\textbf{X}}}(\psi ^{{\textbf{X}}}_{\alpha +1}0)=\gamma _{\textbf{X}}\circ {\textbf{R}}_0(\alpha )={\textbf{R}}(\alpha )\). To show \((\alpha ,0)\in {\text {dom}}(\psi ^{{\textbf{X}}})\) with \(\alpha \) at the place of \(\alpha +1\), use Lemmas 6.5 and 6.9 to write \((\alpha ,0)=\pi _{\textbf{Y}}\circ \iota (x)\) with \(x\in {\textbf{X}}\). Then argue as before, with \(\alpha \) and y at the place of \(\alpha +1\) and \(\textbf{R}_0(\alpha )\). \(\square \)

In the rest of this section we characterize the range of \(\pi _{{\textbf{X}}}\) or, in other words, the domain of the partial function \(\psi ^{{\textbf{X}}}\). As a first step, we assign supports to the elements of \({\textbf{K}}\) and \({\textbf{O}}\). To avoid misunderstanding, we point out that the following support functions do not belong to a dilator. Let us also recall that \({\text {supp}}^S\) was defined in the paragraph before Definition 5.3.

Definition 6.14

Let \({\text {supp}}^{{\textbf{K}}}:{\textbf{K}}={\textbf{X}}+S^{\textbf{R}}_{\Gamma ({\textbf{X}})}\rightarrow [{\textbf{X}}]^{<\omega }\) be given by

$$\begin{aligned} {\text {supp}}^{{\textbf{K}}}(x)=\{x\},\quad {\text {supp}}^{{\textbf{K}}}(\textbf{X}+\sigma )=\bigcup \{{\text {supp}}^\Gamma _{\textbf{X}}(\rho )\,|\,\rho \in {\text {supp}}^S_{\Gamma (\textbf{X})}(\sigma )\backslash {\text {rng}}({\textbf{R}})\}. \end{aligned}$$

Furthermore, define \({\text {supp}}^{{\textbf{O}}}:{\textbf{O}}=\Gamma (\textbf{K})\rightarrow [{\textbf{X}}]^{<\omega }\) by setting

$$\begin{aligned} {\text {supp}}^{{\textbf{O}}}(s)=\bigcup \{{\text {supp}}^{\textbf{K}}(\rho )\,|\,\rho \in {\text {supp}}^\Gamma _{{\textbf{K}}}(s)\}. \end{aligned}$$

The given definition—and in particular the exclusion of \({\text {rng}}({\textbf{R}})\)—is justified by the following connection with the support functions of our dilators \({\textbf{S}}\) and \(\Gamma \circ {\textbf{S}}\).

Lemma 6.15

Each of the diagrams

figure j

commutes.

Proof

Let us abbreviate \(f:=\iota +S_{Y\circ \Gamma (\iota )}\). Using Definitions 6.1 and 6.3 as well as the naturality of supports, we get

$$\begin{aligned} {\text {supp}}^{{\textbf{S}}}_{{\textbf{Y}}}\circ f({\textbf{X}}+\sigma )&=\bigcup \{{\text {supp}}^J_{{\textbf{Y}}}(Y\circ \Gamma (\iota )(\rho ))\,|\,\rho \in {\text {supp}}^S_{\Gamma ({\textbf{X}})}(\sigma )\},\\ [\iota ]^{<\omega }\circ {\text {supp}}^{{\textbf{K}}}(\textbf{X}+\sigma )&=\bigcup \{{\text {supp}}^\Gamma _{\textbf{Y}}(\Gamma (\iota )(\rho ))\,|\,\rho \in {\text {supp}}^S_{\Gamma (\textbf{X})}(\sigma )\backslash {\text {rng}}({\textbf{R}})\}. \end{aligned}$$

To see why the range of \({\textbf{R}}\) is excluded, note that \(\rho ={\textbf{R}}(\alpha )\) entails \(\Gamma (\iota )(\rho )=\textbf{P}(\alpha )\), so that Definition 6.7 yields \(Y\circ \Gamma (\iota )(\rho )=(\alpha +1,0)\) and thus

$$\begin{aligned} {\text {supp}}^J_{\textbf{Y}}(Y\circ \Gamma (\iota )(\rho ))={\text {supp}}^{\Gamma }_{\textbf{Y}}(0)=\emptyset . \end{aligned}$$

As a straightforward consequence, the left diagram commutes if we have

$$\begin{aligned} {\text {supp}}^J_{{\textbf{Y}}}(Y\circ \Gamma (\iota )(\rho ))={\text {supp}}^\Gamma _{\textbf{Y}}(\Gamma (\iota )(\rho ))\quad \text {for}\quad \rho \in \Gamma (\textbf{X})\backslash {\text {rng}}({\textbf{R}}). \end{aligned}$$

Even though we do not need this fact, it is instructive to observe that the equation fails for \(\rho ={\textbf{R}}(\alpha )=\gamma _{\textbf{X}}(\psi ^{{\textbf{X}}}_{\alpha +1}0)\), where Lemma 4.10 yields

$$\begin{aligned} {\text {supp}}^\Gamma _{\textbf{Y}}(\Gamma (\iota )(\rho ))=[\iota ]^{<\omega }\left( {\text {supp}}^\Gamma _{\textbf{X}}(\gamma _{{\textbf{X}}}(\psi ^{\textbf{X}}_{\alpha +1}0))\right) =[\iota ]^{<\omega }\left( \{\psi ^{\textbf{X}}_{\alpha +1}0\}\right) \ne \emptyset . \end{aligned}$$

On the other hand, \(\rho \notin {\text {rng}}({\textbf{R}})\) entails \(\Gamma (\iota )(\rho )\notin {\text {rng}}({\textbf{P}})\), as we have \(\Gamma (\iota )\circ {\textbf{R}}={\textbf{P}}\) and \(\Gamma (\iota )\) is injective. We then get \(Y\circ \Gamma (\iota )(\rho )=(\alpha ,\Gamma (\iota )(\rho ))\) for some \(\alpha <\nu \). In this case, the desired equality is immediate by the definition of the support for J. The right diagram is readily reduced to the left one. \(\square \)

Our well founded ‘subterm’ relation on \({\textbf{Y}}\) can now be transferred to \({\textbf{X}}\).

Lemma 6.16

For any \(x\in X\) and \((\alpha ,s)\in {\text {dom}}(\psi ^{{\textbf{X}}})\) we have

$$\begin{aligned} \iota (x)\vartriangleleft \iota (\psi ^{{\textbf{X}}}_\alpha s)\quad \Leftrightarrow \quad x\in {\text {supp}}^{{\textbf{O}}}(s), \end{aligned}$$

where \(\vartriangleleft \) is the well founded relation on \({\textbf{Y}}\) that was specified in Definition 6.8.

Proof

When \(\psi ^{{\textbf{X}}}_\alpha s\) is defined, we have \(\pi _{\textbf{X}}(\psi ^{{\textbf{X}}}_\alpha s)=(\alpha ,s)\) and hence

$$\begin{aligned} \pi _{{\textbf{Y}}}\circ \iota (\psi ^{{\textbf{X}}}_\alpha s)=\left( \nu \times \Gamma (\iota +S_{Y\circ \Gamma (\iota )})\right) \circ \pi _{\textbf{X}}(\psi ^{{\textbf{X}}}_\alpha s)=\left( \alpha ,\Gamma (\iota +S_{Y\circ \Gamma (\iota )})(s)\right) . \end{aligned}$$

Together with the previous lemma, it follows that \(\iota (x)\vartriangleleft \iota (\psi ^{{\textbf{X}}}_\alpha s)\) amounts to

$$\begin{aligned} \iota (x)\in&\,{{\text {supp}}^{\Gamma \circ {\textbf{S}}}_{{\textbf{Y}}}}\circ \Gamma (\iota +S_{Y\circ \Gamma (\iota )})(s){}\\ =&\,\bigcup \left\{ {\text {supp}}^{{\textbf{S}}}_{{\textbf{Y}}}(\tau )\,\left| \,\tau \in {{\text {supp}}^\Gamma _{{\textbf{S}}({\textbf{Y}})}}\circ \Gamma (\iota +S_{Y\circ \Gamma (\iota )})(s)\right. \right\} \\ =&\,\bigcup \left\{ {{\text {supp}}^{{\textbf{S}}}_{{\textbf{Y}}}}\circ (\iota +S_{Y\circ \Gamma (\iota )})(\rho )\,\left| \,\rho \in {\text {supp}}^\Gamma _{{\textbf{K}}}(s)\right. \right\} \\ =&\,\bigcup \left\{ [\iota ]^{<\omega }\left( {\text {supp}}^{\textbf{K}}(\rho )\right) \,\left| \,\rho \in {\text {supp}}^\Gamma _{\textbf{K}}(s)\right. \right\} =[\iota ]^{<\omega }\left( {\text {supp}}^{\textbf{O}}(s)\right) , \end{aligned}$$

which is clearly equivalent to \(x\in {\text {supp}}^{{\textbf{O}}}(s)\). \(\square \)

Given that \(\iota :X\rightarrow Y\) is an inclusion map, we will also refer to \(\vartriangleleft \) as a well founded relation on \({\textbf{X}}\). The following definition uses recursion along this relation. It also exploits that any element of \({\textbf{X}}\) can be uniquely written as \(\psi ^{\textbf{X}}_\alpha s\), since the partial function \(\psi ^{\textbf{X}}:\nu \times {\textbf{O}}\mathrel {\twoheadrightarrow _p}{\textbf{X}}\) is surjective and order preserving. When we refer to \(\psi ^{{\textbf{X}}}_\alpha s\) as a given element of \({\textbf{X}}\), we always assume \((\alpha ,s)\in {\text {dom}}(\psi ^{{\textbf{X}}})\).

Definition 6.17

For \(\gamma <\nu \) we define \(K^-_\gamma :{\textbf{X}}\rightarrow [\textbf{O}]^{<\omega }\) and \(K_\gamma :{\textbf{O}}\rightarrow [{\textbf{O}}]^{<\omega }\) by

$$\begin{aligned} K^-_\gamma (\psi ^{{\textbf{X}}}_\alpha s)&:={\left\{ \begin{array}{ll} \{s\}\cup K_\gamma (s) &{} \text {if } \gamma \le \alpha ,\\ \emptyset &{} \text {otherwise}, \end{array}\right. }\\ K_\gamma (s)&:=\bigcup \{K^-_\gamma (x)\,|\,x\in {\text {supp}}^{{\textbf{O}}}(s)\}. \end{aligned}$$

As promised, we can now characterize the domain of our collapsing function.

Proposition 6.18

For any \(\gamma <\nu \) and \(s\in {\textbf{O}}\) we have

$$\begin{aligned} (\gamma ,s)\in {\text {dom}}(\psi ^{{\textbf{X}}})={\text {rng}}(\pi _{\textbf{X}})\quad \Leftrightarrow \quad K_\gamma (s)\subseteq _{{\textbf{O}}} s. \end{aligned}$$

Proof

Let \(G_\gamma \) and \(G^{\Gamma \circ {\textbf{S}}}_\gamma \) be the maps that arise from Definition 1.4 in conjunction with Assumption 6.4. We abbreviate \(f:=\iota +S_{Y\circ \Gamma (\iota )}:{\textbf{K}}\rightarrow {\textbf{S}}({\textbf{Y}})\) and show that

figure k

is commutative. To prove that the left quare commutes, we employ induction over the well founded relation from Lemma 6.16. For the induction step, recall that the proof of Lemma 6.16 yields \(\pi _{{\textbf{Y}}}\circ \iota (\psi ^{\textbf{X}}_\alpha s)=(\alpha ,\Gamma (f)(s))\). By Definition 1.4 we get

$$\begin{aligned} G^{\Gamma \circ {\textbf{S}}}_\gamma \circ \iota (\psi ^{{\textbf{X}}}_\alpha s)={\left\{ \begin{array}{ll} \{\Gamma (f)(s)\}\cup G_\gamma \circ \Gamma (f)(s) &{} \text {if } \alpha \ge \gamma ,\\ \emptyset &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

To complete the induction step, use the hypothesis and Lemma 6.15 to compute

$$\begin{aligned} {[}\Gamma (f)]^{<\omega }\circ K_\gamma (s)&=\bigcup \{[\Gamma (f)]^{<\omega }\circ K^-_\gamma (x)\,|\,x\in {\text {supp}}^{{\textbf{O}}}(s)\}\\&=\bigcup \{G^{\Gamma \circ {\textbf{S}}}_\gamma (x)\,|\,x\in [\iota ]^{<\omega }\circ {\text {supp}}^{{\textbf{O}}}(s)\}\\&=\bigcup \{G^{\Gamma \circ \textbf{S}}_\gamma (x)\,|\,\tau \in {{\text {supp}}^{\Gamma \circ {\textbf{S}}}_{\textbf{Y}}}\circ \Gamma (f)(s)\}=G_\gamma \circ \Gamma (f)(s). \end{aligned}$$

Note that this proves that the right square commutes. Definition 1.4 does now yield

$$\begin{aligned} K_\gamma (s)\subseteq _{{\textbf{O}}} s\quad&\Leftrightarrow \quad G_\gamma \circ \Gamma (f)(s)=[\Gamma (f)]^{<\omega }\circ K_\gamma (s)\subseteq _{\Gamma \circ {\textbf{S}}({\textbf{Y}})}\Gamma (f)(s)\\ {}&\Leftrightarrow \quad (\gamma ,\Gamma (f)(s))\in {\text {rng}}(\pi _{\textbf{Y}}). \end{aligned}$$

To complete the proof, we show that \((\gamma ,\Gamma (f)(s))\in {\text {rng}}(\pi _{{\textbf{Y}}})\) and \((\gamma ,s)\in {\text {rng}}(\pi _{{\textbf{X}}})\) are equivalent, which means that the diagram from Definition 6.12 is a pullback. Concerning the easier direction, we note that \((\gamma ,s)=\pi _{{\textbf{X}}}(x)\) entails

$$\begin{aligned} (\gamma ,\Gamma (f)(s))=(\nu \times \Gamma (f))\circ \pi _X(x)=\pi _{\textbf{Y}}\circ \iota (x)\in {\text {rng}}(\pi _{{\textbf{Y}}}). \end{aligned}$$

To prove the converse, we assume \((\gamma ,\Gamma (f)(s))=\pi _{\textbf{Y}}(y)\) and derive \(y\in {\textbf{X}}={\text {rng}}(\iota )\). In view of \(f=({\textbf{Y}}+S_Y)\circ (\iota +S_{\Gamma (\iota )})\) we set \(t:=\Gamma (\iota +S_{\Gamma (\iota )})(s)\) to obtain

$$\begin{aligned} \pi _{{\textbf{Y}}}(y)=(\gamma ,\Gamma (\textbf{Y}+S_Y)(t))\in {\text {rng}}(\nu \times \Gamma ({\textbf{Y}}+S_Y)). \end{aligned}$$

The proof of Lemma 6.11 shows that \(t\in {\text {rng}}(\Gamma (\iota +S_{\Gamma (\iota )}))\) entails

$$\begin{aligned} {\text {supp}}^{\Gamma \circ {\textbf{S}}}_{{\textbf{Y}}}\left( \Gamma (\textbf{Y}+S_Y)(t)\right) \subseteq {\textbf{X}}. \end{aligned}$$

By Definition 6.8 we now get \(y\in {\textbf{X}}\), as desired. \(\square \)

Let us also record a basic observation that will be needed later:

Lemma 6.19

We have \({{\text {supp}}^{{\textbf{O}}}}\circ \Gamma (I)={\text {supp}}^\Gamma _{\textbf{X}}\).

Proof

First recall that \({{\text {supp}}^\Gamma _{\textbf{K}}}\circ \Gamma (I)=[I]^{<\omega }\circ {\text {supp}}^\Gamma _{{\textbf{X}}}\) holds by naturality. In view of Definition 6.14 we have \({{\text {supp}}^{{\textbf{K}}}}\circ I(x)=\{x\}\) and thus

$$\begin{aligned} {{\text {supp}}^{{\textbf{O}}}}\circ \Gamma (I)(\rho )&=\bigcup \{{\text {supp}}^{{\textbf{K}}}(\tau )\,|\,\tau \in {{\text {supp}}^\Gamma _{{\textbf{K}}}}\circ \Gamma (I)(\rho )\}\\ {}&=\bigcup \{{{\text {supp}}^{{\textbf{K}}}}\circ I(x)\,|\,x\in {\text {supp}}^\Gamma _{{\textbf{X}}}(\rho )\}={\text {supp}}^\Gamma _{\textbf{X}}(\rho ), \end{aligned}$$

as desired. \(\square \)

We conclude this section with an observation about the order on \({\textbf{O}}\).

Lemma 6.20

We have \(\Gamma (I)(s)<_{{\textbf{O}}}\Gamma _{{\textbf{X}}+\sigma }\) for all \(s\in \Gamma ({\textbf{X}})\) and \(\sigma \in S^{{\textbf{R}}}_{\Gamma (\textbf{X})}\).

Proof

As I maps into the first summand of \({\textbf{X}}+ S^{\textbf{R}}_{\Gamma ({\textbf{X}})}\), we see that \(\Gamma _{{\textbf{X}}+\sigma }\) lies outside the range of \(\Gamma (I)\). But the latter is an initial segment of \({\textbf{O}}\), by Corollary 4.11. \(\square \)

7 Operator control and infinite proofs

From the previous section we have a function

$$\begin{aligned} \psi ^{{\textbf{X}}}:\nu \times {\textbf{O}}=\nu \times \Gamma \left( \textbf{X}+S^{{\textbf{R}}}_{\Gamma ({\textbf{X}})}\right) \mathrel {\twoheadrightarrow _p}{\textbf{X}} \end{aligned}$$

that is surjective and order preserving but partial, i. e., not always defined. In the present section, we transform \(\psi ^{\textbf{X}}\) into a function \(\psi :\nu \times {\textbf{O}}\rightarrow {\textbf{O}}\) that is total but not always order preserving. We then define an abstract variant of the operator controlled proofs that have been introduced by Buchholz [7]. Finally, we construct an operator controlled proof that embeds the search tree \(S^{{\textbf{R}}}_{\Gamma ({\textbf{X}})}\) from Sect. 5.

As a first step, we transform \(\psi ^{{\textbf{X}}}\) into a function \(\psi ^+\) that remains partial but has codomain \(\textbf{O}\). Note that the following definition composes arrows from the diagram at the beginning of Sect. 6. This diagram commutes by Lemma 4.10, which means that \(\gamma _{{\textbf{K}}}\circ I\) equals \(\Gamma (I)\circ \gamma _{{\textbf{X}}}\). The maps \(\gamma _Z:Z\rightarrow \Gamma (Z)\) and \(I:{\textbf{X}}\rightarrow {\textbf{K}}\) are given by Definitions 4.9 and 6.12, while \({\text {supp}}^{{\textbf{O}}}:{\textbf{O}}\rightarrow [{\textbf{X}}]^{<\omega }\) comes from Definition 6.14.

Definition 7.1

The partial function \(\psi ^+:\nu \times {\textbf{O}}\rightarrow _p{\textbf{O}}\) is given by

$$\begin{aligned} \psi ^+_\alpha s:=\psi ^+(\alpha ,s):={\left\{ \begin{array}{ll} \gamma _{{\textbf{K}}}\circ I(\psi ^{{\textbf{X}}}_\alpha s) &{} \text {if } (\alpha ,s)\in {\text {dom}}(\psi ^{{\textbf{X}}})=:{\text {dom}}(\psi ^+),\\ \text {undefined} &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

To define \({\text {supp}}^+:{\textbf{O}}\rightarrow [{\textbf{O}}]^{<\omega }\), we set \({\text {supp}}^+:=[\gamma _{{\textbf{K}}}\circ I]^{<\omega }\circ {\text {supp}}^{\textbf{O}}\).

For an arbitrary dilator D, no family of embeddings \(Z\rightarrow D(Z)\) needs to exist. This explains why Definition 1.4 involves two families of functions \(G^D_\gamma \) and \(G_\gamma \) with domain X and D(X), respectively. In Definition 6.17 we have constructed corresponding functions \(K^-_\gamma :\textbf{X}\rightarrow [{\textbf{O}}]^{<\omega }\) and \(K_\gamma :{\textbf{O}}\rightarrow [\textbf{O}]^{<\omega }\). In the present case, however, we do have an embedding \(\gamma _{{\textbf{X}}}\circ I:{\textbf{X}}\rightarrow {\textbf{O}}\) (amongst others because of the maps \(\gamma _Z:Z\rightarrow \Gamma (Z)\) that make \(\Gamma \) normal). As the following shows, this allows us to eliminate \(K^-_\gamma \) in favour of \(K_\gamma \). Similarly, the functions \(G^D_\gamma \) and \(G_\gamma \) are unified in traditional ordinal notation systems, as we have seen in Example 1.5.

Proposition 7.2

For any \(\gamma <\nu \) and \((\alpha ,s)\in {\text {dom}}(\psi ^+)\) we have

$$\begin{aligned} K_\gamma (\psi ^+_\alpha s)={\left\{ \begin{array}{ll} \{s\}\cup K_\gamma (s) &{}\text {if } \gamma \le \alpha ,\\ \emptyset &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Furthermore, we have \(K_\gamma (t)=\bigcup \{K_\gamma (r)\,|\,r\in {\text {supp}}^+(t)\}\) for any \(t\in {\textbf{O}}\).

Proof

The naturality of supports and Lemma 4.10 yield

$$\begin{aligned} {{\text {supp}}^\Gamma _{{\textbf{K}}}}\circ \Gamma (I)\circ \gamma _{\textbf{X}}(\psi ^{{\textbf{X}}}_\alpha s)=[I]^{<\omega }\circ {{\text {supp}}^\Gamma _{{\textbf{X}}}}\circ \gamma _{\textbf{X}}(\psi ^{{\textbf{X}}}_\alpha s)=\{I(\psi ^{{\textbf{X}}}_\alpha s)\}. \end{aligned}$$

In view of Definition 6.14, we can derive

$$\begin{aligned}{} & {} {\text {supp}}^{{\textbf{O}}}(\psi ^+_\alpha s)={{\text {supp}}^{{\textbf{O}}}}\circ \gamma _{{\textbf{K}}}\circ I(\psi ^{{\textbf{X}}}_\alpha s)={{\text {supp}}^{{\textbf{O}}}}\circ \Gamma (I)\circ \gamma _{{\textbf{X}}}(\psi ^{{\textbf{X}}}_\alpha s)\\{} & {} \quad =\bigcup \{{\text {supp}}^{\textbf{K}}(\rho )\,|\,\rho \in {{\text {supp}}^\Gamma _{\textbf{K}}}\circ \Gamma (I)\circ \gamma _{{\textbf{X}}}(\psi ^{{\textbf{X}}}_\alpha s)\}={{\text {supp}}^{{\textbf{K}}}}\circ I(\psi ^{{\textbf{X}}}_\alpha s)=\{\psi ^{{\textbf{X}}}_\alpha s\}. \end{aligned}$$

For later reference, we record that this entails

$$\begin{aligned} {\text {supp}}^+(\psi ^+_\alpha s)=\{\gamma _{{\textbf{K}}}\circ I(\psi ^{\textbf{X}}_\alpha s)\}=\{\psi ^+_\alpha s\}. \end{aligned}$$

Due to Definition 6.17 we obtain

$$\begin{aligned} K_\gamma (\psi ^+_\alpha s)=\bigcup \{K^-_\gamma (x)\,|\,x\in {\text {supp}}^{{\textbf{O}}}(\psi ^+_\alpha s)\}=K_\gamma ^-(\psi ^{{\textbf{X}}}_\alpha s). \end{aligned}$$

The first claim of the proposition is now immediate by Definition 6.17. In the paragraph before this definition, we have observed that any element \(x\in {\textbf{X}}\) can be written as \(x=\psi ^{{\textbf{X}}}_\alpha s\) for some \((\alpha ,s)\in {\text {dom}}(\psi ^{{\textbf{X}}})\). We get \(\gamma _{\textbf{K}}\circ I(x)=\psi ^+_\alpha s\), which means that the previous observation can be reformulated as

$$\begin{aligned} K_\gamma \circ \gamma _{{\textbf{K}}}\circ I=K^-_\gamma . \end{aligned}$$

In view of Definition 6.17, we can deduce

$$\begin{aligned} K_\gamma (t)=\bigcup \{K_\gamma ^-(x)\,|\,x\in {\text {supp}}^{\textbf{O}}(t)\}=\bigcup \{K_\gamma (r)\,|\,r\in [\gamma _{{\textbf{K}}}\circ I]^{<\omega }\circ {\text {supp}}^{{\textbf{O}}}(t)\}. \end{aligned}$$

Considering the definition of \({\text {supp}}^+\), this coincides with the remaining claim. \(\square \)

The following result will be used to extend \(\psi ^+\) into a total function.

Proposition 7.3

Given any \(\alpha <\nu \) and \(s\in {\textbf{O}}\), we get \((\alpha ,t)\in {\text {dom}}(\psi ^+)\) for some element \(t\in \{s\}\cup K_\alpha (s)\) with \(s\le _{{\textbf{O}}}t\).

Proof

The main task will be to show that \(r\in K_\alpha (s)\) entails \(r\notin K_\alpha (r)\subseteq K_\alpha (s)\). Once this is achieved, we can conclude by induction on the cardinality of the finite set \(K_\alpha (s)\). Indeed, for \(K_\alpha (s)\subseteq _{{\textbf{O}}} s\) we get \((\alpha ,s)\in {\text {dom}}(\psi ^+)\) by Proposition 6.18, so we can take \(t=s\). If \(K_\alpha (s)\subseteq _{{\textbf{O}}} s\) fails, we can pick an \(r\in K_\alpha (s)\) with \(s\le r\). By the initial claim, \(K_\alpha (r)\) has fewer elements than \(K_\alpha (s)\). Inductively, we thus get \((\alpha ,t)\in {\text {dom}}(\psi ^+)\) for some \(t\in \{r\}\cup K_\alpha (r)\subseteq K_\alpha (s)\) with \(s\le r\le t\). To prove the initial claim, recall that Lemma 6.16 provides a well founded relation \(\vartriangleleft \) on \({\textbf{X}}\subseteq {\textbf{Y}}\). It will be convenient to consider the associated height function \(h:\textbf{X}\rightarrow {\mathbb {N}}\) with

$$\begin{aligned} h\left( \psi ^{{\textbf{X}}}_\gamma t\right) =\max \left( \{0\}\cup \{h(z)+1\,|\,z\in {\text {supp}}^{\textbf{O}}(t)\}\right) . \end{aligned}$$

Aiming at \(r\notin K_\alpha (r)\), we fix an arbitrary element \(x\in {\text {supp}}^{{\textbf{O}}}(r)\). We use induction on \(h(y)\le h(x)\) to prove \(r\notin K^-_\alpha (y)\). Writing \(y=\psi ^{{\textbf{X}}}_\gamma t\), we note that \(h(x)\ge h(y)\) forces \(x\notin {\text {supp}}^{\textbf{O}}(t)\) and hence \(r\ne t\). With the induction hypothesis, this yields

$$\begin{aligned} r\notin \{t\}\cup \bigcup \{K^-_\alpha (z)\,|\,z\in {\text {supp}}^{\textbf{O}}(t)\}=\{t\}\cup K_\alpha (t)\supseteq K^-_\alpha (\psi ^{\textbf{X}}_\gamma t)=K^-_\alpha (y). \end{aligned}$$

Since \(x\in {\text {supp}}^{{\textbf{O}}}(r)\) was arbitrary, we get

$$\begin{aligned} r\notin \bigcup \{K^- _\alpha (x)\,|\,x\in {\text {supp}}^{\textbf{O}}(r)\}=K_\alpha (r). \end{aligned}$$

Another induction on h(x) shows that \(r\in K^-_\alpha (x)\) entails \(K_\alpha (r)\subseteq K^-_\alpha (x)\). It is straightfoward to conclude that \(r\in K_\alpha (s)\) entails \(K_\alpha (r)\subseteq K_\alpha (s)\). \(\square \)

We can now define the total extension of \(\psi ^{{\textbf{X}}}\) that was promised above.

Definition 7.4

To obtain a total function \(\psi :\nu \times {\textbf{O}}\rightarrow {\textbf{O}}\), we put

$$\begin{aligned}{} & {} \psi _\alpha s:=\psi (\alpha ,s):=\psi ^+_\alpha t\quad \text {for the } <_{{\textbf{O}}}\text {-minimal } t\in \{s\}\cup K_\alpha (s) \text { with}\\{} & {} \quad s\le _{{\textbf{O}}}t \text { and } (\alpha ,t)\in {\text {dom}}(\psi ^+). \end{aligned}$$

Let us also define \(C_\alpha (t):=\{s\in \textbf{O}\,|\,K_\alpha (s)\subseteq _{{\textbf{O}}}t\}\) for all \(\alpha <\nu \) and \(t\in {\textbf{O}}\).

Note that we immediately get \(\psi _\alpha s=\psi ^+_\alpha s\) for \((\alpha ,s)\in {\text {dom}}(\psi ^+)\). The sets \(C_\alpha (t)\) and the following proposition evoke traditional constructions of ordinal notation systems in terms of set theory (see e. g. [7, Definition 4.2]). In contrast to these constructions, our functions \(\psi _\alpha \) do not seem to be weakly increasing. Indeed, if we have \(t<r<t'\) with \((\alpha ,r)\in {\text {dom}}(\psi ^+)\) but \(\psi _\alpha t=\psi ^+_\alpha t'\) due to \(r\notin K_\alpha (t)\), then we get \(\psi _\alpha r=\psi ^+_\alpha r<\psi ^+_\alpha t'=\psi _\alpha t\). At the same time, Corollary 7.6 will ensure that the order is preserved in relevant cases.

Proposition 7.5

The following holds for all \(\alpha <\nu \) and \(s,t\in {\textbf{O}}\):

  1. (a)

    Given \(s\in C_\alpha (t)\) with \(s<t\), we get \(\psi _\gamma s\in C_\alpha (t)\) for any \(\gamma <\nu \).

  2. (b)

    If we have \(s<\psi _{\alpha +1}0\), then \(s\in C_\alpha (t)\) implies \(s<\psi _\alpha t\).

  3. (c)

    If we have \(t\in C_\alpha (t)\), then \(s<\psi _\alpha t\) implies \(s\in C_\alpha (t)\).

Proof

(a) For \(\gamma <\alpha \) we have \(K_\alpha (\psi _\gamma s)=\emptyset \), so that \(\psi _\gamma s\in C_\alpha (t)\) is immediate. Let us now assume \(\gamma \ge \alpha \). With \(h:{\textbf{X}}\rightarrow {\mathbb {N}}\) as in the proof of Proposition 7.3, an easy induction on h(x) yields \(K^-_\gamma (x)\subseteq K^-_\alpha (x)\) and simultaneously \(K_\gamma (s)\subseteq K_\alpha (s)\). We note that this entails \(C_\alpha (t)\subseteq C_\gamma (t)\). Given that we have \(s\in C_\alpha (t)\) and \(s<t\), we learn that \(\psi _\gamma s=\psi ^+_\gamma t'\) holds for some

$$\begin{aligned} t'\in \{s\}\cup K_\gamma (s)\subseteq \{s\}\cup K_\alpha (s)\subseteq _{{\textbf{O}}} t. \end{aligned}$$

As in the proof of Proposition 7.3, we get \(K_\alpha (t')\subseteq K_\alpha (s)\) and hence

$$\begin{aligned} K_\alpha (\psi _\gamma s)=K_\alpha (\psi ^+_\gamma t')=\{t'\}\cup K_\alpha (t')\subseteq \{t'\}\cup K_\alpha (s)\subseteq _{{\textbf{O}}} t. \end{aligned}$$

This amounts to \(\psi _\gamma s\in C_\alpha (t)\), as desired.

(b) We use induction on the build-up of \(s\in \Gamma ({\textbf{K}})\) according to Definition 4.5. In the crucial case, we have \(s=\Gamma _z\) for some \(z\in {\textbf{K}}={\textbf{X}}+S^{\textbf{R}}_{\Gamma ({\textbf{X}})}\). As Lemma 6.13 ensures \((\alpha +1,0)\in {\text {dom}}(\psi ^+)\), the assumption \(s<\psi _{\alpha +1}0\) yields

$$\begin{aligned} \gamma _{{\textbf{K}}}(z)=s<_{\textbf{O}}\psi _{\alpha +1}0=\psi ^+_{\alpha +1}0=\gamma _{{\textbf{K}}}\circ I(\psi ^{{\textbf{X}}}_{\alpha +1}0). \end{aligned}$$

The range of \(I:{\textbf{X}}\rightarrow {\textbf{K}}\) is an initial segment, so \(z=I(x)\) holds for some \(x<\psi ^{{\textbf{X}}}_{\alpha +1}0\). Like any other element of \({\textbf{X}}\), the latter can be written in the form \(x=\psi ^{{\textbf{X}}}_\gamma r\), which yields \(s=\psi ^+_\gamma r\). We must have \(\gamma \le \alpha \), as \(\psi ^{{\textbf{X}}}\) is order preserving. If we have \(\gamma <\alpha \), then \(s<\psi _\alpha t\) is immediate. Let us now assume \(\gamma =\alpha \). We then have \(r\in K_\alpha (s)\), so that \(s\in C_\alpha (t)\) yields \(r<t\). For the appropriate \(t'\ge t\) we get

$$\begin{aligned} s=\psi ^+_\alpha r<\psi ^+_\alpha t'=\psi _\alpha t. \end{aligned}$$

In the case of a term \(s={\overline{\varphi }} s_0s_1\), we recall that Definition 4.7 yields

$$\begin{aligned} {\text {supp}}^\Gamma _{{\textbf{K}}}(s)={\text {supp}}^\Gamma _{\textbf{K}}(s_0)\cup {\text {supp}}^\Gamma _{{\textbf{K}}}(s_1). \end{aligned}$$

The equality remains valid when we replace \({\text {supp}}^\Gamma _{\textbf{K}}\) by \({\text {supp}}^{{\textbf{O}}}\) or \({\text {supp}}^+\) or \(K_\alpha \), due to Definition 6.14 and Proposition 7.2. So \(s\in C_\alpha (t)\) is equivalent to \(s_0,s_1\in C_\alpha (t)\). Also note that \(s<\psi _{\alpha +1}0\) and \(s_0,s_1<\psi _{\alpha +1}0\) are equivalent by Definition 4.5, as

$$\begin{aligned} \psi _{\alpha +1}0\in {\text {rng}}(\gamma _{\textbf{K}})=\{\Gamma _z\,|\,z\in {\textbf{K}}\} \end{aligned}$$

is strongly critical. We can thus invoke the induction hypothesis to get \(s_0,s_1<\psi _\alpha t\). The latter entails \(s<\psi _\alpha t\), because \(\psi _\alpha t\) is strongly critical as well. For a term of the form \(s=\langle s_0,\ldots ,s_{n-1}\rangle \), the argument is similar.

(c) As in the proof of (b), we argue by induction on the build-up of \(s\in \Gamma ({\textbf{K}})\). Let us first assume that we have \(s=\Gamma _z\) for some \(z\in {\textbf{K}}\). Given \(s<\psi _\alpha t<\psi _{\alpha +1}0\), we can once again write \(s=\psi ^+_\gamma r\) with \(\gamma \le \alpha \). If the last inequality is strict, we obtain \(K_\alpha (s)=\emptyset \), so that \(s\in C_\alpha (t)\) is immediate. Now assume \(\gamma =\alpha \) and recall that \((\alpha ,r)\in {\text {dom}}(\psi ^+)\) entails \(K_\alpha (r)\subseteq _{\textbf{O}}r\). Given \(t\in C_\alpha (t)\), we have \(\psi _\alpha t=\psi ^+_\alpha t\), so that \(s<\psi _\alpha t\) entails \(r<t\). Together we get

$$\begin{aligned} K_\alpha (s)=\{r\}\cup K_\alpha (r)\subseteq _{{\textbf{O}}}t \end{aligned}$$

and hence \(s\in C_\alpha (t)\), as desired. Let us also consider a term \(s={\overline{\varphi }} s_0s_1<\psi _\alpha t\). For each \(i\le 1\) we get \(s_i<\psi _\alpha t\), so that the induction hypothesis yields \(s_i\in C_\alpha (t)\). We can conclude \(s\in C_\alpha (t)\), as noted in the proof of (b). An analogous argument applies in the case of a term \(s=\langle s_0,\ldots ,s_{n-1}\rangle \) with \(n>1\). For \(s=0\), it suffices to observe that \(K_\alpha (0)\) is empty, since the same holds for \({\text {supp}}^\Gamma _{{\textbf{K}}}(0)\). \(\square \)

As observed in part (b) of the previous proof, all values \(\psi _\alpha t\) are strongly critical. The next result provides inequalities between different values of \(\psi \).

Corollary 7.6

The following holds for all \(s,t\in {\textbf{O}}\):

  1. (a)

    For \(t\ne 0\) we have \(\psi _\alpha 0<\psi _\alpha t<\psi _{\alpha +1} 0=\Gamma (I)\circ {\textbf{R}}(\alpha )\).

  2. (b)

    If we have \(s\in C_\alpha (t)\), then \(s<t\) implies \(\psi _\alpha s<\psi _\alpha t\).

Proof

Concerning part (a), let us first observe that Lemmas 4.10 and 6.13 yield

$$\begin{aligned} \Gamma (I)\circ {\textbf{R}}(\alpha )=\Gamma (I)\circ \gamma _{\textbf{X}}(\psi ^{{\textbf{X}}}_{\alpha +1})=\gamma _{{\textbf{K}}}\circ I(\psi ^{{\textbf{X}}}_{\alpha +1}0)=\psi ^+_{\alpha +1}0=\psi _{\alpha +1}0. \end{aligned}$$

The second inequality in part (a) is immediate, while the first one reduces to (b), as \({\text {supp}}^\Gamma _{{\textbf{K}}}(0)=\emptyset \) entails \(K_\alpha (0)=\emptyset \) and hence \(0\in C_\alpha (t)\). Let us now establish part (b). Given \(s\in C_\alpha (t)\) and \(s<t\), we get \(\psi _\alpha s\in C_\alpha (t)\) by part (a) of the previous proposition. Part (b) of the latter yields \(\psi _\alpha s<\psi _\alpha t\), as we have \(\psi _\alpha s<\psi _{\alpha +1}0\). \(\square \)

With the sets \(C_\alpha (t)\) at hand, we can recover the operators \({\mathcal {H}}_s\) of Buchholz [7].

Definition 7.7

For \(s\in {\textbf{O}}\) and \(a\in [{\textbf{O}}]^{<\omega }\) we set

$$\begin{aligned} {\mathcal {H}}_s(a):=\bigcap \{C_\alpha (t)\,|\,\alpha<\nu \text { and }t\in {\textbf{O}}\text { with }s<_{{\textbf{O}}}t\text { and }a\subseteq C_\alpha (t)\}\subseteq {\textbf{O}}. \end{aligned}$$

Note that the intersection is taken over a non-empty family, because \(a\subseteq C_\alpha (t)\) amounts to \(b\subseteq _{{\textbf{O}}}t\) for the finite set \(b=\bigcup _{r\in a}K_\alpha (r)\). The following is immediate.

Lemma 7.8

The following holds for all \(s,t\in {\textbf{O}}\) and \(a,b\in [\textbf{O}]^{<\omega }\):

  1. (a)

    We have \(a\subseteq {\mathcal {H}}_s(a)\).

  2. (b)

    Given \(a\subseteq {\mathcal {H}}_s(b)\), we get \({\mathcal {H}}_s(a)\subseteq {\mathcal {H}}_s(b)\).

  3. (c)

    For \(s<t\) we have \({\mathcal {H}}_s(a)\subseteq {\mathcal {H}}_t(a)\).

Parts (a) and (b) express that \({\mathcal {H}}_s\) is a closure operator. Together, they ensure that \(a\subseteq b\) implies \(\mathcal H_s(a)\subseteq {\mathcal {H}}_s(b)\). As we will see, the following is an abstract way to say that \({\mathcal {H}}_s\) is nice in the sense of [7, Definition 3.5].

Proposition 7.9

For all \(s,t\in {\textbf{O}}\) and \(a\in [{\textbf{O}}]^{<\omega }\) we have

$$\begin{aligned} s\in \mathcal H_t(a)\quad \Leftrightarrow \quad {\text {supp}}^+(s)\subseteq {\mathcal {H}}_t(a). \end{aligned}$$

Proof

For each \(\alpha <\nu \), Proposition 7.2 yields

$$\begin{aligned} K_\alpha (s)\subseteq _{{\textbf{O}}}t\quad \Leftrightarrow \quad K_\alpha (r)\subseteq _{{\textbf{O}}}t\text { for all }r\in {\text {supp}}^+(s). \end{aligned}$$

In view of Definition 7.4, the equivalence from the proposition thus holds with \(C_\alpha (t)\) at the place of \(\mathcal H_t(a)\). This pointwise version is stronger than the claim itself. \(\square \)

The corollary below encapsulates various closure properties, such as

$$\begin{aligned} {\overline{\varphi }} r_0r_1\in \mathcal H_r(a)\quad \Leftrightarrow \quad \{r_0,r_1\}\subseteq {\mathcal {H}}_r(a). \end{aligned}$$

In view of Definition 4.7, the direction from right to left follows from the corollary for \(s_i=r_i\) and \(t_0={\overline{\varphi }} r_0r_1\) (with \(m=2\) and \(n=1\)). The converse direction follows when we take \(s_0={\overline{\varphi }} r_0r_1\) and \(t_i=r_i\). We get an analogous equivalence for terms of the form \(\langle r_0,\ldots ,r_{k-1}\rangle \). Due to Lemma 4.16, we also learn that \(\{s_0,s_1\}\subseteq {\mathcal {H}}_r(a)\) entails \(s_0+s_1\in \mathcal H_r(a)\) and \(\varphi s_0s_1\in {\mathcal {H}}_r(a)\), where \(\varphi \) is our total extension of \({\overline{\varphi }}\). One can also take \(m=0\), to obtain \(0,1\in {\mathcal {H}}_r(a)\) from \({\text {supp}}^\Gamma _{\textbf{K}}(0)=\emptyset \).

Corollary 7.10

Consider any \(s_0,\ldots ,s_{m-1}\) and \(t_0,\ldots ,t_{n-1}\) in \({\textbf{O}}\). If we have

$$\begin{aligned} \textstyle \bigcup _{i<m}{\text {supp}}^\Gamma _{\textbf{K}}(s_i)\supseteq \textstyle \bigcup _{j<n}{\text {supp}}^\Gamma _{\textbf{K}}(t_j), \end{aligned}$$

then \(\{s_0,\ldots ,s_{m-1}\}\subseteq {\mathcal {H}}_r(a)\) implies \(\{t_0,\ldots ,t_{n-1}\}\subseteq {\mathcal {H}}_r(a)\).

Proof

As in the proof of Proposition 7.5, the given inclusion remains valid when we replace \({\text {supp}}^\Gamma _{{\textbf{K}}}\) by \({\text {supp}}^+\). We can conclude by the previous proposition. \(\square \)

The following result on collapsing functions (cf. [7, Lemma 4.6]) completes our list of closure properties. In particular, it yields \(\psi _\alpha 0\in {\mathcal {H}}_t(a)\) for all \(\alpha <\nu \).

Corollary 7.11

Given \(s\in {\mathcal {H}}_t(a)\) with \(s\le _{{\textbf{O}}}t\), we get \(\psi _\alpha s\in {\mathcal {H}}_t(a)\) for all \(\alpha <\nu \).

Proof

To obtain \(\psi _\alpha s\in {\mathcal {H}}_t(a)\), we need to establish \(\psi _\alpha s\in C_\beta (t')\) for arbitrary \(\beta <\nu \) and \(t'>t\) with \(a\subseteq C_\beta (t')\). The assumption \(s\in \mathcal H_t(\alpha )\) ensures \(s\in C_\beta (t')\). Given that we have \(s\le t<t'\), Proposition 7.5 yields \(\psi _\alpha s\in C_\beta (t')\), as required. \(\square \)

The rest of this section concerns a notion of infinite proof that is heavily inspired by work of Buchholz [7]. As preparation, we introduce notation that relates to the parameters and the rank of formulas. In Sect. 5 and Definition 7.1, we have explained \({\text {supp}}^{\textbf{L}}_{\Gamma ({\textbf{X}})}(a)\in [\Gamma ({\textbf{X}})]^{<\omega }\) and \({\text {supp}}^+(s)\in [{\textbf{O}}]^{<\omega }\) for \(a\in \textbf{L}^u_{\Gamma ({\textbf{X}})}\) and \(s\in {\textbf{O}}\), respectively. The following definition overloads this notation by admitting arguments of different types. To interpret the notation correctly, one will need to infer the type of the argument from the context.

Definition 7.12

For an \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula \(\varphi \) and an \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-sequent \(\Gamma \), we put

$$\begin{aligned} {\text {supp}}^{{\textbf{L}}}_{\Gamma ({\textbf{X}})}(\varphi )&:=\textstyle \bigcup \{{\text {supp}}^{{\textbf{L}}}_{\Gamma ({\textbf{X}})}(a)\,|\,a\in {\textbf{L}}^u_{\Gamma ({\textbf{X}})}\text { is a parameter of }\varphi \},\\ {\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(\Gamma )&:=\textstyle \bigcup _{i<n}{\text {supp}}^{\textbf{L}}_{\Gamma (\textbf{X})}(\varphi _i)\quad \text {for}\quad \Gamma =\varphi _0,\ldots ,\varphi _{n-1}. \end{aligned}$$

When \(\sigma \) is an element of \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\), an \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula or an \(\textbf{L}^u_{\Gamma ({\textbf{X}})}\)-sequent, we define

$$\begin{aligned} {\text {supp}}^+(\sigma ):=[\Gamma (I)]^{<\omega }\circ {\text {supp}}^{\textbf{L}}_{\Gamma ({\textbf{X}})}(\sigma )\in [{\textbf{O}}]^{<\omega }. \end{aligned}$$

For \(\alpha <\nu \), an \({\textbf{L}}^u_{\Gamma (\textbf{X})}\)-formula \(\varphi \) is called a \(\Sigma (\alpha )\)-formula if all universal quantifiers in \(\varphi \) are bounded and we have

$$\begin{aligned} {\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(\varphi )\subseteq _{\Gamma ({\textbf{X}})}{\textbf{R}}(\alpha ). \end{aligned}$$

Let us also agree to abbreviate \(L[\alpha ]:=L^u_{\textbf{R}(\alpha )}\in {\textbf{L}}^u_{\Gamma ({\textbf{X}})}\) for \(\alpha <\nu \).

To motivate the new notation, we recall that Definition 5.3 involves relativized axioms \({\text {Ax}}_n^{L(i)}\) with \(L(i)=L[\alpha ]\) for \(\alpha =\nu _i\). We are particularly interested in the case of \(\Delta _0\)-collection, where \({\text {Ax}}_n^{L[\alpha ]}\) has instances of the form

$$\begin{aligned} \forall x\!\in \! a_0\exists y\!\in \! L[\alpha ]\,\theta (x,y,a_1,\ldots ,a_n)\!\rightarrow \!\exists w\!\in \! L[\alpha ]\forall x\!\in \!a_0\exists y\!\in \!w\,\theta (x,y,a_1,\ldots ,a_n). \end{aligned}$$

In the relevant cases, we will have \(a_i\in \textbf{L}^u_{\Gamma ({\textbf{X}})}\) and \({\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(a_i)\subseteq _{\Gamma ({\textbf{X}})}{\textbf{R}}(\alpha )\). On an intuitive level, this means that the parameters come from the \({\textbf{R}}(\alpha )\)-th stage of the constructible hierarchy, i. e., from \(L[\alpha ]\). The given condition ensures that

$$\begin{aligned} \varphi :=\forall x\in a_0\exists y\,\theta (x,y,a_1,\ldots ,a_n) \end{aligned}$$

is a \(\Sigma (\alpha )\)-formula. Our instance of \(\Delta _0\)-collection can now be written as

$$\begin{aligned} \varphi ^{L[\alpha ]}\rightarrow \exists w\in L[\alpha ].\,\varphi ^w. \end{aligned}$$

For an arbitrary \(\Sigma (\alpha )\)-formula, this implication can be deduced from \(\Delta _0\)-collection in \(L[\alpha ]\), at least for the actual constructible hierarchy (see [4, Theorem I.4.3]). This fact will not be used in the following, but it does explain the role of \(\Sigma (\alpha )\)-formulas.

As a final ingredient for our infinite proofs, we assign formula ranks that will be used to control cut inferences. In order to explain the following definition, we recall that \(\textbf{L}^u_{\Gamma ({\textbf{X}})}\) is built over a set \(u\ni 0\) of urelements (fixed in Assumption 5.1). According to Sect. 5, our \({\textbf{L}}^u_{\Gamma (\textbf{X})}\)-formulas are closed (unless noted otherwise) and in negation normal form. The required ordinal arithmetic on \(\textbf{O}=\Gamma ({\textbf{K}})\) was discussed at the end of Sect. 4. Let us point out that \(\Gamma (I):\Gamma ({\textbf{X}})\rightarrow {\textbf{O}}\) commutes with basic ordinal arithmetic. It follows that all ranks lie in the range of \(\Gamma (I)\). For notational reasons, it will still be convenient to have ranks in \({\textbf{O}}\) rather than \(\Gamma ({\textbf{X}})\).

Definition 7.13

The function \({\text {rk}}:{\textbf{L}}^u_{\Gamma ({\textbf{X}})}\rightarrow {\textbf{O}}\) is given by

$$\begin{aligned} {\text {rk}}(w):=0\text { for }w\in u,\qquad {\text {rk}}(L^u_s):=\omega \cdot (1+\Gamma (I)(s)),\\ {\text {rk}}(\{x\in L^u_s\,|\,\varphi (x,a_1,\ldots ,a_n)\}):={\text {rk}}(L^u_s)+1. \end{aligned}$$

To each bounded \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula \(\varphi \), we assign a rank \({\text {rk}}(\varphi )\in {\textbf{O}}\) by setting

$$\begin{aligned} {\text {rk}}(a\in b)&:={\text {rk}}(\lnot \, a\in b):=\max \{{\text {rk}}(a)+6,{\text {rk}}(b)+1\},\\ {\text {rk}}(a=b)&:={\text {rk}}(\lnot \, a=b):=\max \{{\text {rk}}(a),{\text {rk}}(b),5\}+4,\\ {\text {rk}}(\varphi _0\vee \varphi _1)&:={\text {rk}}(\varphi _0\wedge \varphi _1):=\max \{{\text {rk}}(\varphi _0),{\text {rk}}(\varphi _1)\}+1,\\ {\text {rk}}(\exists x\in a.\,\varphi (x))&:={\text {rk}}(\forall x\in a.\,\varphi (x)):=\max \{{\text {rk}}(a),{\text {rk}}(\varphi (0))+2\}. \end{aligned}$$

Note that we get \({\text {rk}}(\varphi )={\text {rk}}(\lnot \varphi )\) for any bounded \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula \(\varphi \), because of our treatment of negation as a defined operation. Let us record a basic property:

Lemma 7.14

For all \(b\in {\textbf{L}}^u_{\Gamma ({\textbf{X}})}\) and \(t\in {\textbf{O}}\) we have \({\text {rk}}(b)\in {\mathcal {H}}_0({\text {supp}}^+(b))\) and

$$\begin{aligned} {\text {supp}}^+(b)\subseteq _{\textbf{O}}t\quad \Leftrightarrow \quad {\text {rk}}(b)<_{{\textbf{O}}}\omega \cdot (1+t). \end{aligned}$$

Both properties remain valid when we replace b by a bounded \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula \(\varphi \).

Proof

For \(b\in u\) it suffices to observe \({\text {rk}}(b)=0\) and \({\text {supp}}^+(b)=\emptyset \). In the remaining cases, the equivalence holds since we have \({\text {rk}}(b)=\omega \cdot (1+\Gamma (I)(s))+i\) for some \(i\le 1\), where s is the largest element of \({\text {supp}}^{\textbf{L}}_{\Gamma ({\textbf{X}})}(b)\). We also get

$$\begin{aligned} \Gamma (I)(s)\in {\text {supp}}^+(b)\subseteq {\mathcal {H}}_0({\text {supp}}^+(b)). \end{aligned}$$

In view of \(1=\varphi _00\), Lemma 4.16 yields

$$\begin{aligned} {\text {supp}}^\Gamma _{{\textbf{K}}}({\text {rk}}(b))={\text {supp}}^\Gamma _{\textbf{K}}(\omega \cdot (1+\Gamma (I)(s))+i)\subseteq {\text {supp}}^\Gamma _{\textbf{K}}(\Gamma (I)(s)). \end{aligned}$$

Thus \({\text {rk}}(b)\in {\mathcal {H}}_0({\text {supp}}^+(b))\) follows by Corollary 7.10. A straightforward induction over formulas shows that we can write \({\text {rk}}(\varphi )={\text {rk}}(b)+n\) with \(n\in {\mathbb {N}}\), where b is a parameter of \(\varphi \) or equal to \(0\in u\subseteq {\textbf{L}}^u_{\Gamma ({\textbf{X}})}\). In both cases we get

$$\begin{aligned} {\text {rk}}(\varphi )={\text {rk}}(b)+n\in {\mathcal {H}}_0({\text {supp}}^+(b))\subseteq \mathcal H_0({\text {supp}}^+(\varphi )) \end{aligned}$$

due to Corollary 7.10 and Lemma 7.8. By another induction over formulas, we see that \({\text {rk}}(b)\le {\text {rk}}(\varphi )\) holds for any parameter b of the formula \(\varphi \). Given that \(r<\omega \cdot s\) entails \(r+n<\omega \cdot s\), this ensures that the equivalence remains valid. \(\square \)

To justify the focus on bounded formulas, we recall that any \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula \(\varphi \) is associated with a disjunction \(\bigvee _{a\in \iota (\varphi )}\varphi _a\) or conjunction \(\bigwedge _{a\in \iota (\varphi )}\varphi _a\), as explained in Sect. 5. If \(\varphi \) is bounded, so is \(\varphi _a\) for every \(a\in \iota (\varphi )\), due to [11, Definition 3.12]. Thus all formulas in Definition 5.3 are bounded, and the same will hold for the formulas in our infinite proofs. We say that an \(\textbf{L}^u_{\Gamma ({\textbf{X}})}\)-sequent is bounded if it consists of bounded formulas only. The assignment of ranks is designed to validate the following, which is shown in the proof of [11, Theorem 3.14] (see also [7, Lemma 3]).

Lemma 7.15

Given any bounded \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula \(\varphi \), we have

$$\begin{aligned} {\text {rk}}(\varphi _a)<_{{\textbf{O}}}{\text {rk}}(\varphi )\quad \text {for all } a\in \iota (\varphi )=\iota _{\Gamma ({\textbf{X}})}(\varphi ). \end{aligned}$$

In the paragraph before Lemma 6.11, we have observed that \({\textbf{O}}\) is well founded, which justifies the following recursion. Intuitively, we have \((r,a)\vdash _s^t\Gamma \) if the sequent \(\Gamma \) has an infinite proof with height at most t, where \({\mathcal {H}}_r(a)\) and s control relevant parameters and cuts. The given definition is inspired by [7, Theorem 3.8].

Definition 7.16

By recursion on t, we declare that the relation

$$\begin{aligned} (r,a)\vdash ^t_s\Gamma \end{aligned}$$

between elements \(r,s,t\in {\textbf{O}}\), \(a\in [{\textbf{O}}]^{<\omega }\) and a bounded \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-sequent \(\Gamma \) holds precisely if we have

$$\begin{aligned} \{t\}\cup {\text {supp}}^+(\Gamma )\subseteq {\mathcal {H}}_r(a) \end{aligned}$$

and one of the following clauses applies:

  1. (i)

    for some conjunctive \(\varphi \simeq \bigwedge _{b\in \iota (\varphi )}\varphi _b\in \Gamma \) and every \(b\in \iota (\varphi )\subseteq {\textbf{L}}^u_{\Gamma ({\textbf{X}})}\), there is a \(t(b)<t\) such that we have \((r,a\cup {\text {supp}}^+(b))\vdash ^{t(b)}_s\Gamma ,\varphi _b\),

  2. (ii)

    for some disjunctive \(\varphi \simeq \bigvee _{b\in \iota (\varphi )}\varphi _b\in \Gamma \) and some \(b\in \iota (\varphi )\subseteq {\textbf{L}}^u_{\Gamma ({\textbf{X}})}\) such that we have \({\text {supp}}^+(b)\subseteq _{{\textbf{O}}} t\), there is a \(t(0)<t\) with \((r,a)\vdash ^{t(0)}_s\Gamma ,\varphi _b\),

  3. (iii)

    for some bounded \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula \(\psi \) with \({\text {rk}}(\psi )<s\), there is a \(t(0)<t\) such that we have \((r,a)\vdash ^{t(0)}_s\Gamma ,\psi \) and \((r,a)\vdash ^{t(0)}_s\Gamma ,\lnot \psi \),

  4. (iv)

    for some \(\alpha <\nu \) and some \(\Sigma (\alpha )\)-formula \(\varphi \) with \(\exists z\in L[\alpha ].\,\varphi ^z\in \Gamma \), there is an element \(t(0)<t\) with \((r,a)\vdash ^{t(0)}_s\Gamma ,\varphi ^{L[\alpha ]}\).

Sometimes one wants to apply the given clauses in a modified form, e. g., to derive \((r,a)\vdash _s^t\Gamma _0,\Gamma _1\) from \((r,a)\vdash _s^{t(0)}\Gamma _0,\psi \) and \((r,a)\vdash _s^{t(1)}\Gamma _1,\lnot \psi \) with \(t(0)\ne t(1)<t\). This is possible due to the following standard result (cf. [7, Lemma 3.9(a)]).

Lemma 7.17

(Weakening) Given \(r\le r',s\le s',t\le t'\) and \(a\subseteq \mathcal H_{r'}(a')\), we have

$$\begin{aligned} (r,a)\vdash ^t_s\Gamma \quad \text {and}\quad \{t'\}\cup {\text {supp}}^+(\Delta )\subseteq \mathcal H_{r'}(a')\qquad \Rightarrow \qquad (r',a')\vdash ^{t'}_{s'}\Delta ,\Gamma . \end{aligned}$$

Proof

One argues by induction on \(t\in {\textbf{O}}\) and distinguishes cases that correspond to the clauses from Definition 7.16. In each case, one uses the induction hypothesis and reapplies the same clause. This is possible because Lemma 7.8 yields

$$\begin{aligned} a\cup c\subseteq {\mathcal {H}}_r(a\cup c)\subseteq {\mathcal {H}}_{r'}(a'\cup c), \end{aligned}$$

where one takes \(c={\text {supp}}^+(b)\) for clause (i) and \(c=\emptyset \) in the other cases. \(\square \)

We always refer to the lemma as ‘weakening’, even when \(a'\) is a proper subset of a, where we get an apparent strengthening. In the following result, the bound \(\omega \cdot {\text {rk}}(\varphi )\) could be improved to \(2\cdot {\text {rk}}(\varphi )\). We keep the suboptimal bound because only \(t\mapsto \omega \cdot t\) has been defined in the present paper.

Lemma 7.18

For any bounded \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula \(\varphi \) and any \(a\in {\textbf{L}}^u_{\Gamma ({\textbf{X}})}\) we have

$$\begin{aligned} (0,{\text {supp}}^+(\varphi ))\vdash ^{\omega \cdot {\text {rk}}(\varphi )}_0\varphi ,\lnot \varphi \qquad \text {and}\qquad (0,{\text {supp}}^+(a))\vdash ^{\omega \cdot {\text {rk}}(a)+2}_0 a=a. \end{aligned}$$

Proof

To establish the first claim, we argue by induction on \({\text {rk}}(\varphi )\). First observe that \(\mathcal H_0({\text {supp}}^+(\varphi ))\) contains \({\text {rk}}(\varphi )\) and hence also \(\omega \cdot {\text {rk}}(\varphi )\), due to Lemma 7.14 and its proof. As disjunction and conjunction are dual (see [11, Definition 3.12]), we may assume \(\varphi \simeq \bigvee _{b\in \iota (\varphi )}\varphi _b\) to get \(\lnot \varphi \simeq \bigwedge _{b\in \iota (\varphi )}\lnot \varphi _b\), or in other words \(\iota (\lnot \varphi )=\iota (\varphi )\) and \(\lnot (\varphi _b)=(\lnot \varphi )_b\). In view of Lemma 7.15, we use the induction hypothesis to get

$$\begin{aligned} (0,{\text {supp}}^+(\varphi _b))\vdash ^{\omega \cdot {\text {rk}}(\varphi _b)}_0\varphi _b,\lnot \varphi _b\qquad \text {for each }b\in \iota (\varphi ). \end{aligned}$$

To prepare an application of weakening, we observe that [11, Definition 3.12] yields

$$\begin{aligned} {\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(\varphi _b)\subseteq {\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(\varphi )\cup {\text {supp}}^{{\textbf{L}}}_{\Gamma ({\textbf{X}})}(b). \end{aligned}$$

This inclusion remains valid when we apply \([\Gamma (I)]^{<\omega }\) to both sides, i. e., when we replace \({\text {supp}}^{\textbf{L}}_{\Gamma ({\textbf{X}})}\) by \({\text {supp}}^+\). For each \(b\in \iota (\varphi )\), we can use Lemma 7.14 to derive

$$\begin{aligned} t(b):=\max \{{\text {rk}}(\varphi _b),{\text {rk}}(b)\}\in \mathcal H_0({\text {supp}}^+(\varphi )\cup {\text {supp}}^+(b)). \end{aligned}$$

As announced, we now apply weakening to get

$$\begin{aligned} (0,{\text {supp}}^+(\varphi )\cup {\text {supp}}^+(b))\vdash _0^{\omega \cdot t(b)}\varphi ,\lnot \varphi ,\varphi _b,\lnot \varphi _b. \end{aligned}$$

The choice of t(b) and Lemma 7.14 ensure \({\text {supp}}^+(b)\subseteq _{{\textbf{O}}}\omega \cdot t(b)+1\), as required in clause (ii) of Definition 7.16. By the latter, we thus obtain

$$\begin{aligned} (0,{\text {supp}}^+(\varphi )\cup {\text {supp}}^+(b))\vdash _0^{\omega \cdot t(b)+1}\varphi ,\lnot \varphi ,\lnot \varphi _b\qquad \text {for each }b\in \iota (\varphi )=\iota (\lnot \varphi ). \end{aligned}$$

Based on [11, Definition 3.12] and Lemma 7.14, it is not hard to check that \(b\in \iota (\varphi )\) entails \({\text {rk}}(b)<{\text {rk}}(\varphi )\), so that we get \(t(b)<{\text {rk}}(\varphi )\) by Lemma 7.15. We can thus apply clause (i) of Definition 7.16, in order to complete the proof of the first claim from the lemma. To derive the second claim, we show

$$\begin{aligned} (0,{\text {supp}}^+(a))\vdash _0^{\omega \cdot {\text {rk}}(a)+1}\forall x\in a.\,x\in a \end{aligned}$$

by induction on \({\text {rk}}(a)\). Let us consider a term of the form \(a=\{x\in L^u_s\,|\,\theta (x,{\textbf{d}})\}\). For \(a\in u\) and \(a=L^u_s\) the argument is easier (but note that \(a\in u\) leads to the bound \(\omega \cdot {\text {rk}}(a)+1\) rather than \(\omega \cdot {\text {rk}}(a)\)). By [11, Definition 3.12] we have

$$\begin{aligned} \forall x\in a.\,x\in a\simeq \textstyle \bigwedge _{b\in \iota }\lnot \theta (b,{\textbf{d}})\vee b\in a\quad \text {and}\quad b\in a\simeq \textstyle \bigvee _{c\in \iota }\theta (c,{\textbf{d}})\wedge c=b\\ \text {with }\iota =\{b\in {\textbf{L}}^u_{\Gamma (\textbf{X})}\,|\,{\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(b)\subseteq _{\Gamma ({\textbf{X}})}s\}. \end{aligned}$$

In the clause for \(b\in a\), we will take c to be the same term as b. To derive \(b=b\), we recall the general clause

$$\begin{aligned} (b_0=b_1)\simeq \textstyle \bigwedge _{i\in \{0,1\}}\forall x\in b_i.\,x\in b_{1-i}. \end{aligned}$$

When \(b_0\) and \(b_1\) are the same term b, then the two conjuncts coincide, but we still need a step to introduce the conjunction. So the induction hypothesis and clause (i) of Definition 7.16 yield

$$\begin{aligned} (0,{\text {supp}}^+(b))\vdash _0^{\omega \cdot {\text {rk}}(b)+2}b=b. \end{aligned}$$

This shows the second claim of the lemma, once the present induction is completed. We have \({\text {supp}}^+(\theta (b,\textbf{d}))\subseteq {\text {supp}}^+(a)\cup {\text {supp}}^+(b)\), and Lemma 7.14 provides

$$\begin{aligned} s(b)<{\text {rk}}(a)\quad \text {for}\quad s(b):=\max \{{\text {rk}}(b)+1,{\text {rk}}(\theta (b,{\textbf{d}}))\}. \end{aligned}$$

Using the first part of the present lemma, we can thus derive

$$\begin{aligned} (0,{\text {supp}}^+(a)\cup {\text {supp}}^+(b))\vdash ^{\omega \cdot s(b)+1}_0\lnot \theta (b,{\textbf{d}}),\theta (b,{\textbf{d}})\wedge b=b. \end{aligned}$$

We now use clause (ii) of Definition 7.16 three times, once to get \(b\in a\) and twice to combine the disjuncts, so that we obtain

$$\begin{aligned} (0,{\text {supp}}^+(a)\cup {\text {supp}}^+(b))\vdash ^{\omega \cdot s(b)+4}_0\lnot \theta (b,{\textbf{d}})\vee b\in a. \end{aligned}$$

To complete the induction step, one applies clause (i) of the same definition. \(\square \)

In the rest of this section, we show how the search tree \(S^{\textbf{R}}_{\Gamma ({\textbf{X}})}\) from Definition 5.3 can be transformed into an infinite proof. We begin with the crucial axioms.

Proposition 7.19

For each of the \(\Delta _0\)-collection axioms \({\text {Ax}}_{1+n}\) from Definition 5.2 and any \(\alpha <\nu \), we have

$$\begin{aligned} (0,\emptyset )\vdash _0^t{\text {Ax}}_{1+n}^{L[\alpha ]}\quad \text {with}\quad t:=\psi _{\alpha +1}0+\omega \cdot 3. \end{aligned}$$

Proof

Corollaries 7.10 and 7.11 provide \(\psi _{\alpha +1}0+\omega \cdot m+n\in {\mathcal {H}}_0(\emptyset )\) for \(m,n\in {\mathbb {N}}\). We recall \(L[\alpha ]=L^u_{{\textbf{R}}(\alpha )}\) and \(\psi _{\alpha +1}0=\Gamma (I)\circ {\textbf{R}}(\alpha )\) as well as \({\text {supp}}^{{\textbf{L}}}_{\Gamma ({\textbf{X}})}(L^u_s)=\{s\}\). The initial condition from Definition 7.16 can now be derived as

$$\begin{aligned} {\text {supp}}^+({\text {Ax}}_{1+n}^{L[\alpha ]})=[\Gamma (I)]^{<\omega }\circ {\text {supp}}^{\textbf{L}}_{\Gamma ({\textbf{X}})}(L^u_{\textbf{R}(\alpha )})=\{\psi _{\alpha +1}0\}\subseteq {\mathcal {H}}_0(\emptyset ). \end{aligned}$$

As in the paragraph that follows Definition 7.12, we write collection in the form

$$\begin{aligned} {\text {Ax}}_{1+n}=\forall {\textbf{z}}\forall v\,(\psi \rightarrow \exists w\,\psi ^w)\quad \text {with}\quad \psi (v,{\textbf{z}}):=\forall x\in v\exists y\,\theta (x,y,{\textbf{z}}), \end{aligned}$$

for a \(\Delta _0\)-formula \(\theta \) and variables \(\textbf{z}=z_1,\ldots ,z_k\). Note that we get

$$\begin{aligned} {\text {Ax}}_{1+n}^{L[\alpha ]}=\forall z_1\in L[\alpha ]\ldots \forall z_k\in L[\alpha ]\forall v\in L[\alpha ]\,(\psi ^{L[\alpha ]}\rightarrow \exists w\in L[\alpha ].\,\psi ^w). \end{aligned}$$

Let us now recall that [11, Definition 3.12] yields

$$\begin{aligned} \forall y\in L[\alpha ].\,\varphi (y)\,\simeq \,\textstyle \bigwedge _{a\in \iota }\varphi (a)\quad \text {for}\quad \iota :=\{a\in \textbf{L}^u_{\Gamma ({\textbf{X}})}\,|\,{\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(a)\subseteq _{\Gamma ({\textbf{X}})}{\textbf{R}}(\alpha )\}. \end{aligned}$$

To conclude by \(k+1\) applications of clause (i) from Definition 7.16, we shall thus show the following: For \(s:=\psi _{\alpha +1}0+\omega \cdot 2+3\) and arbitrary \(a_0,\ldots ,a_k\in \iota \), we have

$$\begin{aligned} (0,{\text {supp}}^+(\varphi ))\vdash ^s_0 \varphi ^{L[\alpha ]}\rightarrow \exists w\in L[\alpha ].\,\varphi ^w\quad \text {with}\quad \varphi :=\psi (a_0,\ldots ,a_k). \end{aligned}$$

In the proof of Proposition 7.5 we have observed that \(\psi _{\alpha +1}0\) is strongly critical. This justifies the last step in the computation

$$\begin{aligned} {\text {rk}}(L[\alpha ])=\omega \cdot (1+\Gamma (I)\circ \textbf{R}(\alpha ))=\omega \cdot (1+\psi _{\alpha +1}0)=\psi _{\alpha +1}0. \end{aligned}$$

By Definition 7.13 in conjunction with Lemma 7.14, we get \({\text {rk}}(\varphi ^{L[\alpha ]})=\psi _{\alpha +1}0+2\). We can thus use Lemma 7.18 to obtain

$$\begin{aligned} (0,{\text {supp}}^+(\varphi ^{L[\alpha ]}))\vdash _0^r\lnot \varphi ^{L[\alpha ]},\varphi ^{L[\alpha ]}\quad \text {with}\quad r:=\omega \cdot (\psi _{\alpha +1}0+2)=\psi _{\alpha +1}0+\omega \cdot 2. \end{aligned}$$

Weakening allows us to replace \({\text {supp}}^+(\varphi ^{L[\alpha ]})\) by \({\text {supp}}^+(\varphi )\), as we have

$$\begin{aligned} {\text {supp}}^+(\varphi ^{L[\alpha ]})\subseteq {\text {supp}}^+(\varphi )\cup {\text {supp}}^+(L[\alpha ])\subseteq \mathcal H_0({\text {supp}}^+(\varphi )). \end{aligned}$$

Now \(\varphi \) is a \(\Sigma (\alpha )\)-formula, due to \(a_i\in \iota \). Thus clause (iv) of Definition 7.16 yields

$$\begin{aligned} (0,{\text {supp}}^+(\varphi ))\vdash _0^{r+1}\lnot \varphi ^{L[\alpha ]},\exists w\in L[\alpha ].\,\varphi ^w. \end{aligned}$$

From Sect. 5 we recall that \(\varphi ^{L[\alpha ]}\rightarrow \exists w\in L[\alpha ].\,\varphi ^w\) and \(\lnot \varphi ^{L[\alpha ]}\vee \exists w\in L[\alpha ].\,\varphi ^w\) denote the same formula in negation normal form. We can thus conclude by two applications of clause (ii) from Definition 7.16. \(\square \)

On an intuitive level, the following holds because the stage \({\textbf{R}}(\alpha )\) of \(L[\alpha ]\) is a limit (in fact \(\Gamma (I)\circ {\textbf{R}}(\alpha )=\psi _{\alpha +1}0\) is strongly critical).

Proposition 7.20

Consider the axiom \({\text {Ax}}_0=\forall x\exists y.\, x\in y\) from Definition 5.2. For any \(\alpha <\nu \) we have \((0,\emptyset )\vdash ^t_0{\text {Ax}}_0^{L[\alpha ]}\) with \(t:=\psi _{\alpha +1}0\).

Proof

First note that we have

$$\begin{aligned} \{t\}\cup {\text {supp}}^+({\text {Ax}}_0^{L[\alpha ]})=\{\psi _{\alpha +1}0\}\subseteq \mathcal H_0(\emptyset ), \end{aligned}$$

as in the previous proof. To conclude by clauses (i) and (ii) of Definition 7.16, we write \(\iota =\{a\in {\textbf{L}}^u_{\Gamma ({\textbf{X}})}\,|\,{\text {supp}}^{\textbf{L}}_{\Gamma ({\textbf{X}})}(a)\subseteq _{\Gamma ({\textbf{X}})}\textbf{R}(\alpha )\}\) and observe

$$\begin{aligned} {\text {Ax}}_0^{L[\alpha ]}=\forall x\in L[\alpha ]\exists y\in L[\alpha ].\,x\in y\,&\simeq \,\textstyle \bigwedge _{a\in \iota }\exists y\in L[\alpha ].\,a\in y,\\ \exists y\in L[\alpha ].\,a\in y\,&\simeq \,\textstyle \bigvee _{b\in \iota } a\in b. \end{aligned}$$

Given an arbitrary \(a\in \iota \), we must thus derive \(a\in b\) for a suitable \(b\in \iota \). Let us set

$$\begin{aligned} b:=L^u_r\quad \text {with}\quad r:={\left\{ \begin{array}{ll} 0 &{} \text {if }a\in u,\\ s+1 &{} \text {if }a=L^u_s\text { or }a=\{x\in L^u_s\,|\,\theta (x,{\textbf{c}})\}. \end{array}\right. } \end{aligned}$$

In the more interesting second case, we note that \(a\in \iota \) and \(s\in {\text {supp}}^{{\textbf{L}}}_{\Gamma ({\textbf{X}})}(a)\) entail

$$\begin{aligned} s<_{\Gamma ({\textbf{X}})}{\textbf{R}}(\alpha )\in {\text {rng}}(\gamma _{\textbf{X}})=\{\Gamma _x\,|\,x\in {\textbf{X}}\}. \end{aligned}$$

We can infer \(s+1<{\textbf{R}}(\alpha )\) by Lemma 4.15 (recall \(1={\overline{\varphi }}_00\)). Let us rewrite this as

$$\begin{aligned} {\text {supp}}^{{\textbf{L}}}_{\Gamma ({\textbf{X}})}(b)\subseteq _{\Gamma (\textbf{X})}{\textbf{R}}(\alpha ), \end{aligned}$$

which also holds when we have \(a\in u\) and hence \(r=0\). As in the previous proof, we use Lemma 7.14 to conclude that \(\omega \cdot {\text {rk}}(b)+n<\psi _{\alpha +1}0\) holds for all \(n\in {\mathbb {N}}\). Given that Definition 4.7 yields \(\Gamma (I)(0)=0\) and \(\Gamma (I)(s+1)=\Gamma (I)(s)+1\), we can employ Corollary 7.10 to get \({\text {supp}}^+(b)=\{\Gamma (I)(r)\}\subseteq {\mathcal {H}}_0({\text {supp}}^+(a))\) and hence

$$\begin{aligned} {\text {rk}}(b)\in {\mathcal {H}}_0({\text {supp}}^+(b))\subseteq {\mathcal {H}}_0({\text {supp}}^+(a)). \end{aligned}$$

Let us now recall that [11, Definition 3.12] yields

$$\begin{aligned} a\in b\,\simeq \,\textstyle \bigvee _{c\in \kappa }c=a\quad \text {with}\quad \kappa =\{c\in \textbf{L}^u_{\Gamma ({\textbf{X}})}\,|\,{\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(c)\subseteq _{\Gamma ({\textbf{X}})} r\}. \end{aligned}$$

As Lemma 7.18 provides a derivation of \(a=a\), we take c to be the term a. Note that the choice of r ensures \(a\in \kappa \) and \({\text {supp}}^+(a)\subseteq _{\textbf{O}}\Gamma (I)(r)\le \omega \cdot {\text {rk}}(b)\). We may thus apply clause (ii) of Definition 7.16, to get

$$\begin{aligned} (0,{\text {supp}}^+(a))\vdash ^{\omega \cdot {\text {rk}}(b)}_0 a\in b. \end{aligned}$$

In view of \(b\in \iota \) and \({\text {supp}}^+(b)\subseteq _{\textbf{O}}\omega \cdot {\text {rk}}(b)+1\), the same clause now yields

$$\begin{aligned} (0,{\text {supp}}^+(a))\vdash ^{\omega \cdot {\text {rk}}(b)+1}_0 \exists y\in L[\alpha ].\,a\in y. \end{aligned}$$

Since \(a\in \iota \) was arbitrary and we always have \(\omega \cdot {\text {rk}}(b)+1<\omega \cdot \psi _{\alpha +1}0=\psi _{\alpha +1}0\), we can conclude by clause (i) of Definition 7.16. \(\square \)

To conclude this section, we show that the search tree \(S^{\textbf{R}}_{\Gamma ({\textbf{X}})}\) from Definition 5.3 can be converted into an infinite proof. We are particularly interested in the root node \(\langle \rangle \in S^{{\textbf{R}}}_{\Gamma ({\textbf{X}})}\), which gives rise to elements

$$\begin{aligned} {\textbf{X}}+\langle \rangle \in {\textbf{X}}+S^{{\textbf{R}}}_{\Gamma (\textbf{X})}={\textbf{K}}\qquad \text {and}\qquad \Gamma _{\textbf{X}+\langle \rangle }\in \Gamma ({\textbf{K}})={\textbf{O}}. \end{aligned}$$

The label \(l_{\Gamma ({\textbf{X}})}(\langle \rangle )\) at the root is the empty sequent, which we denote by \(\langle \rangle \) as well. Let us recall that the empty sequent stands for the empty disjunction, which is false and should thus not be provable. To reconcile this remark with the following result, we recall that the latter is part of an argument by contradiction (see the paragraph before Assumption 6.4).

Theorem 7.21

(Embedding) We have \((0,\emptyset )\vdash ^t_t\langle \rangle \) for \(t=\Gamma _{{\textbf{X}}+\langle \rangle }\in {\textbf{O}}\).

Proof

For \(\sigma =\langle \sigma _0,\ldots ,\sigma _{n-1}\rangle \in S^{\textbf{R}}_{\Gamma ({\textbf{X}})}\subseteq ({\textbf{L}}^u_{\Gamma (\textbf{X})})^{<\omega }\) we extend Definition 7.12 by

$$\begin{aligned} {\text {supp}}^+(\sigma ):=[\Gamma (I)]^{<\omega }\circ {\text {supp}}^S_{\Gamma (\textbf{X})}(\sigma )=\textstyle \bigcup _{i<n}{\text {supp}}^+(\sigma _i)\in [\textbf{O}]^{<\omega }, \end{aligned}$$

where the second equality uses \({\text {supp}}^S_{\Gamma (\textbf{X})}(\sigma )=\bigcup _{i<n}{\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(\sigma _i)\) from Sect. 5. Let us write \(l(\sigma )=l_{\Gamma ({\textbf{X}})}(\sigma )\) for the sequent label from Definition 5.3. We will show

$$\begin{aligned} (0,{\text {supp}}^+(\sigma ))\vdash ^s_s l(\sigma )\quad \text {with}\quad s:=\Gamma _{{\textbf{X}}+\sigma } \end{aligned}$$

by induction on \(\sigma \in S^{{\textbf{R}}}_{\Gamma ({\textbf{X}})}\) in the Kleene-Brouwer order, which is well founded due to the embedding \(\sigma \mapsto \Gamma _{{\textbf{X}}+\sigma }\) into the well order \({\textbf{O}}\). Note that the theorem is the case of the root \(\sigma =\langle \rangle \). Considering Definition 7.16, we first show

$$\begin{aligned} \{\Gamma _{{\textbf{X}}+\sigma }\}\cup {\text {supp}}^+(l(\sigma ))\subseteq \mathcal H_0({\text {supp}}^+(\sigma )). \end{aligned}$$

In view of \(\Gamma (I)\circ \textbf{R}(\alpha )=\psi _{\alpha +1}0\in {\mathcal {H}}_0(\emptyset )\), the claim about \({\text {supp}}^+(l(\sigma ))\) reduces to Corollary 5.7. To conclude via Proposition 7.9, we assume \(r\in {\text {supp}}^+(\Gamma _{{\textbf{X}}+\sigma })\) and derive \(r\in \mathcal H_0({\text {supp}}^+(\sigma ))\). Definitions 6.14 and 7.1 yield \(r=\gamma _{{\textbf{K}}}\circ I(x)\) for some

$$\begin{aligned} x\in {\text {supp}}^{{\textbf{O}}}(\Gamma _{{\textbf{X}}+\sigma })&=\bigcup \{{\text {supp}}^{{\textbf{K}}}(\rho )\,|\,\rho \in {\text {supp}}^{\Gamma }_{{\textbf{K}}}(\Gamma _{{\textbf{X}}+\sigma })\}\\ {}&={\text {supp}}^{{\textbf{K}}}(\textbf{X}+\sigma )=\bigcup \{{\text {supp}}^\Gamma _{\textbf{X}}(\rho )\,|\,\rho \in {\text {supp}}^S_{\Gamma (\textbf{X})}(\sigma )\backslash {\text {rng}}({\textbf{R}})\}. \end{aligned}$$

We thus get \(x\in {\text {supp}}^\Gamma _{{\textbf{X}}}(\rho )\) with \(\rho \in {\text {supp}}^S_{\Gamma ({\textbf{X}})}(\sigma )\) and hence \(\Gamma (I)(\rho )\in {\text {supp}}^+(\sigma )\). By Lemma 6.19 and the other direction of Proposition 7.9, we obtain

$$\begin{aligned} r\in [\gamma _{{\textbf{K}}}\circ I]^{<\omega }\circ {\text {supp}}^\Gamma _{{\textbf{X}}}(\rho )= & {} [\gamma _{{\textbf{K}}}\circ I]^{<\omega }\circ {\text {supp}}^{{\textbf{O}}}(\Gamma (I)(\rho ))\\= & {} {\text {supp}}^+(\Gamma (I)(\rho ))\subseteq {\mathcal {H}}_0({\text {supp}}^+(\sigma )). \end{aligned}$$

In our induction along the Kleene-Brouwer order, we distinguish cases according to Definition 5.3. Let us first assume that \(\sigma \) has even length 2k, where k codes a pair \(\langle n,i\rangle \). For \(\alpha =\nu _i\), the cited definition provides \(\sigma ^\frown L[\alpha ]\in S^{{\textbf{R}}}_{\Gamma ({\textbf{X}})}\), and the induction hypothesis yields

$$\begin{aligned} (0,{\text {supp}}^+(\sigma )\cup {\text {supp}}^+(L[\alpha ]))\vdash ^r_r l(\sigma ),\lnot {\text {Ax}}_n^{L[\alpha ]}\quad \text {with}\quad r=\Gamma _{{\textbf{X}}+\sigma ^\frown L[\alpha ]}<\Gamma _{\textbf{X}+\sigma }. \end{aligned}$$

Here we can omit \({\text {supp}}^+(L[\alpha ])\subseteq \mathcal H_0(\emptyset )\) by ‘weakening’. From Lemma 6.20 we get

$$\begin{aligned} \psi _{\alpha +1}0=\Gamma (I)\circ {\textbf{R}}(\alpha )<_{\textbf{O}}\Gamma _{{\textbf{X}}+\sigma ^\frown L[\alpha ]}=r. \end{aligned}$$

Due to Propositions 7.19 (for \(n>0\)) and 7.20 (for \(n=0\)), we thus have

$$\begin{aligned} (0,{\text {supp}}^+(\sigma ))\vdash ^r_r l(\sigma ),{\text {Ax}}_n^{L[\alpha ]}. \end{aligned}$$

As \(\psi _{\alpha +1}0<s=\Gamma _{{\textbf{X}}+\sigma }\) entails \({\text {rk}}({\text {Ax}}_n^{L[\alpha ]})<s\), we can complete the induction step by clause (iii) of Definition 7.16 (‘cut rule’). The other cases from Definition 5.3 correspond directly to clauses (i) and (ii). Concerning the disjunctive case, we note that \({\text {supp}}^+(b)\subseteq {\text {rng}}(\Gamma (I))\) entails \({\text {supp}}^+(b)\subseteq _{{\textbf{O}}}\Gamma _{{\textbf{X}}+\sigma }\), again by Lemma 6.20. \(\square \)

8 An abstract ordinal analysis

In this section, we show cut elimination and collapsing results that entail the consistency of our infinite proof system. On the one hand, these results resemble the known ordinal analysis of iterated admissibility [7, 29, 40, 44]. On the other hand, our setting here is more abstract, since we work relative to the given dilator \(\Gamma \circ {\textbf{S}}\) from Assumption 6.4 (recall that \({\textbf{S}}\) arises from the search trees of Definition 5.3). Once consistency is available, it will be straightforward to deduce the main result of our paper, as we shall see in the next section. We begin with a standard ingredient for cut elimination (cf. [7, Lemma 3.13]):

Lemma 8.1

(Inversion) If \(\varphi \simeq \bigwedge _{b\in \iota (\varphi )}\varphi _b\) is conjunctive, then we have

$$\begin{aligned} (r,a)\vdash _s^t\Gamma ,\varphi \qquad \Rightarrow \qquad (r,a\cup {\text {supp}}^+(b))\vdash _s^t\Gamma ,\varphi _b\quad \text {for any}\quad b\in \iota (\varphi ). \end{aligned}$$

Proof

Due to the initial condition from Definition 7.16, the premise of the desired implication entails \({\text {supp}}^+(\varphi )\subseteq {\mathcal {H}}_r(a)\). As in the proof of Lemma 7.18 we get

$$\begin{aligned} {\text {supp}}^+(\varphi _b)\subseteq {\text {supp}}^+(\varphi )\cup {\text {supp}}^+(b)\subseteq \mathcal H_r(a\cup {\text {supp}}^+(b)), \end{aligned}$$

which ensures that the same initial condition holds for the conclusion. We now argue by induction on \(t\in {\textbf{O}}\). In the crucial case, clause (i) of Definition 7.16 was applied to the distinguished formula \(\varphi \), so that we have

$$\begin{aligned} (r,a\cup {\text {supp}}^+(b))\vdash _s^{t(b)}\Gamma ,\varphi ,\varphi _b \end{aligned}$$

for some \(t(b)<t\). Here we can omit \(\varphi \) due to the induction hypothesis. Weakening (Lemma 7.17) allows us to increase t(b) to t, which yields the desired conclusion. In all other cases, one uses the induction hypothesis and reapplies the same clause. The latter is possible because clauses (ii) and (iv) concern formulas that are disjunctive and hence different from \(\varphi \). \(\square \)

The following (cf. [7, Lemma 3.14]) shows how certain applications of the cut rule can be avoided. Note that the result is no immediate consequence of clause (iii) from Definition 7.16, since the latter would require \({\text {rk}}(\psi )={\text {rk}}(\lnot \psi )<{\text {rk}}(\psi )\).

Lemma 8.2

(Reduction) For disjunctive \(\psi \) with \({\text {rk}}(\psi )\notin \{\psi _{\alpha +1}0\,|\,\alpha <\nu \}\) we have

$$\begin{aligned} (r,a)\vdash _{{\text {rk}}(\psi )}^{t(0)}\Gamma ,\lnot \psi \quad \text {and}\quad (r,a)\vdash _{{\text {rk}}(\psi )}^{t(1)}\Gamma ,\psi \qquad \Rightarrow \qquad (r,a)\vdash _{{\text {rk}}(\psi )}^{t(0)+t(1)}\Gamma . \end{aligned}$$

Proof

The premise of the desired implication entails \(t(i)\in \mathcal H_r(a)\) for \(i\in \{0,1\}\), as in the previous proof. Thus \(t(0)+t(1)\in {\mathcal {H}}_r(a)\) holds by Corollary 7.10 in conjunction with Lemma 4.16. We now argue by induction on t(1) and distinguish cases according to the clause of Definition 7.16 that was used to derive \(\Gamma ,\psi \). In the crucial case, the formula \(\psi \simeq \bigvee _{b\in \iota (\psi )}\psi _b\) itself was derived by clause (ii), which means that we have

$$\begin{aligned} (r,a)\vdash _{{\text {rk}}(\psi )}^{s}\Gamma ,\psi ,\psi _b\quad \text {for some }b\in \iota (\psi )\text { and }s<_{{\textbf{O}}}t(1). \end{aligned}$$

In particular, this means that we have \({\text {supp}}^+(\psi _b)\subseteq {\mathcal {H}}_r(a)\), by the initial condition from Definition 7.16. We may also assume \({\text {supp}}^+(b)\subseteq {\text {supp}}^+(\psi _b)\). Indeed, this is immediate if b occurs in \(\psi _b\). If it does not, then we have \(\psi _b=\psi _i\) for some index \(i\in \{0,1\}\subseteq \iota (\psi )\), as a glance at [11, Definition 3.12] reveals. In this case we may thus redefine \(b:=i\in u\subseteq \textbf{L}^u_{\Gamma ({\textbf{X}})}\) to get \({\text {supp}}^+(b)=\emptyset \). Let us now apply weakening to the given derivation of \(\Gamma ,\lnot \psi \), so that we obtain

$$\begin{aligned} (r,a)\vdash _{{\text {rk}}(\psi )}^{t(0)}\Gamma ,\lnot \psi ,\psi _b. \end{aligned}$$

By the induction hypothesis, we can then infer

$$\begin{aligned} (r,a)\vdash _{{\text {rk}}(\psi )}^{t(0)+s}\Gamma ,\psi _b. \end{aligned}$$

From [11, Definition 3.12] we know that \(\lnot \psi \) is conjunctive with \((\lnot \psi )_b=\lnot (\psi _b)\) for all \(b\in \iota (\lnot \psi )=\iota (\psi )\). We may thus apply inversion (Lemma 8.1) to the given derivation of \(\Gamma ,\lnot \psi \), in order to get

$$\begin{aligned} (r,a\cup {\text {supp}}^+(b))\vdash ^{t(0)}_{{\text {rk}}(\psi )}\Gamma ,\lnot \psi _b. \end{aligned}$$

For b as above, we may omit \({\text {supp}}^+(b)\subseteq {\mathcal {H}}_r(a)\) by weakening. As Lemma 7.15 ensures \({\text {rk}}(\psi _b)<{\text {rk}}(\psi )\), we can conclude by clause (iii) of Definition 7.16. In all other cases, one uses the induction hypothesis and reapplies the same clause. Here it is crucial to observe that clause (iv) cannot be applied with \(\psi =(\exists z\in L[\alpha ].\varphi ^z)\). Indeed, given that \(\varphi \) is a \(\Sigma (\alpha )\)-formula, we have

$$\begin{aligned} {\text {supp}}^+(\varphi )\subseteq _{{\textbf{O}}}\Gamma (I)\circ \textbf{R}(\alpha )=\psi _{\alpha +1}0. \end{aligned}$$

We may replace \(\varphi \) by the ‘trivial’ relativization \(\varphi ^0\), since we have \({\text {supp}}^+(0)=\emptyset \). As \(\psi _{\alpha +1}0\) is strongly critical (cf. the proof of Proposition 7.5), Lemma 7.14 yields

$$\begin{aligned} {\text {rk}}(\varphi ^0)+2<_{\textbf{O}}\omega \cdot (1+\psi _{\alpha +1}0)=\psi _{\alpha +1}0. \end{aligned}$$

Similarly, we get \({\text {rk}}(L[\alpha ])={\text {rk}}(L^u_{\textbf{R}(\alpha )})=\psi _{\alpha +1}0\) and then

$$\begin{aligned} {\text {rk}}(\exists z\in L[\alpha ].\,\varphi ^z)=\max \{{\text {rk}}(L[\alpha ]),{\text {rk}}(\varphi ^0)+2\}=\psi _{\alpha +1}0\ne {\text {rk}}(\psi ). \end{aligned}$$

The inequality holds by an assumption in the lemma, which thus excludes an obstructive application of clause (iv) from Definition 7.16. \(\square \)

By the next result (cf. [7, Theorem 3.16]), the cut rank can be reduced when no critical value \(\psi _{\alpha +1}0\) is involved. To remove this last restriction, we will later prove a collapsing result that complements cut elimination. Let us point out that \(\varphi \) refers to the Veblen function from Definition 4.12.

Proposition 8.3

(Predicative cut elimination) Consider elements \(p,q\in {\textbf{O}}\) such that \(p\le \psi _{\alpha +1}0<p+\varphi (0,q)\) fails for all \(\alpha <\nu \). We then have

$$\begin{aligned} (r,a)\vdash ^t_{p+\varphi (0,q)}\Gamma \quad \text {and}\quad q\in {\mathcal {H}}_r(a)\qquad \Rightarrow \qquad (r,a)\vdash ^{\varphi (q,t)}_p\Gamma . \end{aligned}$$

Proof

The assumption of the desired implication entails \(q,t\in \mathcal H_r(a)\), due to the initial condition from Definition 7.16. We get \(\varphi (q,t)\in {\mathcal {H}}_r\) by Corollary 7.10 in conjunction with Lemma 4.16. Let us now argue by main induction on q and side induction on t (where p may vary during the induction). In the crucial case, we are concerned with clause (iii) of Definition 4.12, so that we have

$$\begin{aligned} (r,a)\vdash ^s_{p+\varphi (0,q)}\Gamma ,\psi \qquad \text {and}\qquad (r,a)\vdash ^s_{p+\varphi (0,q)}\Gamma ,\lnot \psi \end{aligned}$$

with \({\text {rk}}(\psi )<p+\varphi (0,q)\) and \(s<t\). For later use we record \({\text {supp}}^+(\psi )\subseteq {\mathcal {H}}_r(a)\). The side induction hypothesis yields

$$\begin{aligned} (r,a)\vdash ^{\varphi (q,s)}_p\Gamma ,\psi \qquad \text {and}\qquad (r,a)\vdash ^{\varphi (q,s)}_p\Gamma ,\lnot \psi . \end{aligned}$$

If we have \({\text {rk}}(\psi )<p\), then we can conclude by clause (iii) of Definition 7.16, since Proposition 4.13 yields \(\varphi (q,s)<\varphi (q,t)\). By the same proposition and Lemma 4.15, we even have \(\varphi (q,s)+\varphi (q,s)<\varphi (q,t)\). Now assume \(p\le {\text {rk}}(\psi )\) and note that this entails \({\text {rk}}(\lnot \psi )={\text {rk}}(\psi )\notin \{\psi _{\alpha +1}0\,|\,\alpha <\nu \}\). Let us recall that \(\lnot \lnot \psi \) is syntactically equal to \(\psi \), since we treat negation as a defined operation on formulas in negation normal form. Either \(\psi =\lnot \lnot \psi \) or \(\lnot \psi \) is disjunctive, as seen in [11, Definition 3.12]. We can thus use reduction (Lemma 8.2) and weakening to get

$$\begin{aligned} (r,a)\vdash ^{\varphi (q,t)}_{{\text {rk}}(\psi )}\Gamma . \end{aligned}$$

Lemma 4.15 yields \({\text {rk}}(\psi )=p+s\) for some \(s\in {\textbf{O}}=\Gamma ({\textbf{K}})\). By Definition 4.5 we may write \(s=\langle s_0,\ldots ,s_{n-1}\rangle \) with \(s_i\in {\textsf{H}}\) (not necessarily with \(n>1\)). We thus get

$$\begin{aligned} p=p+s(0)\quad \text {and}\quad {\text {rk}}(\psi )=p+s(n)\quad \text {for}\quad s(i):=\langle s_0,\ldots ,s_{i-1}\rangle . \end{aligned}$$

By an auxiliary induction from \(i=n\) down to \(i=0\), we now show

$$\begin{aligned} (r,a)\vdash ^{\varphi (q,t)}_{p+s(i)}\Gamma . \end{aligned}$$

In the induction step, we use Proposition 4.13 to write \(s_i\in {\textsf{H}}\) in the form \(\varphi (p_i,q_i)\). Let us set \(q(i):=q_i\) when \(p_i=0\) and \(q(i):=s_i\) when \(0<p_i\). In the second case, Proposition 4.13 yields \(\varphi (0,s_i)=s_i\). So we always get

$$\begin{aligned} s(i+1)=s(i)+s_i=s(i)+\varphi (0,q(i)). \end{aligned}$$

Let us observe that we have

$$\begin{aligned} {\text {supp}}^{{\textbf{L}}}_{\Gamma ({\textbf{X}})}(q(i))={\text {supp}}^{\textbf{L}}_{\Gamma ({\textbf{X}})}(s_i)\subseteq {\text {supp}}^{\textbf{L}}_{\Gamma ({\textbf{X}})}({\text {rk}}(\psi )). \end{aligned}$$

As Lemma 7.14 provides \({\text {rk}}(\psi )\in \mathcal H_0({\text {supp}}^+(\psi ))\subseteq {\mathcal {H}}_r(a)\), we obtain \(q(i)\in {\mathcal {H}}_r(a)\) by Corollary 7.10. Furthermore, we have \(q(i)<q\) due to

$$\begin{aligned} p+s(i)+\varphi (0,q(i))=p+s(i+1)\le {\text {rk}}(\psi )<p+\varphi (0,q)\le p+s(i)+\varphi (0,q). \end{aligned}$$

Given the auxiliary induction hypothesis (with \(i+1\) at the place of i), we use the main induction hypothesis (with \(p+s(i)\) and q(i) at the place of p and q) to get

$$\begin{aligned} (r,a)\vdash ^{\varphi (q(i),\varphi (q,t))}_{p+s(i)}\Gamma . \end{aligned}$$

From Proposition 4.13 we know that \(q(i)<q\) entails \(\varphi (q(i),\varphi (q,t))=\varphi (q,t)\). So the step of the auxiliary induction is completed. Taking \(i=0\) completes the present case of the side and main induction step. The remaining cases are straightforward. \(\square \)

So far, the notation \(\varphi ^a\) for relativization has been introduced for \(a\in {\textbf{L}}^u_{\Gamma ({\textbf{X}})}\) only. We now use the embedding \(\Gamma (I):\Gamma ({\textbf{X}})\rightarrow \Gamma (\textbf{K})={\textbf{O}}\) to overload the notation.

Definition 8.4

Given an \({\textbf{L}}^u_{\Gamma ({\textbf{X}})}\)-formula \(\varphi \) and an element \(t\in {\text {rng}}(\Gamma (I))\subseteq {\textbf{O}}\), we set \(\varphi ^t:=\varphi ^a\) with \(a:=L^u_s\in {\textbf{L}}^u_{\Gamma (\textbf{X})}\) for the unique \(s\in \Gamma ({\textbf{X}})\) with \(\Gamma (I)(s)=t\).

It is instructive to recall \(\psi _{\alpha +1}0=\Gamma (I)\circ \textbf{R}(\alpha )\) and \(L[\alpha ]=L^u_{{\textbf{R}}(\alpha )}\), which yields

$$\begin{aligned} \varphi ^t=\varphi ^{L[\alpha ]}\quad \text {for}\quad t=\psi _{\alpha +1}0\in {\textbf{O}}. \end{aligned}$$

By Corollary 4.11 and Definition 6.12, the range of \(\Gamma (I)\) is an initial segment of \({\textbf{O}}\). It follows that \(\varphi ^t\) is defined whenever \(t\le \psi _{\alpha +1}0\) holds for some \(\alpha <\nu \). We will later need the following variant of inversion (cf. Lemma 8.1).

Lemma 8.5

Given \(q\in {\mathcal {H}}_r(a)\) with \(q\le \psi _{\alpha +1}0\), we get

$$\begin{aligned} (r,a)\vdash ^t_s\Gamma ,\forall x\in L[\alpha ].\,\theta \qquad \Rightarrow \qquad (r,a)\vdash ^t_s\Gamma ,(\forall x.\theta )^q, \end{aligned}$$

for any bounded \({\textbf{L}}^u_{\Gamma (\textbf{X})}\)-formula \(\theta =\theta (x)\).

Proof

Write \(\varphi :=\forall x\in L[\alpha ].\,\theta \) and \(\varphi ':=(\forall x.\theta )^q=\forall x\in L^u_p.\,\theta \) with \(\Gamma (I)(p)=q\), and note that \(q\le \psi _{\alpha +1}0\) entails \(p\le {\textbf{R}}(\alpha )\). The initial condition of Definition 7.16 is preserved as we have \({\text {supp}}^+(\varphi ')\subseteq {\text {supp}}^+(\varphi )\cup \{q\}\). In view of [11, Definition 3.12], the formulas \(\varphi \) and \(\varphi '\) are both conjunctive, and we have \(\varphi _b=\theta (b)=\varphi '_b\) for any

$$\begin{aligned} b\in \iota (\varphi ')=\{a\in {\textbf{L}}^u_{\Gamma (\textbf{X})}\,|\,{\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(a)\subseteq _{\Gamma ({\textbf{X}})}p\}\subseteq \iota (\varphi ). \end{aligned}$$

So whenever clause (i) of Definition 7.16 is used to derive \(\varphi \), it can also derive \(\varphi '\). Based on this observation, the claim is readily established by induction on t. \(\square \)

Let us also record how relativization interacts with our assignment of a disjunction \(\varphi \simeq \bigvee _{b\in \iota (\varphi )}\varphi _b\) or conjunction \(\varphi \simeq \bigwedge _{b\in \iota (\varphi )}\varphi _b\) to each formula \(\varphi \).

Lemma 8.6

The following holds for any \({\textbf{L}}^u_{\Gamma (\textbf{X})}\)-formula \(\varphi \) and any \(t\in {\text {rng}}(\Gamma (I))\):

  1. (a)

    The formula \(\varphi ^t\) is conjunctive or disjunctive, respectively, if and only if the same holds for \(\varphi \).

  2. (b)

    We have \((\varphi ^t)_b=(\varphi _b)^t\) for any \(b\in \iota (\varphi ^t)\subseteq \iota (\varphi )\).

  3. (c)

    For any \(b\in \iota (\varphi )\) with \({\text {supp}}^+(b)\subseteq _{{\textbf{O}}}t\), we have \(b\in \iota (\varphi ^t)\).

  4. (d)

    If \(\varphi \) is a \(\Sigma (\alpha )\)-formula, then so is \(\varphi _b\) for any b in the set

    $$\begin{aligned} \iota (\varphi ^{L[\alpha ]})=\{b\in \iota (\varphi )\,|\,{\text {supp}}^+(b)\subseteq _{\textbf{O}}\psi _{\alpha +1}0\}. \end{aligned}$$
  5. (e)

    Assume \(\varphi \) is a conjunctive \(\Sigma (\alpha )\)-formula. We then have \(\iota (\varphi ^t)=\iota (\varphi )\). Also, there is an \(s\in {\text {supp}}^+(\varphi )\cup \{0\}\) with \({\text {supp}}^+(b)\subseteq _{{\textbf{O}}}s\) for all \(b\in \iota (\varphi )\).

In part (e), we get \(s<\psi _{\alpha +1}0\) due to the definition of \(\Sigma (\alpha )\)-formulas. So when \(\varphi \) is conjunctive, part (d) applies to any element \(b\in \iota (\varphi )\).

Proof

All claims can be verified explicitly, based on [11, Definition 3.12]. Details for a representative case are given in the proof of [11, Lemma 9.1]. Concerning (d), we note that \({\text {supp}}^+(b)\subseteq _{{\textbf{O}}}\psi _{\alpha +1}0\) is equivalent to \({\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(b)\subseteq _{\Gamma ({\textbf{X}})}{\textbf{R}}(\alpha )\), which relates to Definition 7.12. In part (e), the crucial point is that \(\varphi \) cannot begin with an unbounded quantifier. \(\square \)

Clause (iv) of Definition 7.16 (which allows us to infer \(\exists z\in L[\alpha ].\,\varphi ^z\) from \(\varphi ^{L[\alpha ]}\)) is an obstruction to cut elimination, as we have seen in the proof of Lemma 8.2. The following result (cf. [7, Lemma 3.17]) will allow us to circumvent this clause, since \(\varphi ^t\) with \(t<\psi _{\alpha +1}0\) entails \(\exists z\in L[\alpha ].\,\varphi ^z\).

Proposition 8.7

(Boundedness) For each \(\Sigma (\alpha )\)-formula \(\varphi \) with \(\alpha <\nu \) we have

$$\begin{aligned} (r,a)\vdash _q^s\Gamma ,\varphi ^{L[\alpha ]}\text { and }s\le t<\psi _{\alpha +1}0\text { with }t\in \mathcal H_r(a)\quad \Rightarrow \quad (r,a)\vdash _q^s\Gamma ,\varphi ^t. \end{aligned}$$

Proof

First note that the antecedent of the desired implication entails

$$\begin{aligned} {\text {supp}}^+(\varphi ^t)\subseteq {\text {supp}}^+(\varphi ^{L[\alpha ]})\cup \{t\}\subseteq \mathcal H_r(a), \end{aligned}$$

so that the initial condition from Definition 7.16 is preserved. We now argue by induction on s. When the relevant clause from Definition 7.16 does not refer to \(\varphi ^{L[\alpha ]}\), it is straightforward to reduce to the induction hypothesis. In case clause (i) applies to \(\varphi ^{L[\alpha ]}\), the latter is conjunctive and we have

$$\begin{aligned} (r,a\cup {\text {supp}}^+(b))\vdash ^{s(b)}_q\Gamma ,\varphi ^{L[\alpha ]},(\varphi ^{L[\alpha ]})_b\quad \text {with }s(b)<s\text { for all }b\in \iota (\varphi ^{L[\alpha ]}). \end{aligned}$$

The previous lemma ensures that \(\varphi _b\) is a \(\Sigma (\alpha )\)-formula with \((\varphi _b)^{L[\alpha ]}=(\varphi ^{L[\alpha ]})_b\), for any \(b\in \iota (\varphi )=\iota (\varphi ^{L[\alpha ]})\). Thus two applications of the induction hypothesis yield

$$\begin{aligned} (r,a\cup {\text {supp}}^+(b))\vdash ^{s(b)}_q\Gamma ,\varphi ^t,(\varphi _b)^t\quad \text {for all }b\in \iota (\varphi ). \end{aligned}$$

Using the previous lemma once again, we learn that \(\varphi \) and hence \(\varphi ^t\) is conjunctive with \((\varphi ^t)_b=(\varphi _b)^t\) for all \(b\in \iota (\varphi ^t)\subseteq \iota (\varphi )\). In order to conclude the present case of the induction step, we can thus reapply clause (i). A similar argument covers clause (ii), as the previous lemma ensures the following: for any \(b\in \iota (\varphi ^{L[\alpha ]})\subseteq \iota (\varphi )\) with \({\text {supp}}^+(b)\subseteq _{{\textbf{O}}}s\le t<\psi _{\alpha +1}\), we have \(b\in \iota (\varphi ^t)\) and \(\varphi _b\) is a \(\Sigma (\alpha )\)-formula. Finally, we consider an application of clause (iv) for a \(\Sigma (\beta )\)-formula \(\theta \) with \(\beta <\nu \) and \((\exists z\in L[\beta ].\,\theta ^z)=\varphi ^{L[\alpha ]}\), where we have

$$\begin{aligned} (r,a)\vdash ^{s(0)}_q\Gamma ,\varphi ^{L[\alpha ]},\theta ^{L[\beta ]}\quad \text {for some }s(0)<s. \end{aligned}$$

If we have \(\alpha \ne \beta \), then \(L[\beta ]\) occurs in \(\varphi \), and the definition of \(\Sigma (\alpha )\)-formulas yields

$$\begin{aligned} {\textbf{R}}(\beta )\in {\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(L[\beta ])\subseteq {\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(\varphi )\subseteq _{\Gamma ({\textbf{X}})}{\textbf{R}}(\alpha ). \end{aligned}$$

So in any case we have \(\beta \le \alpha \). By a similar argument, it follows that \(L[\alpha ]\) cannot occur in the \(\Sigma (\beta )\)-formula \(\theta \). In case \(\beta <\alpha \) we thus get \(\varphi ^{L[\alpha ]}=\varphi =\varphi ^t\), which makes the claim trivial. Now assume \(\beta =\alpha \) and note that this forces \(\varphi =\exists z.\,\theta ^z\). We apply the induction hypothesis twice (once with s(0) at the place of t), to get

$$\begin{aligned} (r,a)\vdash ^{s(0)}_q\Gamma ,\varphi ^t,\theta ^{s(0)}. \end{aligned}$$

For \(p\in \Gamma ({\textbf{X}})\) with \(\Gamma (I)(p)=t\) we have

$$\begin{aligned} \varphi ^t=\exists z\in L^u_p.\,\theta ^z\simeq \textstyle \bigvee _{b\in \iota (\varphi ^t)}\theta ^b\quad \text {with}\quad \iota (\varphi ^t)=\{b\in \textbf{L}^u_{\Gamma ({\textbf{X}})}\,|\,{\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(b)\subseteq _{\Gamma ({\textbf{X}})}p\}. \end{aligned}$$

Now \(\theta ^{s(0)}=\theta ^b\) holds for \(b:=L^u_{p(0)}\) with \(\Gamma (I)(p(0))=s(0)<s\le t\), which yields

$$\begin{aligned} {\text {supp}}^{{\textbf{L}}}_{\Gamma (\textbf{X})}(b)=\{p(0)\}\subseteq _{\Gamma (\textbf{X})}p\quad \text {and}\quad {\text {supp}}^+(b)=\{s(0)\}\subseteq _{{\textbf{O}}}s. \end{aligned}$$

We can thus conclude by an application of clause (ii) from Definition 7.16. \(\square \)

The following definition adapts notation from [7, Section 4], which will be used for the crucial result on collapsing and impredicative cut elimination. The reader may wish to recall Definitions 7.4 and 7.7 as well as the paragraph before Theorem 7.21.

Definition 8.8

For \(\alpha <\nu \) and \(r,s\in {\textbf{O}}\) and \(a\in [\textbf{O}]^{<\omega }\), we abbreviate

$$\begin{aligned} {\mathcal {A}}(a;r,\alpha ,s)\quad :\Leftrightarrow \quad r,s\in \mathcal H_r(a)\text { and }a\subseteq \textstyle \bigcap _{\beta \ge \alpha } C_\beta (r+1). \end{aligned}$$

Let us also put \({\overline{K}}:=\{\Omega (\alpha )\,|\,\alpha \le \nu \}\) with

$$\begin{aligned} \begin{aligned} \Omega (0)&:=0,\qquad&\Omega (\alpha +1)&:=(\psi _{\alpha +1}0)+1,\\ \Omega (\nu )&:=\Gamma _{{\textbf{X}}+\langle \rangle }\qquad&\Omega (\lambda )&:=\psi _\lambda 0\quad \text {for each limit } \lambda <\nu . \end{aligned} \end{aligned}$$

Note that we have \(\Omega (\alpha )\in {\mathcal {H}}_0(\emptyset )\) for all \(\alpha \le \nu \), as a consequence of Theorem 7.21 and Corollary 7.11. For \(\alpha <\nu \), the following result characterizes \(\Omega (\alpha )\in {\text {rng}}(\Gamma (I))\) as a supremum.

Lemma 8.9

For any \(\alpha \le \nu \) and \(s\in {\text {rng}}(\Gamma (I))\subseteq {\textbf{O}}\) we have

$$\begin{aligned} s<_{{\textbf{O}}}\Omega (\alpha )\quad \Leftrightarrow \quad s\le _{\textbf{O}}\psi _{\beta +1}0\text { for some }\beta <\alpha . \end{aligned}$$

Proof

The claim is immediate when \(\alpha \) is zero or a successor. Let us now assume that \(\alpha <\nu \) is a limit. The non-trivial task is to show that \(s<\psi _\alpha 0\) entails \(s\le \psi _{\beta +1}0\) for some \(\beta <\alpha \). Invoking Definitions 7.1 and 7.4 as well as Lemma 4.10, we see that any \(\delta <\nu \) validates

$$\begin{aligned} s<_{{\textbf{O}}}\psi _\delta 0=\gamma _{{\textbf{K}}}\circ I(\psi ^{\textbf{X}}_\delta 0)\quad \Leftrightarrow \quad {\text {supp}}^\Gamma _{\textbf{K}}(s)\subseteq _{{\textbf{K}}}I(\psi ^{{\textbf{X}}}_\delta 0). \end{aligned}$$

Assume that these equivalent statements hold for \(\delta =\alpha \). We need to find a \(\beta <\alpha \) such that they hold for \(\delta =\beta +1\) as well. Let us recall that the range of \(I:{\textbf{X}}\rightarrow {\textbf{K}}\) is an initial segment. The maximal element of the finite set \({\text {supp}}^\Gamma _{{\textbf{K}}}(s)\) can thus be written as I(x), except in the trivial case where the support is empty. Due to Definition 6.12 we get \(x=\psi ^{{\textbf{X}}}_\beta t\) for some \(\beta <\nu \) and \(t\in {\textbf{O}}\). Clearly, the right side above holds for \(\delta =\beta +1\). Also, the right side for \(\delta =\alpha \) entails \(I(x)<I(\psi ^{{\textbf{X}}}_\alpha 0)\) and hence \(\beta <\alpha \), as desired. Now consider the case of \(\alpha =\nu \). For any \(s\in {\text {rng}}(\Gamma (I))\), Definition 1.1 and Lemma 6.20 yield \(s<\Omega (\nu )\) and \({\text {supp}}^\Gamma _{\textbf{K}}(s)\subseteq {\text {rng}}(I)\). Due to the latter, we can find a \(\beta <\nu \) with \(s<\psi _{\beta +1}0\), as in the limit case. \(\square \)

The following transfers [7, Lemma 4.7] into our setting. Note that no assumption such as \(p'\in {\mathcal {H}}_r(a)\) is needed in part (b).

Lemma 8.10

If we have \({\mathcal {A}}(a;r,\alpha ,s)\) and \(t\in {\mathcal {H}}_r(a)\), then the following holds:

  1. (a)

    Given \(\alpha \le \beta \) and \(t<\psi _{\beta +1}0\), we get \(t<\psi _\beta (r+1)\).

  2. (b)

    For \(p:=r+\varphi _0(s+t)\) we have \(p\in {\mathcal {H}}_r(a)\) as well as \(\psi _\alpha p\in {\mathcal {H}}_p(a)\), and \(p<p'\) entails \(\psi _\alpha p<\psi _\alpha p'\).

Proof

  1. (a)

    In view of Definition 7.7, the assumptions entail \(t\in C_\beta (r+1)\). Now the conclusion follows by Proposition 7.5.

  2. (b)

    In view of \(r\le p\), we can use Corollary 7.10 to get \(p\in {\mathcal {H}}_r(a)\subseteq {\mathcal {H}}_{p}(a)\), which entails \(\psi _\alpha p\in {\mathcal {H}}_{p}(a)\) by Corollary 7.11. Given \(p<p'\), we now obtain \(\psi _\alpha p\in C_\alpha (p')\), as \({\mathcal {A}}(a;r,\alpha ,s)\) provides \(a\subseteq C_\alpha (r+1)\subseteq C_\alpha (p')\). In order to conclude \(\psi _\alpha p<\psi _\alpha p'\), it suffices to invoke Proposition 7.5 once again.

\(\square \)

Our abstract ordinal analysis culminates in the following (cf. [7, Theorem 4.8]).

Theorem 8.11

(Collapsing and impredicative cut elimination) For \(\alpha <\nu \), assume

$$\begin{aligned} (r,a)\vdash ^t_s\Gamma \quad \text {with}\quad \mathcal A(a;r,\alpha ,s)\quad \text {and}\quad s\in {\overline{K}}, \end{aligned}$$

where all elements of \(\Gamma \) have the form \(\varphi ^{L[\alpha ]}\) for a \(\Sigma (\alpha )\)-formula \(\varphi \). We then get

$$\begin{aligned} (p,a)\vdash ^q_q\Gamma \quad \text {with}\quad p=r+\varphi _0(s+t)\quad \text {and}\quad q=\psi _\alpha p. \end{aligned}$$

Proof

We argue by main induction on s and side induction on t (where \(\alpha \) and the other parameters may vary in the induction). The previous lemma secures the initial condition from Definition 7.16. In clause (i) of the latter, we are concerned with a conjunctive formula \(\varphi ^{L[\alpha ]}\in \Gamma \) such that we have

$$\begin{aligned} (r,a\cup {\text {supp}}^+(b))\vdash ^{t(b)}_s\Gamma ,\varphi ^{L[\alpha ]}_b\quad \text {with }t(b)<t\text { for all }b\in \iota (\varphi ^{L[\alpha ]}). \end{aligned}$$

Here we write \(\varphi ^{L[\alpha ]}_b\) for \((\varphi ^{L[\alpha ]})_b\), which coincides with \((\varphi _b)^{L[\alpha ]}\) due to Lemma 8.6. The latter also yields a \(t'\in {\text {supp}}^+(\varphi )\cup \{0\}\subseteq {\mathcal {H}}_r(a)\) with \({\text {supp}}^+(b)\subseteq _{{\textbf{O}}}t'<\psi _{\alpha +1}0\) for all \(b\in \iota (\varphi ^{L[\alpha ]})\). To establish \(\mathcal A(a\cup {\text {supp}}^+(b);r,\alpha ,s)\) for any such b, we consider an arbitrary \(\beta \ge \alpha \). By the previous lemma we get \(t'<\psi _\beta (r+1)\). Let us also note that \(r+1\in \mathcal H_r(a)\subseteq C_\beta (r+1)\) holds due to \(\mathcal A(a;r,\alpha ,s)\) and Definition 7.7. Thus Proposition 7.5 yields \({\text {supp}}^+(b)\subseteq C_\beta (r+1)\), as required. We may now use the side induction hypothesis to infer

$$\begin{aligned} (p(b),a\cup {\text {supp}}^+(b))\vdash ^{q(b)}_{q(b)}\Gamma ,\varphi ^{L[\alpha ]}_b\quad \text {with}\quad p(b)=r+\varphi _0(s+t(b))\quad \text {and}\quad \psi _\alpha p(b), \end{aligned}$$

for any \(b\in \iota (\varphi ^{L[\alpha ]})\). With p and q as in the theorem, we see that \(t(b)<t\) entails \(p(b)<p\) and then \(q(b)<q\), by Lemma 8.10 with \(a\cup {\text {supp}}^+(b)\) at the place of a. To conclude the present case of the induction step, we use weakening and reapply clause (i) of Definition 7.16. Now consider clause (ii) for a disjunctive \(\varphi ^{L[\alpha ]}\in \Gamma \) with

$$\begin{aligned} (r,a)\vdash ^{t(0)}_s\Gamma ,\varphi ^{L[\alpha ]}_b\quad \text {for some}\quad t(0)<t\quad \text {and}\quad b\in \iota (\varphi ^{L[\alpha ]}). \end{aligned}$$

As in the proof of Lemma 8.2, we may assume \({\text {supp}}^+(b)\subseteq {\text {supp}}^+(\varphi ^{L[\alpha ]}_b)\). The latter entails \({\text {supp}}^+(b)\subseteq {\mathcal {H}}_r(a)\), by the initial condition from Definition 7.16. Since we also have \({\text {supp}}^+(b)\subseteq _{{\textbf{O}}}\psi _{\alpha +1}0\) due to Lemma 8.6, we can use Lemma 8.10 to get

$$\begin{aligned} {\text {supp}}^+(b)\subseteq _{{\textbf{O}}}\psi _\alpha (r+1)\le _{\textbf{O}}\psi _\alpha p(0)=:q(0)\quad \text {with}\quad p(0):=r+\varphi _0(s+t(0)). \end{aligned}$$

Let us recall that our version of \(\psi \) is not even weakly increasing. To secure the weak inequality above, one invokes Lemma 8.10 with \(s=0=t\). The given bound on \({\text {supp}}^+(b)\) allows us to reapply clause (ii) after the side induction hypothesis has been used. Before we come to the crucial clause (iii), let us consider an application of (iv), where \(\Gamma \) contains \(\exists z\in L[\beta ].\,\varphi ^z\) for some \(\Sigma (\beta )\)-formula \(\varphi \). As in the proof of Proposition 8.7, we necessarily have \(\beta \le \alpha \). To conclude by the side induction hypothesis, we need only observe that \(\varphi ^{L[\beta ]}=\psi ^{L[\alpha ]}\) holds for some \(\Sigma (\alpha )\)-formula \(\psi \). We can take \(\psi :=\varphi \) for \(\beta =\alpha \) and \(\psi :=\varphi ^{L[\beta ]}=(\varphi ^{L[\beta ]})^{L[\alpha ]}\) for \(\beta <\alpha \). As preparation for clause (iv), we establish the following claim (which is adapted from the proof by Buchholz [7]). The quantities that appear in the theorem should be considered as fixed (for the induction step), while p(0), q(0) and \(\varphi \) can be arbitrary.

Claim

Assume that we have \(r\le p(0)<p\) and \(p(0)\in \mathcal H_{p(0)}(a)\), and that there exists a \(\beta <\nu \) with \(s(0):=\max \{q(0),{\text {rk}}(\varphi )\}<\psi _{\beta +1}0\le s\). We then get

$$\begin{aligned} (p(0),a)\vdash ^{q(0)}_{q(0)}\Gamma ,\varphi \quad \text {and}\quad (p(0),a)\vdash ^{q(0)}_{q(0)}\Gamma ,\lnot \varphi \qquad \Rightarrow \qquad (p,a)\vdash ^q_q\Gamma . \end{aligned}$$

To establish the claim, we first note that clause (iii) of Definition 7.16 yields

$$\begin{aligned} (p(0),a)\vdash ^{q(0)+1}_{s(0)+1}\Gamma . \end{aligned}$$

For any \(\beta \) as in the claim, we have \(s(1):=\Omega (\beta )+\varphi _0(s(0)+1)<\psi _{\beta +1}0\), since the bound is strongly critical (cf. the proof of Proposition 7.5). So there is no \(\gamma <\nu \) with \(\Omega (\beta )\le \psi _{\gamma +1}0<s(1)\). We can thus use Proposition 8.3 (predicative cut elimination) to get

$$\begin{aligned} (p(0),a)\vdash ^{t(0)}_{\Omega (\beta )}\Gamma \quad \text {with}\quad t(0):=\varphi (s(0)+1,q(0)+1). \end{aligned}$$

It is straightforward to check that we have \(\mathcal A(a;p(0),\alpha ,\Omega (\beta ))\). We can now use the main induction hypothesis to infer

$$\begin{aligned} (p(1),a)\vdash ^{q(1)}_{q(1)}\Gamma \quad \text {with}\quad p(1)=p(0)+\varphi _0(\Omega (\beta )+t(0))\quad \text {and}\quad q(1)=\psi _\alpha p(1). \end{aligned}$$

We have \(p(0)<p=r+\varphi _0(s+t)\) by assumption, and the above yields

$$\begin{aligned} \varphi _0(\Omega (\beta )+t(0))<\varphi _0(s+t)\in \mathsf H\subseteq \Gamma ({\textbf{X}}). \end{aligned}$$

Using Lemmas 4.15 and 8.10, we obtain \(p(1)<p\) and then \(q(1)<q\). An application of weakening (Lemma 7.17) concludes the proof of the claim. Let us now consider an application of clause (iii) from Definition 7.16, where we have

$$\begin{aligned} (r,a)\vdash ^{t(0)}_s\Gamma ,\varphi \quad \text {and}\quad (r,a)\vdash ^{t(0)}_s\Gamma ,\lnot \varphi \end{aligned}$$

for some \(t(0)<t\) and some bounded \({\textbf{L}}^u_{\Gamma (\textbf{X})}\)-formula \(\varphi \) with \({\text {rk}}(\varphi )<s\). First assume

$$\begin{aligned} {\text {rk}}(\varphi )<_{\textbf{O}}\psi _{\alpha +1}0=\omega \cdot (1+\psi _{\alpha +1}0), \end{aligned}$$

where the equality holds because \(\psi _{\alpha +1}0\) is strongly critical. From Lemma 7.14 we learn that \(\varphi \) and \(\lnot \varphi \) are \(\Sigma (\alpha )\)-formulas. Given that any bounded formula \(\theta \) is equal to \(\theta ^{L[\alpha ]}\), the side induction hypothesis provides

figure l

with \(p(0)=r+\varphi _0(s+t(0))\) and \(q(0)=\psi _\alpha p(0)\). Also by Lemma 7.14, we have

$$\begin{aligned} {\text {rk}}(\varphi )\in {\mathcal {H}}_0({\text {supp}}^+(\varphi ))\subseteq \mathcal H_r(a), \end{aligned}$$

which entails \({\text {rk}}(\varphi )<\psi _\alpha (r+1)\le q(0)\) due to Lemma 8.10. To conclude the present case of the induction step, we can thus reapply clause (iii). Next, assume we have

$$\begin{aligned} \psi _{\alpha +1}0\le {\text {rk}}(\varphi )\notin \{\psi _{\beta +1}0\,|\,\beta <\nu \}. \end{aligned}$$

Due to \({\text {rng}}(\Gamma (I))\ni {\text {rk}}(\varphi )<s\in {\overline{K}}\), we may pick a \(\beta <\nu \) with \({\text {rk}}(\varphi )\le \psi _{\beta +1}0<s\), by Lemma 8.9. In the present case this upgrades to \({\text {rk}}(\varphi )<\psi _{\beta +1}0\), which entails that we have \(\alpha <\beta \). It follows that \(\Gamma ,\varphi ,\lnot \varphi \) consists of bounded \(\Sigma (\beta )\)-formulas. Indeed, for \(\psi ^{L[\alpha ]}\in \Gamma \) with a \(\Sigma (\alpha )\)-formula \(\psi \), we get

$$\begin{aligned} {\text {supp}}^{\textbf{L}}_{\Gamma (X)}(\psi ^{L[\alpha ]})\subseteq {\text {supp}}^{\textbf{L}}_{\Gamma (X)}(\psi )\cup \{\textbf{R}(\alpha )\}\subseteq _{\Gamma ({\textbf{X}})}{\textbf{R}}(\beta ). \end{aligned}$$

From \({\mathcal {A}}(a;r,\alpha ,s)\) and \(\alpha <\beta \) we immediately get \({\mathcal {A}}(a;r,\beta ,s)\). Thus the side induction hypothesis yields (\(\star \)), but now with \(q(0)=\psi _\beta p(0)\) for the same p(0). We can conclude the present case by the claim that we have established above. Finally, assume that we have \({\text {rk}}(\varphi )=\psi _{\beta +1}0\) with \(\alpha \le \beta <\nu \). Recall that \(\varphi \) and \(\lnot \lnot \varphi \) are syntactically equal, due to our treatment of negation as a defined operation. We may thus assume that \(\varphi \) (rather than \(\lnot \varphi \)) is disjunctive. In view of Definition 7.13, we must have \(\varphi =\exists x\in L[\beta ].\,\theta \) for some bounded \({\textbf{L}}^u_{\Gamma (\textbf{X})}\)-formula \(\theta =\theta (x)\) that satisfies \({\text {rk}}(\theta (0))<\psi _{\beta +1}0\). The latter entails that \(\exists x.\,\theta \) is a \(\Sigma (\beta )\)-formula. Now the side induction hypothesis and boundedness (Proposition 8.7) yield

$$\begin{aligned} (p(0),a)\vdash ^{q(0)}_{q(0)}\Gamma ,(\exists x.\,\theta )^{q(0)}\quad \text {with } p(0)=r+\varphi _0(s+t(0))\text { and } q(0)=\psi _\beta p(0). \end{aligned}$$

From \((r,a)\vdash ^{t(0)}_s\Gamma ,\lnot \varphi \) with \(\lnot \varphi =(\forall x\in L[\beta ].\,\lnot \theta )\) we also obtain

$$\begin{aligned} (p(0),a)\vdash ^{t(0)}_s\Gamma ,(\forall x.\,\lnot \theta )^{q(0)}, \end{aligned}$$

by weakening and Lemma 8.5. One readily derives \({\mathcal {A}}(a;p(0),\beta ,s)\). As \((\forall x.\lnot \theta )^{q(0)}\) is a bounded \(\Sigma (\beta )\)-formula, the side induction hypothesis provides

$$\begin{aligned} (p(1),a)\vdash ^{q(1)}_{q(1)}\Gamma ,(\forall x.\lnot \theta )^{q(0)}\quad \text {with }p(1)=p(0)+\varphi _0(s+t(0))\text { and }q(1)=\psi _\beta p(1). \end{aligned}$$

As we have \(p(0)<p(1)\) and \(q(0)<q(1)\in {\mathcal {H}}_{p(1)}(a)\) by Lemma 8.10, the above can be weakened to

$$\begin{aligned} (p(1),a)\vdash ^{q(1)}_{q(1)}\Gamma ,(\exists x.\,\theta )^{q(0)}. \end{aligned}$$

Note that we have \(p(1)<p\), due to Lemma 4.15. Using Lemma 7.14, we also see that \({\text {rk}}(\theta (0))<\psi _{\beta +1}0=\omega \cdot (1+\psi _{\beta +1}0)\) entails

$$\begin{aligned} {\text {supp}}^+\left( (\exists x.\,\theta )^{q(0)}\right) \subseteq {\text {supp}}^+(\theta (0))\cup \{q(0)\}\subseteq _{\textbf{O}}\psi _{\beta +1}0 \end{aligned}$$

and hence \({\text {rk}}((\exists x.\,\theta )^{q(0)})<\psi _{\beta +1}0={\text {rk}}(\varphi )<s\). We can thus conclude by the claim that was shown above (with p(1) and q(1) at the place of p(0) and q(0)). \(\square \)

One can use collapsing and boundedness to obtain quantitative information from proofs, as in [7, Theorem 4.9]. For our purpose, it will be enough to have the following consistency result (recall that the empty sequent represents contradiction). Let us stress that our ordinal analysis was conditional on Assumptions 5.1 and 6.4. In fact, our aim was to refute these assumptions. This aim is achieved by the following result, since it contradicts Theorem 7.21 (embedding). The conclusions from this contradiction will be drawn in the next section.

Corollary 8.12

(Consistency) We do not have \((0,\emptyset )\vdash ^t_{\Omega (\nu )}\langle \rangle \) for any \(t\in {\textbf{O}}\).

Proof

Assume the claim is false. Then the previous theorem yields a \(p\in {\textbf{O}}\) with

$$\begin{aligned} (p,\emptyset )\vdash ^q_q\langle \rangle \quad \text {for}\quad q=\psi _0p. \end{aligned}$$

Note that we have \(q=\varphi (0,q)\le \psi _{\alpha +1}0\) for all \(\alpha <\nu \). We can thus use predicative cut elimination (Proposition 8.3) to get

$$\begin{aligned} (p,\emptyset )\vdash ^{\varphi (q,q)}_0\langle \rangle . \end{aligned}$$

The latter cannot hold, because no clause from Definition 7.16 applies: clauses (i,ii) and (iv) require a formula in \(\langle \rangle \), while clause (iii) demands \({\text {rk}}(\psi )<0\). \(\square \)

9 Fixed points, comprehension, and admissible sets

In this section, we combine our previous work in order to prove Theorem 1.6 and its corollaries, which were stated in the introduction. The following result provides the most difficult implication. It relies on an extensive argument that was developed in Sects. 5 to 8. More intuitive explanations of the following proof can be found in the introduction and in Sect. 5.

Theorem 9.1

For the following statements from Theorem 1.6, the theory \(\mathsf {ATR_0^{set}}\) proves that (ii) implies (iv) for any infinite ordinal \(\nu \):

  1. (ii)

    any dilator has a well founded \(\nu \)-fixed point,

  2. (iv)

    for any set u, there is a sequence of admissible sets \(\textsf{Ad}_\alpha \ni u\) for \(\alpha <\nu \), such that \(\alpha<\beta <\nu \) entails \(\textsf{Ad}_\alpha \in \textsf{Ad}_\beta \).

Proof

As mentioned before, the restriction to infinite \(\nu \) is convenient because it allows us to reduce to the limit case. Indeed, it entails that we have \(\nu \le \mu +\omega \) for limits \(\mu ,\omega \le \nu \). Given that (ii) holds for \(\nu \), it does also hold for \(\mu \) and for \(\omega \), by Corollary 2.10 in conjunction with Corollary 2.2 and Theorem 2.9. Assuming the limit case of the present theorem, we thus get (iv) for \(\mu \) and for \(\omega \). To deduce (iv) for \(\nu \) and a given set u, we build two increasing sequences of admissibles \(\textsf{Ad}'_\alpha \ni u\) for \(\alpha <\mu \) and \(\textsf{Ad}''_n\ni \bigcup _{\alpha <\mu }\textsf{Ad}'_\alpha \) for \(n<\omega \). Note that we always have \(\textsf{Ad}'_\alpha \in \textsf{Ad}''_n\), as admissible sets are transitive. To obtain the desired sequence of admissibles \(\textsf{Ad}_\alpha \) for \(\alpha <\nu \), we set \(\textsf{Ad}_\alpha :=\textsf{Ad}'_\alpha \) when \(\alpha <\mu \) and \(\textsf{Ad}_\alpha :=\textsf{Ad}''_n\) when \(\alpha =\mu +n<\nu \). For the rest of this proof, we assume that \(\nu \) is a limit such that (ii) holds. Note that \(\Pi ^1_1\)-comprehension becomes available by Corollary 4.4. It suffices to establish (iv) for transitive u (replace u by the transitive closure \(u'\) of \(\{u\}\)). We may also assume that the intersection \(o(u)=u\cap {\text {Ord}}\) with the class of ordinals is a successor \(o(u)>1\) (replace \(u'\) by \(u'\cup \{0,1,o(u')\}\)). Since \(\mathsf {ATR_0^{set}}\) contains the axiom of countability (cf. the introduction), we can fix enumerations \(u=\{u_i\,|\,i\in {\mathbb {N}}\}\) and \(\nu =\{\nu _i\,|\,i\in {\mathbb {N}}\}\). By these preliminary considerations we have satisfied Assumption 5.1. Aiming at a contradiction, we now assume that (iv) fails for \(\nu \) and u as fixed. By Proposition 6.2, it follows that a certain predilator \({\textbf{S}}_0\) is a dilator. The latter gives rise to another dilator \(\Gamma \circ {\textbf{S}}\), due to Proposition 4.8 and Definition 6.3. We now use statement (ii) of the present theorem, which yields a well order \({\textbf{Y}}\) with a \(\nu \)-collapse

$$\begin{aligned} \pi _{{\textbf{Y}}}:{\textbf{Y}}\rightarrow \nu \times (\Gamma \circ {\textbf{S}})(\textbf{Y}). \end{aligned}$$

This means that Assumption 6.4 is satisfied as well. However, we have seen that the cited assumptions entail two incompatible results: Theorem 7.21 and Corollary 8.12 cannot both be valid, as we have \(\Omega (\nu )=\Gamma _{{\textbf{X}}+\langle \rangle }\) by Definition 8.8. Thus we have reached the desired contradiction. \(\square \)

The next implication follows from [43, Paragraph 3] (see also the English translation in [44, Section 5] as well as Section 3.3.5 of the survey [41]). We provide a proof because the cited references involve the notion of inductive definition.

Proposition 9.2

Over \(\mathsf {ATR_0^{set}}\), statement (iv) from Theorem 1.6 (or Theorem 9.1) entails the following, for any ordinal \(\nu \):

  1. (i)

    \(\Pi ^1_1\)-recursion along \(\nu \) holds.

Proof

We want to establish recursion for a given \(\Pi ^1_1\)-formula \(\varphi (x,\alpha ,X,{\textbf{Z}})\) with parameters \(x\in {\mathbb {N}}\), \(\alpha <\nu \) and \(X,{\textbf{Z}}\subseteq {\mathbb {N}}\). Recall (e. g. from [55, Lemma V.1.4]) that we have a set theoretic \(\Sigma \)-formula \(\psi (x,\alpha ,X,{\textbf{Z}})\) such that our base theory proves

$$\begin{aligned} ``A \text { is admissible''}\rightarrow \forall x,\alpha ,X,{\textbf{Z}}\in A\,\left( \varphi (x,\alpha ,X,\textbf{Z})\leftrightarrow \psi (x,\alpha ,X,{\textbf{Z}})^A\right) , \end{aligned}$$

where the superscript denotes relativization. Since the cited reference employs inductive definitions, we recall an alternative argument: We have \(\varphi (x,\alpha ,X,{\textbf{Z}})\) precisely when a certain computable tree \(T=T(x,\alpha ,X,{\textbf{Z}})\) is well founded (see e. g. [55, Lemma V.1.4]). Let \(\psi (x,\alpha ,X,{\textbf{Z}})\) assert that there is an \(f:T\rightarrow {\text {Ord}}\) that descends along branches. Crucially, if \(T\in A\) is indeed well founded, then such an f exists in A (see e. g. [31, Theorem 4.6]). In the following, we rely on the presentation of \(\Pi ^1_1\)-recursion in the second paragraph after Theorem 1.6. Note that statement (iv) holds for \(\nu +1\) if it holds for \(\nu >0\). We may thus consider a sequence of admissibles \(\textsf{Ad}(\alpha )\in \textsf{Ad}(\beta )\) for \(\alpha <\beta \le \nu \), such that \(\textsf{Ad}(0)\) contains given parameters \({\textbf{Z}}\). By primitive recursion in the sense of [34], we define a function \(\nu \ni \alpha \mapsto Y_{<\alpha }\) with \(Y_{<0}:=\emptyset \) and

$$\begin{aligned} Y_{<\alpha +1}&:=Y_{<\alpha }\cup \{\langle \alpha ,x\rangle \,|\,x\in {\mathbb {N}}\text { and }\psi (x,\alpha ,Y_{<\alpha },{\textbf{Z}})^{\textsf{Ad}(\alpha +1)}\},\\ Y_{<\lambda }&:=\textstyle \bigcup _{\alpha<\lambda }Y_{<\alpha }\quad \text {for limit } \lambda . \end{aligned}$$

We then set \(Y:=\bigcup _{\alpha<\nu }Y_{<\alpha }\) and observe \(Y_{\alpha }=\{\langle \gamma ,x\rangle \in Y\,|\,\gamma <\alpha \}\) for \(\alpha <\nu \), as in the presentation after Theorem 1.6. Our task is to establish

$$\begin{aligned} \{x\in {\mathbb {N}}\,|\,\langle \alpha ,x\rangle \in Y\}=\{x\in \mathbb N\,|\,\varphi (x,\alpha ,Y_{<\alpha },{\textbf{Z}})\}, \end{aligned}$$

where the left side is commonly denoted by \(Y_\alpha \). The claim reduces to

$$\begin{aligned} \varphi (x,\alpha ,Y_{<\alpha },\textbf{Z})\leftrightarrow \psi (x,\alpha ,Y_{<\alpha },\textbf{Z})^{\textsf{Ad}(\alpha +1)}. \end{aligned}$$

This equivalence holds by the choice of \(\psi \), once we have established \(Y_{<\alpha }\in \textsf{Ad}(\alpha +1)\). We show the latter by induction on \(\alpha <\nu \). In the crucial case of a limit \(\alpha \), we get

$$\begin{aligned} \psi (x,\gamma ,Y_{<\gamma },\textbf{Z})^{\textsf{Ad}(\gamma +1)}\leftrightarrow \psi (x,\gamma ,Y_{<\gamma },\textbf{Z})^{\textsf{Ad}(\alpha )}\quad \text {for}\quad \gamma <\alpha . \end{aligned}$$

Indeed, both sides are equivalent to \(\varphi (x,\gamma ,Y_{<\gamma },{\textbf{Z}})\), as \(Y_{<\gamma }\in \textsf{Ad}(\gamma +1)\subseteq \textsf{Ad}(\alpha )\) holds by induction hypothesis. So we can view \(\alpha \ge \gamma \mapsto Y_{<\gamma }\) as dependent on \(\textsf{Ad}(\alpha )\) rather than \(\alpha \ni \gamma \mapsto \textsf{Ad}(\gamma )\). Now since \(\textsf{Ad}(\alpha +1)\) contains \(\textsf{Ad}(\alpha )\), it will also contain \(Y_{<\alpha }\), as admissible sets are closed under primitive recursive set functions. \(\square \)

In Sect. 2 we have constructed a linear order \(\psi _\nu (D)\), relative to a given well order \(\nu \) and predilator D. Besides the statements (i,ii) and (iv) that that have been recalled above, Theorem 1.6 involves the following assertion:

  1. (iii)

    if D is a dilator (rather than just a predilator), then \(\psi _\nu (D)\) is a well order.

We now combine the previous results in order to deduce our main theorem.

Proof of Theorem 1.6

Due to Corollary 2.2 and Theorem 2.9, the order \(\psi _\nu (D)\) is the unique \(\nu \)-fixed point of D, up to isomorphism. Together with Theorem 3.12, it follows that we have

$$\begin{aligned} (i)\quad \Rightarrow \quad (ii)\quad \Leftrightarrow \quad (iii) \end{aligned}$$

for any well order \(\nu \), provably in \(\textsf{RCA}_0\). As in the desired Theorem 1.6, we now assume that \(\nu \) is infinite (though this could probably be avoided). From Corollary 4.4 we know that (ii) entails \(\Pi ^1_1\)-comprehension. To show that (ii) implies (i) over the theory \(\textsf{RCA}_0\), it is thus enough to prove the same implication in \(\textsf{ATR}_0\) or indeed in the conservative extension \(\mathsf {ATR_0^{set}}\). As stated in the introduction, our version of \(\mathsf {ATR_0^{set}}\) contains the axiom of countability, which is included in [55] but marked as ‘optional’ in [54]. Let us also recall that \(\mathsf {ATR_0^{set}}\) contains axiom beta, which allows us to assume that \(\nu \) is an ordinal (rather than just a well order). Over the theory \(\mathsf {ATR_0^{set}}\), Theorem 9.1 and Proposition 9.2 yield

$$\begin{aligned} (ii)\quad \Rightarrow \quad (iv)\quad \Rightarrow \quad (i), \end{aligned}$$

which closes our circle of implications. \(\square \)

In the introduction, we have stated a corollary which asserts that (ii) and (iii) for \(\nu =\omega \) are equivalent to the following:

  1. (i’)

    every subset of \({\mathbb {N}}\) is contained in a countable \(\beta \)-model of \(\Pi ^1_1\)-comprehension.

This result holds by our main theorem and the following standard argument.

Proof of Corollary 1.7

We first assume (i’) and derive (ii) for \(\nu =\omega \), over \(\textsf{RCA}_0\). In fact we may work in \(\textsf{ATR}_0\) (e. g. by [55, Exercise VII.2.10]). Due to Theorem 1.6, it is enough to establish \(\Pi ^1_1\)-recursion along \(\omega \). Given a \(\Pi ^1_1\)-formula \(\varphi (x,n,X,{\textbf{Z}})\) and paramters \({\textbf{Z}}\), we invoke (i’) to get a countable \(\beta \)-model \({\mathcal {M}}\ni {\textbf{Z}}\) of \(\Pi ^1_1\)-comprehension. Satisfaction in \({\mathcal {M}}\) is arithmetical for instances of \(\varphi \) (cf. [55, Definition VII.2.1]). We can thus use arithmetical recursion to construct the set

$$\begin{aligned} Y=\{\langle n,x\rangle \in \omega \times {\mathbb {N}}\,|\,\mathcal M\vDash \varphi (x,n,Y^n,{\textbf{Z}})\}, \end{aligned}$$

with \(Y^n=\{\langle m,x\rangle \in Y\,|\,m<n\}\) as before. The given definition presumes \(Y^n\in {\mathcal {M}}\), which we get by induction: in the step, \(\Pi ^1_1\)-comprehension in \({\mathcal {M}}\) yields

$$\begin{aligned} Y^{n+1}=Y^n\cup \{\langle n,x\rangle \,|\,x\in {\mathbb {N}}\text { and }{\mathcal {M}}\vDash \varphi (x,n,Y^n,{\textbf{Z}})\}\in {\mathcal {M}}. \end{aligned}$$

Since \({\mathcal {M}}\) is a \(\beta \)-model (cf. [55, Lemma VII.2.6]), we have

$$\begin{aligned} \varphi (x,n,Y^n,{\textbf{Z}})\quad \leftrightarrow \quad \mathcal M\vDash \varphi (x,n,Y^n,{\textbf{Z}}). \end{aligned}$$

In the notation from the introduction we thus have \(H_\varphi (Y)\), as needed to establish the given instance of \(\Pi ^1_1\)-recursion. To show that (ii) for \(\nu =\omega \) entails (i’), we may work over \(\mathsf {ATR_0^{set}}\), as in the proof of Theorem 1.6. By the latter, we get a hierarchy of admissible sets \({\text {Ad}}(m)\in {\text {Ad}}(n)\) for \(m<n<\omega \), where we can assume that \({\text {Ad}}(0)\) contains a given subset of \({\mathbb {N}}\). Let us put

$$\begin{aligned} {\mathcal {S}}:=\{Z\in A\,|\,Z\subseteq {\mathbb {N}}\}\quad \text {with}\quad A:=\textstyle \bigcup _{n<\omega }{\text {Ad}}(n). \end{aligned}$$

We shall show that \({\mathcal {M}}:=({\mathbb {N}},{\mathcal {S}})\) is the \(\beta \)-model required by (i’). First note that the countability of \({\mathcal {S}}\) is for free, because \(\mathsf {ATR_0^{set}}\) includes an axiom that makes all sets countable (cf. the previous proof). To show that \({\mathcal {M}}\) is a \(\beta \)-model, we consider an arbitrary \(\Pi ^1_1\)-formula \(\varphi (x,Z)\). As in the proof of Proposition 9.2, we obtain a \(\Sigma \)-formula \(\psi (x,Z)\) such that \(\varphi (x,Z)\) and \(\psi (x,Z)^{{\text {Ad}}(n)}\) are equivalent for \(Z\in {\text {Ad}}(n)\). The indicated proof of equivalence relativizes to A (for details see [31, Section 7] or [41, Section 3.3.2], noting that \(A\vDash \textsf{KPl}^{{\text {r}}}\)). This means that we get

$$\begin{aligned} \varphi (x,Z)\leftrightarrow \psi (x,Z)^{{\text {Ad}}(n)}\leftrightarrow \mathcal M\vDash \varphi (x,Z)\quad \text {when}\quad Z\in {\text {Ad}}(n). \end{aligned}$$

As any \(Z\in A\) is contained in \({\text {Ad}}(n)\) for some \(n\in {\mathbb {N}}\), it follows that \({\mathcal {M}}\) is a \(\beta \)-model. Invoking bounded separation in \({\text {Ad}}(n+1)\), we also see that \(Z\in {\text {Ad}}(n)\) entails

$$\begin{aligned} \{x\in {\mathbb {N}}\,|\,{\mathcal {M}}\vDash \varphi (x,Z)\}=\{x\in \mathbb N\,|\,\psi (x,Z)^{{\text {Ad}}(n)}\}\in {\text {Ad}}(n+1)\subseteq A, \end{aligned}$$

which shows that \({\mathcal {M}}\) satisfies \(\Pi ^1_1\)-comprehension. \(\square \)

To conclude this paper, we derive the final result that was stated in the introduction. It is concerned with the principle of \(\Pi ^1_1\)-transfinite recursion, which asserts that statement (i) of Theorem 1.6 holds for every well order \(\nu \).

Proof of Corollary 1.8

Consider the statements (i) to (iii) from Theorem 1.6. For each of these statements, we define the variants

(\(\forall \)n):

statement (n) holds for every well order \(\nu \),

(\(\infty \)n):

statement (n) holds for every infinite well order \(\nu \).

By Theorem 1.6, statements (\(\infty \)i) and (\(\infty \)ii) and (\(\infty \)iii) are pairwise equivalent. The corollary claims that the same holds for (\(\forall \)i) and (\(\forall \)ii) and (\(\forall \)iii). This is true because statements (\(\forall \)n) and (\(\infty \)n) are in fact equivalent. The latter is immediate in the case of (i). For the other statements, it follows from Corollary 2.10 (in conjunction with Corollary 2.2 and Theorem 2.9). \(\square \)