A few more dissimilarities between second-order arithmetic and set theory

Fujimoto, Kentaro

doi:10.1007/s00153-022-00829-3

A few more dissimilarities between second-order arithmetic and set theory

Open access
Published: 09 July 2022

Volume 62, pages 147–206, (2023)
Cite this article

Download PDF

You have full access to this open access article

Archive for Mathematical Logic Aims and scope Submit manuscript

A few more dissimilarities between second-order arithmetic and set theory

Download PDF

Kentaro Fujimoto ORCID: orcid.org/0000-0002-4830-5861¹

2387 Accesses
Explore all metrics

Abstract

Second-order arithmetic and class theory are second-order theories of mathematical subjects of foundational importance, namely, arithmetic and set theory. Despite the similarity in appearance, there turned out to be significant mathematical dissimilarities between them. The present paper studies various principles in class theory, from such a comparative perspective between second-order arithmetic and class theory, and presents a few new dissimilarities between them.

Set Theory and its Place in the Foundations of Mathematics: A New Look at an Old Question

Article Open access 18 January 2017

Set Theory and Numbers

Short note: Least fixed points versus least closed points

Article Open access 08 February 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The study of second-order set theory, also known as class theory, was recently reinvigorated with various different motivations. In particular, the development of ordinal analysis and reverse mathematics brings new perspectived and techniques to the study of subsystems of Morse–Kelley theory ${\mathsf {M}}{\mathsf {K}}$, which has been a driving force of the recent trend of research on class theory.^{Footnote 1} In the course of the recent development on class theory, several significant dissimilarities between second-order arithmetic and class theory have been discovered. Both second-order arithmetic and class theory are second-order theories of mathematical subjects of foundational importance, namely, arithmetic and set theory. However, it turned out that the class-theoretic counterparts of some important theorems in second-order arithmetic fail in class theory, and some powerful tools and techniques in second-order arithmetic are not available in class theory. The existence of such dissimilarities, at the same time, attracts interests in “non-trivial” similarities; even if the standard or known proofs of some theorems in second-order arithmetic are no longer valid in class theory, it is sometimes the case that the corresponding theorems can be proven in class theory by different types of proofs. In the present paper, we present a few new such dissimilarities and “non-trivial” similarities.

It is well known that the schema of $\omega $-model reflection is equivalent to the schema of transfinite induction (also known as the schema of Bar induction). In the notation of the standard textbook [25], the system $\Pi ^{1}_{\infty }\text {-}\mathsf {RFN}$ of $\omega $-model reflection and the system $\Pi ^{1}_{\infty }\text {-}{\mathsf {T}}{\mathsf {I}}$ of transfinite induction (also referred to as $\Pi ^{1}_{\infty }\text {-}{\mathsf {B}}{\mathsf {i}}$ in [16]) have exactly the same theorems in the language of second-order arithmetic. They are also proof-theoretically equivalent to the first-order system ${\mathsf {I}}{\mathsf {D}}_{1}$ of inductive definitions as well as its second-order counterpart $\mathrm {LFP}_{0}^{-}$ (also referred to as $( {\mathsf {I}}{\mathsf {D}}_{1}^{2} )_{0}$ in [20]). Transfinite induction concerns the notion of well-foundedness, and, while the notion of well-foundedness is $\Pi ^{1}_{1}$-complete in second-order arithmetic, it is only elementary in class theory. Hence, the notion of well-foundedness is less robust in class theory than in second-order arithmetic, and the class-theoretic counterpart of $\Pi ^{1}_{\infty }\text {-}{\mathsf {T}}{\mathsf {I}}$ is naturally expected to be significantly weaker in the context of class theory than it is in the context of second-order arithmetic, which will be shown to be indeed the case in the present paper.

In the present paper, we will mainly study the class-theoretic counterparts of the systems of transfinite induction and $\omega $-reflection principle, as well as some related principles. We will show that transfinite induction is quite a weak principle in class theory, as is expected, and not equivalent to the class-theoretic counterpart of $\omega $-reflection. In fact, as we will show, the aforementioned three systems $\Pi ^{1}_{\infty } \text {-} \mathsf {RFN}$, $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}$, and ${\mathsf {I}}{\mathsf {D}}_{1}$, are all pairwise inequivalent in class theory, while they are all equivalent in second-order arithmetic. In addition, among other results, we will give an analysis of the class-theoretic counterpart of the axiom of $\Sigma ^{1}_{1}$ dependent choice, which we call $\Sigma ^{1}_{1}$ dependent collection, in relation to subsystems of $\Pi ^{1}_{\infty } \text {-} \mathsf {RFN}$; as a corollary, we will obtain an alternative proof of Sato’s theorem [22] that the class-theoretic counterpart $\mathsf {ETR}$ of the system of arithmetical transfinite recursion $\mathsf {ATR}$ is weaker than the class-theoretic counterpart of the system of $\Sigma ^{1}_{1}$ choice; $\Sigma ^{1}_{1}$ dependent collection will also be shown to be proof-theoretically equivalent to $\Pi ^{1}_{2} \text {-} \mathrm {RFN}$ in class theory, but the proof is quite different from the known proof of the equivalence of their second-order arithmetical counterparts. To conclude the paper, we will briefly consider alternative types of reflection principles.

Remark 1.1

Neither the Axiom of Choice ($\mathrm {AC}$) nor Global Choice ($\mathrm {GC}$) is counted as a default axiom of class theory in the present paper; in particular, neither is included among the axioms of the Von Neumann–Bernays–Gödel theory $\mathsf {NBG}$. However, the addition of $\mathrm {AC}$ or $\mathrm {GC}$ to $\mathsf {NBG}$ does not affect any of the proofs in the present paper, while the assumption of them would make some proofs simpler, and all the results of ${\mathcal {L}}_{\in }$-conservation, relative consistency, etc., in the present paper concerning class theory still hold even when we assume either $\mathrm {AC}$ or $\mathrm {GC}$.^{Footnote 2}

2 Definitions and basic facts

2.1 Basic systems

Let ${\mathcal {L}}_{\in }$ be the language of first-order set theory. The language ${\mathcal {L}}_{2}$ of second-order set theory, i.e., class theory, is a two-sorted language with variables $x, y, z, \ldots $ of first-sort (“first-order”) and variables $X, Y, Z, \ldots $ of second-sort (“second-order”), whose non-logical symbols are a binary membership predicate $\in _{\mathrm {set}}$ between first-order entities and another binary membership predicate $\in _{\mathrm {class}}$ between first- and second-order entities. We assume that ${\mathcal {L}}_{2}$ possesses the equality symbol $=$ as a logical symbol only for first-order entities, i.e., sets, and the equality between two classes Y and Z is definitionally introduced by putting $Y = Z :\Leftrightarrow \forall z (z \in _{\mathrm {class}}Y \leftrightarrow z \in _{\mathrm {class}}Z )$. The thus defined relation $Y = Z$ is a congruent relation allowing substitution salva veritate. The equality and subset relation between sets and classes are defined in an obvious manner: $x = X :\Leftrightarrow \forall z (z \in _{\mathrm {set}}x \leftrightarrow z \in _{\mathrm {class}}X )$; $x \subset X :\Leftrightarrow \forall z (z \in _{\mathrm {set}}x \rightarrow z \in _{\mathrm {class}}X )$. For simplicity, we will identify $\in _{\mathrm {set}}$ and $\in _{\mathrm {class}}$ throughout the present paper whenever there is no worry of confusion.

For each natural number n, we standardly define collections $\Pi ^{0}_{n}$, $\Sigma ^{0}_{n}$, $\Pi ^{1}_{n}$, and $\Sigma ^{1}_{n}$ of ${\mathcal {L}}_{2}$-formulae in obvious analogy with those in second-order arithmetic: we start by identifying $\Pi ^{0}_{0}$ and $\Sigma ^{0}_{0}$ with the collection of ${\mathcal {L}}_{2}$-formulae only with bounded first-order quantifiers and no second-order quantifiers, which may contain second-order free variables as parameters (“class parameters”), and also identifying $\Pi ^{1}_{0}$ and $\Sigma ^{1}_{0}$ with the collection of elementary formulae, namely, formulae with no second-order quantifiers but possibly with class parameters; then, $\Pi ^{0}_{n + 1}$, $\Sigma ^{0}_{n + 1}$, $\Pi ^{1}_{n + 1}$, $\Sigma ^{1}_{n + 1}$ are defined from $\Sigma ^{0}_{n}$, $\Pi ^{0}_{n}$, $\Sigma ^{1}_{n}$, and $\Pi ^{1}_{n}$, respectively, in the usual manner in terms of alterations of universal and existential quantifiers. We write $\Pi ^{1}_{\infty } = \bigcup _{n} \Pi ^{1}_{n}$ and $\Sigma ^{1}_{\infty } = \bigcup _{n} \Sigma ^{1}_{n}$. Given an ${\mathcal {L}}_{2}$-system $\mathsf {T}$, we call an ${\mathcal {L}}_{2}$-formula a $\Delta ^{i}_{n}$-formula ($i = 0, 1$) in $\mathsf {T}$, when it is equivalent to some $\Pi ^{i}_{n}$-formula and $\Sigma ^{i}_{n}$-formula in $\mathsf {T}$.

In what follows, we occasionally abuse the notation and say that an ${\mathcal {L}}_{2}$-formula is $\Pi ^{1}_{n}$ or $\Sigma ^{1}_{n}$ when it is equivalent to some $\Pi ^{1}_{n}$- or $\Sigma ^{1}_{n}$-formula in a system in question, respectively. By means of ordered pairs, we can show (in, say, $\mathsf {NBG}$) that for all $\Sigma ^{1}_{n}$-formulae $\Phi $, the result of prefixing a first-order or second-order existential quantifier, i.e., $\exists x \Phi $ or $\exists X \Phi $, is equivalent to a $\Sigma ^{1}_{n}$-formula; the dual holds for $\Pi ^{1}_{n}$.

The Von Neumann–Bernays–Gödel class theory $\mathsf {NBG}$ consists of the standard first-order set-theoretic axioms of extensionality, paring, union, powerset, and infinity, and the following four axioms regarding classes.

$$\begin{aligned} \mathrm {ECA}: \exists X \forall x \bigl ( x \in X \leftrightarrow \Phi (x) \bigr ), \text { for all elementary }\Phi \text {with}\,X\,\text {not free}, \end{aligned}$$

where $\mathrm {ECA}$ is an acronym for the Elementary Comprehension Axiom; the (unique) class X satisfying $\forall x (x \in X \leftrightarrow \Phi (x))$ will be denoted by $\{ x \mid \Phi (x ) \}$.

$$\begin{aligned} \text {Class Separation}: \quad&\forall X \forall x \exists y (y = x \cap X ) \\ \text {Class Foundation}: \quad&\forall X [ X \ne \emptyset \rightarrow (\exists x \in X ) (\forall y \in x ) (y \not \in X ) ] \\ \text {Class Replacement}: \quad&\forall X [ Fun (X ) \rightarrow \forall x \exists y (X''x \subset y ) ], \end{aligned}$$

where $x \cap X := \{ z \mid z \in x \wedge z \in X \}$, $ Fun ( X )$ expresses “X is a function”, and $X''x := \{ u \mid (\exists v \in x ) \langle v, u \rangle \in X \}$ (i.e., the image of x under X). It is well known that $\mathsf {NBG}$ is finitely axiomatizable. Note that, as we remarked, neither Axiom of Choice ($\mathrm {AC}$) nor Global Choice ($\mathrm {GC}$) is included in the axioms of $\mathsf {NBG}$.

We will occasionally consider adding the following axiom schemata to $\mathsf {NBG}$:

$$\begin{aligned} {\Sigma ^{1}_{n} \text {-} \mathrm {CA}:} \quad&\exists X \forall x \bigl (x \in X \leftrightarrow \Phi (x ) \bigr ) \\ {\Delta ^{1}_{n} \text {-} \mathrm {CA}:} \quad&\forall x \bigl (\Phi (x ) \leftrightarrow \Psi (x ) \bigr ) \rightarrow \exists X \forall x \bigl (x \in X \leftrightarrow \Phi (x ) \bigr ) \\ {\Sigma ^{1}_{n} \text {-} \mathrm {Sep}:} \quad&\forall x \exists w \forall z \bigl (z \in w \leftrightarrow z \in x \wedge \Phi (z ) \bigr ) \\ {\Sigma ^{1}_{n} \text {-} \mathrm {Repl}:} \quad&\forall z \bigl (\forall x \in z \exists ! y \Phi (x, y ) \rightarrow \exists w \forall x \in z \exists y \in w \Phi (x, y ) \bigr ) \\ {\Sigma ^{1}_{n} \text {-} \mathrm {Ind}:}\quad&\forall x \bigl (\forall y \in x \Phi ( y ) \rightarrow \Phi ( x ) \bigr ) \rightarrow \forall x \Phi ( x ) \end{aligned}$$

where $\Phi $ and $\Psi $ are any $\Sigma ^{1}_{n}$- and $\Pi ^{1}_{n}$-formula, respectively, with neither X nor w free; these axiom schemata for $\Pi ^{1}_{n}$-formulae, such as $\Pi ^{1}_{n} \text {-} \mathrm {CA}$, are defined similarly.^{Footnote 3} Throughout the present paper, we stipulate that, whenever we define axioms, the universal closures of displayed formulae are taken as the defined axioms; hence, $\Phi $ and $\Psi $ above may possibly contain other free variables, unless otherwise specified, and the names of the axioms above precisely mean the universal closures of the displayed formulae above. All the axioms listed above are underivable from $\mathsf {NBG}$ for $n > 0$, whereas they are all derivable from $\mathsf {NBG}$ for $n = 0$; note that $\mathrm {ECA}$ is just the same as $\Pi ^{1}_{0} \text {-} \mathrm {CA}$.

Proposition 2.1

The following are provable in $\mathsf {NBG}$.

1.
$\Sigma ^{1}_{n} \text {-} \mathrm {Sep}$ ($\Sigma ^{1}_{n} \text {-} \mathrm {CA}$) and $\Pi ^{1}_{n} \text {-} \mathrm {Sep}$ ($\Pi ^{1}_{n} \text {-} \mathrm {CA}$, resp.) are equivalent.
2.
$\Sigma ^{1}_{n} \text {-} \mathrm {Sep}$ implies $\Pi ^{1}_{n} \text {-} \mathrm {Ind}$, and $\Pi ^{1}_{n} \text {-} \mathrm {Sep}$ implies $\Sigma ^{1}_{n} \text {-} \mathrm {Ind}$.

Proof

The claim 1 is obvious. For the claim 2, suppose $\lnot \Phi ( x )$ for some $\Pi ^{1}_{n}$ (or $\Sigma ^{1}_{n}$) formula $\Phi $ and x. Take $a := \{ z \in \mathrm {TC} ( \{ x \} ) \mid \lnot \Phi ( z ) \}$ ($\ne \emptyset $) by $\Sigma ^{1}_{n} \text {-} \mathrm {Sep}$ ($\Pi ^{1}_{n} \text {-} \mathrm {Sep}$, resp.), where $\mathrm {TC} ( y )$ denotes the transitive closure of y. By the foundation axiom, there is a $\in $-minimal $y \in a$. Hence, we have $\lnot \Phi ( y )$ but $\Phi ( z )$ for all $z \in y$. $\square $

In analogy with second-order arithmetic, $\mathsf {NBG}$ corresponds to the system $\mathsf {ACA}_{0}$ of arithmetical comprehension, and (in one view) ${\mathsf {NBG}}+ \Sigma ^{1}_{\infty } \text {-} {\mathrm {Sep}} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$ corresponds to $\mathsf {ACA}$. We keep the nomenclature $\mathsf {NBG}$, following the long-standing convention, but we call the latter system $\mathsf {ECA}$ in this analogy with second-order arithmetic. We will also consider the following systems:

$$\begin{aligned}&\Pi ^{1}_{n} \text {-} {\mathsf {C}}{\mathsf {A}}_{0} := \mathsf {NBG} + \Pi ^{1}_{n} \text {-} \mathrm {CA}&\Pi ^{1}_{n} \text {-} {\mathsf {C}}{\mathsf {A}} := \mathsf {ECA} + \Pi ^{1}_{n} \text {-} \mathrm {CA}&\\&\Delta ^{1}_{n} \text {-} {\mathsf {C}}{\mathsf {A}}_{0} := \mathsf {NBG} + \Delta ^{1}_{n} \text {-} \mathrm {CA}&\Delta ^{1}_{n} \text {-} {\mathsf {C}}{\mathsf {A}} := \mathsf {ECA} + \Delta ^{1}_{n} \text {-} \mathrm {CA}.&\end{aligned}$$

We will use sans serif fonts to denote systems and normal fonts for axioms and axiom schemata.^{Footnote 4}

For each natural numbers $n \ge 1$ and $i, j \ge 0$, there is known to be a $\Pi ^{0}_{n}$ universal formula $\pi ^{0}_{n, i, j} ( v, u_{1}, \ldots , u_{i}, U_{1}, \ldots , U_{j} )$ only with the displayed variables free such that, for all $\Pi ^{0}_{n}$-formulae $\Phi ( u_{1}, \ldots , u_{i}, U_{1}, \ldots , U_{j} )$ only with the displayed variables free, there is a (standard) natural number e such that

$$\begin{aligned} \mathsf {NBG} \vdash \forall \vec {x} \forall \vec {X} \bigl ( \Phi ( x_{1}, \ldots , x_{i}, X_{1}, \ldots , X_{j} ) \leftrightarrow \pi ^{0}_{n, i, j} ( e, x_{1}, \ldots , x_{i}, X_{1}, \ldots , X_{j} ) \bigr ). \end{aligned}$$

We will suppress the second and third subscripts i and j of $\pi ^{0}_{n, i, j}$, which indicates the numbers of first- and second-order free variables, and simply write $\pi ^{0}_{n}$, when they are clear from the context.

The proof of the next lemma is essentially the standard argument by Skolem functions (such as [25, Lemma V.1.4]); however, due to the lack of $\mathrm {GC}$, we cannot take Skolem functions, and we instead take Skolem “multi-valued” functions.

Lemma 2.2

Let $\Phi ( \vec {z}, Y, \vec {Z} )$ be an elementary formula. Then we can find a $\Pi ^{0}_{2}$-formula $\Psi ( \vec {z}, X, \vec {Z} )$ such that

$$\begin{aligned} \mathsf {NBG} \vdash \ \forall \vec {z} \forall \vec {Z} \bigl ( \exists Y \Phi ( \vec {z}, Y, \vec {Z} ) \leftrightarrow \exists X \Psi ( \vec {z}, X, \vec {Z} ) \bigr ) \bigr ). \end{aligned}$$

Proof

Fix any $\vec {z}$ and $\vec {Z}$. We can assume without loss of generality that $\Phi ( \vec {z}, Y, \vec {Z} )$ is in the following prenex normal form:

$$\begin{aligned} \forall x_{1} \exists y_{1} \cdots \forall x_{k} \exists y_{k} \Theta ( x_{1}, \ldots , x_{k}, y_{1}, \ldots , y_{k}, \vec {z}, Y, \vec {Z} ), \end{aligned}$$

where $\Theta $ is $\Delta ^{0}_{0}$. By meta-induction on k, $\Phi ( \vec {z}, Y, \vec {Z} )$ is shown to be equivalent to

$$\begin{aligned} \exists G_{1} \cdots \exists G_{k} \Bigl (&\forall \vec {x} \exists \vec {y} ( \langle x_{1}, y_{1} \rangle \in G_{1} \wedge \cdots \wedge \langle x_{1}, \ldots , x_{k}, y_{k} \rangle \in G_{k} ) \ \wedge \\&\forall \vec {x} \forall \vec {y} \bigl ( ( \langle x_{1}, y_{1} \rangle \in G_{1} \wedge \cdots \wedge \langle x_{1}, \ldots , x_{k}, y_{k} \rangle \in G_{k} ) \rightarrow \Theta ( \vec {x}, \vec {y}, \vec {z}, Y, \vec {Z} ) \bigr ) \Bigr ), \end{aligned}$$

where $G_{1}, \ldots , G_{k}$ are distinct from each other and $Y, \vec {Z}$. By contracting $G_{i}$s into a single class (and $x_{i}$s and $y_{i}$s into single sets, respectively), we have a $\Delta ^{0}_{1}$-formula $\Theta '$ in $\mathsf {NBG}$ such that $\exists Y \Phi ( \vec {z}, Y, \vec {Z} )$ is equivalent to

$$\begin{aligned} \exists Y \exists G \Bigl ( \forall x \exists y ( \langle x, y \rangle \in G ) \wedge \forall x \forall y \bigl ( \langle x, y \rangle \in G \rightarrow \Theta ' ( x, y, \vec {z}, Y, \vec {Z} ) \bigr ) \Bigr ). \end{aligned}$$

We finally obtain the claim by contracting G and Y into one class. $\square $

This implies that, for each $n > 0$ and $\Pi ^{1}_{n}$-formula $\forall Y_{1} \exists Y_{2} \cdots \Phi ( x, X, \vec {Y} )$, there is a (standard) natural number e such that

$$\begin{aligned} \forall x \forall X \bigl ( \forall Y_{1} \exists Y_{2} \cdots \Phi ( x, X, Y_{1}, \ldots , Y_{n} ) \leftrightarrow \forall Y_{1} \exists Y_{2} \cdots \pi ( e, x, X, Y_{1}, \ldots , Y_{n} ) \bigr ), \end{aligned}$$

where $\pi = \pi ^{0}_{2, 1, n + 1}$, if n is even, and $\pi = \lnot \pi ^{0}_{2, 1, n + 1}$, if n is odd. Hence, for each $n > 0$, we have a $\Pi ^{1}_{n}$ universal formula in $\mathsf {NBG}$, which will be denoted by $\pi ^{1}_{n} ( e, x, X )$. Note that we don’t have to consider universal formulae containing more first- and/or second-order free variables because we can always contract multiple free variables into one variable by pairing in $\mathsf {NBG}$. As a result, all the axioms listed above are finitely axiomatizable modulo $\mathsf {NBG}$.

2.2 Well-foundedness and transfinite recursion

For a class X, we will write $x \prec _{X} y$ for $\langle x, y \rangle \in X$. We define a formula $ Wf ( X )$ expressing the well-foundedness of $\prec _{X}$ and a formula $ TI _{\Phi } ( X )$ asserting transfinite induction along $\prec _{X}$ with respect to $\Phi ( x ) \in {\mathcal {L}}_{2}$ (possibly with parameters) as follows:

$$\begin{aligned} Wf ( X )&\ :\Leftrightarrow \ \forall Z \bigl [ \forall x \bigl ( \forall y ( y \prec _{X} x \rightarrow y \in Z ) \rightarrow x \in Z \bigr ) \rightarrow {\mathbb {V}} = Z \bigr ]; \\ TI _{\Phi } ( X )&\ :\Leftrightarrow \ \forall x \bigl ( \forall y ( y \prec _{X} x \rightarrow \Phi ( y ) ) \rightarrow \Phi ( x ) \bigr ) \rightarrow \forall x \Phi ( x ); \end{aligned}$$

the symbol ${\mathbb {V}}$ above denotes the universe of sets, namely, the class $\{ x \mid x = x \}$. For a collection $\Gamma $ of ${\mathcal {L}}_{2}$-formulae, we define the schema $\Gamma \text {-} \mathrm {TI}$ as follows:

$$\begin{aligned} {\Gamma \text {-} \mathrm {TI}:} \quad&\forall X \bigl ( Wf ( X ) \rightarrow TI _{\Phi } ( X ) \bigr ), \ \text {for all}\,\, \Phi \in \Gamma . \end{aligned}$$

Thereby we define the systems of $\Pi ^{1}_{n}$ transfinite induction as follows ($n \in {\mathbb {N}}$):

$$\begin{aligned}&\Pi ^{1}_{n} \text {-} {\mathsf {T}}{\mathsf {I}}_{0} := \mathsf {NBG} + \Pi ^{1}_{n} \text {-} \mathrm {TI}&&\Pi ^{1}_{n} \text {-} {\mathsf {T}}{\mathsf {I}} := \mathsf {ECA} + \Pi ^{1}_{n} \text {-} \mathrm {TI}.&\end{aligned}$$

By the existence of universal formulae, $\Pi ^{1}_{n} \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$ is finitely axiomatizable for all $n \in {\mathbb {N}}$ (except the finite axiomatizability of $\Pi ^{1}_{0} \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$, where $\Pi ^{1}_{0} \text {-} \mathrm {TI}$ is derivable from $\mathsf {NBG}$).

The notion of well-foundedness is known to be $\Pi ^{1}_{1}$-complete in second-order arithmetic, but it is elementary in class theory, and the elementarity of well-foundedness causes a number of differences between second-order arithmetic and class theory, as seen in [7, 22, 23] for example. In the present paper, we adopt the following elementary expression of well-foundedness.

Proposition 2.3

For each class X, $ Wf ( X )$ is equivalent to the following in $\mathsf {NBG}$:

$$\begin{aligned} \begin{aligned}&\text {Every non-empty set has a } \prec _{X}\text {-minimal element; in other words}, \\&\text {there is no non-empty set } c \text { such that } ( \forall x \in c ) ( \exists y \in c ) ( y \prec _{X} x ). \end{aligned} \end{aligned}$$

(1)

Proof

Suppose (1) fails. Take $c \ne \emptyset $ with $( \forall x \in c ) ( \exists y \in c ) ( y \prec _{X} x )$. Let $Y = {\mathbb {V}} {\setminus } c$. Since $Y \ne {\mathbb {V}}$, it suffices to show that, for all x,

$$\begin{aligned} \forall y ( y \prec _{X} x \rightarrow y \in Y ) \rightarrow x \in Y. \end{aligned}$$

(2)

Take any x. If $x \not \in c$, then the succedent of (2) trivially holds. Otherwise, there is $y \prec _{X} x$ such that $y \in c$ and thus $y \not \in Y$, and the antecedent of (2) fails.

For the converse, suppose $\lnot Wf ( X )$. There is some class $Z \ne {\mathbb {V}}$ such that

$$\begin{aligned} \forall x \bigl ( x \not \in Z \rightarrow \exists y ( y \prec _{X} x \wedge y \not \in Z ) \bigr ). \end{aligned}$$

(3)

From (3), we will construct a pseudo $\omega $-descending chain of $\prec _{X}$, by which we mean a function $f :\omega \rightarrow {\mathbb {V}}$ such that

$$\begin{aligned} ( \forall n \in \omega ) \bigl ( f ( n ) \ne \emptyset \wedge ( \forall x \in f ( n ) ) ( \exists y \in f ( n + 1 ) ) ( {y \prec _{X} x} ) \bigr ). \end{aligned}$$

First, for each $x \not \in Z$, we define

$$\begin{aligned} g ( x ) := \{ z \mid z \prec _{X} x \wedge z \not \in Z \wedge \forall y ( y \prec _{X} x \wedge y \not \in Z \rightarrow {\mathsf {r}}{\mathsf {k}} ( z ) \le {\mathsf {r}}{\mathsf {k}} ( y ) ) \}, \end{aligned}$$

where ${\mathsf {r}}{\mathsf {k}} ( w )$ denotes the rank of a set w; that is, g(x) is the set of sets z with the least rank such that $z \prec _{X} x$ and $z \not \in Z$; this is an application of Scott’s trick and g(x) is a non-empty set by (3) for all $x \not \in Z$. We thereby recursively define $f :\omega \rightarrow {\mathbb {V}}$ so that

$$\begin{aligned} f ( 0 )&:= \{ z \mid z \not \in Z \wedge \forall y ( y \not \in Z \rightarrow {\mathsf {r}}{\mathsf {k}} ( z ) \le {\mathsf {r}}{\mathsf {k}} ( y ) ) \} \\ f ( k + 1 )&:= \bigcup \{ g ( z ) \mid z \in f ( k ) \}; \end{aligned}$$

note that f(0) is the set of $z \not \in Z$ with the least rank. We put c be the range of f. Since $f ( n ) \ne \emptyset $ for all $n \in \omega $, c has no $\prec _{X}$-minimal element. $\square $

We will denote von Neumann ordinals by lowercase Greek letters $\alpha $, $\beta $, ..., possibly with indices, and write $\alpha < \beta $ for $\alpha \in \beta $, viz., the canonical ordering of the ordinals. For a class X and a set a, we define $( X )_{a} := \{ x \mid \langle x, a \rangle \in X \}$. Then, for an ${\mathcal {L}}_{2}$-formula $\Phi ( x, z, Z )$ possibly with parameters, we define

$$\begin{aligned} {\mathcal {H}}_{\Phi } ( X, Y )&\ :\Leftrightarrow \ \forall a \Bigl ( ( Y )_{a} = \Bigl \{ x \mid \Phi \bigl ( x, a, \{ \langle u, b \rangle \mid u \in ( Y )_{b} \wedge b \prec _{X} a \} \bigr ) \Bigr \} \Bigr ) \\ {\mathcal {H}}_{\Phi } ( \alpha , Y )&\ :\Leftrightarrow \ ( \forall \beta< \alpha ) \Bigl ( ( Y )_{\beta } \ = \ \Bigl \{ x \mid \Phi \bigl ( x, \beta , \{ \langle u, \gamma \rangle \mid u \in ( Y )_{\gamma } \wedge \gamma < \beta \} \bigr ) \Bigr \} \Bigr ). \end{aligned}$$

We thereby define the axiom $\mathrm {ETR}$ of elementary transfinite recursion, which is the class-theoretic version of Friedman’s axiom $\mathrm {ATR}$ of arithmetical transfinite recursion, and its restriction $\mathrm {ETR} ( \alpha )$ to the set wellordering $\{ \langle \gamma , \beta \rangle \mid \gamma< \beta < \alpha \}$.

$$\begin{aligned} {\mathrm {ETR}:} \quad&\forall X \bigl ( Wf ( X ) \rightarrow \exists Y {\mathcal {H}}_{\Phi } ( X, Y ) \bigr ), \ \text {for all } \Phi \in \Pi ^{1}_{0} \text { without } Y \text { free}. \\ {\mathrm {ETR} (\alpha ):} \quad&\exists Y {\mathcal {H}}_{\Phi } ( \alpha , Y ), \ \text {for all } \Phi \in \Pi ^{1}_{0} \text { without } Y \text { free}. \end{aligned}$$

Obviously, $\mathrm {ETR}$ implies $\mathrm {ETR} ( \alpha )$ for all $\alpha \in On$, where On denotes the class of ordinals. We thus introduce four systems:

$$\begin{aligned} \mathsf {ETR}_{0}&:= \mathsf {NBG} + \mathrm {ETR}&\mathsf {ETR}&:= \mathsf {ECA} + \mathrm {ETR}&\\ \mathsf {ECA}_{0}^{+}&:= \mathsf {NBG} + \mathrm {ETR} ( \omega )&\mathsf {ECA}^{+}&:= \mathsf {ECA} + \mathrm {ETR} ( \omega );&\end{aligned}$$

recall that $\mathsf {ECA} = \mathsf {NBG} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$. We can show by an exactly parallel manner to second-order arithmetic that both $\mathsf {ETR}_{0}$ and $\mathsf {ECA}_{0}^{+}$ are finitely axiomatizable; see [7, Theorem 90]. The third system $\mathsf {ECA}_{0}^{+}$ corresponds (in one sense) to the system $\mathsf {ACA}_{0}^{+}$ of $\omega $-Turing jumps in second-order arithmetic.^{Footnote 5}

2.3 Coded ${\mathbb {V}}$-models

We will modify the notion of coded $\omega $-model in second-order arithmetic for class theory, and define the notion of coded ${\mathbb {V}}$-model, by which we mean a class S viewed as an ${\mathcal {L}}_{2}$-structure $\bigl \langle {\mathbb {V}}, \{ ( S )_{x} \mid x \in {\mathbb {V}} \} \bigr \rangle $ in which the membership relations are standardly interpreted.

Definition 2.4

Let S be any class viewed as a coded ${\mathbb {V}}$-model. For each (standard) ${\mathcal {L}}_{2}$-formula $\Phi $, we inductively define the relativization of $\Phi $ to S, written as $\Phi ^{S}$ henceforth, as follows: $ A ^{S} :\Leftrightarrow A $ for all atomic ${\mathcal {L}}_{2}$-formulae $ A $, and

$$\begin{aligned}&( \lnot \Psi )^{S} \, :\Leftrightarrow \, \lnot \Psi ^{S}&( \Psi \wedge \Theta )^{S} \, :\Leftrightarrow \, \Psi ^{S} \wedge \Theta ^{S}&\\&( \forall z \Psi ( z ) )^{S} \, :\Leftrightarrow \, \forall z \Psi ^{S} ( z )&( \forall Z \Psi ( Z ) )^{S} \, :\Leftrightarrow \, \forall z \Psi ^{S} ( ( S )_{z} ) ).&\end{aligned}$$

The relativization $\Phi ^{S}$ of an ${\mathcal {L}}_{2}$-formula is always elementary, and thus, in particular, $\Psi ^{S}$ holds in $\mathsf {NBG}$ for every instance $\Psi $ of $\Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$ or $\Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$. Furthermore, for every elementary formula $\Phi $, $\Phi ^{S}$ is identical with $\Phi $.

For each classes X and S, we write $X \, {\dot{\in }} \, S$ for $( \exists x ) ( X = ( S )_{x} )$, which informally expresses that X is a member of the second-order domain of the coded ${\mathbb {V}}$-model S; hence, the coded ${\mathbb {V}}$-model S can be alternatively expressed as $\langle {\mathbb {V}}, \{ X \mid X \, {\dot{\in }} \, S \} \rangle $. With this notation, $( \forall Z \Psi ( Z ) )^{S}$ is equivalent to $( \forall Z \, {\dot{\in }} \, S ) \Psi ^{S} ( Z )$.

We next consider a restriction of a coded ${\mathbb {V}}$-model to a set. In what follows, we stipulate for simplicity that when we treat an ordered pair $\langle M, N \rangle $ of sets as an ${\mathcal {L}}_{2}$-structure only with the specification of the first-order domain M and second-order domain N, the membership relations are always assumed to be standardly interpreted in the ${\mathcal {L}}_{2}$-structure unless otherwise specified. We make a parallel stipulation for set-sized ${\mathcal {L}}_{\in }$-structures: if a set M is treated as an ${\mathcal {L}}_{\in }$-structure, the membership relation is standardly interpreted unless otherwise specified.

Given a coded ${\mathbb {V}}$-model S and a set M, we denote the set-sized ${\mathcal {L}}_{2}$-structure $\langle M, \, \{ ( S )_{x} \cap M \mid x \in M \} \rangle $ by $S^{M}$. For a set x and a class X, ${^{x} X}$ denotes the class of (set) functions from x to X. Then, given a set M, each $h \in {^{\omega } M}$ can be viewed as a first-order variable assignment on $S^{M}$ that assigns h(i) to the ith first-order variable $u_{i}$, and also as a second-order variable assignment on $S^{M}$ that assigns $( S )_{h ( j )} \cap M$ to the jth second-order variable $U_{j}$.

Let $ Fml _{2}$ be the (countable) set of codes of ${\mathcal {L}}_{2}$-formulae. For each ${\mathcal {L}}_{2}$-formula $\Phi $, $ Fml _{2}$ contains its code and we will simply denote it by $\Phi $; this notation neglects the distinction of formulae in the usual sense (as meta-theoretic syntactic entities) and their codes (which are only sets), but there should be no danger of confusion. Then, for each $f, g \in {^{\omega } M}$ and $\Phi \in Fml _{2}$ we write $S^{M} \models \Phi \, [ f, g ]$ to mean that $\Phi $ is satisfied in the set-sized ${\mathcal {L}}_{2}$-structure $S^{M}$ under the first-order variable assignment f and the second-order variable assignment g.

For a class X, let us write $X \,{\dot{\in }} \, S^{M}$ for $\exists z ( z \in M \wedge ( S )_{z} = X )$; with this notation, the set-sized ${\mathcal {L}}_{2}$-structure $S^{M}$ can also be expressed as $\langle M, \, \{ X \cap M \mid X \, {\dot{\in }} \, S^{M} \} \rangle $. When some specific sets $x_{1}, \ldots , x_{m} \in M$ and classes $X_{1}, \ldots , X_{n} \, {\dot{\in }} \, S^{M}$ are given, the relation $S^{M} \models \Phi ( x_{1}, \ldots x_{m}, X_{1} \cap M, \ldots , X_{n} \cap M )$ (or $S^{M} \models \Phi ( \vec {x}, \vec {X} )$ more simply) is defined in the obvious manner: that is, it means that $S^{M} \models \Phi [ f, g ]$ for every variable assignment f and g that assigns $x_{i}$ to $u_{i}$ ($1, \le i \le m$) and $X_{j} \cap M$ to $U_{j}$ ($1 \le j \le n$), respectively.

Now, the next notation is useful.

Definition 2.5

Let S be a coded ${\mathbb {V}}$-model and M any set. For each (standard) ${\mathcal {L}}_{2}$-formula $\Phi $, we inductively define the relativization of $\Phi $ to $S^{M}$, written as $\Phi ^{S^{M}}$ henceforth, as follows: $ A ^{S^{M}} :\Leftrightarrow A $ for all atomic ${\mathcal {L}}_{2}$-formulae $ A $, and

$$\begin{aligned}&( \lnot \Psi )^{S^{M}} \, :\Leftrightarrow \, \lnot \Psi ^{S^{M}}&&( \Psi \wedge \Theta )^{S^{M}} \, :\Leftrightarrow \, \Psi ^{S^{M}} \wedge \Theta ^{S^{M}}&\\&( \forall z \Psi ( z ) )^{S^{M}} \, :\Leftrightarrow \, ( \forall z \in M ) \Psi ^{S^{M}} ( z )&&( \forall Z \Psi ( Z ) )^{S^{M}} \, :\Leftrightarrow \, ( \forall z \in M ) \Psi ^{S^{M}} ( ( S )_{z} ) );&\end{aligned}$$

note that $( \forall Z \Psi ( Z ) )^{S^{M}}$ is equivalent to $( \forall Z \, {\dot{\in }} \, S^{M} ) \Psi ^{S^{M}} ( Z )$ by definition.

In the definition of $\Phi ^{S^{M}}$, classes are not restricted to the set M, but this notation is justified by the following proposition, which can be standardly shown by induction on the complexity of $\Phi $; recall that equality between classes is not counted as a primitive predicate symbol of ${\mathcal {L}}_{2}$ but defined in terms of $\in $.

Proposition 2.6

Let $\Phi ( \vec {u}, U_{1}, \ldots , U_{k} )$ be an ${\mathcal {L}}_{2}$-formula only with the displayed variables free. $\mathsf {NBG}$ proves the following: for every coded ${\mathbb {V}}$-model S, set M, $\vec {x} \in M$, and $X_{1}, \ldots , X_{k} \, {\dot{\in }} \, S^{M}$,

$$\begin{aligned} \Phi ^{S^{M}} ( \vec {x}, \vec {X} ) \, \leftrightarrow \, S^{M} \models \Phi ( \vec {x}, X_{1} \cap M, \ldots , X_{k} \cap M ); \end{aligned}$$

recall that the “$\Phi $”to the right of “$\models $” is, precisely, the code of $\Phi $ belonging to $ Fml _{2}$.

The next is a variation of the Montague–Lévy reflection principle.

Lemma 2.7

(Reflection principle in coded ${\mathbb {V}}$-models). Let $\Phi ( \vec {u}, \vec {U} )$ be any ${\mathcal {L}}_{2}$-formula only with the displayed variables free. Then, $\mathsf {NBG}$ proves the following: for all coded ${\mathbb {V}}$-models S and ordinals $\alpha $, there exists an ordinal $\beta > \alpha $ such that

$$\begin{aligned} ( \forall \vec {x} \in V_{\beta } ) ( \forall \vec {X} \, {\dot{\in }} \, S^{V_{\beta }} ) \bigl ( \Phi ^{S} ( \vec {x}, \vec {X} ) \ \leftrightarrow \ \Phi ^{S^{V_{\beta }}} ( \vec {x}, \vec {X} ) \bigr ). \end{aligned}$$

(4)

Note that, by Proposition 2.6, (4) is equivalent to the following:

$$\begin{aligned} ( \forall \vec {x} \in V_{\beta } ) ( \forall \vec {X} \, {\dot{\in }} \, S^{V_{\beta }} ) \bigl ( \Phi ^{S} ( \vec {x}, \vec {X} ) \ \leftrightarrow S^{V_{\beta }} \models \Phi ( \vec {x}, \vec {X} ) \bigr ). \end{aligned}$$

Proof

Let $\Psi _{1}, \ldots , \Psi _{n}$ be the enumeration of all the sub-formulae of $\Phi $. Take any coded ${\mathbb {V}}$-model S. For each $1 \le i \le n$, let $\Psi _{i} ( x_{1}, \ldots , x_{k_{i}}, X_{1}, \ldots , X_{m_{i}} )$ contain at most the displayed variables free, and we define a class function $G_{i} :{\mathbb {V}}^{k_{i} + m_{i}} \rightarrow On$ in the following manner: if $\Psi _{i}$ is of the form $\exists w \Theta ( w, \vec {x}, \vec {X} )$, then we set

$$\begin{aligned}&G_{i} ( a_{1}, \ldots , a_{k_{i}}, z_{1}, \ldots , z_{m_{i}} ) \\&\quad = {\left\{ \begin{array}{ll} \min \bigl \{ \eta \mid ( \exists w \in V_{\eta } ) \Theta ^{S} ( w, \vec {a}, ( S )_{z_{1}}, \ldots , ( S )_{z_{m_{i}}} ) \bigr \} &{} \text {if } \Psi _{i}^{S} ( \vec {a}, ( S )_{z_{1}}, \ldots , ( S )_{z_{m_{i}}} ) \\ 0 &{} \text {otherwise}; \end{array}\right. } \end{aligned}$$

if $\Psi _{i}$ is of the form $\exists Z \Theta ( Z, \vec {x}, \vec {X} )$, then we set

$$\begin{aligned}&G_{i} ( a_{1}, \ldots , a_{k_{i}}, z_{1}, \ldots , z_{m_{i}} ) \\&\quad = {\left\{ \begin{array}{ll} \min \{ \eta \mid ( \exists Z \, {\dot{\in }} \, S^{V_{\eta }} ) \Theta ^{S} ( Z, \vec {a}, ( S ) _{z_{1}}, \ldots , ( S )_{z_{m_{i}}} ) \} &{} \text {if } \Psi _{i}^{S} ( \vec {a}, ( S )_{z_{1}}, \ldots , ( S )_{z_{m_{i}}} ) \\ 0 &{} \text {otherwise}; \end{array}\right. } \end{aligned}$$

if $\Psi _{i}$ is of another form, then we just put $G_{i} ( \vec {a}, \vec {z} ) = 0$; note that $G_{i}$s can be taken as classes because $\Psi _{i}^{S}$s are elementary. We thereby define class functions $F_{i} :On \rightarrow On$ ($1 \le i \le n$) and $F :On \rightarrow On$ as follows:

$$\begin{aligned}&F_{i} ( \xi ) \ := \ \sup \{ G_{i} ( a_{1}, \ldots , a_{k_{i}}, z_{1}, \ldots , z_{m_{i}} ) \mid a_{1}, \ldots , a_{k_{i}}, z_{1}, \ldots , z_{m_{i}} \in V_{\xi } \} \\&F ( \xi ) \ := \ \max \{ \xi + 1, F_{1} ( \xi ), \ldots , F_{n} ( \xi ) \}. \end{aligned}$$

Then, we recursively define $F^{0} ( \xi ) = \xi $ and $F^{j + 1} ( \xi ) := F ( F^{j} ( \xi ) )$. Finally, we define $H :On \rightarrow On$ by $H ( \xi ) := \sup _{j \in \omega } F^{j} ( \xi )$.

Now, for any given $\alpha $, we set $\beta := H ( \alpha )$ and write $\gamma _{j}$ for $F^{j} ( \alpha )$; hence, we have $\beta = \sup _{j < \omega } \gamma _{j}$. We can show (4) for all $\Psi _{i}$s in place of $\Phi $ by a routine induction; we only go through the crucial case here. Let $\Psi _{i}$ be $\exists Z \Theta ( Z, \vec {x}, \vec {X} )$ and take $\vec {x} \in V_{\beta }$ and $\vec {X} \, {\dot{\in }} \, S^{V_{\beta }}$. Take the least $l < \omega $ such that $\vec {x} \in V_{\gamma _{l}}$ and $\vec {X} \, {\dot{\in }} \, S^{V_{\gamma _{l}}}$. Let $z_{j} \in V_{\gamma _{l}}$ be such that $X_{j} = ( S )_{z_{j}}$ for $1 \le j \le m_{i}$. We have $G_{i} ( \vec {x}, \vec {z} ) \le \gamma _{l + 1}$. Hence, if $\Psi _{i}^{S} ( \vec {x}, \vec {X} )$, then $\Theta ^{S} ( Z, \vec {x}, \vec {X} )$ for some $Z \, {\dot{\in }} \, S^{V_{\gamma _{l + 1}}}$ (and thus $Z \, {\dot{\in }} \, S^{V_{\beta }}$). By the induction hypothesis, we obtain $\Theta ^{S^{V_{\beta }}} ( Z, \vec {x}, \vec {X} )$ and thus $\Psi _{i}^{S^{V_{\beta }}} ( \vec {x}, \vec {X} )$. The converse is obvious from the induction hypothesis. $\square $

By the standard trick, the last lemma implies the next one.

Lemma 2.8

For each ${\mathcal {L}}_{2}$-formulae $\Phi _{1} ( \vec {u}, \vec {U}, \vec {W} ), \ldots , \Phi _{k} ( \vec {u}, \vec {U}, \vec {W} )$ only with the displayed variables free, $\mathsf {NBG}$ proves the following: for all coded ${\mathbb {V}}$-models S, classes $\vec {Y} \, {\dot{\in }} \, S$, and ordinals $\alpha $, there exists an ordinal $\beta > \alpha $ such that

$$\begin{aligned} \vec {Y} \, {\dot{\in }} \, S^{V_{\beta }} \wedge \bigwedge _{1 \le i \le k} ( \forall \vec {x} \in V_{\beta } ) ( \forall \vec {X} \, {\dot{\in }} \, S^{V_{\beta }} ) \bigl ( \Phi _{i}^{S} ( \vec {x}, \vec {X}, \vec {Y} ) \ \leftrightarrow \ \Phi _{i}^{S^{V_{\beta }}} ( \vec {x}, \vec {X}, \vec {Y} ) \bigr ). \end{aligned}$$

In order to express $\Phi ^{S}$ for infinitely many $\Phi $’s at once, we need a something like a satisfaction predicate for a coded ${\mathbb {V}}$-model S. Let S be a coded ${\mathbb {V}}$-model. Then, each $h \in {^{\omega } {\mathbb {V}}}$ can be viewed as a first-order variable assignment on S that assigns h(i) to the ith first-order variable $u_{i}$, as well as a second-order variable assignment on S that assigns $( S )_{h ( j )}$ to the jth second-order variable $U_{j}$. For each $h \in {^{\omega } {\mathbb {V}}}$, $x \in {\mathbb {V}}$, and $n \in \omega $, we define a new set function $h_{( x \mid n )} \in {^{\omega } {\mathbb {V}}}$ by putting $h_{( x \mid n )} ( m ) = x$, if $n = m$, and $h_{( x \mid n )} ( m ) = h ( m )$, if $n \ne m$. Finally, let $ Fml _{2} ( n )$ be the subset of $ Fml _{2}$ that comprises the codes of ${\mathcal {L}}_{2}$-formulae with at most n logical symbols; for notational convenience, we set $ Fml _{2} ( \omega ) = Fml _{2}$.

Definition 2.9

Let S be a coded ${\mathbb {V}}$-model and $\alpha \le \omega $. A class X is said to be an $\alpha $-satisfaction class for S, if and only if $X \subset Fml _{2} ( \alpha ) \times {^{\omega } {\mathbb {V}}} \times {^{\omega } {\mathbb {V}}}$ and the following holds for all $\Phi \in Fml _{2} ( \alpha )$ and $f, g \in {^{\omega } {\mathbb {V}}}$:

$$\begin{aligned} {\left\{ \begin{array}{ll} \text {If}\, \Phi \,\text {is}\, u_{i} = u_{j},\, \text {then} \,\langle \Phi , f, g \rangle \in \, X\, \leftrightarrow f ( i ) = f ( j ); \\ \text {If}\, \Phi \,\text {is} \,u_{i} \in u_{j}, \,\text {then} \,\langle \Phi , f, g \rangle \in X \leftrightarrow f ( i ) \in f ( j ); \\ \text {If}\, \Phi \,\text {is} \,u_{i} \in U_{j}, \,\text {then} \,\langle \Phi , f, g \rangle \in X \leftrightarrow f ( i ) \in ( S )_{g ( j )}; \\ \text {If} \,\Phi = \lnot \Psi , \,\text {then} \,\langle \Phi , f, g \rangle \in X \leftrightarrow \langle \Psi , f, g \rangle \not \in X; \\ \text {If} \,\Phi = \Psi \wedge \Theta , \,\text {then} \,\langle \Phi , f, g \rangle \in X \leftrightarrow ( \langle \Psi , f, g \rangle \in X \wedge \langle \Theta , f, g \rangle \in X ); \\ \text {If} \,\Phi = \forall u_{i} \Psi , \,\text {then} \,\langle \Phi , f, g \rangle \in X \leftrightarrow \forall x ( \langle \Psi , f_{( x \mid i)}, g \rangle \in X ); \\ \text {If}\, \Phi = \forall U_{j} \Psi , \,\text {then} \,\langle \Phi , f, g \rangle \in X \leftrightarrow \forall x ( \langle \Psi , f, g_{( x \mid j)} \rangle \in X ). \end{array}\right. } \end{aligned}$$

(5)

We particularly call an $\omega $-satisfaction class for S a full satisfaction class for S.

The next two propositions are standardly shown and we omit the proofs.

Proposition 2.10

Let S be a coded ${\mathbb {V}}$-model and $\alpha \le \omega $. The following are provable in $\mathsf {NBG}$.

1.
If X is an $\alpha $-satisfaction class for S, then $X \cap ( Fml _{2} ( n ) \times {^{\omega } {\mathbb {V}}} \times {^{\omega } {\mathbb {V}}} )$ is an n-satisfaction class for S for all $n < \alpha $.
2.
If classes X and Y are $\alpha $-satisfaction classes for the same S, then $X = Y$.

Proposition 2.11

For each (standard) natural number n, $\mathsf {NBG}$ proves the existence of an n-satisfaction class for any coded ${\mathbb {V}}$-model S.

Definition 2.12

Let S be a coded ${\mathbb {V}}$-model and $\Gamma \subset Fml _{2}$. For each $f, g \in {^{\omega } {\mathbb {V}}}$, we write $S \models \Gamma \, [ f, g ]$ to mean that there is an $\alpha $-satisfaction class X for some $\alpha \le \omega $ such that $\Gamma \subset Fml _{2} ( \alpha )$ and $( \forall \Phi \in \Gamma ) ( \langle \Phi , f, g \rangle \! \in \! X )$; we write $S \models \Gamma $ if $S \models \Gamma [ f, g ]$ for all $f, g \in {^{\omega } {\mathbb {V}}}$. For a single formula $\Phi \in Fml _{2}$, we write $S \models \Phi \, [ f, g ]$ and $S \models \Phi $ for $S \models \{ \Phi \} \, [ f, g ]$ and $S \models \{ \Phi \}$, respectively. Let $\Phi ( u_{i_{0}}, \ldots , u_{i_{m}}, U_{j_{0}}, \ldots , U_{j_{k}} ) \in Fml _{2}$ with designated free variables $u_{i_{0}}, \ldots , u_{i_{m}}, U_{j_{0}}, \ldots , U_{j_{k}}$, which may contain other free variables. Then, for each sets $x_{0}, \ldots , x_{m} \in {\mathbb {V}}$ and classes $X_{0}, \ldots , X_{k} \ {\dot{\in }} \ S$, we write $S \models \Phi ( \vec {x}, \vec {X} )$ to mean

$$\begin{aligned} ( \forall f, g \in {^{\omega } {\mathbb {V}}} ) \Bigl ( \bigl ( \bigwedge _{0 \le l \le m} f ( i_{l} ) = x_{l} \wedge \bigwedge _{0 \le l \le k} ( S )_{g ( j_{l} )} = X_{l} \bigr ) \rightarrow S \models \Phi \, [ f, g ] \Bigr ). \end{aligned}$$

We say that a class S is a coded ${\mathbb {V}}$-model of a (recursive) ${\mathcal {L}}_{2}$-system $\mathsf {T}$, when $S \models \Gamma $ for the set $\Gamma $ ($\subset Fml _{2}$) of the codes of the axioms of $\mathsf {T}$. Note that if $\mathsf {T}$ is finite, then $S \models \Gamma $ does not necessitate the existence of a full satisfaction class for S.^{Footnote 6}

The next is shown by induction on the complexity of $\Phi $ using Proposition 2.11.

Proposition 2.13

For each (standard) ${\mathcal {L}}_{2}$-formula $\Phi ( \vec {u}, \vec {U} )$, $\mathsf {NBG}$ proves that, for every coded ${\mathbb {V}}$-model S, $\vec {x} \in {\mathbb {V}}$, and $\vec {X} \, {\dot{\in }} \, S$,

$$\begin{aligned} \Phi ^{S} ( \vec {x}, \vec {X} ) \, \leftrightarrow \, S \models \Phi ( \vec {x}, \vec {X} ); \end{aligned}$$

(6)

recall again that the “$\Phi $”to the right of “$\models $” is precisely the code of $\Phi $. Hence, in particualr, for every elementary formula $\Phi ( x, X )$ and coded ${\mathbb {V}}$-model S, we have

$$\begin{aligned} \forall \vec {x} \forall \vec {X} \, {\dot{\in }} \, S \bigl ( \Phi ( \vec {x}, \vec {X} ) \leftrightarrow S \models \Phi ( \vec {x}, \vec {X} ) \bigr ). \end{aligned}$$

that is, elementary formulae are “absolute” for coded ${\mathbb {V}}$-models.

Hence, although the official definition of $S \models \Phi $ is a $\Delta ^{1}_{1}$ in $\mathsf {NBG}$ (by Proposition 2.10), this proposition shows that $S \models \Phi $ is elementarily expressible for each (standard) ${\mathcal {L}}_{2}$-formula $\Phi $. It also follows that $S \models \mathsf {T}$ is elementarily expressible for every finite ${\mathcal {L}}_{2}$-system $\mathsf {T}$.

The proof of the next proposition is essentially the same as the well known corresponding fact (about $\mathsf {ACA}_{0}^{+}$) in second-order arithmetic, and we omit the details.

Proposition 2.14

1.
$\mathsf {ECA}_{0}^{+}$ proves the existence of a full satisfaction class for any coded ${\mathbb {V}}$-model S.
2.
$\mathsf {ECA}_{0}^{+}$ proves that for each class Z there is a coded ${\mathbb {V}}$-model S of $\mathsf {NBG}$ with $Z \, {\dot{\in }} \, S$.

Proof

1. For every S, $\mathrm {ETR} ( \omega )$ yields a class X such that, for each $n < \omega $, $( X )_{n}$ is an n-satisfaction class, from which we can define a full satisfaction class for S.

2. It suffices to show the existence of a coded ${\mathbb {V}}$-model of $\Sigma ^{0}_{1} \text {-} \mathrm {CA}$. Take any class Z. By $\mathrm {ETR} ( \omega )$, we construct X such that $( X )_{0} = Z$ and, for each $n < \omega $ and e, $( ( X )_{n + 1} )_{e} = \{ x \mid \pi ^{0}_{1} ( e, x, ( X )_{n} ) \}$: that is, $\{ S \mid S \, {\dot{\in }} \, ( X )_{n + 1} \}$ is the collection of all $\Sigma ^{0}_{1}$ definable classes with a parameter $( X )_{n}$. Then, $Y := \{ \langle x, \langle n, e \rangle \rangle \mid x \in ( ( X )_{n} )_{e} \}$ gives a coded ${\mathbb {V}}$-model of $\Sigma ^{0}_{1} \text {-} \mathrm {CA}$ with $Z \, {\dot{\in }} \, Y$. $\square $

The existence of a coded ${\mathbb {V}}$-model of an ${\mathcal {L}}_{2}$-system $\mathsf {T}$ implies the consistency of $\mathsf {T}$ in $\mathsf {NBG}$, in the same way as the existence of a coded $\omega $-model of a system $\mathsf {S}$ of second-order arithmetic implies the consistency of $\mathsf {S}$ in $\mathsf {ACA}_{0}$.^{Footnote 7} As we will see below, the existence of a coded ${\mathbb {V}}$-model bears more implications.

We first observe that Lemma 2.8 and Propositions 2.6 and 2.13 above immediately imply the following.

Corollary 2.15

Let $\mathsf {T}$ be a finite ${\mathcal {L}}_{2}$-system. $\mathsf {NBG}$ proves that if there is a coded ${\mathbb {V}}$-model of $\mathsf {T}$, then there are class-many set models of $\mathsf {T}$.

In this corollary, since $\mathsf {T}$ is finite, the condition of the existence of a coded ${\mathbb {V}}$-model of $\mathsf {T}$ does not require the existence of a full satisfaction class for the coded ${\mathbb {V}}$-model. The presence of a full satisfaction class has a stronger consequence.

Lemma 2.16

$\mathsf {NBG}$ proves the following: for each coded ${\mathbb {V}}$-model S, if there is a full satisfaction class X for S, then, for all ordinals $\alpha $, there is an ordinal $\beta > \alpha $ such that

$$\begin{aligned} ( \forall \Phi \in Fml _{2} ) ( \forall f, g \in {^{\omega } V_{\beta }} ) \bigl ( S \models \Phi \, [ f, g ] \leftrightarrow S^{V_{\beta }} \models \Phi \, [ f, g ] \bigr ). \end{aligned}$$

(7)

Proof

Let X be a full satisfaction class for S. For each $\Phi \in Fml _{2}$ and $f, g \in {^{\omega }{\mathbb {V}}}$, we will write $S \models _{X} \Phi \, [ f, g ]$ for $\langle \Phi , f, g \rangle \in X$. It follows by Proposition 2.10.2 that

$$\begin{aligned} ( \forall \Phi \in Fml _{2} ) ( \forall f, g \in {^{\omega }{\mathbb {V}}} ) \bigl ( S \models \Phi \, [ f, g ] \leftrightarrow S \models _{X} \Phi \, [ f, g ] \bigr ). \end{aligned}$$

Hence, it suffices to show that, for all $\alpha \in On$, there is $\beta > \alpha $ such that

$$\begin{aligned} ( \forall \Phi \in Fml _{2} ) ( \forall f, g \in {^{\omega } V_{\beta }} ) \bigl ( S \models _{X} \Phi \, [ f, g ] \leftrightarrow S^{V_{\beta }} \models \Phi \, [ f, g ] \bigr ). \end{aligned}$$

(8)

The proof idea is essentially the same as Lemma 2.7: instead of taking Skolem functions $G_{i}$’s separately for finitely many formulae $\Psi _{0}, \ldots , \Psi _{n}$, we take a single “global” Skolem function $G : Fml _{2} \times {^{\omega } {\mathbb {V}}} \times {^{\omega } {\mathbb {V}}} \rightarrow On$ uniformly for all $\Phi \in Fml _{2}$. Let $f, g \in {^{\omega } {\mathbb {V}}}$. The wanted function G is defined as follows: if $\Phi $ is of the form $\exists u_{j} \Theta $, then we set

$$\begin{aligned} G ( \Phi , f, g ) := {\left\{ \begin{array}{ll} \min \{ \eta \mid ( \exists w \in V_{\eta } ) S \models _{X} \Theta \, [ f_{( w \mid j )}, g ] \} &{} \text {if } S \models _{X} \Phi \, [ f, g ] \\ 0 &{} \text {otherwise}; \end{array}\right. } \end{aligned}$$

if $\Phi $ is of the form $\exists U_{l} \Theta $, then we set

$$\begin{aligned} G ( \Phi , f, g ) := {\left\{ \begin{array}{ll} \min \{ \eta | ( \exists w \in V_{\eta } ) S \models _{X} \Theta \, [ f, g_{( w | l)} ] \} &{} \text {if } S \models _{X} \Phi \, [ f, g ] \\ 0 &{} \text {otherwise}; \end{array}\right. } \end{aligned}$$

if $\Phi $ is of another form, then we just put $G ( \Phi , f, g ) = 0$. Then, we define a class function $F' :On \rightarrow On$ as follows:

$$\begin{aligned} F' ( \xi ) \ := \ \sup \{ G ( \Phi , f, g ) \mid \Phi \in Fml _{2} \, \wedge \, f, g \in {^{\omega } V_{\xi }} \}; \end{aligned}$$

this is well defined, since $ Fml _{2}$ and ${^{\omega } V_{\xi }}$ are sets. The rest is parallel to the proof of Lemma 2.7; note that the $\omega $-induction and $\omega $-recursion involved in the remaining part are possible because $F'$ is elementarily definable (and this is why we work with the elementary relation $S \models _{X} \Phi \, [ f, g ]$ instead of the $\Delta ^{1}_{1}$-relation $S \models \Phi \, [ f, g ]$). $\square $

Lemma 2.17

$\mathsf {NBG}$ proves the following.

1.
For every coded ${\mathbb {V}}$-model S, if there is a full satisfaction class for S, then $S \models \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$.
2.
For every standard $n \in {\mathbb {N}}$ and coded ${\mathbb {V}}$-model S, $S \models \Sigma ^{1}_{n} \text {-} \mathrm {Sep} + \Sigma ^{1}_{n} \text {-} \mathrm {Repl}$.

Proof

1. Let X be a full satisfaction class for a coded ${\mathbb {V}}$-model S. By Proposition 2.10.2, $S \models \Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$ is equivalent to the following:

$$\begin{aligned} ( \forall \Phi ( u, v ) \in Fml _{2} ) \forall a \Bigl ( ( \forall x \! \in \! a ) \exists ! y S \models _{X} \! \Phi ( x, y ) \rightarrow \exists b ( \forall x \! \in \! a ) ( \exists y \in b ) S \models _{X} \! \Phi ( x, y ) \Bigr ). \end{aligned}$$

This is equivalent to an instance of $\Pi ^{1}_{0} \text {-} \mathrm {Repl}$, which is derivable in $\mathsf {NBG}$. The other claim $S \models \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$ can be shown similarly.

2. Similarly, $S \models \Sigma ^{1}_{n} \text {-} \mathrm {Sep}$ and $S \models \Sigma ^{1}_{n} \text {-} \mathrm {Repl}$ are equivalent to single instances of $\Pi ^{1}_{0} \text {-} \mathrm {Sep}$ and $\Pi ^{1}_{0} \text {-} \mathrm {Repl}$, respectively, under the assumption of the existence of an m-satisfaction class for S for a sufficiently large natural number m, but the existence of such a (partial) satisfaction class is derivable in $\mathsf {NBG}$ (by Proposition 2.11). $\square $

Given a coded ${\mathbb {V}}$-model S, let us write $S^{M} \prec S^{N}$ for transitive sets M and N to mean that $M \subset N$ and

$$\begin{aligned} ( \forall \Phi \in Fml _{2} ) ( \forall f, g \in {^{\omega }M} \bigl ( S^{M} \models \Phi \, [ f, g ] \leftrightarrow S^{N} \models \Phi \, [ f, g ] \bigr ); \end{aligned}$$

recall that $f, g \in {^{\omega }M}$ can be viewed as variable assignments both on $S^{M}$ and $S^{N}$, since $M \subset N$. This means that the mappings $x \mapsto x$ (for $x \in M$) and $X \cap M \mapsto X \cap N$ (for $X \, {\dot{\in }} \, S^{M}$) are an elementary embedding of $S^{M}$ in $S^{N}$. Now, Lemmas 2.16 and 2.17 imply the following.

Corollary 2.18

Let $\mathsf {T}$ be a recursive (possibly infinite) ${\mathcal {L}}_{2}$-system. The following is provable in $\mathsf {NBG}$: if there are a coded ${\mathbb {V}}$-model S with $S \models \mathsf {T}$ and a full satisfaction class for S, then there is a class Z of ordinals closed unbounded in On such that $S^{V_{\alpha }} \prec S^{V_{\beta }}$ and $S^{V_{\alpha }} \models \mathsf {T} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$ for all $\alpha , \beta \in Z$ with $\alpha < \beta $. That is, Z is an elementary chain of models of $\mathsf {T} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$ of length On.

Proof

Suppose $S \models \mathsf {T}$ and there is a full satisfaction class X for S. We take Z to be the (elementary) class of ordinals $\beta $ that satisfies (8). Clearly, Z is unbounded in On and $S^{V_{\beta }} \models \mathsf {T}$ for every $\beta \in Z$. The closedness of Z can be shown by the standard Tarski-Vaught argument. Let $\gamma = \bigcup \{ \beta _{\xi } \}_{\xi < \lambda }$ for a limit $\lambda $ such that $\{ \beta _{\xi } \}_{\xi < \lambda } \subset Z$ and $\beta _{\eta } < \beta _{\zeta }$ for $\eta< \zeta < \lambda $. . Take any $f, g \in {^{\omega }V_{\gamma }}$. Then, for instance (the crucial case), suppose $S \models _{X} \exists U_{j} \Phi \, [ f, g ]$. Take $f', g' \in {^{\omega }V_{\gamma }}$ such that

$$\begin{aligned}&f' ( i ) = {\left\{ \begin{array}{ll} f ( i ) &{} \text {if } u_{i} \text { is free in } \exists U_{j} \Phi \\ 0 &{} \text {otherwise} \end{array}\right. }&&g' ( i ) = {\left\{ \begin{array}{ll} g ( i ) &{} \text {if } U_{i} \text { is free in } \exists U_{j} \Phi \\ 0 &{} \text {otherwise}. \end{array}\right. }&\end{aligned}$$

Since $\exists U_{j} \Phi $ contains only finitely many free variables, there is some $\xi < \lambda $ such that $f', g' \in {^{\omega }V_{\beta _{\xi }}}$; we will write $\beta $ for $\beta _{\xi }$. Now, in general, we can show by the standard argument that for all variable assignments $p, q \in {^{\omega }V}$ and $p', q' \in {^{\omega }V}$,

$$\begin{aligned} \begin{aligned}&\text {if} p, q \text {and} p', q' \text {coincide on all the free variables of} \Psi , \\&\text {then} ( S \models _{X} \Psi \, [ p, q ] ) \leftrightarrow ( S \models _{X} \Psi \, [ p', q' ] ). \end{aligned} \end{aligned}$$

(9)

Hence, by (9), we have $S \models _{X} \exists U_{j} \Phi \, [f', g' ]$. Since $\beta $ satisfies (8), there is some $w \in V_{\beta }$ such that $S^{V_{\beta }} \models \Phi \, [ f', g'_{( w \mid j)} ]$ and thus $S \models _{X} \Phi \, [ f', g'_{( w \mid j)} ]$, which entails $S \models _{X} \Phi \, [ f, g_{( w \mid j)} ]$ again by (9). We obtain $S^{V_{\gamma }} \models \Phi \, [ f, g_{( w \mid j)} ]$, by the induction hypothesis, and thus $S^{V_{\gamma }} \models \exists U_{j} \Phi \, [ f, g ]$. Finally, it follows by Lemma 2.17.1 that $S^{V_{\beta }} \models \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$ for all $\beta \in Z$. $\square $

2.4 ${\mathbb {V}}$-reflection

Definition 2.19

(${\mathbb {V}}$-Reflection) The schema $\Pi ^{1}_{n} \text {-} \mathrm {RFN}$ of $\Pi ^{1}_{n}$ ${\mathbb {V}}$-reflection is defined as follows:

$$\begin{aligned} \forall \vec {x} \forall \vec {X} \bigl ( \Phi ( \vec {x}, \vec {X} ) \rightarrow \exists S ( \vec {X} \, {\dot{\in }} \, S \wedge S \models \mathsf {NBG} \wedge S \models \Phi ( \vec {x}, \vec {X} ) \bigr ), \ \ \text {for all } \Phi \in \Pi ^{1}_{n}. \end{aligned}$$

where $\Phi $ only contains the displayed variables free (and without S free); the schema $\Sigma ^{1}_{n} \text {-} \mathrm {RFN}$ is similarly defined. Since $\mathsf {NBG}$ is finitely axiomatizable by $\Pi ^{1}_{2}$-sentences, we can drop the condition “$S \models \mathsf {NBG}$” when $n \ge 2$. We thereby define:

$$\begin{aligned}&\Pi ^{1}_{n} \text {-} \mathsf {RFN}_{0} := \mathsf {NBG} + \Pi ^{1}_{n} \text {-} \mathrm {RFN}&\text {and}&\Pi ^{1}_{n} \text {-} \mathsf {RFN} := \mathsf {ECA} + \Pi ^{1}_{n} \text {-} \mathrm {RFN}.&\end{aligned}$$

$\Pi ^{1}_{n} \text {-} \mathsf {RFN}_{0}$ is finitely axiomatizable for all n by the existence of a $\Pi ^{1}_{n}$ universal formula (for $n \ge 1$) and Proposition 2.20.1 below (for $n = 0$), but $\Pi ^{1}_{n} \text {-} \mathsf {RFN}$ and $\Pi ^{1}_{\infty } \text {-} \mathsf {RFN}_{0}$ are not finitely axiomatizable.

Proposition 2.20

The following hold in $\mathsf {NBG}$.

1.
$\Pi ^{1}_{0} \text {-} \mathrm {RFN}$ is equivalent to the following (single sentence):
$$\begin{aligned} \text {for all } X, \text { there is a coded } {\mathbb {V}} \text {-model } S \text { of } \mathsf {NBG} \text { with } X \, {\dot{\in }} \, S. \end{aligned}$$
2.
For all $n \in {\mathbb {N}}$, $\Sigma ^{1}_{n + 1} \text {-} \mathrm {RFN}$ is equivalent to $\Pi ^{1}_{n} \text {-} \mathrm {RFN}$.
3.
$\Sigma ^{1}_{2} \text {-} \mathrm {RFN}$ and $\Pi ^{1}_{1} \text {-} \mathrm {RFN}$ are equivalent to $\Pi ^{1}_{0} \text {-} \mathrm {RFN}$.

Proof

The claim 1 follows from Proposition 2.13. The claim 2 is obvious. The claim 3 follows from the claim 2 and the “downward absoluteness” of $\Pi ^{1}_{1}$-formulae in the sense that if $\Phi \in \Pi ^{1}_{1}$ holds then $\Phi $ holds in every coded ${\mathbb {V}}$-model containing $\Phi $’s parameters. $\square $

The next can be shown in a parallel manner to [1, Lemma 3.4] (one direction was already shown in Propostion 2.14.2).

Lemma 2.21

$\mathsf {ECA}_{0}^{+}$ proves exactly the same ${\mathcal {L}}_{2}$-theorems as $\Pi ^{1}_{0} \text {-} \mathsf {RFN}_{0}$ (and thus $\Sigma ^{1}_{2} \text {-} \mathsf {RFN}_{0}$).

Proposition 2.22

Let $\mathsf {F}$ be any finite ${\mathcal {L}}_{2}$-system whose axioms are all $\Pi ^{1}_{n}$. Then, $\Pi ^{1}_{n} \text {-} \mathsf {RFN}_{0} + \mathsf {F} \vdash Con ( \mathsf {ECA} + \mathsf {F} )$, where $ Con ( \mathsf {T} )$ is the consistency statement of $\mathsf {T}$.

Proof

$\Pi ^{1}_{n} \text {-} \mathsf {RFN}_{0} + \mathsf {F}$ proves $\mathrm {ETR} ( \omega )$ by Lemma 2.21 and the existence of a coded ${\mathbb {V}}$-model of $\mathsf {F}$, which implies the claim by Proposition 2.14.1 and Corollary 2.18. $\square $

Since each instance of $\Pi ^{1}_{n} \text {-} \mathrm {RFN}$ is $\Pi ^{1}_{n + 1}$, we have the following.

Corollary 2.23

For $n \ge 1$, $\Pi ^{1}_{n + 1} \text {-} \mathsf {RFN}_{0} \vdash Con ( \Pi ^{1}_{n} \text {-} \mathsf {RFN} )$.

2.5 Other systems

In this subsection, we will introduce a few more systems.

For a second-order variable X, an ${\mathcal {L}}_{2}$-formula $\Phi $ is said to be X-positive, when X only occurs positively in $\Phi $. The axiom schemata of $\mathrm {FP}$ and $\mathrm {LFP}$ are thereby defined as follows.

$$\begin{aligned} {\mathrm {FP}:}&\quad \exists X \forall x \bigl ( x \in X \leftrightarrow \Phi ( x, X ) \bigr ), \\ {\mathrm {LFP}:}&\quad \exists X \Bigl ( \forall x ( \Phi ( x, X ) \rightarrow x \in X ) \wedge \forall Y \bigl ( \forall x ( \Phi ( x, Y ) \rightarrow x \in Y ) \rightarrow X \subset Y \bigr ) \Bigr ), \end{aligned}$$

for each X-positive elementary $\Phi $ possibly with parameters. $\mathrm {FP}$ asserts the existence of a fixed-point of each X-positive elementary formula, and $\mathrm {LFP}$ asserts the existence of the least such fixed-points.^{Footnote 8} The weaker variants $\mathrm {FP}^{-}$ and $\mathrm {LFP}^{-}$ are obtained by restricting the range of $\Phi $s above to the X-positive elementary formulae without class parameters (but possibly with set parameters). We thereby set

$$\begin{aligned}&{\mathsf {F}}{\mathsf {P}}_{0}^{(-)} := \mathsf {NBG} + \mathrm {FP}^{(-)} \quad \quad \mathsf {LFP}_{0}^{(-)} := \mathsf {NBG} + \mathrm {LFP}^{(-)} \\&{\mathsf {F}}{\mathsf {P}}^{(-)} := \mathsf {ECA} + \mathrm {FP}^{(-)} \quad \quad \mathsf {LFP}^{(-)} := \mathsf {ECA} + \mathrm {LFP}^{(-)}. \end{aligned}$$

Note that ${\mathsf {F}}{\mathsf {P}}_{(0)}^{-}$ and $\mathsf {LFP}_{(0)}^{-}$ only prohibit the use of parameters in the axiom schemata $\mathrm {FP}_{(0)}^{-}$ and $\mathrm {LFP}_{(0)}^{-}$, respectively, but still allow parameters in the other axiom schemata (such as $\mathrm {ECA}$ in particular). Both $\mathrm {FP}^{(-)}$ and $\mathrm {LFP}^{(-)}$ are derivable (in $\mathsf {NBG}$) from single (universal) instances of them; see [23, Lemma 2]. Hence, ${\mathsf {F}}{\mathsf {P}}_{0}^{(-)}$ and $\mathsf {LFP}_{0}^{(-)}$ are finitely axiomatizable.

In second-order arithmetic, $\mathrm {FP}$ is equivalent to $\mathrm {ATR}$ (due to Avigad [2]), and $\mathrm {LFP}$ is equivalent to $\Pi ^{1}_{1} \text {-} \mathrm {CA}$, but neither of the corresponding equivalences holds in class theory. The next is a remarkable theorem due to Sato.

Theorem 2.24

(Sato [23]). $\mathsf {NBG} \vdash \mathrm {LFP} \leftrightarrow \mathrm {FP}$ and $\mathsf {NBG} \vdash \mathrm {LFP}^{-} \leftrightarrow \mathrm {FP}^{-}$.

We next consider principles of class collection. For a collection $\Gamma $ of ${\mathcal {L}}_{2}$-formulae, the schema of $\Gamma $-collection is defined as follows:

$$\begin{aligned} {\Gamma \text {-} \mathrm {Coll}:}&\forall x \exists X \Phi ( x, X ) \rightarrow \exists Y \forall x \exists y \Phi ( x, ( Y )_{y} ), \ \text {for each }\Phi \in \Gamma , \end{aligned}$$

where $\Phi $ may have parameters. We also consider a parameter-free version of $\Gamma \text {-} \mathrm {Coll}$: $\Gamma \text {-} \mathrm {Coll}^{-}$ are obtained by restricting the above $\Phi $s to $\Gamma $-formulae with no class parameters (but possibly with set parameters). We can easily show (by using paring),

$$\begin{aligned} \mathsf {NBG} \vdash \Sigma ^{1}_{n + 1} \text {-} \mathrm {Coll}^{(-)} \, \leftrightarrow \, \Pi ^{1}_{n} \text {-} \mathrm {Coll}^{(-)}, \ \text {for every }n \in {\mathbb {N}}. \end{aligned}$$

We thereby define

$$\begin{aligned}&\Sigma ^{1}_{n} \text {-} \mathsf {Coll}_{0}^{(-)} := \mathsf {NBG} + \Sigma ^{1}_{n} \text {-} \mathrm {Coll}^{(-)}&\text {and}&\Sigma ^{1}_{n} \text {-} \mathsf {Coll}^{(-)} := \mathsf {ECA} + \Sigma ^{1}_{n} \text {-} \mathrm {Coll}^{(-)};&\end{aligned}$$

again note that $\Sigma ^{1}_{n} \text {-} \mathsf {Coll}_{(0)}^{-}$ only prohibits the use of parameters in the axiom schema $\Sigma ^{1}_{n} \text {-} \mathrm {Coll}^{-}$. By means of a $\Pi ^{1}_{n + 1}$ universal formula, $\Sigma ^{1}_{n + 1} \text {-} \mathsf {Coll}_{0}^{(-)}$ and thus $\Pi ^{1}_{n} \text {-} \mathsf {Coll}_{0}^{(-)}$ are finitely axiomatizable for every n. The system $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0} + \Pi ^{1}_{\infty } \text {-} \mathrm {Ind}$ is extensively studied in [15, 18]. In the presence of a global choice, $\Sigma ^{1}_{1} \text {-} \mathrm {Coll}$ implies the axiom $\Sigma ^{1}_{1} \text {-} \mathrm {AC}$ of $\Sigma ^{1}_{1}$ choice (see [15] for its definition), and $\Sigma ^{1}_{1} \text {-} \mathrm {Coll}^{-}$ implies the parameter-free version $\Sigma ^{1}_{1} \text {-} \mathrm {AC}^{-}$ without class parameters, but these implications fail without assuming $\mathrm {GC}$, since $\Sigma ^{1}_{1} \text {-} {\mathsf {A}}{\mathsf {C}}^{-}$ implies $\mathrm {GC}$ in $\mathsf {NBG}$ (see [7, Lemma 5]).

Proposition 2.25

1.
$\mathsf {NBG} \vdash \mathrm {FP} \rightarrow \mathrm {ETR}$.
2.
$\mathsf {NBG} \vdash \Sigma ^{1}_{1} \text {-} \mathrm {Coll} \rightarrow \Delta ^{1}_{1} \text {-} \mathrm {CA}$.

Proof

For the claim 1, Avigad’s [2] proof of $\mathsf {ACA}_{0} \vdash \mathrm {FP} \rightarrow \mathrm {ATR}$ can be applied to class theory as it is; also see [22, Proposition 29]. The claim 2 is proved in [7, Proposition 4]. $\square $

We will consider some first-order extensions of ${\mathsf {Z}}{\mathsf {F}}$. Let us start with a few general definitions. For a first- or second-order language ${\mathcal {L}}$ including ${\mathcal {L}}_{\in }$, we occasionally consider extending the axiom schemata of separation and replacement for ${\mathcal {L}}$:

$$\begin{aligned} {{\mathcal {L}}\hbox {-}\mathrm {Sep}}: \quad&\forall a \exists b \forall x \bigl [ x \in b \leftrightarrow x \in a \wedge \varphi ( x ) \bigr ], \\ {{\mathcal {L}}\hbox {-}\mathrm {Repl}}: \quad&\forall a \bigl [ ( \forall x \in a ) \exists ! y \varphi ( x, y ) \rightarrow \exists b ( \forall x \in a ) ( \exists y \in b ) \varphi ( x, y ) \bigr ], \end{aligned}$$

where $\varphi $ is an arbitrary ${\mathcal {L}}$-formula without b free; note that ${\mathcal {L}}_{2} \text {-} \mathrm {Sep}$ and ${\mathcal {L}}_{2} \text {-} \mathrm {Repl}$ are equivalent to $\Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$ and $\Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$, respectively, in $\mathsf {NBG}$. Next, given such ${\mathcal {L}}$ ($\supset {\mathcal {L}}_{\in }$), we set ${\mathcal {L}}( P _{1}, \ldots , P _{k} )$ to be the language obtained by adding fresh unary predicate symbols $ P _{1}, \ldots , P _{k}$ to ${\mathcal {L}}$: namely, ${\mathcal {L}}( P _{1}, \ldots , P _{k} ) = {\mathcal {L}}\cup \{ P _{1}, \ldots , P _{k} \}$. An inductive operator form is an ${\mathcal {L}}_{\in } ( P )$-formula ${\mathcal {A}} ( x, P )$ with at most one free variable in which $ P $ occurs only positively.

Now, we define a first-order language ${\mathcal {L}}_{\mathrm {ID}}$ as an extension of ${\mathcal {L}}_{\in }$ with unary predicates $J_{{\mathcal {A}}}$ associated to each inductive operator form ${\mathcal {A}} ( x, P )$. The ${\mathcal {L}}_{\mathrm {ID}}$-system $\widehat{{\mathsf {I}}{\mathsf {D}}}_{1}$ is defined as ${\mathsf {Z}}{\mathsf {F}} + {\mathcal {L}}_{\mathrm {ID}} \text {-} \mathrm {Sep} + {\mathcal {L}}_{\mathrm {ID}} \text {-} \mathrm {Repl}$ plus the following axiom schema $\widehat{\mathrm {ID}}$ asserting that each $J_{{\mathcal {A}}}$ is a fixed-point of ${\mathcal {A}} ( x, P )$.

$$\begin{aligned} {\widehat{\mathrm {ID}}:} \quad&\forall x \bigl ( {\mathcal {A}} ( x, J_{{\mathcal {A}}} ) \leftrightarrow J_{{\mathcal {A}}} x \bigr ), \ \text {for each inductive operator form } {\mathcal {A}} ( x, P ). \end{aligned}$$

This axiom schema says that $J_{{\mathcal {A}}}$ is a fixed-point of ${\mathcal {A}}$. The ${\mathcal {L}}_{\mathrm {ID}}$-system ${\mathsf {I}}{\mathsf {D}}_{1}$ is the strengthening of $\widehat{{\mathsf {I}}{\mathsf {D}}}_{1}$ defined as ${\mathsf {Z}}{\mathsf {F}} + {\mathcal {L}}_{\mathrm {ID}} \text {-} \mathrm {Sep} + {\mathcal {L}}_{\mathrm {ID}} \text {-} \mathrm {Repl}$ plus the following axiom schemata asserting that each $J_{{\mathcal {A}}}$ is the least fixed-point of ${\mathcal {A}} ( x, P )$ (cf. fn 8):

$$\begin{aligned} {\mathrm {ID1}:} \quad&\forall x \bigl ( {\mathcal {A}} ( x, J_{{\mathcal {A}}} ) \rightarrow J_{{\mathcal {A}}} x \bigr ) \\ {\mathrm {ID2}:} \quad&\forall x \bigl ( {\mathcal {A}} ( x, \Psi ( {\hat{u}} ) ) \rightarrow \Psi ( x ) \bigr ) \rightarrow \forall x ( J_{{\mathcal {A}}} x \rightarrow \Psi ( x ) ), \ \text {for all }\Psi ( u ) \in {\mathcal {L}}_{\mathrm {ID}}, \end{aligned}$$

where $\Psi ( u )$ may contain parameters, and “${\hat{u}}$” indicates which variable each term t is substituted for; hence, ${\mathcal {A}} ( x, \Psi ( {\hat{u}} ) )$ is obtained from ${\mathcal {A}} ( x, P )$ by replacing each occurrence of $ P t$ by $\Psi ( t )$ (with renaming of bound variables as necessary to avoid collision). Since $ P $ occurs only positively in an inductive operator form ${\mathcal {A}} ( x, P )$, ${\mathcal {A}} ( x, {\mathcal {A}} ( {\hat{u}}, J_{{\mathcal {A}}} ) )$ implies ${\mathcal {A}} ( x, J_{{\mathcal {A}}} )$ by $\mathrm {ID1}$, and thus $\forall x ( J_{{\mathcal {A}}} x \rightarrow {\mathcal {A}} ( x, J_{{\mathcal {A}}} ) )$ by $\mathrm {ID2}$; hence, $\widehat{{\mathsf {I}}{\mathsf {D}}}_{1}$ is a sub-theory of ${\mathsf {I}}{\mathsf {D}}_{1}$.

The next lemma is the class-theoretic version of so-called Aczel’s trick.

Lemma 2.26

Let ${\mathcal {B}} ( x, \vec {y}, P , Q )$ be an ${\mathcal {L}}_{\in } ( P , Q )$-formula only with the displayed variables free in which $ P $ occurs only positively (but $ Q $ may occur negatively, and $\vec {y}$ may be empty). There is a $\Sigma ^{1}_{1}$-formula $\Psi ( u, \vec {y}, X )$ only with the displayed variables free such that

$$\begin{aligned} \Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0} \vdash \forall X \forall x \forall \vec {y} \bigl ( \Psi ( x, \vec {y}, X ) \leftrightarrow {\mathcal {B}} ( x, \vec {y}, \Psi ( {\hat{u}}, \vec {y}, X ), X ) \bigr ). \end{aligned}$$

Furthermore, when ${\mathcal {B}} ( x, \vec {y}, P )$ contains no $ Q $, there is a $\Sigma ^{1}_{1}$-formula $\Psi ( u, \vec {y} )$ only with the displayed variable free such that

$$\begin{aligned} \Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0}^{-} \vdash \forall x \forall \vec {y} \bigl ( \Psi ( x, \vec {y} ) \leftrightarrow {\mathcal {B}} ( x, \vec {y}, \Psi ( {\hat{u}}, \vec {y} ) ) \bigr ). \end{aligned}$$

Hence, in particular, for each inductive operator form ${\mathcal {A}} ( x, P )$, there is a $\Sigma ^{1}_{1}$-formula $\Psi ( u )$ such that $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0}^{-} \vdash \forall x \bigl ( \Psi ( x ) \leftrightarrow {\mathcal {A}} ( x, \Psi ( {\hat{u}} ) ) \bigr )$.

Proof

From a $\Pi ^{1}_{1}$-universal formula we can construct a $\Sigma ^{1}_{1}$ (universal) formula $\sigma ( e, x, \vec {y}, X )$ such that for each $\Sigma ^{1}_{1}$-formula $\Phi ( x, \vec {y}, X )$ there is a (standard) natural number e satisfying $\mathsf {NBG} \vdash \forall x \forall \vec {y} \forall X \bigl ( \Phi ( x, \vec {y}, X ) \leftrightarrow \sigma ( e, x, \vec {y}, X ) \bigr )$. Thereby we set

$$\begin{aligned} \Theta ( x, \vec {y}, X ) :\Leftrightarrow \forall z \forall w \Bigl ( x = \langle z, w \rangle \rightarrow {\mathcal {B}} \bigl ( w, \vec {y}, \, \sigma ( z, \langle z, {\hat{u}} \rangle , \vec {y}, X ), \, X \bigr ) \Bigr ). \end{aligned}$$

Since $\sigma $ is $\Sigma ^{1}_{1}$ and occur only positively in ${\mathcal {B}}$, $\Theta $ is also $\Sigma ^{1}_{1}$ provably in $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0}$: by using $\Sigma ^{1}_{1} \text {-} \mathrm {Coll}$, we can always push any first-order quantifier prefixed to a $\Sigma ^{1}_{1}$-formula within the existential class quantifier; note that if ${\mathcal {B}}$ contains no $ Q $ (which is a place-holder for a class parameter), $\Theta $ is $\Sigma ^{1}_{1}$ in $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0}^{-}$ for the same reason. Hence, there is some (standard) natural number e such that

$$\begin{aligned} \Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0} \vdash \forall x \forall \vec {y} \forall X \bigl ( \Theta ( x, \vec {y}, X ) \leftrightarrow \sigma ( e, x, \vec {y}, X ) \bigr ). \end{aligned}$$

We fix such e and put $\Psi ( u, \vec {y}, X ) :\Leftrightarrow \Theta ( \langle e, u \rangle , \vec {y}, X )$; note that e is definable and so $\Psi $ only has u, $\vec {y}$, and X free. Hence, we have

$$\begin{aligned} \Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0} \vdash \Psi ( x, \vec {y}, X ) \leftrightarrow {\mathcal {B}} \bigl ( x, \vec {y}, \, \sigma ( e, \langle e, {\hat{u}} \rangle , \vec {y}, X ), X \bigr ) \leftrightarrow {\mathcal {B}} \bigl ( x, \vec {y}, \Psi ( {\hat{u}}, \vec {y}, X ), X \bigr ). \end{aligned}$$

If ${\mathcal {B}}$ contains no class parameters, then nor do $\Theta $ and $\Psi $, and thus $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0}^{-}$ is sufficient to derive this equivalence. $\square $

Recall that the schemata $\mathrm {FP}^{-}$ and $\mathrm {LFP}^{-}$ allow set parameters. However, as the next proposition shows, forbidding set parameters (as well as class parameters) results in the same theories.

Proposition 2.27

Let $\mathrm {FP}^{\square }$ and $\mathrm {LFP}^{\square }$ denote the variants of $\mathrm {FP}$ and $\mathrm {LFP}$, respectively, obtained by restricting the schemata to X-positive $\Pi ^{1}_{0}$-formulae neither with set nor class parameters. Then, we have

$$\begin{aligned}&\mathsf {NBG} \vdash \mathrm {FP}^{-} \leftrightarrow \mathrm {FP}^{\square } \qquad \text {and} \qquad \mathsf {NBG} \vdash \mathrm {LFP}^{-} \leftrightarrow \mathrm {LFP}^{\square }.&\end{aligned}$$

Proof

Let $\Phi ( x, y, X )$ be an X-positive $\Pi ^{1}_{0}$-formula only with the displayed variables free. We set $\Psi ( u, X ) := \forall x \forall y \bigl ( u = \langle x, y \rangle \rightarrow \Phi ( x, y, ( X )_{y} ) \bigr )$. Then, for each x and y, if X is a fixed-point (or least fixed-point) of $\Psi $, then $( X )_{y}$ is a fixed-point (least fixed-point, resp.) of $\Phi $ with y as a parameter; cf. [8, Theorem 5.2]. $\square $

Now, we can show by an obvious model-theoretic argument (or partial cut-elimination) that $\widehat{{\mathsf {I}}{\mathsf {D}}}_{1}$ and $\mathsf {NBG} + {\mathsf {F}}{\mathsf {P}}_{0}^{\square }$ have the same ${\mathcal {L}}_{\in }$-theorems, and so do ${\mathsf {I}}{\mathsf {D}}_{1}$ and $\mathsf {NBG} + \mathsf {LFP}_{0}^{\square }$. Hence, the next corollary follows.^{Footnote 9}

Corollary 2.28

$\widehat{{\mathsf {I}}{\mathsf {D}}}_{1}$, ${\mathsf {I}}{\mathsf {D}}_{1}$, ${\mathsf {F}}{\mathsf {P}}_{0}^{\square }$, $\mathsf {LFP}_{0}^{\square }$, ${\mathsf {F}}{\mathsf {P}}_{0}^{-}$, and $\mathsf {LFP}_{0}^{-}$ have the same ${\mathcal {L}}_{\in }$-theorems.

In second-order arithmetic, the schema of $\Sigma ^{1}_{n}$-dependent choice is defined as:

$$\begin{aligned} \forall n \forall X \exists Y \Phi ( n, X, Y ) \rightarrow \exists Z \forall n \Phi ( n, ( Z )^{n}, ( Z )_{n} ), \ \text {for all } \Phi \in \Sigma ^{1}_{n}, \end{aligned}$$

where $( Z )^{n}$ is defined as $\{ \langle i, j \rangle \in X \mid j < n \}$. This definition of $( Z )^{n}$ cannot be straightforwardly generalized to our current setting, because we do not assume a global wellordering of the universe ${\mathbb {V}}$. In fact, the axiom of $\Sigma ^{1}_{1}$ dependent choice in the above form with any reasonable definition of $( Z )^{x}$ implies $\Sigma ^{1}_{1} \text {-} \mathrm {AC}$ (by treating X as a dummy variable) and thus the axiom of global choice $\mathrm {GC}$ [7, Lemma 5]. Hence, for the current setting, we adopt an alternative axiom schema that we call $\Sigma ^{1}_{n}$ dependent collection schema.^{Footnote 10} We set

$$\begin{aligned} ( Z )^{x} := \{ \langle y, z \rangle \in Z \mid {\mathsf {r}}{\mathsf {k}} ( z ) < {\mathsf {r}}{\mathsf {k}} ( x ) \}. \end{aligned}$$

and, for a collection $\Gamma $ of ${\mathcal {L}}_{2}$-formulae, define the schema $\Gamma \text {-} \mathrm {DColl}$ of $\Gamma $ dependent collection as follows.

$$\begin{aligned} {\Gamma \text {-} \mathrm {DColl}:}&\forall x \forall X \exists Y \Phi ( x, X, Y ) \rightarrow \exists Z \forall x \exists y \Phi ( x, ( Z )^{x}, ( Z )_{y} ), \ \text {for all } \Phi \in \Gamma . \end{aligned}$$

It is easy to see that $\mathsf {NBG} \vdash \Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl} \leftrightarrow \Pi ^{1}_{n} \text {-} \mathrm {DColl}$ for all $n \in {\mathbb {N}}$. Thereby, for each $n \in {\mathbb {N}}$, we define

$$\begin{aligned}&\Sigma ^{1}_{n} \text {-} \mathsf {DColl}_{0} := \mathsf {NBG} + \Sigma ^{1}_{n} \text {-} \mathrm {DColl} \qquad \text {and} \qquad \Sigma ^{1}_{n} \text {-} \mathsf {DColl} := \mathsf {ECA} + \Sigma ^{1}_{n} \text {-} \mathrm {DColl}.&\end{aligned}$$

Thanks to universal formulae, $\Sigma ^{1}_{n} \text {-} \mathsf {DColl}_{0}$ is finitely axiomatizable for all $n \in {\mathbb {N}}$; note that, since $\Sigma ^{1}_{0} = \Pi ^{1}_{0}$, $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0}$ and $\Sigma ^{1}_{0} \text {-} \mathsf {DColl}_{0}$ have exactly the same ${\mathcal {L}}_{2}$-theorems.

The next is obvious by treating X above as a dummy variable.

Proposition 2.29

For all $n \in {\mathbb {N}}$, $\mathsf {NBG} \vdash \Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl} \rightarrow \Sigma ^{1}_{n + 1} \text {-} \mathrm {Coll}$.

We have a different, but equivalent, formulation of $\Sigma ^{1}_{n}$ dependent collection ($n \in {\mathbb {N}}$).^{Footnote 11} For a class X and a set x, let us define $[ \! [ X ] \! ]^{x} := \{ \langle y, z \rangle \in X \mid z \in x \}$. Then, an alternative formulation of $\Sigma ^{1}_{n}$ dependent collection is given as follows.

$$\begin{aligned} {\Sigma ^{1}_{n} \text {-} \mathrm {DColl}':} \quad&\forall x \forall X \exists Y \Phi ( x, X, Y ) \rightarrow \exists Z \forall x \exists y \Phi ( x, [ \! [ Z ] \! ]^{x}, ( Z )_{y} ). \end{aligned}$$

Proposition 2.30

For all $n \in {\mathbb {N}}$, $\mathsf {NBG} \vdash \Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl} \leftrightarrow \Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl}'$.

Proof

Suppose $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl}$. Assume $\forall x \forall X \exists Y \Phi ( x, X, Y )$ for $\Phi \in \Sigma ^{1}_{n + 1}$. We have $\forall x \forall X \exists Y \Phi ( x, \, [ \! [ X ] \! ]^{x}, Y )$. $\Sigma ^{1}_{n} \text {-} \mathrm {DColl}$ yields a class Z with $\forall x \exists y \Phi \bigl ( x, \, [ \! [ ( Z )^{x} ] \! ]^{x}, ( Z )_{y} \bigr )$. Since $x \subset V_{{\mathsf {r}}{\mathsf {k}} ( x )}$, we have $[ \! [ ( Z )^{x} ] \! ]^{x} = [ \! [ Z ] \! ]^{x}$.

Suppose $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl}'$ for the converse. It suffices to derive $\Pi ^{1}_{n} \text {-} \mathrm {DColl}$. Assume $\forall x \forall X \exists Y \Phi ( x, X, Y )$ for $\Phi \in \Pi ^{1}_{n}$. Put

$$\begin{aligned} \Theta ( x, X, Y ) \, :\Leftrightarrow \, \forall z \bigl ( x = V_{{\mathsf {r}}{\mathsf {k}} ( z )} \cup \{ z \} \rightarrow \Phi ( z, ( X )^{z}, Y ) \bigr ), \end{aligned}$$

which is $\Pi ^{1}_{n}$. Since there is at most one z with $x = V_{{\mathsf {r}}{\mathsf {k}} ( z )} \cup \{ z \}$ for each x, we have $\forall x \forall X \exists Y \Theta ( x, X, Y )$ by the assumption. Hence, $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl}'$ yields a class Z such that $\forall x \exists y \Theta ( x, [ \! [ Z ] \! ]^{x}, ( Z )_{y} )$. Now, take any z and let $x = V_{{\mathsf {r}}{\mathsf {k}} ( z )} \cup \{ z \}$. Then, there is some y such that $\Theta ( x, [ \! [ Z ] \! ]^{x}, ( Z )_{y} )$ and thus $\Phi \bigl ( z, \, ( [ \! [ Z ] \! ]^{x} )^{z}, ( Z )_{y} \bigr )$. We have $( [ \! [ Z ] \! ]^{x} )^{z} = ( Z )^{z}$ and thus obtain $\Phi ( z, ( Z )^{z}, ( Z )_{y} )$. $\square $

As is expected, $\Sigma ^{1}_{n + 1}$ dependent collection implies $\Sigma ^{1}_{n + 1}$ dependent choice under the assumption of $\mathrm {GC}$.

Proposition 2.31

Let us define the schema of $\Sigma ^{1}_{n}$ dependent choice as follows:

$$\begin{aligned} {\Sigma ^{1}_{n} \text {-} \mathrm {DC}:} \quad&\forall x \forall X \exists Y \Phi ( x, X, Y ) \rightarrow \exists Z \forall x \Phi ( x, ( Z )^{x}, ( Z )_{x} ), \ \text {for all } \Phi \in \Sigma ^{1}_{n}. \end{aligned}$$

Then, for all $n \in {\mathbb {N}}$, $\mathsf {NBG} + \mathrm {GC} \vdash \Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl} \leftrightarrow \Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}$.

Proof

We work within $\mathsf {NBG} + \mathrm {GC}$. One direction is obvious. For the converse, suppose $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl}$. Consider the next schema (the “choice” version of $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl}'$):

$$\begin{aligned} {\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}':} \quad&\forall x \forall X \exists Y \Phi ( x, X, Y ) \rightarrow \exists Z \forall x \Phi ( x, [ \! [ Z ] \! ]^{x}, ( Z )_{x} ), \ \text {for all } \Phi \in \Sigma ^{1}_{n + 1}. \end{aligned}$$

By the same argument as Proposition 2.30, we can show that $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}$ and $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}'$ are equivalent in $\mathsf {NBG}$.^{Footnote 12} Hence, it suffices to derive $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}'$.

For a class W and a set w, let us define $\langle W \rangle ^{w} := \{ z \mid ( \exists u \in w ) z \in ( W )_{u} \}$: namely, $\langle W \rangle ^{w} = \bigcup _{u \in w} ( W )_{u}$. Our first goal is to derive the following intermediate schema:

$$\begin{aligned} {\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}'':} \quad&\forall x \forall X \exists Y \Phi ( x, X, Y ) \rightarrow \exists Z \forall x \Phi ( x, \langle Z \rangle ^{x}, ( Z )_{x} ), \ \text {for all } \Phi \in \Sigma ^{1}_{n + 1}. \end{aligned}$$

Take any $\Sigma ^{1}_{n + 1}$-formula $\Phi ( x, X, Y )$ and assume $\forall x \forall X \exists Y \Phi ( x, X, Y )$. Let us put

$$\begin{aligned} \Theta ( x, X, Y ) :\Leftrightarrow \Phi ( x, \, \{ z \mid ( \exists u ) z \in ( X )_{u} \}, \, Y ). \end{aligned}$$

We obviously have $\forall x \forall X \exists Y \Theta ( x, X, Y )$. By Proposition 2.30, there is a class W such that $\forall x \exists y \Theta ( x, [ \! [ W ] \! ]^{x}, ( W )_{y} )$. Since we have $\{ z \mid ( \exists u ) z \in ( [ \! [ W ] \! ]^{x} )_{u} \} \, = \, \langle W \rangle ^{x}$, we obtain $\forall x \exists y \Phi ( x, \langle W \rangle ^{x}, ( W )_{y} )$. Now, take a global wellordering $\prec $ on ${\mathbb {V}}$. We define a (class) function $F :{\mathbb {V}} \rightarrow {\mathbb {V}}$ by $\in $-recursion so that

$$\begin{aligned} F ( x )&\ = \ \text {the } \prec \!\!\text {-least } z \text { such that } \Phi \bigl ( x, \{ w \mid ( \exists y \in x ) {w \in ( W )_{F ( y )}} \}, ( W )_{z} \bigr ) \\&\ = \ \text {the } \prec \!\!\text {-least } z \text { such that } \Phi ( x, \langle W \rangle ^{F''x}, ( W )_{z} ). \end{aligned}$$

Then, we put

$$\begin{aligned} Z := \{ \langle w, x \rangle \mid \langle w, F ( x ) \rangle \in W \}. \end{aligned}$$

For each x, we have $\langle Z \rangle ^{x} = \{ w \mid ( \exists y \in x ) {w \in ( W )_{F ( y )}} \} = \langle W \rangle ^{F''x}$ and $( Z )_{x} = ( W )_{F ( x )}$; hence, we obtain $\Phi \bigl ( x, \langle Z \rangle ^{x}, ( Z )_{x} \bigr )$.

We have derived $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}''$. Our next goal is to show that $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}''$ implies $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}'$; this is shown by Krähenbühl [18], but let us rehearse it here for the reader’s convenience. Suppose $\forall x \forall X \exists Y \Psi ( x, X, Y )$ for any $\Psi \in \Sigma ^{1}_{n + 1}$. Let us put

$$\begin{aligned} \Xi ( x, X, Y ) \ :\Leftrightarrow \ \Psi \bigl ( x, X, ( Y )_{x} \bigr ) \wedge \bigl ( [ \! [ X ] \! ]^{x} = X \rightarrow ( \forall y \in Y ) \exists z \langle z, x \rangle = y \bigr ). \end{aligned}$$

We have $\forall x \forall X \exists Y \Xi ( x, X, Y )$. There is a class W such that $\forall x \Xi \bigl ( x, \langle W \rangle ^{x}, ( W )_{x} \bigr )$ by $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}''$. We can show by $\in $-induction on x that $\forall x ( \forall y \in ( W )_{x} ) \exists z \langle z, x \rangle = y$. Put $Z := \{ \langle z, x \rangle \mid \langle \langle z, x \rangle , x \rangle \in W \}$. Then, we have $[ \! [ Z ] \! ]^{x} = \langle W \rangle ^{x}$ and $( Z )_{x} = ( ( W )_{x} )_{x}$ for all x; hence $\forall x \Psi \bigl ( x, [ \! [ Z ] \! ]^{x}, ( Z )_{x} \bigr )$. $\square $

As in second-order arithmetic, $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0}$ is a stronger system than $\mathsf {NBG}$, whereas $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0}$ have the same ${\mathcal {L}}_{\in }$-theorems as $\mathsf {NBG}$ (see [7, Theorem 15]).

Proposition 2.32

Let $\Omega = \{ \langle \alpha , \beta \rangle \in On \times On \mid \alpha < \beta \}$. $\Omega $ is obviously a class wellordering provably in $\mathsf {NBG}$. Then, we have $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0} \vdash \mathrm {ETR} ( \Omega )$.^{Footnote 13}

Proof

We work within $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0}$. Take any elementary $\Phi ( x, z, X )$. Let

$$\begin{aligned} \Psi ( x, y, X ) :\Leftrightarrow \forall z \forall f&\Bigl ( \bigl ( y = \langle z, f \rangle \wedge ``f \text {is a function''} \bigr ) \\&\rightarrow \Phi \bigl ( x, z, \bigl \{ \langle u, v \rangle \mid v \in \mathrm {dom} ( f ) \wedge \exists w ( w \in f ( v ) \wedge u \in ( X )_{w} ) \bigr \} \bigr ) \Bigr ), \end{aligned}$$

where $\mathrm {dom} ( f )$ denotes the domain of f. We have $\forall y \forall X \exists Y \bigl ( Y = \{ x \mid \Psi ( x, y, X ) \} \bigr )$ by $\mathrm {ECA}$. By $\Sigma ^{1}_{1} \text {-} \mathrm {DColl}$, there is a class Z such that $\forall y \exists a \bigl ( ( Z )_{a} = \{ x \mid \Psi ( x, y, ( Z )^{y} ) \} \bigr )$. We define a class function F by recursion on $ On $ so that $F ( \alpha )$ is the set of all sets b with the least rank such that $( Z )_{b} = \{ x \mid \Psi ( x, \langle \alpha , F \! \upharpoonright _{\alpha } \rangle , ( Z )^{\langle \alpha , F \upharpoonright _{\alpha } \rangle } ) \}$, where $F \! \upharpoonright _{\alpha }$ denotes the restriction of F to $\alpha $: hence, for each $b \in F ( \alpha )$, we have

$$\begin{aligned} ( Z )_{b} \, = \, \bigl \{ x \mid \Phi \bigl ( x, \alpha , \{ \langle u, \beta \rangle \mid \beta < \alpha \wedge \exists w ( w \in F ( \beta ) \wedge u \in ( Z )_{w} ) \} \bigr ) \bigr \}, \end{aligned}$$

since $( ( Z )^{\langle \alpha , f \rangle } )_{w} = ( Z )_{w}$ if f is a function and $w \in f ( v )$ for some $v \in \mathrm {dom} ( f )$ (by consideration on their ranks). Let us put

$$\begin{aligned} W'&:= \bigl \{ \langle u, \alpha \rangle \mid \alpha \in On \wedge \exists w ( w \in F ( \alpha ) \wedge u \in ( Z )_{w} ) \bigr \} \\&\, = \bigl \{ \langle u, \alpha \rangle \mid \alpha \in On \wedge \forall w ( w \in F ( \alpha ) \rightarrow u \in ( Z )_{w} ) \bigr \}; \end{aligned}$$

the equality holds because $\{ x \mid \Psi ( x, \langle \alpha , F \! \upharpoonright _{\alpha } \rangle , ( Z )^{\langle \alpha , F \upharpoonright _{\alpha } \rangle } ) \}$ is unique and thus $( Z )_{b} = ( Z )_{c}$ for all $b, c \in F ( \alpha )$. Now, for each $\alpha \in On $, we have

$$\begin{aligned} \{ \langle u, \beta \rangle \mid \beta< \alpha \, \wedge \, \exists w ( w \in F ( \beta ) \wedge u \in ( Z )_{w} ) \} = \{ \langle u, \beta \rangle \mid \beta < \alpha \, \wedge \, u \in ( W' )_{\beta } \} = ( W' )^{\alpha }. \end{aligned}$$

Hence, for each $b \in F ( \alpha )$, we have $( Z )_{b} = \{ x \mid \Phi ( x, \alpha , ( W' )^{\alpha } \}$ and thus

$$\begin{aligned} ( W' )_{\alpha } = \{ x \mid \Phi ( x, \alpha , ( W' )^{\alpha } ) \}. \end{aligned}$$

We finally put $W := W' \cup \{ \langle x, a \rangle \mid a \not \in On \wedge \Phi ( z, a, \emptyset ) \}$ to treat the inessential case for $( W )_{a}$ for a not belonging to the intended domain On of the wellordering $\Omega $. $\square $

Corollary 2.33

$\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0}$ proves $ Con ( \mathsf {NBG} )$ and thus $ Con ( \Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0} )$ because the aforementioned fact that $\mathsf {NBG}$ and $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0}$ have the same ${\mathcal {L}}_{\in }$-theorems is provable in $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0}$ (or even much weaker systems).

3 Transfinite induction

Since $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$ is a sequential theory in the sense of [9, Ch. III, Definition 1.12] and derives $\omega $-induction for every ${\mathcal {L}}_{2}$-formula, it follows from [9, Ch. III, Lemma 3.47] that $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$ is reflexive and thus we have the following.

Proposition 3.1

$\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$ proves the consistency of $\mathsf {NBG}$ and $\Pi ^{1}_{n} \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$ for each n.

Nonetheless, as we will see below, $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$ and $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}$ are still quite weak extensions of $\mathsf {NBG}$.

We first consider $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}$. Recall that ${\mathcal {L}}_{\in } ( P _{1}, \ldots , P _{k} )$ denotes ${\mathcal {L}}_{\in } \cup \{ P _{1}, \ldots , P _{k} \}$ for fresh unary predicates $ P _{1}, \ldots , P _{k}$ (Sect. 2.5). We start with a technical definition and lemma for the subsequent argument.

Definition 3.2

Let $\Phi _{1} ( v_{1} ), \ldots , \Phi _{k} ( v_{k} )$ be ${\mathcal {L}}_{2}$-formulae with distinguished free variable $v_{1}, \ldots , v_{k}$, which need not be distinct and may overlap, and possibly with other free variables. The following definition is made in $\mathsf {NBG} + \Pi ^{1}_{\infty } \text {-} \mathrm {Sep}$. For each set x, a (set-sized) ${\mathcal {L}}_{\in } ( P _{1}, \ldots , P _{k} )$-structure $\langle x, \Phi _{1}, \ldots , \Phi _{k} \rangle $ is defined as $\langle x, \Phi _{1} \cap x, \ldots , \Phi _{k} \cap x \rangle $ where each $ P _{i}$ ($1 \le i \le k$) is interpreted by $\Phi _{i} \cap x$ ($= \{ z \in x \mid \Phi _{i} ( z ) \}$); here, note that $\Phi _{i} \cap x$ need not be equal to the relativization $\Phi _{i}^{x}$ of $\Phi _{i}$ to x.

Lemma 3.3

Let $\varphi ( \vec {x}, P _{1}, \ldots , P _{k} ) \in {\mathcal {L}}_{\in } ( P _{1}, \ldots , P _{k} )$ only with the displayed variables $\vec {x}$ free. Take arbitrary ${\mathcal {L}}_{2}$-formulae $\Phi _{1} ( v_{1} ), \ldots , \Phi _{k} ( v_{k} )$ (possibly with parameters). Then, the following is provable in $\mathsf {ECA}$:

$$\begin{aligned} \forall \alpha ( \exists \beta > \alpha ) ( \forall \vec {x} \in V_{\beta } ) \Bigl ( \bigl ( \langle V_{\beta }, \Phi _{1}, \ldots , \Phi _{k} \rangle \models \varphi ( \vec {x} ) \bigr ) \ \leftrightarrow \ \varphi ( \vec {x}, \Phi _{1}, \ldots , \Phi _{k} ) \Bigr ), \end{aligned}$$

where $\varphi ( \vec {x}, \Phi _{1}, \ldots , \Phi _{k} )$ is the result of simultaneously substituting $\Phi _{i} ( u )$ for $ P _{i} ( u )$ for all $1 \le i \le k$ with renaming of bound variables as necessary to avoid collision.

Proof

The proof is parallel to that of the Montague-Lévy reflection principle. Let $\psi _{1}, \ldots , \psi _{n}$ be the enumeration of all the sub-formulae of $\varphi $. Then, for each $1 \le i \le n$, let $\psi _{i} ( z_{1}, \ldots , z_{m_{i}} )$ contain only the displayed $m_{i}$ variables free, and we take the following ${\mathcal {L}}_{2}$-definable (not necessarily a class) function ${\mathcal {G}}_{i} :{\mathbb {V}}^{m_{i}} \rightarrow On$.

1.
If $\psi _{i}$ is of the form $\exists w \theta ( w, \vec {z}, P _{1}, \ldots , P _{k} )$, then we set
$$\begin{aligned} {\mathcal {G}}_{i} ( \vec {a} ) \ := \ {\left\{ \begin{array}{ll} \min \{ \eta \in On \mid ( \exists w \in V_{\eta } ) \theta ( w, \vec {a}, \Phi _{1}, \ldots , \Phi _{k} ) \} &{} \text {if } \psi _{i} ( \vec {a}, \vec {\Phi } ) \text { holds} \\ 0 &{} \text {otherwise}; \end{array}\right. } \end{aligned}$$
2.
If $\psi _{i}$ is of another form, then ${\mathcal {G}}_{i} ( \eta ) = 0$;

We thereby take the following ${\mathcal {L}}_{2}$-definable (again, not necessarily classes) functions ${\mathcal {F}}_{i} :On \rightarrow On$ ($1 \le i \le n$) and ${\mathcal {F}} :On \rightarrow On$:

$$\begin{aligned}&{\mathcal {F}}_{i} ( \xi ) \ := \ \sup \{ {\mathcal {G}}_{i} ( a_{1}, \ldots , a_{m_{i}} ) \mid a_{1}, \ldots , a_{m_{i}} \in V_{\xi } \} \\&{\mathcal {F}} ( \xi ) \ := \ \max \{ \xi + 1, {\mathcal {F}}_{1} ( \xi ), \ldots , {\mathcal {F}}_{n} ( \xi ) \}; \end{aligned}$$

here, we essentially use $\Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$. By recursion (which also requires $\Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$ as well as $\Sigma ^{1}_{\infty } \text {-} \mathrm {Ind}$), we set ${\mathcal {F}}^{0} ( \xi ) = \xi $ and ${\mathcal {F}}^{k + 1} ( \xi ) = {\mathcal {F}} ( {\mathcal {F}}^{k} ( \xi ) )$ and then define ${\mathcal {H}} :On \rightarrow On$ by ${\mathcal {H}} ( \xi ) := \sup _{k < \omega } {\mathcal {F}}^{k} ( \xi )$. Now, for any given $\alpha $, let $\beta := {\mathcal {H}} ( \alpha )$ ($> \alpha $). It is routine to check by induction on the complexity of formulae that, for all $1 \le i \le n$ and for all $\vec {z} \in V_{\beta }$,

$$\begin{aligned} \qquad \qquad \qquad \bigl ( \langle V_{\beta }, \Phi _{1}, \ldots , \Phi _{k} \rangle \models \psi _{i} ( P _{1}, \ldots , P _{k} ) \bigr ) \ \leftrightarrow \ \psi _{i} ( \Phi _{1}, \ldots , \Phi _{k} ).\qquad \qquad \qquad \square \end{aligned}$$

Theorem 3.4

$\mathsf {ECA} \vdash \Pi ^{1}_{\infty } \text {-} \mathrm {TI}$. Hence, $\mathsf {ECA}$ and $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}$ have the same ${\mathcal {L}}_{2}$-theorems.

Proof

We work within $\mathsf {ECA}$. Suppose $ Wf ( X )$ and take any ${\mathcal {L}}_{2}$-formula $\Psi $. Then, by the last lemma, there exists $\alpha \in On$ such that

$$\begin{aligned} \langle V_{\alpha }, X, \Psi \rangle \models TI _{ P _{2}} ( P _{1} ), \text { if and only if } TI _{\Psi } ( X ). \end{aligned}$$

Hence, it suffices to show that $\langle V_{\alpha }, X, \Psi \rangle \models TI _{ P _{2}} ( P _{1} )$, i.e.,

$$\begin{aligned} ( \forall x \in V_{\alpha } ) \bigl ( ( \forall y \in V_{\alpha } ) ( y \prec _{X} x \rightarrow \Psi ( y ) ) \rightarrow \Psi ( x ) \bigr ) \rightarrow ( \forall x \in V_{\alpha } ) \Psi ( x ). \end{aligned}$$

Assume the antecedent. Let $Z := ( V_{\alpha } \cap \Psi ) \cup ( {\mathbb {V}} {\setminus } V_{\alpha } )$, which is a class by $\Pi ^{1}_{\infty } \text {-} \mathrm {Sep}$. Then, it follows from the antecedent that

$$\begin{aligned} \forall x \bigl ( \forall y ( y \prec _{X} x \rightarrow y \in Z ) \rightarrow x \in Z \bigr ). \end{aligned}$$

Thus $Z = {\mathbb {V}}$ follows from $ Wf ( X )$, which implies $( \forall x \in V_{\alpha } ) \Psi ( x )$. $\square $

The next follows from this theorem, Proposition 2.14, and Corollary 2.18.

Corollary 3.5

$\mathsf {ECA}_{0}^{+}$ proves the consistency of $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}$.

Hence, whereas $\Sigma ^{1}_{1} \text {-} \mathrm {TI}$ implies $\mathrm {ATR}$ over $\mathsf {ACA}_{0}$ in second-order arithmetic ([24, Theorem 2.5]), the corresponding statement fails in class theory.

Proposition 3.6

$\Delta ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}}_{0} + \Sigma ^{1}_{1} \text {-} \mathrm {TI} \vdash \mathrm {ETR}$.

Proof

The claim follows from the proof of [7, Theorem 18], which actually establishes that, for each $\Phi \in \Pi ^{1}_{0}$, there exists some $\Psi \in \Sigma ^{1}_{1}$ such that

$$\begin{aligned} \Delta ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}}_{0} \vdash TI _{\Psi } ( X ) \rightarrow \exists Y {\mathcal {H}}_{\Phi } ( X, Y ). \square \end{aligned}$$

Hence, the next theorem follows from Theorem 3.4 and Proposition 3.6.

Theorem 3.7

$\Delta ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}} \vdash \mathrm {ETR}$. Hence, $\Sigma ^{1}_{1} \text {-} \mathsf {Coll} \ (\text {or } \Sigma ^{1}_{1} \text {-} \mathsf {DColl}) \vdash \mathrm {ETR}$.

Since $\Delta ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}}$ is reflexive, this gives an alternative proof of [7, Theorem 90].

Corollary 3.8

$\Delta ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}} \vdash Con ( \mathsf {ETR}_{0} )$. Hence, $\Sigma ^{1}_{1} \text {-} \mathsf {Coll} \ (\text {or } \Sigma ^{1}_{1} \text {-} \mathsf {DColl}) \vdash Con ( \mathsf {ETR}_{0} )$.

Sato [22] showed that $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}$ even shows the consistency of $\mathsf {ETR}$ (i.e., $\mathsf {ETR}_{0} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$), and we will give an alternative proof of this fact later in §5.

We next consider $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$. It will be shown that the strength of $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$ falls strictly between $\mathsf {NBG}$ and $\mathsf {ECA}$.

First, preliminarily, we will observe that the consistency of $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$ ($+ \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$) can be relatively easily proved if we assume $\mathrm {AC}$.

Lemma 3.9

$\mathsf {ECA} + \mathrm {AC} \vdash Con ( \Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} + \mathrm {AC} )$.

Proof

$\mathsf {ECA}$ proves that there is an ${\mathcal {L}}_{2}$ formula $\Phi ( \alpha )$ of ordinals with $V_{\alpha } \models {\mathsf {Z}}{\mathsf {F}}$ such that $\{ \alpha \in On \mid \Phi ( \alpha ) \}$ is closed unbounded in On; see [7, Corollary 27] (or Fact 3.10 below). Hence, there is an ordinal $\kappa $ with cofinality greater than $\omega $ such that $V_{\kappa } \models {\mathsf {Z}}{\mathsf {F}}$.^{Footnote 14} Let D be the set of $V_{\kappa }$-definable sets with parameters from $V_{\kappa }$. Then, $\langle V_{\kappa }, D \rangle $ is a model of $\mathsf {NBG} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} + \mathrm {AC}$. We claim that $\langle V_{\kappa }, D \rangle \models \Pi ^{1}_{\infty } \text {-} \mathrm {TI}$. Take any set $X \in D$ and suppose $\langle V_{\kappa }, D \rangle \models Wf ( X )$. This is equivalent to the non-existence of a pseudo $\omega $-descending chain of $\prec _{X}$ in $V_{\kappa }$. Hence, $\prec _{X}$ is indeed well-founded in ${\mathbb {V}}$, since ${\mathsf {c}}{\mathsf {f}} ( \kappa ) > \omega $ and thus any pseudo $\omega $-descending chain of $\prec _{X}$ would be contained in $V_{\kappa }$. We thereby obtain $\langle V_{\kappa }, D \rangle \models TI ( \prec _{X} )$. $\square $

We need to eliminate the assumption of $\mathrm {AC}$, but the above proof still gives a guidance for how to achieve it: that is, we should aim to give a transitive model of $\mathsf {NBG} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$ for which the notion of well-foundedness (of class orderings) is absolute.^{Footnote 15} For that goal (and for other purposes later on), we introduce a theory ${\mathsf {T}}{\mathsf {C}}$ of the Tarskian typed truth defined in [7]; we will only repeat necessary facts about ${\mathsf {T}}{\mathsf {C}}$ and refer the reader to [7] for the proofs and details.

The language ${\mathcal {L}}_{\mathrm {T}}$ of ${\mathsf {T}}{\mathsf {C}}$ is defined as ${\mathcal {L}}_{\in } \cup \{ T \}$ for a unary predicate symbol $ T $ (“truth predicate”). Let ${\mathcal {L}}_{\in }^{\infty }$ be the language obtained by adding constant symbols $\mathsf {c}_{a}$ to ${\mathcal {L}}_{\in }$ for all $a \in {\mathbb {V}}$. We fix a coding of ${\mathcal {L}}_{\in }^{\infty }$ in a sufficiently weak sub-theory of ${\mathsf {Z}}{\mathsf {F}}$, say, ${\mathsf {K}}{\mathsf {P}}\omega $. We will denote the classes of codes of ${\mathcal {L}}_{\in }^{\infty }$-formulae and ${\mathcal {L}}_{\in }^{\infty }$-sentences by $ Fml _{\in }^{\infty }$ and $ St _{\in }^{\infty }$, respectively; we will also write $ Fml _{\in }$ and $ St _{\in }$ for the (countably infinite) sets of ${\mathcal {L}}_{\in }$-formulae and ${\mathcal {L}}_{\in }$-sentences, respectively. For each ${\mathcal {L}}_{\in }$-formula $\varphi $, $ Fml _{\in }$ and $ Fml _{\in }^{\infty }$ both contain its code and we will simply denote it by $\varphi $; this notation again neglects the distinction of formulae and their codes, but there should be no danger of confusion.^{Footnote 16} By writing $\varphi ( u_{1}, \ldots , u_{k} ) \in Fml _{\in }^{\infty }$ we indicate it codes an ${\mathcal {L}}_{\in }^{\infty }$-formula only with the displayed variables free, and, for each sets $a_{1}, \ldots , a_{k}$, we simply write $\varphi ( a_{1}, \ldots , a_{k} )$ to denote the code of the ${\mathcal {L}}_{\in }^{\infty }$-formula $\varphi ( \mathsf {c}_{a_{1}}, \ldots , \mathsf {c}_{a_{k}} )$ obtained by substituting the constants $\mathsf {c}_{a_{i}}$ for the variables $u_{i}$ ($1 \le i \le k$); accordingly, $ T ( \varphi ( \vec {a} ) )$ expresses that $\varphi ( \vec {u} )$ is true of $a_{1}$, ..., $a_{k}$.

The ${\mathcal {L}}_{\mathrm {T}}$-system ${\mathsf {T}}{\mathsf {C}}$ comprises ${\mathsf {Z}}{\mathsf {F}} + {\mathcal {L}}_{\mathrm {T}} \text {-} \mathrm {Sep} + {\mathcal {L}}_{\mathrm {T}} \text {-} \mathrm {Repl}$ (Sect. 2.5) plus the axioms expressing Tarski’s inductive clauses of the truth predicate for ${\mathcal {L}}_{\in }^{\infty }$, such as “a sentence $\mathsf {c}_{a} \in \mathsf {c}_{b}$ is true, if and only if a is indeed a member of b”, more precisely, the following four axioms:

$$\begin{aligned} {( \mathrm {T1} )} \quad&\forall a \forall b \bigl ( T ( a \in b ) \leftrightarrow a \in b \bigr ) \wedge \forall a \forall b \bigl ( T ( a = b ) \leftrightarrow a = b \bigr ) \\ {( \mathrm {T2} )} \quad&( \forall \sigma \in St _{\in }^{\infty } ) \bigl ( T ( \lnot \sigma ) \leftrightarrow \lnot T ( \sigma ) \bigr ) \\ {( \mathrm {T3})} \quad&( \forall \sigma , \tau \in St _{\in }^{\infty } ) \bigl ( T ( \sigma \wedge \tau ) \leftrightarrow ( T ( \sigma ) \wedge T ( \tau ) ) \bigr ) \\ {( \mathrm {T4} )} \quad&( \forall \varphi ( x ) \in Fml _{\in }^{\infty } ) \bigl ( T ( \forall x \varphi ( x ) ) \leftrightarrow \forall a T ( \varphi ( a ) ) \bigr ), \end{aligned}$$

where “$( \forall \varphi ( x ) \in Fml _{\in }^{\infty } )$” means “for all codes of ${\mathcal {L}}_{\in }^{\infty }$-formulae with exactly one free variable”, as we have stipulated above.

It is shown in [7] that $\mathsf {ECA}$ and ${\mathsf {T}}{\mathsf {C}}$ are mutually interpretable in the way that the ${\mathcal {L}}_{\in }$-part is preserved; hence, they have the same ${\mathcal {L}}_{\in }$-theorems. Such an interpretation of $\mathsf {ECA}$ in ${\mathsf {T}}{\mathsf {C}}$ is obtained by translating second-order quantifiers “$\forall X$” into “for all codes of ${\mathcal {L}}_{\in }^{\infty }$-formulae with exactly one free variables” (i.e., “$\forall \varphi ( x ) \in Fml _{\in }^{\infty }$”) and also translating the membership relation “$a \in X$” into “the ${\mathcal {L}}_{\in }^{\infty }$-formula X is true of a” (i.e., “$ T ( \varphi ( a ) )$”). We will denote this translation of ${\mathcal {L}}_{2}$ in ${\mathcal {L}}_{\mathrm {T}}$ by ${\mathcal {I}}$.

In ${\mathsf {T}}{\mathsf {C}}$, we can directly express that a transitive set M is a elementary sub-structure of ${\mathbb {V}}$. Let M be a transitive set. Following the notation of [7], by $ St _{\in }^{M}$ we denote the set of codes of ${\mathcal {L}}_{\in }^{\infty }$ sentences only with constants $\mathsf {c}_{a}$ from $a \in M$. If M is a model of a sufficiently strong theory such as ${\mathsf {K}}{\mathsf {P}}\omega $, then we have $ St _{\in }^{M} \ = \ St _{\in }^{\infty } \cap M = ( St _{\in }^{\infty } )^{M}$. Thereby we define^{Footnote 17}

$$\begin{aligned} M \prec {\mathbb {V}} :\Leftrightarrow ( \forall \sigma \in St _{\in }^{M} ) ( T ( \sigma ^{M} ) \leftrightarrow T ( \sigma ) ), \end{aligned}$$

where $\sigma ^{M}$ is (the code of) the ordinary relativization of $\sigma $ to M. Since ${\mathsf {T}}{\mathsf {C}}$ proves that all the axioms of ${\mathsf {Z}}{\mathsf {F}}$ are true and that $( \forall \vec {a} \in M ) \bigl ( T ( \varphi ^{M} ( \vec {a} ) ) \leftrightarrow M \models \varphi ( \vec {a} ) \bigr )$ for all $\varphi ( \vec {u} ) \in Fml _{\in }$, $M \prec {\mathbb {V}}$ implies $M \models {\mathsf {Z}}{\mathsf {F}}$ in ${\mathsf {T}}{\mathsf {C}}$.

Let $\varphi $ be an ${\mathcal {L}}_{\mathrm {T}}$-formula and M a set. We inductively define the relativization of $\varphi $ to M in the following obvious manner:

$$\begin{aligned}&(x = y)^{M} :\Leftrightarrow x = y&(x \in y)^{M} :\Leftrightarrow x \in y&(T x)^{M} :\Leftrightarrow T x&\\&(\lnot \psi )^{M} :\Leftrightarrow \lnot \psi ^{M}&(\psi \wedge \theta )^{M} :\Leftrightarrow \psi ^{M} \wedge \theta ^{M}&(\exists x \psi ( x ))^{M} :\Leftrightarrow ( \exists x \in M ) \psi ^{M} ( x ).&\end{aligned}$$

Next, we denote the (countably infinite) sets of codes of ${\mathcal {L}}_{\mathrm {T}}$-formulae and codes of ${\mathcal {L}}_{\mathrm {T}}$-sentences by $ Fml _{\mathrm {T}}$ and $ St _{\mathrm {T}}$; we will not consider adding set constants to them. Then, for each $\varphi ( \vec {u} )\in Fml _{\mathrm {T}}$ and $\vec {a} \in M$, we write $M \models \varphi ( \vec {a} )$ to mean $\langle M, T \cap M \rangle \models \varphi ( \vec {a} )$, in which $ T $ is interpreted by $ T \cap M$ ($= \{ x \in M \mid T x \}$) and the membership relation is standardly interpreted; note that $ T \cap M$ is a set by ${\mathcal {L}}_{\mathrm {T}} \text {-} \mathrm {Sep}$. It is routine to show that $( \forall \vec {a} ) \bigl ( \varphi ^{M} ( \vec {a} ) \leftrightarrow M \models \varphi ( \vec {a} ) \bigr )$; recall that “$\varphi $” to the right of “$\leftrightarrow $” is a code of ${\mathcal {L}}_{\mathrm {T}}$-formula, while “$\varphi $” to the left of “$\leftrightarrow $” is a genuine formula as a meta-theoretic syntactic entity.

Let M be a transitive model of a sufficiently strong sub-theory of ${\mathsf {Z}}{\mathsf {F}}$, say, ${\mathsf {K}}{\mathsf {P}}\omega $. By ${\mathcal {I}}^{-1} ( M )$ we denote the ${\mathcal {L}}_{2}$-structure $\langle M, N \rangle $ where N is defined as follows:

$$\begin{aligned} N \, := \, \Bigl \{ \{ a \in M \mid M \models T ( \varphi ( a ) ) \} \, \mid \, \varphi ( u ) \in ( Fml _{\in }^{\infty } )^{M} \text { with one free variable} \Bigr \}: \end{aligned}$$

that is, ${\mathcal {I}}^{-1} ( M )$ is the ${\mathcal {L}}_{2}$-structure induced from the ${\mathcal {L}}_{\mathrm {T}}$-structure $\langle M, T \cap M \rangle $ by ${\mathcal {I}}$. Hence, for every (code of) ${\mathcal {L}}_{2}$-sentence $\Phi $ and a set M, we have

$$\begin{aligned} M \models \Phi ^{{\mathcal {I}}} \leftrightarrow {\mathcal {I}}^{-1} ( M ) \models \Phi . \end{aligned}$$

(10)

The next fact [7, Lemma 29] will be used in the proof of our claim.

Fact 3.10

For any ${\mathcal {L}}_{\mathrm {T}}$-formulae $\varphi _{1}, \ldots , \varphi _{k}$, ${\mathsf {T}}{\mathsf {C}}$ proves the following:

$$\begin{aligned} ( \forall \alpha ) ( \exists \beta > \alpha ) \Bigl ( V_{\beta } \prec {\mathbb {V}} \wedge \bigwedge _{i \le k} ( \forall \vec {x}_{i} \in V_{\beta } ) \bigl ( \varphi _{i}^{V_{\beta }} ( \vec {x}_{i} ) \leftrightarrow \varphi _{i} ( \vec {x}_{i} ) \bigr ) \Bigr ). \end{aligned}$$

Theorem 3.11

$\mathsf {ECA} \vdash Con ( \Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} )$.

Proof

It suffices to prove the claimed consistency in ${\mathsf {T}}{\mathsf {C}}$. We will work within ${\mathsf {T}}{\mathsf {C}}$.

We first note that the ${\mathcal {I}}$-translation $ Wf ^{{\mathcal {I}}} ( x )$ of $ Wf ^{{\mathcal {I}}} ( X )$ is an ${\mathcal {L}}_{\mathrm {T}}$ formula that takes a code $\varphi ( u ) \in Fml _{\in }^{\infty }$ of an ${\mathcal {L}}_{\in }^{\infty }$-formula with only one free variable as its argument, and $ Wf ^{{\mathcal {I}}} ( \varphi ( u ) )$ precisely denotes:

$$\begin{aligned} \bigl ( \forall \psi ( u ) \in&Fml _{\in }^{\infty } \bigr ) \Bigl ( \forall x \Bigl ( \forall y \bigl ( T ( \varphi ( \langle y, x \rangle ) ) \rightarrow T ( \psi ( y ) ) \bigr ) \rightarrow T ( \psi ( x ) ) \Bigr ) \rightarrow \forall x T ( \psi ( x ) ) \Bigr ). \end{aligned}$$

For readability, we will write $y \prec _{\varphi } x$ for $ T ( \varphi ( \langle y, x \rangle ) )$.

By Fact 3.10, there is some $\beta $ such that $V_{\beta } \prec {\mathbb {V}}$ and the following hold:

$$\begin{aligned}&( \forall x \in V_{\beta } ) \bigl ( ( Wf ^{{\mathcal {I}}} )^{V_{\beta }} ( x ) \leftrightarrow Wf ^{{\mathcal {I}}} ( x ) \bigr ); \end{aligned}$$

(11)

$$\begin{aligned}&( \mathsf {NBG}^{{\mathcal {I}}} )^{V_{\beta }} \leftrightarrow \mathsf {NBG}^{{\mathcal {I}}}. \end{aligned}$$

(12)

Since $V_{\beta } \prec {\mathbb {V}}$, we have $( Fml _{\in }^{\infty } )^{V_{\beta }} = Fml _{\in }^{\infty } \cap V_{\beta }$. Hence, (11) entails

$$\begin{aligned} ( \forall \varphi ( u ) \in ( Fml _{\in }^{\infty } )^{V_{\beta }} ) \bigl ( ( Wf ^{{\mathcal {I}}} )^{V_{\beta }} ( \varphi ( u ) ) \leftrightarrow Wf ^{{\mathcal {I}}} ( \varphi ( u ) ) \bigr ). \end{aligned}$$

(13)

We claim ${\mathcal {I}}^{-1} ( V_{\beta } )$ is a model of $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$. We have $( \mathsf {NBG}^{{\mathcal {I}}} )^{V_{\beta }}$ by (12), since ${\mathsf {T}}{\mathsf {C}} \vdash \mathsf {NBG}^{{\mathcal {I}}}$, and thus ${\mathcal {I}}^{-1} ( V_{\beta } ) \models \mathsf {NBG}$ by (10). Since $\beta $ is a limit ordinal, we also have ${\mathcal {I}}^{-1} ( V_{\beta } ) \models \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$.

Now, take any $\varphi ( u ) \in ( Fml _{\in }^{\infty } )^{\beta }$. Since $V_{\beta } \prec {\mathbb {V}}$, we have $( Fml _{\in }^{\infty } )^{V_{\beta }} = Fml _{\in }^{\infty } \cap V_{\beta }$ and thus $\varphi ( u ) \in Fml _{\in }^{\infty }$. Suppose $( Wf ^{{\mathcal {I}}} ( \varphi ( u ) ) )^{V_{\beta }}$; we have $ Wf ^{{\mathcal {I}}} ( \varphi ( u ) )$ by (13). Take any (a code of an ${\mathcal {L}}_{2}$-formula) $\Phi ( x ) \in Fml _{2}$ and suppose

$$\begin{aligned} V_{\beta } \models \, \forall x \bigl ( \forall y ( y \prec _{\varphi } x \rightarrow \Phi ^{{\mathcal {I}}} ( y ) ) \rightarrow \Phi ^{{\mathcal {I}}} ( x ) \bigr ). \end{aligned}$$

Since $V_{\beta } \prec {\mathbb {V}}$, all the relevant syntactic notions and operations concerning the codes of ${\mathcal {L}}_{\in }^{\infty }$ are absolute for $V_{\beta }$, and thus it follows that

$$\begin{aligned} ( \forall x \in V_{\beta } ) \Bigl ( ( \forall y \in V_{\beta } ) \bigl ( y \prec _{\varphi } x \rightarrow V_{\beta } \models \Phi ^{{\mathcal {I}}} ( y ) \bigr ) \rightarrow V_{\beta } \models \Phi ^{{\mathcal {I}}} ( x ) \Bigr ). \end{aligned}$$

Pick $a := \{ x \in V_{\beta } \mid V_{\beta } \models \Phi ^{{\mathcal {I}}} ( x ) \}$ by ${\mathcal {L}}_{\mathrm {T}} \text {-} \mathrm {Sep}$. We have

$$\begin{aligned} \forall x \bigl ( \forall y ( y \prec _{\varphi } x \rightarrow ( y \in a \vee y \not \in V_{\beta } ) ) \rightarrow ( x \in a \vee x \not \in V_{\beta } ) \bigr ). \end{aligned}$$

If $a \ne V_{\beta }$, then $\{ x \mid x \in a \vee x \not \in V_{\beta } \} \ne {\mathbb {V}}$, which contradicts $ Wf ^{{\mathcal {I}}} ( \varphi ( u ) )$, since $\{ x \mid x \in a \vee x \not \in V_{\beta } \}$ is a class in the sense of $\mathsf {NBG}^{{\mathcal {I}}}$; hence, we obtain $V_{\beta } \models \forall x \Phi ^{{\mathcal {I}}} ( x )$. $\square $

As a consequence, in particular, $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}$ is consistency-wise stronger than $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}_{0}$.

4 Reflection

In contrast to $\Pi ^{1}_{\infty } \text {-} \mathrm {TI}$, the schema $\Pi ^{1}_{\infty } \text {-} \mathrm {RFN}$ is rather strong, whereas they are equivalent in second-order arithmetic (in $\mathsf {ACA}_{0}$).

Proposition 4.1

$\mathsf {NBG} + \Pi ^{1}_{n + 2} \text {-} \mathrm {RFN} \vdash \Sigma ^{1}_{n} \text {-} \mathrm {Sep} + \Sigma ^{1}_{n} \text {-} \mathrm {Repl}$.

Proof

The negation of each instance of $\Sigma ^{1}_{n} \text {-} \mathrm {Sep}$ or $\Sigma ^{1}_{n} \text {-} \mathrm {Repl}$ is a $\Sigma ^{1}_{n + 3}$-sentence. Hence, if the negation of any instance of either $\Sigma ^{1}_{n} \text {-} \mathrm {Sep}$ or $\Sigma ^{1}_{n} \text {-} \mathrm {Repl}$ were to hold, then $\Sigma ^{1}_{n + 3} \text {-} \mathrm {RFN}$ and thus $\Pi ^{1}_{n + 2} \text {-} \mathrm {RFN}$ (by Proposition 2.20.2) would yield a coded ${\mathbb {V}}$-model S that falsifies the instance, which contradicts Lemma 2.17.2. $\square $

Corollary 4.2

$\Pi ^{1}_{\infty } \text {-} \mathsf {RFN}_{0}$ and $\Pi ^{1}_{\infty } \text {-} \mathsf {RFN}$ have exactly the same ${\mathcal {L}}_{2}$-theorems, and thus $\Pi ^{1}_{\infty } \text {-} \mathsf {RFN}_{0} \vdash \Pi ^{1}_{\infty } \text {-} \mathrm {TI}$ (by Theorem 3.4).

Lemma 4.3

$\Pi ^{1}_{2} \text {-} \mathsf {RFN}_{0} \vdash \Sigma ^{1}_{1} \text {-} \mathrm {Coll}$.

Proof

Suppose $\forall x \exists X \Phi ( x, X, \vec {Z} )$ for $\Phi \in \Pi ^{1}_{0}$ with parameters $\vec {Z}$. There is a coded ${\mathbb {V}}$-model S such that $\vec {Z} \, {\dot{\in }} \, S$ and $S \models \forall x \exists X \Phi ( x, X, \vec {Z} )$, namely, $\forall x \exists z \Phi ( x, ( S )_{z}, \vec {Z} )$. $\square $

Corollary 4.4

$\Pi ^{1}_{2} \text {-} \mathsf {RFN} \vdash \mathrm {ETR}$; by Lemma 3.6 and Proposition 2.25.2.

Corollary 4.5

$\Pi ^{1}_{2} \text {-} \mathsf {RFN} \vdash Con ( \mathsf {ETR} )$.

Proof

Since $\mathsf {ETR}_{0}$ is finitely axiomatizable by $\Pi ^{1}_{2}$-sentences, there is a coded ${\mathbb {V}}$-model S of $\mathsf {ETR}_{0}$ by Corollary 4.4. By Lemma 2.21 and Proposition 2.14.1, S has a full satisfaction class. Hence, the claim follows from Corollary 2.18. $\square $

Corollary 4.6

$\Pi ^{1}_{3} \text {-} \mathsf {RFN}_{0} \vdash Con ( \Sigma ^{1}_{1} \text {-} \mathsf {Coll} )$.

Proof

Shown similarly to the last corollary, using Lemma 4.3 (instead of Corollary 4.4) and the finite axiomatizability of $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0}$ by $\Pi ^{1}_{3}$-sentences. $\square $

Theorem 4.7

$\Pi ^{1}_{3} \text {-} \mathsf {RFN}_{0} \vdash \mathrm {FP}$. Hence, $\Pi ^{1}_{3} \text {-} \mathsf {RFN}_{0} \vdash \mathrm {LFP}$ by Sato’s Theorem 2.24.

Proof

Take any X-positive elementary $\Phi ( x, z, X, Z )$. Let us fix the parameters z and Z and take a coded ${\mathbb {V}}$-model S of $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}_{0}$ with $Z \, {\dot{\in }} \, S$. By Lemma 2.26, there is a $\Sigma ^{1}_{1}$-formula $\Psi ( u )$ with parameters from S such that

$$\begin{aligned} S \models \forall x \bigl ( \Psi ( x ) \leftrightarrow \Phi ( x, z, \Psi ( {\hat{u}} ), Z ) \bigr ). \end{aligned}$$

Take $Y := \{ x \mid S \models \Psi ( x ) \}$ by $\mathrm {ECA}$; we have $\forall x \bigl ( x \in Y \leftrightarrow \Phi ( x, z, Y, Z ) \bigr )$. $\square $

The next follows from the last theorem and the finite axiomatizability of $\mathsf {LFP}_{0}$ by $\Pi ^{1}_{3}$-sentences; the proof is similar to those of Corollaries 4.5 and 4.6.

Corollary 4.8

$\Pi ^{1}_{3} \text {-} \mathsf {RFN}_{0} \vdash Con ( \mathsf {LFP} )$.

This makes another contrast with second-order arithmetic; in second-order arithmetic, $\mathsf {LFP}$ rather proves the consistency of $\Pi ^{1}_{3} \text {-} \mathsf {RFN}_{0}$.^{Footnote 18}

The proof of the next lemma is essentially parallel to that of the corresponding statement of [25, Theorem VIII.5.12] in second-order arithmetic.

Lemma 4.9

$\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0} \vdash \Pi ^{1}_{2} \text {-} \mathrm {RFN}$.

Proof

We will work within $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0}$. Let $\Phi ( z, U ) = \forall X \exists Y \Psi ( z, X, Y, U )$ be a $\Pi ^{1}_{2}$-formula where $\Psi $ is elementary; we can similarly show the claim for formulae with more free variables. Suppose $\Phi ( z, U )$ holds. We define

$$\begin{aligned} \Theta ( x, z, X, Y, U ) \, :\Leftrightarrow \, ( x = 0 \rightarrow Y = U ) \wedge \forall b \bigl [ x = \langle 1, b \rangle \rightarrow \Psi \bigl ( z, ( X )_{b}, Y, U \bigr ) \bigr ]. \end{aligned}$$

By the supposition, we have $\forall x \forall X \exists Y \Theta ( x, z, X, Y, U )$, and thus $\Sigma ^{1}_{1} \text {-} \mathrm {DColl}$ yields a class S with $\forall x \exists y \Theta ( x, z, ( S )^{x}, ( S )_{y}, U )$. Since $\Theta ( 0, z, ( S )^{0}, ( S )_{y}, U ) \leftrightarrow ( S )_{y} = U$ for some y, we have $U \, {\dot{\in }} \, S$. It remains to show that $S \models \Phi ( z, U )$; recall that we can ignore the condition $S \models \mathsf {NBG}$ when working with $\Pi ^{1}_{n} \text {-} \mathrm {RFN}$ for $n \ge 2$ (see Sect. 2.4). Take any $X \, {\dot{\in }} \, S$. Let b be such that $( S )_{b} = X$, and put $x = \langle 1, b \rangle $. Then, there exists some y such that $\Theta ( x, z, ( S )^{x}, ( S )_{y}, U )$, which entails $\Psi \bigl ( z, ( ( S )^{x} )_{b}, ( S )_{y}, U \bigr )$. Since ${\mathsf {r}}{\mathsf {k}} ( b ) < {\mathsf {r}}{\mathsf {k}} ( x )$, we obtain $\Psi \bigl ( z, ( S )_{b}, ( S )_{y}, U \bigr )$; hence, $S \models \Phi ( z, U )$. $\square $

Corollary 4.10

$\Sigma ^{1}_{1} \text {-} \mathsf {DColl} \vdash Con ( \mathsf {ETR} )$.

It is known that $\mathsf {LFP}_{0} \vdash \Sigma ^{1}_{1} \text {-} \mathrm {DC}$ in second-order arithmetic,^{Footnote 19} but this fails in class theory, even if we add $\Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$ and $\Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$.

Corollary 4.11

$\mathsf {LFP} \not \vdash \Sigma ^{1}_{1} \text {-} \mathrm {DColl}$.

Proof

For contradiction suppose $\mathsf {LFP} \vdash \Sigma ^{1}_{1} \text {-} \mathrm {DColl}$, which implies ${\mathsf {F}}{\mathsf {P}} \vdash \Pi ^{1}_{2} \text {-} \mathrm {RFN}$. Since ${\mathsf {F}}{\mathsf {P}}_{0}$ is finitely axiomatizable by $\Pi ^{1}_{2}$-formulae, we would have a coded ${\mathbb {V}}$-model of ${\mathsf {F}}{\mathsf {P}}_{0}$ in ${\mathsf {F}}{\mathsf {P}}$. Since ${\mathsf {F}}{\mathsf {P}}_{0} \vdash \mathrm {ETR}$, we would have ${\mathsf {F}}{\mathsf {P}} \vdash Con ( {\mathsf {F}}{\mathsf {P}} )$; a contradiction. $\square $

Remark 4.12

Sato has already proved $\mathsf {LFP} \not \vdash \Sigma ^{1}_{1} \text {-} \mathrm {Coll}$ [23, Corollary 12], which entails the last corollary. Furthermore, Gitman, Hamkins, and Johnstone announced (by private communication with Gitman) a much stronger result: they have shown that even ${\mathsf {M}}{\mathsf {K}}$ does not derive $\Sigma ^{1}_{1} \text {-} \mathrm {Coll}$.^{Footnote 20}

We have seen that $\Pi ^{1}_{\infty } \text {-} \mathrm {RFN}$ is a relatively strong axiom (schema), and it is stronger than some systems, such as $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}$ and $\mathsf {LFP}$, that are equivalent or even stronger than it in second-order arithmetic.

5 $\Sigma ^{1}_{1}$-dependent collection

In second-order arithmetic, $\Sigma ^{1}_{1} \text {-} \mathrm {DC}$ is equivalent to both $\Pi ^{1}_{2} \text {-} \mathrm {RFN}$ and $\Pi ^{1}_{1} \text {-} \mathrm {TI}$ over $\mathsf {ACA}_{0}$ [25, Theorem VIII.5.12]. In class theory, as we have seen, $\Sigma ^{1}_{1} \text {-} \mathrm {DColl}$ and $\Pi ^{1}_{1} \text {-} \mathrm {TI}$ are not equivalent in $\mathsf {NBG}$; furthermore, $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0}$ has stronger consistency strength than even $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}$ (by Proposition 2.32 and Corollary 3.5). In contrast, we have also shown that $\Sigma ^{1}_{1} \text {-} \mathrm {DColl}$ still implies $\Pi ^{1}_{2} \text {-} \mathrm {RFN}$ in $\mathsf {NBG}$. Unfortunately, we do not yet know whether the converse holds. Nonetheless, we will show in the present section that $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}$ and $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}$ have the same ${\mathcal {L}}_{\in }$-theorems, and this implies that $\Pi ^{1}_{2} \text {-} \mathsf {RFN}$ and $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}$ have the same ${\mathcal {L}}_{\in }$-theorems, too, since $\Pi ^{1}_{2} \text {-} \mathsf {RFN}_{0} \vdash \Sigma ^{1}_{1} \text {-} \mathrm {Coll}$ (Lemma 4.3). For this purpose, we need to make a detour to some first-order extensions of ${\mathsf {Z}}{\mathsf {F}}$ studied in [8].

It is shown in [8] that $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}$ has the same ${\mathcal {L}}_{\in }$-theorems as the first-order system ${\mathsf {S}}{\mathsf {C}}_{1}$ of stage comparison prewellorderings (of inductive definitions) and also that they have the same ${\mathcal {L}}_{\in }$-theorems as the Kripke–Platek system ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$ over the set-theoretic universe ${\mathbb {V}}$ with respect to the canonical translation $\star $ of ${\mathcal {L}}_{\in }$ into the language ${\mathcal {L}}_{\mathrm {KP}}$ of ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$ (see Sect. 5.1.1 for its definition). Using this fact, our claim will be shown in two steps. First, we will define a new system ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}_{p}$, which essentially is ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$ augmented with a new predicate for a $\Delta $ projection of the domain of sets, and then show that ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}_{p}$ plus the “axiom of constructibility” ${\mathcal {S}}= L ^{{\mathcal {S}}} ({\mathsf {V}} )$, asserting that every set is constructible relative to the set of urelements, is interpretable in ${\mathsf {S}}{\mathsf {C}}_{1}$ in the way that the ${\mathcal {L}}_{\in }$-part is preserved (with respect to $\star $). Second, we will show that $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}$ is also interpretable in ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}_{p} + {\mathcal {S}}= L ^{{\mathcal {S}}} ({\mathsf {V}} )$ in the way that preserves the ${\mathcal {L}}_{\in }$-part (with respect to $\star $). The argument of the present section presupposes and makes use of the results of [8], and we refer the reader to [8] for the definitions and basic facts of the relevant systems.

5.1 ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}_{p} + {{\mathcal {S}}= L ^{{\mathcal {S}}} ({\mathsf {V}} )}$ and its interpretation in ${\mathsf {S}}{\mathsf {C}}_{1}$

Throughout this Sect. 5.1, we do not work with any second-order systems, and we identify “classes” with formulae without danger of confusion: hence, a class is always an abbreviation of some formula possibly with parameters in the present subsection. The definitions and notions we have so far made for classes (as second-order entities of class theory) carry over to classes in this sense; e.g., $y \in ( X )_{x}$ precisely means $\Phi ( \langle y, x \rangle )$ for some formula $\Phi $ that the “class” X denotes. We remark that we adopt some different notations from [8], where $( X )_{x}$ is denoted by $X^{x}$, for example.

5.1.1 The systems ${\mathsf {S}}{\mathsf {C}}_{1}$ and ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$

For the reader’s convenience, we will repeat the definitions of the systems ${\mathsf {S}}{\mathsf {C}}_{1}$ and ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ and explain some basic facts about them necessary for the subsequent argument.

The language ${\mathcal {L}}_{\mathrm {SC}}$ of ${\mathsf {S}}{\mathsf {C}}_{1}$ is an extension of ${\mathcal {L}}_{\mathrm {ID}}$ (see Sect. 2.5) with additional unary predicates $\prec _{{\mathcal {A}}}$ associated to each inductive operator form ${\mathcal {A}} ( x, P )$. Let us denote the class of ordered pairs by $ Pair $, and, for each $x \in Pair $, we denote its first component by $( x )_{0}$ and the second by $( x )_{1}$; hence, $( \langle a, b \rangle )_{0} = a$ and $( \langle a, b \rangle )_{1} = b$. Let us write $x \prec _{{\mathcal {A}}} y$ for $\prec _{{\mathcal {A}}} \! \! ( \langle x, y \rangle )$ and also write $\prec _{{\mathcal {A}}} \upharpoonright _{x}$ for the class of $\prec _{{\mathcal {A}}}$-predecessors of x, i.e., $\{ y \mid y \prec _{{\mathcal {A}}} x \}$ ($= ( \prec _{{\mathcal {A}}} )_{x}$, in other words). The ${\mathcal {L}}_{\mathrm {SC}}$-system ${\mathsf {S}}{\mathsf {C}}_{1}$ is defined as $\widehat{{\mathsf {I}}{\mathsf {D}}}_{1} + {\mathcal {L}}_{\mathrm {SC}} \text {-} \mathrm {Sep} + {\mathcal {L}}_{\mathrm {SC}} \text {-} \mathrm {Repl}$ (see Sect. 2.5) plus the following axiom schemata of $\prec _{{\mathcal {A}}}$ for each inductive operator form ${\mathcal {A}} ( x, P )$.

$$\begin{aligned} {\mathrm {SC1}:} \quad&\prec _{{\mathcal {A}}} \, \subset \, Pair \wedge \forall x \forall y \bigl [ \, x \prec _{{\mathcal {A}}} y \, \leftrightarrow \, \bigl ( x \in J_{{\mathcal {A}}} \wedge \lnot {\mathcal {A}} ( y, \prec _{{\mathcal {A}}} \upharpoonright _{x} ) \bigr ) \, \bigr ] \\ {\mathrm {SC2}:} \quad&\forall x \bigl ( \forall y ( y \prec _{{\mathcal {A}}} x \rightarrow \Psi ( y ) ) \rightarrow \Psi ( x ) \bigr ) \rightarrow \forall x \Psi ( x ), \ \text {for all } \Psi ( u ) \in {\mathcal {L}}_{\mathrm {SC}}. \end{aligned}$$

The axioms $\mathrm {SC1}$ and $\mathrm {SC2}$ express that $\prec _{{\mathcal {A}}}$ is the stage comparison (strict) prewellordering of the least fixed-point $J_{{\mathcal {A}}}$ of the monotone operator induced by ${\mathcal {A}}$; see [19] for the definitions of these notions.^{Footnote 21}${\mathsf {S}}{\mathsf {C}}_{1}$ is a definitional extension of ${\mathsf {I}}{\mathsf {D}}_{1}$ (whether formulated over arithmetic or over set theory) [8, Theorem 9.4]; hence, in particular, ${\mathsf {S}}{\mathsf {C}}_{1}$ have the same ${\mathcal {L}}_{\in }$-theorems as ${\mathsf {I}}{\mathsf {D}}_{1}$ and thus $\widehat{{\mathsf {I}}{\mathsf {D}}}_{1}$ by Corollary 2.28.

The relation $\preceq _{{\mathcal {A}}}$ is defined by $x \preceq _{{\mathcal {A}}} y :\Leftrightarrow {\mathcal {A}} ( x, \prec _{{\mathcal {A}}} \upharpoonright _{y} )$. We have the following basic facts concerning $\prec _{{\mathcal {A}}}$ and $\preceq _{{\mathcal {A}}}$; see [8, §4] for their proofs.

Fact 5.1

Let ${\mathcal {A}}$ be an inductive operator form. ${\mathsf {S}}{\mathsf {C}}_{1}$ proves the following.

1.
$\forall x \bigl ( x \in J_{{\mathcal {A}}} \leftrightarrow {\mathcal {A}} ( x, \prec _{{\mathcal {A}}} \upharpoonright _{x} ) \bigr )$.
2.
$\prec _{{\mathcal {A}}}$ is transitive and $\forall x \forall y \bigl ( x \prec _{{\mathcal {A}}} y \vee y \prec _{{\mathcal {A}}} x \vee \prec _{{\mathcal {A}}} \upharpoonright _{x} = \prec _{{\mathcal {A}}} \upharpoonright _{y} \bigr )$.
3.
$\forall x \forall y \bigl ( x \preceq _{{\mathcal {A}}} y \, \leftrightarrow \, ( x \in J_{{\mathcal {A}}} \wedge y \nprec _{{\mathcal {A}}} x ) \bigr ) \wedge \forall x \forall y \bigl ( x \prec _{{\mathcal {A}}} y \, \leftrightarrow \, ( x \in J_{{\mathcal {A}}} \wedge y \npreceq _{{\mathcal {A}}} x ) \bigr )$.

The following definitions are made in ${\mathsf {S}}{\mathsf {C}}_{1}$: a class X is called inductive, if X is equal to $( J_{{\mathcal {A}}} )_{a}$ ($= \{ x \mid J_{{\mathcal {A}}} ( \langle x, a \rangle ) \}$) for some a and inductive operator form ${\mathcal {A}}$; X is called coinductive, if the negation of X is inductive, and it is hyperelementary, if it is both inductive and coinductive. ${\mathsf {S}}{\mathsf {C}}_{1}$ is strong enough to prove most of the basic properties of inductive and hyperelementary classes, such as those given in [19].

For an inductive operator form ${\mathcal {A}}$ and a set a, we set $x \prec _{{\mathcal {A}}, a} y :\Leftrightarrow \langle x, a \rangle \prec _{{\mathcal {A}}} \langle y, a \rangle $, and $x \preceq _{{\mathcal {A}}, a} y :\Leftrightarrow \langle x, a \rangle \preceq _{{\mathcal {A}}} \langle y, a \rangle $. This $\prec _{{\mathcal {A}}, a}$ strictly prewellorders $( J_{{\mathcal {A}}} )_{a}$; hence, for an inductive class $X = ( J_{{\mathcal {A}}} )_{a}$, the relation $\prec _{{\mathcal {A}}, a}$ prewellorders X. The way in which $\prec _{{\mathcal {A}}, a}$ prewellorders X depends on the choice of ${\mathcal {A}}$ and a, but the choice of the pair will not matter for our subsequent argument. So we let $\prec _{X}$ and $\preceq _{X}$ denote $\prec _{{\mathcal {A}}, a}$ and $\preceq _{{\mathcal {A}}, a}$, respectively, for some fixed ${\mathcal {A}}$ and a that define X.

We will use the following facts; their proofs are found in [8, §5].

Fact 5.2

For each inductive class X, the following are provable in ${\mathsf {S}}{\mathsf {C}}_{1}$.

1.
$\forall x \forall y \bigl ( x \preceq _{X} y \, \leftrightarrow \, ( x \in X \wedge y \nprec _{X} x ) \bigr ) \wedge \forall x \forall y \bigl ( x \prec _{X} y \, \leftrightarrow \, ( x \in X \wedge y \npreceq _{X} x ) \bigr )$.
2.
Both $\prec _{X}$ and $\preceq _{X}$ are inductive (Stage Comparison Theorem).
3.
All the ${\mathcal {L}}_{\in }$-definable relations are hyperelementary, and the inductive relations are closed under conjunction, disjunction, and universal and existential quantifiers. (Transitivity Theorem).

Fact 5.3

(Good Parametrization Theorem for Hyperelementary Classes) ${\mathsf {S}}{\mathsf {C}}_{1}$ proves the following. There are inductive classes I and H and a coinductive class ${\check{H}}$ with the following properties.

1.
If $a \in I$ then $( H )_{a} = ( {\check{H}} )_{a}$ (and thus $( H )_{a}$ is hyperelementary for all $a \in I$).
2.
If X is hyperelementary, then $X = ( H )_{a}$ for some $a \in I$.

Fact 5.3 says that hyperelementary classes can be nicely coded by sets.

We next turn to the Kripke–Platek system ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$. In terms of [3], ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ is an extension of $\mathsf {KPU}^{+}$ obtained by incorporating ${\mathsf {Z}}{\mathsf {F}}$ as the theory of urelements and extending the axiom schemata of ${\mathsf {Z}}{\mathsf {F}}$ for the entire language. We adopt the one-sorted formulation of $\mathsf {KPU}^{+}$: let ${\mathcal {L}}_{\mathrm {KP}} = \{ \in _{0}, \in _{1}, {\mathcal {U}}, \mathsf {V} \}$ (with equality as a logical symbol), where ${\mathcal {U}}$ is a unary predicate for urelements, $\in _{0}$ is the membership relation among urelements, $\in _{1}$ is the membership relation for sets, and $\mathsf {V}$ is a constant symbol for the set of urelements. We will write ${\mathcal {S}} x$ for $\lnot \, {\mathcal {U}} x$ to express set-hood. The Lévy hierarchy of formulae are introduced to ${\mathcal {L}}_{\mathrm {KP}}$ in an obvious manner: by $\Delta _{0}^{{\mathcal {S}}}$ we denote the least collection of ${\mathcal {L}}_{\mathrm {KP}}$-formulae containing all ${\mathcal {L}}_{\mathrm {KP}}$-atomics and closed under the Boolean connectives and bounded quantifiers $( \forall x \in _{1} t )$ and $( \exists x \in _{1} t )$ for ${\mathcal {L}}_{\mathrm {KP}}$-terms t; $\Sigma _{n}^{{\mathcal {S}}}$, $\Pi _{n}^{{\mathcal {S}}}$, $\Sigma ^{{\mathcal {S}}}$, and $\Pi ^{{\mathcal {S}}}$ are defined from $\Delta _{0}^{{\mathcal {S}}}$ in the standard manner.

We express various sets and classes in the language ${\mathcal {L}}_{\in }$ of (first-order) set theory, but ${\mathcal {L}}_{\mathrm {KP}}$ possesses two different membership relations $\in _{0}$ and $\in _{1}$, and they have different intended domains ${\mathcal {U}}$ and ${\mathcal {U}} \cup {\mathcal {S}}$. Hence, each set or class expressible in ${\mathcal {L}}_{\in }$ can be expressed in two different ways in ${\mathcal {L}}_{\mathrm {KP}}$ depending on which structure, ${\langle {\mathcal {U}}, \in _{0} \rangle }$ or $\langle {\mathcal {U}} \cup {\mathcal {S}}, \in _{1} \rangle $, is considered. For each set-theoretic notion, such as ordinals, subsets, functions, we will distinguish two different notions, in terms of $\langle {\mathcal {U}}, \in _{0} \rangle $ and of $\langle {\mathcal {U}} \cup {\mathcal {S}}, \in _{1} \rangle $, by attaching prefixes “${\mathcal {U}}$-” and “${\mathcal {S}}$-”; for instance, a ${\mathcal {U}}$-set means an urelement (an element of ${\mathcal {U}}$), and an ${\mathcal {S}}$-set means an element of ${\mathcal {S}}$; a ${\mathcal {U}}$-unordered pair of $a, b \in {\mathcal {U}}$ is a ${\mathcal {U}}$-set x such that $( a, b \in _{0} x ) \wedge ( \forall u \in _{0} x ) ( u = a \vee u = b )$; x is an ${\mathcal {S}}$-unorderd pair of a and b, if $( a, b \in _{1} x ) \wedge ( \forall y \in _{1} x ) ( y = a \vee y = b )$. If a set-theoretic notion or operation is given an abbreviation, such as On and $\langle \cdot , \cdot \rangle $, we will distinguish the two different notions by attaching superscript ${\mathcal {U}}$ or ${\mathcal {S}}$ to them; for example, $On^{{\mathcal {U}}}$ and $On^{{\mathcal {S}}}$ denote the classes of ${\mathcal {U}}$-ordinals and ${\mathcal {S}}$-ordinals, respectively; $\langle u, v \rangle ^{{\mathcal {U}}}$ denotes the ${\mathcal {U}}$-ordered pair (defined in terms of ${\mathcal {U}}$-unordered pairs) of u and v, while $\langle x, y \rangle ^{{\mathcal {S}}}$ denotes the ${\mathcal {S}}$-ordered pair of x and y. We will, however, sometimes abuse this notation and suppress the superscripts ${\mathcal {U}}$ and ${\mathcal {S}}$ for simplicity, when it is clear from the context. For an ${\mathcal {L}}_{\mathrm {KP}}$-definable class X, such as ${\mathcal {S}}$ and $On^{{\mathcal {U}}}$, we write $x \in X$ to mean that x is a member of the class, but it is precisely a mere abbreviation of $\Phi ( x )$ for the ${\mathcal {L}}_{\mathrm {KP}}$-formula $\Phi $ defining X and the symbol “$\in $” here should not be confused with “$\in _{0}$” and “$\in _{1}$”.

Each ${\mathcal {L}}_{\in }$-sentence is canonically translated into ${\mathcal {L}}_{\mathrm {KP}}$ by restricting every quantifier to ${\mathcal {U}}$ and replacing the membership relation of ${\mathcal {L}}_{\in }$ with $\in _{0}$. Let us denote this canonical translation of ${\mathcal {L}}_{\in }$ in ${\mathcal {L}}_{\mathrm {KP}}$ by $\star $. In other words, for each ${\mathcal {L}}_{\in }$-sentence $\sigma $, $\sigma ^{\star }$ is the relativization of $\sigma $ to the structure $\langle {\mathcal {U}}, \in _{0} \rangle $. Thereby we first define a minimal ${\mathcal {L}}_{\mathrm {KP}}$-system ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }$ as the collection of the following axioms:

$$\begin{aligned} ( \mathrm {Ext} ) :&\ \, ( \forall a, b \in {\mathcal {S}} ) \bigl ( \forall x ( x \in _{1} a \leftrightarrow x \in _{1} b ) \rightarrow a = b \bigr ), \\ {( \mathrm {Pair} ) :}&\ \, \forall x \forall y ( \exists a \in {\mathcal {S}} ) \bigl ( x \in _{1} a \wedge y \in _{1} a \bigr ), \\ {( \mathrm {Union} ) :}&\ \, ( \forall a \in {\mathcal {S}} ) ( \exists b \in {\mathcal {S}} ) ( \forall x \in _{1} a ) ( \forall y \in _{1} x ) y \in _{1} b, \\ {( \Delta _{0}^{{\mathcal {S}}} \text {-} \mathrm {Sep}_{1} ) :}&\ \, ( \forall a \in {\mathcal {S}} ) ( \exists b \in {\mathcal {S}} ) \forall x \bigl ( x \in _{1} b \leftrightarrow x \in _{1} a \wedge \psi ( x ) \bigr ), \\ {( \Delta _{0}^{{\mathcal {S}}} \text {-} \mathrm {Coll}_{1} ) :}&\ \, (\forall a \in {\mathcal {S}} ) \bigl [ ( \forall x \in _{1} a ) \exists y \psi ( x, y ) \rightarrow ( \exists b \in {\mathcal {S}} ) ( \forall x \in _{1} a ) ( \exists y \in _{1} b ) \psi ( x, y ) \bigr ], \\ {( \mathrm {U} ) :}&\ \, \mathsf {V} \in {\mathcal {S}} \wedge \forall x \forall y \bigl ( ( x \in _{1} \mathsf {V} \leftrightarrow x \in {\mathcal {U}} ) \wedge ( x \in _{1} y \rightarrow y \in {\mathcal {S}} ) \wedge ( x \in _{0} y \rightarrow x, y \in {\mathcal {U}} ) \bigr ), \\ ( {\mathsf {ZF}}^{\star } ):&\ \sigma ^{\star } \text { for each axiom } \sigma \text { of } {\mathsf {ZF}}, \end{aligned}$$

where $\psi $ is any $\Delta _{0}^{{\mathcal {S}}}$-formula without b free. We also consider the following additional axiom schemata.

$$\begin{aligned} ( \Gamma \text {-} \mathrm {Found}_{1} ):&\ \, \forall x \bigl ( ( \forall y \in _{1} x ) \varphi ( y ) \rightarrow \varphi ( x ) \bigr ) \rightarrow \forall x \varphi ( x ) \\ ( \Gamma \text {-} \mathrm {Found}_{0}^{+} ):&\ ( \forall x \in {\mathcal {U}} ) \bigl ( ( \forall y \in _{0} x ) \varphi ( y ) \rightarrow \varphi ( x ) \bigr ) \rightarrow ( \forall x \in {\mathcal {U}} ) \varphi ( x ). \\ ( \Gamma \text {-} \mathrm {Sep}_{0}^{+} ):&\ ( \forall a \in {\mathcal {U}} ) ( \exists b \in {\mathcal {U}} ) ( \forall z \in {\mathcal {U}} ) \bigl ( z \in _{0} b \leftrightarrow z \in _{0} a \wedge \varphi ( z ) \bigr ). \\ ( \Gamma \text {-} \mathrm {Repl}_{0}^{+} ):&\ ( \forall a \in {\mathcal {U}} ) \bigl [ ( \forall x \in _{0} a ) ( \exists ! y \in {\mathcal {U}} ) \varphi \rightarrow ( \exists b \in {\mathcal {U}} ) ( \forall x \in _{0} a ) ( \exists y \in _{0} b ) \varphi \bigr ], \end{aligned}$$

where $\varphi $ is any formula without b free belonging to a collection $\Gamma $ of formulae (of a language including ${\mathcal {L}}_{\mathrm {KP}}$). The full systems ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ is thereby defined as follows:

$$\begin{aligned} {{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}\ := \ {{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }+ \Sigma _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {Found}_{1} + \Sigma _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {Sep}_{0}^{+} + \Sigma _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {Repl}_{0}^{+}. \end{aligned}$$

We will use the same interpretation $*$ of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ in ${\mathsf {S}}{\mathsf {C}}_{1}$ as in [8]. The entire domain of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ is interpreted by the direct sum of ${\mathbb {V}}$ and a certain inductive class M, say, $( {\mathbb {V}} \times \{ 0 \} ) \cup ( M \times \{ 1 \} )$, so that every ordered pair $\langle x, 0 \rangle $ represents some ${\mathcal {U}}$-set, and an ordered pair $\langle x, 1 \rangle $ represents an ${\mathcal {S}}$-set when $x \in M$. The special inductive class M consists of the codes (in the sense of Fact 5.3) of hyperelementary well-founded trees; here, by a well-founded tree we mean the same thing as what Simpson [25] calls a suitable tree, which is defined as a tree T, in the sense that T is a non-empty class of finite sequences of sets closed under initial segments, which is strictly prewellordered by the canonical ordering $\sqsubset _{T}$ defined as $x \sqsubset _{T} y :\Leftrightarrow ``y \text {is a proper initial segment of} x.''$ In sum, we put

$$\begin{aligned}&M \ := \ \{ a \in I \mid ( H )_{a} \text { is a suitable tree} \} \\&{\mathcal {S}}^{*} \ := \ \{ \langle a, 1 \rangle \mid a \in M \} \\&{\mathcal {U}}^{*} \ := \ \{ \langle a, 0 \rangle \mid a \in {\mathbb {V}} \}, \end{aligned}$$

where M is indeed inductive, since the well-foundedness of $\sqsubset _{( H )_{a}}$ is uniformly expressible for all trees $a \in I$ in terms of a certain inductive class $ Acc ( \sqsubset )$.^{Footnote 22} Then, the domain of quantifiers of ${\mathcal {L}}_{\mathrm {KP}}$ is interpreted as ${\mathcal {S}}^{*} \cup {\mathcal {U}}^{*}$, and the $*$-interpretation of $\in _{0}$ is simply defined by $\langle x, 0 \rangle \in _{0}^{*} \langle y, 0 \rangle :\Leftrightarrow x \in y$.

Each hyperelementary suitable tree is intended to represent its Mostowski collapse. However, since we allow urelements in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, the notions of Mostowski collapse must be so modified as to accommodate urelements; each leaf of a suitable tree corresponds to an object with no $\in _{1}$-member that is contained in the transitive closure of the Mostowski collapse of the tree, and we must somehow distinguish the cases where the leaf represents the ${\mathcal {S}}$-emptyset and where it represents an urelement (i.e., ${\mathcal {U}}$-sets), both of which has no $\in _{1}$-member. For this purpose, we stipulate that, for a leaf u of a tree T, if $u = \langle u_{0}, \ldots , u_{k} \rangle $ ends with an element of the form $u_{k} = \langle x, 0 \rangle \in {\mathcal {U}}^{*}$, then it represents the urelement that $\langle x, 0 \rangle $ represents, and otherwise represents $\emptyset ^{{\mathcal {S}}}$. Informally, given a transitive model of ${\mathsf {Z}}{\mathsf {F}}$ with domain D, which is regarded as the domain of urelements, if $T \subset D$ is a suitable tree, we define the collapse m(T, s) of T at each $s \in T$ by recursion along $\sqsubset _{T}$ so that

$$\begin{aligned} m ( T, s ) := {\left\{ \begin{array}{ll} p \quad \text {(as an urelement)} &{} \text {if}\,\, u \! = \! \langle u_{0}, \ldots , \langle p, 0 \rangle \rangle \text { is a leaf of }T \\ \{ m ( T, s *\langle v \rangle ) \mid s \! *\! \langle v \rangle \! \in \! T \} &{} \text {otherwise}; \end{array}\right. } \end{aligned}$$

(14)

thereby we let T represent the ${\mathcal {S}}$-set $m ( T, \epsilon )$ (a member of ${\mathbb {V}}_{D}$ in terms of [3]).

Finally, to define the $*$-interpretations of $\in _{1}$ and $=$, we first take a special inductive relation B(a, b, u, v), which expresses, for each $a, b \in M$, $u \in ( H )_{a}$, and $v \in ( H )_{b}$, that the sub-tree of the suitable tree $( H )_{a}$ below u is bisimilar to the sub-tree of $( H )_{b}$ below v in a suitably modified sense, in which the notion of bisimilation is modified so as to distinguish the leaves representing the empty set and those representing urelements so that suitable trees $T_{0}$ and $T_{1}$ are bisimilar (in the modified sense) if and only if $m ( T_{0}, \epsilon ) = m ( T_{1}, \epsilon )$. Hence, for $a, b \in M$, $\exists z ( \langle z \rangle \in ( H )_{b} \wedge B ( a, b, \epsilon , \langle z \rangle )$ expresses that $( H )_{a}$ is bisimilar to some immediate sub-tree of $( H )_{b}$ and thus that $m ( ( H )_{a}, \epsilon )$ is a member of $m ( ( H )_{b}, \epsilon )$. Next, let A(a, b) express that $\langle \langle a, 0 \rangle \rangle $ is a leaf of a tree $( H )_{b}$, which means that the urelement represented by $\langle a, 0 \rangle $ is a member of $m ( ( H )_{b}, \epsilon )$ (when $b \in M$). Thereby, we define

$$\begin{aligned} P^{+}_{=} ( x, y )&\, :\Leftrightarrow \, \bigl [ x, y \in {\mathcal {U}}^{*} \wedge ( x )_{0} = ( y )_{0} \bigr ] \vee \bigl [ x, y \in {\mathcal {S}}^{*} \wedge B ( ( x )_{0}, ( y )_{0}, \epsilon , \epsilon ) \bigr ] \\ P^{+}_{\in _{1}} ( x, y )&\, :\Leftrightarrow \, y \in {\mathcal {S}}^{*} \wedge \bigl [ \bigl ( x \in {\mathcal {U}}^{*} \wedge A ( ( x )_{0}, ( y )_{0} ) \bigr ) \\&\qquad \qquad \qquad \qquad \ \vee \bigl ( x \in {\mathcal {S}}^{*} \wedge \exists z ( \langle z \rangle \in ( H )_{( y )_{0}} \wedge B ( ( x )_{0}, ( y )_{0}, \epsilon , \langle z \rangle ) ) \bigr ) \bigr ]. \end{aligned}$$

Then, there are binary inductive relations $P^{-}_{=}$ and $P^{-}_{\in _{1}}$ such that

$$\begin{aligned} ( \forall x, y \in {\mathcal {U}}^{*} \cup {\mathcal {S}}^{*} ) \Bigl ( \bigl ( \lnot P^{+}_{=} ( x, y ) \leftrightarrow P^{-}_{=} ( x, y ) \bigr ) \wedge \bigl ( \lnot P^{+}_{\in _{1}} ( x, y ) \leftrightarrow P^{-}_{\in _{1}} ( x, y ) \bigr ) \Bigr ).\nonumber \\ \end{aligned}$$

(15)

Finally, $( x = y )^{*}$ and $( x \in _{1} y )^{*}$ are defined as $P^{+}_{=} ( x, y )$ and $P^{+}_{\in _{1}} ( x, y )$, respectively.

Then, we have the following fact [8, Theorem 7.19].

Fact 5.4

$*$ is an interpretation of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ in ${\mathsf {S}}{\mathsf {C}}_{1}$.

For each ${\mathcal {L}}_{\in }$-formula $\varphi ( x_{1}, \ldots , x_{k} )$ only with the displayed variables free, we can show the following by straightforward induction on $\varphi $:

$$\begin{aligned} {\mathsf {S}}{\mathsf {C}}_{1} \vdash \ \forall x_{1} \cdots \forall x_{k} \bigl [ \, \varphi ( \vec {x} ) \, \leftrightarrow \, ( \varphi ^{\star } )^{*} \bigl ( \langle x_{0}, 0 \rangle , \ldots , \langle x_{k}, 0 \rangle \bigr ) \, \bigr ]; \end{aligned}$$

(16)

this fact (16) is used in the proof of Fact 5.4. Hence, the next follows.

Fact 5.5

${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$ have the same ${\mathcal {L}}_{\in }$-theorems as ${\mathsf {S}}{\mathsf {C}}_{1}$ with respect to $\star $.

We will need a certain generalization of (16). We first extend $\star $ to a translation of ${\mathcal {L}}_{\in } ( P )$ into ${\mathcal {L}}_{\mathrm {KP}} ( P )$ simply by putting $ P ^{\star } t :\Leftrightarrow P t$, where $ P ^{\star }$ is intended to be a predicate of ${\mathcal {U}}$-sets. We next also extend $*$ to a translation of ${\mathcal {L}}_{\mathrm {KP}} ( P )$ into ${\mathcal {L}}_{\mathrm {SC}} ( P )$ again by putting $ P ^{*} t :\Leftrightarrow P t$. Then, in particular, for each inductive operator form ${\mathcal {A}} ( x, P )$, ${\mathcal {A}}^{\star } ( x, P )$ and $( {\mathcal {A}}^{\star } )^{*} ( x, P )$ are an ${\mathcal {L}}_{\mathrm {KP}} ( P )$-formula and an ${\mathcal {L}}_{\mathrm {SC}}$-formula, respectively, in which $ P $ occurs only positively. Let $\psi ( u )$ be any ${\mathcal {L}}_{\mathrm {SC}}$-formula with a distinguished free variable u (possibly with parameters). Then, for each ${\mathcal {L}}_{\in } ( P )$-formula $\varphi ( x_{1}, \ldots , x_{k}, P )$ only with the displayed variables free, we can show

$$\begin{aligned} {\mathsf {S}}{\mathsf {C}}_{1} \vdash \ \forall x_{1} \cdots \forall x_{k} \bigl [ \, \varphi ( \vec {x}, \psi ( {\hat{u}} ) ) \, \leftrightarrow \, ( \varphi ^{\star } )^{*} \bigl ( \langle x_{0}, 0 \rangle , \ldots , \langle x_{k}, 0 \rangle , \{ \langle u, 0 \rangle \mid \psi ( u ) \} \bigr ) \, \bigr ],\nonumber \\ \end{aligned}$$

(17)

where $( \varphi ^{\star } )^{*} \bigl ( \langle x_{0}, 0 \rangle , \ldots , \langle x_{k}, 0 \rangle , \{ \langle u, 0 \rangle \mid \psi ( u ) \} \bigr )$ is, precisely, the result of replacing each occurrence of an atomic formula $ P t$ in $( \varphi ^{\star } )^{*}$ by $\exists u ( t = \langle u, 0 \rangle \wedge \psi ( u ) )$. This equivalence (17) is proved by the same induction on $\varphi $ as (16) with a trivial additional case for the base step where $\varphi $ is of the form $ P t$.

5.1.2 The constructible universe relative to $\mathsf {V}$

Let $ L ^{{\mathcal {S}}} ({\mathsf {V}} )$ be the ${\mathcal {S}}$-class of constructible (${\mathcal {S}}$-) sets relative to $\mathsf {V}$ (or, from $\mathsf {V}$, in terms of [3]); $ L ^{{\mathcal {S}}} ({\mathsf {V}} )$ is standardly built up from $L_{0}^{{\mathcal {S}}} ( \mathsf {V} ) := \mathsf {V}$ ($\in {\mathcal {S}}$); see [3, Ch. II] for the precise definition. By $L_{\xi }^{{\mathcal {S}}} ( \mathsf {V} )$ we will denote the $\xi $th stage of the construction of $ L ^{{\mathcal {S}}} ({\mathsf {V}} )$, in other words, the ${\mathcal {S}}$-set of constructible sets of the level (or the “constructible rank”) $\xi $ in the constructible hierarchy relative to $\mathsf {V}$. $ L ^{{\mathcal {S}}} ({\mathsf {V}} )$ is definable in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ (in fact, in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }+ \Sigma _{1}^{{\mathcal {S}}} \text {-} \mathrm {Found}_{1}$), and $ L ^{{\mathcal {S}}} ({\mathsf {V}} )\cup {\mathcal {U}}$ is an inner model of ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$ provably in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$. More precisely, for each ${\mathcal {L}}_{\mathrm {KP}}$-formula $\varphi $, let $\varphi ^{ L ^{{\mathcal {S}}} ({\mathsf {V}} )}$ denote the result of restricting each quantifier in $\varphi $ to $ L ^{{\mathcal {S}}} ({\mathsf {V}} )\cup {\mathcal {U}}$ (but keeping all the other vocabulary unchanged, in particular, interpreting ${\mathcal {U}}$ by itself and thus $x \in {\mathcal {S}} x$ ($\Leftrightarrow x \not \in {\mathcal {U}}$) as $x \in L ^{{\mathcal {S}}} ({\mathsf {V}} )$); then, for each axiom $\sigma $ of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, $\sigma ^{ L ^{{\mathcal {S}}} ({\mathsf {V}} )}$ is provable in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$.

Fact 5.4 is proved, in essence, by carrying out the proof of the Barwise–Gandy–Moschovakis theorem [4] within ${\mathsf {S}}{\mathsf {C}}_{1}$, but the Barwise-Gandy-Moschovakis theorem bears richer implications. In particular, it shows that the companion of the inductive sets on a transitive infinite set A, as a Spector class on A, is the least admissible set containing A. Indeed, ${\mathsf {S}}{\mathsf {C}}_{1}$ can “see” this fact within it. Let us call the ${\mathcal {L}}_{\mathrm {KP}}$-statement ${\mathcal {S}}= L ^{{\mathcal {S}}} ({\mathsf {V}} )$ the axiom of constructibility relative to $\mathsf {V}$. The goal of the present sub-subsection is to show the following.

Theorem 5.6

${\mathsf {S}}{\mathsf {C}}_{1} \vdash \bigl ( {\mathcal {S}}= L ^{{\mathcal {S}}} ({\mathsf {V}} )\bigr )^{*}$. Hence, it follows from Fact 5.4 that $*$ is an interpretation of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}+ {\mathcal {S}}= L ^{{\mathcal {S}}} ({\mathsf {V}} )$ in ${\mathsf {S}}{\mathsf {C}}_{1}$.

To show this, we need to prove some preliminary results and use some facts implicit in the proof of Fact 5.4.

Let ${\mathcal {A}} ( x, P )$ be an inductive operator form. ${\mathcal {A}}^{\star }$ is an ${\mathcal {L}}_{\mathrm {KP}} ( P )$-formula with every quantifier bounded (by $\mathsf {V}$). Hence, by $\Sigma ^{{\mathcal {S}}}$-recursion (see [3, Ch. 1]), there exists, provably in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, a $\Sigma ^{{\mathcal {S}}}$-function $ I _{{\mathcal {A}}} :On^{{\mathcal {S}}} \rightarrow {\mathcal {S}}$ such that

$$\begin{aligned} I _{{\mathcal {A}}} ( \alpha ) := \{ u \in _{1} \mathsf {V} \mid {\mathcal {A}}^{\star } ( u, {\bigcup _{\xi < \alpha }}^{{\mathcal {S}}} I _{{\mathcal {A}}} ( \xi ) ) \}^{{\mathcal {S}}}, \end{aligned}$$

where ${\mathcal {A}}^{\star } ( u, \, \bigcup _{\xi < \alpha }^{{\mathcal {S}}} I _{{\mathcal {A}}} ( \xi ) )$ is the result of replacing each occurrence of $ P t$ in ${\mathcal {A}}^{\star } ( u, P )$ with $t \in _{1} \bigcup _{\xi < \alpha }^{{\mathcal {S}}} I _{{\mathcal {A}}} ( \xi )$; let us write $ I _{{\mathcal {A}}}^{\alpha }$ for $ I _{{\mathcal {A}}} ( \alpha )$ and $ I _{{\mathcal {A}}}^{< \alpha }$ for $\bigcup _{\xi < \alpha }^{{\mathcal {S}}} I _{{\mathcal {A}}}^{\xi } ( \xi )$ for readability. We thereby extend $\star $ to a translation of ${\mathcal {L}}_{\mathrm {SC}}$ into ${\mathcal {L}}_{\mathrm {KP}}$ by putting

$$\begin{aligned} J_{{\mathcal {A}}}^{\star } ( x ) \ :\Leftrightarrow \&( \exists \alpha \in On^{{\mathcal {S}}} ) x \in _{1} I _{{\mathcal {A}}}^{\alpha } \\ \prec _{{\mathcal {A}}}^{\star } ( x ) \ :\Leftrightarrow \&( \exists y, z \in {\mathcal {U}} ) ( \exists \alpha \in On^{{\mathcal {S}}} ) \bigl ( x = \langle y, z \rangle ^{{\mathcal {U}}} \wedge y \in _{1} I _{{\mathcal {A}}}^{\alpha } \wedge z \not \in _{1} I _{{\mathcal {A}}}^{\alpha } \bigr ); \end{aligned}$$

note that both $J_{{\mathcal {A}}}^{\star }$ and $\prec _{{\mathcal {A}}}^{\star }$ are $\Sigma ^{{\mathcal {S}}}$-predicates. We can standardly show that $\star $ is an interpretation of ${\mathsf {S}}{\mathsf {C}}_{1}$ in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, and the notion of an inductive class in ${\mathsf {S}}{\mathsf {C}}_{1}$ is accordingly translated into ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, namely, an inductive class in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ is an ${\mathcal {S}}$-class ${\mathfrak {X}}$ such that ${\mathfrak {X}} = ( J_{{\mathcal {A}}}^{\star } )_{a} = \{ u \in _{1} \mathsf {V} \mid \langle u, a \rangle ^{{\mathcal {U}}} \in J_{{\mathcal {A}}}^{\star } \}$ for some inductive operator form ${\mathcal {A}}$ and ${\mathcal {U}}$-set a ($\in _{1} \mathsf {V}$).

Since $ L ^{{\mathcal {S}}} ({\mathsf {V}} )\cup {\mathcal {U}}$ is an inner model of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}\vdash ( \sigma ^{\star } )^{ L ^{{\mathcal {S}}} ({\mathsf {V}} )}$ for each axiom of ${\mathsf {S}}{\mathsf {C}}_{1}$. Now, for each fixed $\alpha \in On^{{\mathcal {S}}}$ as a parameter, $x \in _{1} I _{{\mathcal {A}}}^{\alpha }$ is a $\Delta ^{{\mathcal {S}}}$-predicate on $\mathsf {V}$ in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ and thus $ I _{{\mathcal {A}}}^{\alpha } = ( I _{{\mathcal {A}}}^{\alpha } )^{ L ^{{\mathcal {S}}} ({\mathsf {V}} )}$, provably in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, for each $\alpha \in ( On^{{\mathcal {S}}} )^{ L ^{{\mathcal {S}}} ({\mathsf {V}} )}$; hence, $( J_{{\mathcal {A}}}^{\star } )^{ L ^{{\mathcal {S}}} ({\mathsf {V}} )} = J_{{\mathcal {A}}}^{\star }$ provably in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, since $On^{{\mathcal {S}}} = ( On^{{\mathcal {S}}} )^{ L ^{{\mathcal {S}}} ({\mathsf {V}} )}$. In particular, every inductive or coinductive class in the sense of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ has the same meaning in ${\mathcal {S}} \cup {\mathcal {U}}$ and in $ L ^{{\mathcal {S}}} ({\mathsf {V}} )\cup {\mathcal {U}}$.

To prove the subsequent Lemma 5.8, we need the following fact, which is implicit in the proof of [8, Lemma 7.17].

Fact 5.7

${\mathsf {S}}{\mathsf {C}}_{1}$ proves the following: for every ${\mathcal {L}}_{\mathrm {SC}}$-formula $\varphi ( x )$,

$$\begin{aligned} ( \forall x \in {\mathcal {S}}^{*} \cup {\mathcal {U}}^{*} ) \bigl ( \forall y ( y \in _{1}^{*} x \rightarrow \varphi ( y ) ) \rightarrow \varphi ( x ) \bigr ) \rightarrow ( \forall x \in {\mathcal {S}}^{*} \cup {\mathcal {U}}^{*} ) \varphi ( x ). \end{aligned}$$

Namely, ${\mathsf {S}}{\mathsf {C}}_{1}$ proves the principle of induction along $\in _{1}^{*}$ with respect to not only the $*$-translations of ${\mathcal {L}}_{\mathrm {KP}}$-formulae but also arbitrary ${\mathcal {L}}_{\mathrm {SC}}$-formulae; recall that $\in _{1}^{*}$ is defined in terms of bisimulations between well-founded trees.

Lemma 5.8

${\mathsf {S}}{\mathsf {C}}_{1}$ proves the following.

1.
$\forall x \bigl ( x \in J_{{\mathcal {A}}} \leftrightarrow \langle x, 0 \rangle \in ( J_{{\mathcal {A}}}^{\star } )^{*} \bigr )$.
2.
$\forall x \forall y \bigl ( \prec _{{\mathcal {A}}} ( \langle x, y \rangle ) \leftrightarrow ( \prec _{{\mathcal {A}}}^{\star } )^{*} ( \langle \langle x, y \rangle , 0 \rangle ) \bigr )$.

Note that for every $u, v \in {\mathcal {U}}^{*}$, it follows by the definition of $*$ that there are some x and y with $u = \langle x, 0 \rangle $ and $v = \langle y, 0 \rangle $ and $( \langle u, v \rangle ^{{\mathcal {U}}} )^{*} = \langle \langle x, y \rangle , 0 \rangle $.

Proof

We work within ${\mathsf {S}}{\mathsf {C}}_{1}$. For the claim 1, we first show that

$$\begin{aligned} ( \forall \alpha \in ( On^{{\mathcal {S}}} )^{*} ) \forall x \bigl ( \langle x, 0 \rangle \in _{1}^{*} ( I _{{\mathcal {A}}}^{\alpha } )^{*} \rightarrow x \in J_{{\mathcal {A}}} \bigr ). \end{aligned}$$

(18)

This is shown by induction on $\alpha $ (along $\in _{1}^{*}$) using Fact 5.7; for each x,

$$\begin{aligned} \langle x, 0 \rangle \in _{1}^{*} ( I _{{\mathcal {A}}}^{\alpha } )^{*} \ \Rightarrow \ ( {\mathcal {A}}^{\star } )^{*} \bigl ( \langle x, 0 \rangle , ( I _{{\mathcal {A}}}^{< \alpha } )^{*} \bigr )&\ \Rightarrow \ ( {\mathcal {A}}^{\star } )^{*} \bigl ( \langle x, 0 \rangle , \{ \langle y, 0 \rangle \mid y \in J_{{\mathcal {A}}} \} \bigr ) \\&\ \Rightarrow \ {\mathcal {A}} ( x, J_{{\mathcal {A}}} ) \ \Rightarrow \ x \in J_{{\mathcal {A}}}; \end{aligned}$$

recall that $( {\mathcal {A}}^{\star } )^{*} \bigl ( \langle x, 0 \rangle , ( I _{{\mathcal {A}}}^{< \alpha } )^{*} \bigr )$ is the result of replacing each occurrence of $ Pt $ in $( {\mathcal {A}}^{\star } )^{*} ( \langle x, 0 \rangle , P )$ with $t \in _{1}^{*} ( I _{{\mathcal {A}}}^{< \alpha } )^{*}$; the first implication holds by the definition of $ I _{{\mathcal {A}}}^{\alpha }$ (in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$) and the fact that $*$ is an interpretation of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ in ${\mathsf {S}}{\mathsf {C}}_{1}$; the second obtains by the induction hypothesis and the fact that $ P $ occurs only positively in ${\mathcal {A}}$; the third follows from (17); the fourth holds by the axiom $\widehat{\mathrm {ID}}$. We next show that

$$\begin{aligned} \forall x \bigl ( x \in J_{{\mathcal {A}}} \rightarrow \langle x, 0 \rangle \in _{1}^{*} ( J_{{\mathcal {A}}}^{\star } )^{*} \bigr ). \end{aligned}$$

(19)

This is shown by induction on x along $\prec _{{\mathcal {A}}}$; given $x \in J_{{\mathcal {A}}}$, we have $\langle y, 0 \rangle \in ( J_{{\mathcal {A}}}^{\star } )^{*}$ for every $y \prec _{{\mathcal {A}}} x$ by the induction hypothesis, and, since ${\mathcal {A}} ( x, \prec _{{\mathcal {A}}} \upharpoonright _{x} )$ holds by Fact 5.1.1, we can thereby infer

$$\begin{aligned} {\mathcal {A}} \bigl ( x, \, \{ y \mid \langle y, 0 \rangle \in ( J_{{\mathcal {A}}}^{\star } )^{*} \} \bigr ) \ \Rightarrow \ ( {\mathcal {A}}^{\star } )^{*} \bigl ( \langle x, 0 \rangle , ( J_{{\mathcal {A}}}^{\star } )^{*} \bigr ) \ \Rightarrow \ \langle x, 0 \rangle \in _{1}^{*} ( J_{{\mathcal {A}}}^{\star } )^{*}; \end{aligned}$$

the first implication obtains due to (17); the second holds because $\star $ interprets ${\mathsf {S}}{\mathsf {C}}_{1}$ in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ and $*$ interprets ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ in ${\mathsf {S}}{\mathsf {C}}_{1}$. The claim 1 follows from (18) and (19).

For the claim 2, we first remark that ${\mathsf {S}}{\mathsf {C}}_{1}$ proves that if an ${\mathcal {L}}_{\mathrm {SC}}$-definable relation $\prec $ satisfies $\mathrm {SC1}$ and (some finite instances of) $\mathrm {SC2}$ for ${\mathcal {A}}$, then $\prec $ is identical (co-extensive) with $\prec _{{\mathcal {A}}}$. Now, put $x \prec y :\Leftrightarrow \langle \langle x, y \rangle , 0 \rangle \in ( \prec _{{\mathcal {A}}}^{\star } )^{*}$. It suffices to show that $\prec $ satisfies $\mathrm {SC1}$ and $\mathrm {SC2}$ for ${\mathcal {A}}$. $\mathrm {SC2}$ is readily verified: since $\prec _{{\mathcal {A}}}^{\star }$ orders ${\mathcal {U}}$-sets $u \in J_{{\mathcal {A}}}^{\star }$ by comparing the least ${\mathcal {S}}$-ordinals $\alpha $ with $u \in I_{{\mathcal {A}}}^{\alpha }$, the well-foundedness of $\prec _{{\mathcal {A}}}^{\star }$ derives from that of $\in _{1}$, and thus the well-foundedness of $\prec $ follows from Fact 5.7. To verify $\mathrm {SC1}$ for $\prec $, we first observe that $x \prec y$ is equivalent to

$$\begin{aligned} \langle x, 0 \rangle \in ( J_{{\mathcal {A}}}^{\star } )^{*} \wedge \lnot ( {\mathcal {A}}^{\star } )^{*} \bigl ( \langle y, 0 \rangle , \{ \langle z, 0 \rangle \mid ( \prec _{{\mathcal {A}}}^{\star } )^{*} \langle \langle z, x \rangle , 0 \rangle \} \bigr ), \end{aligned}$$

(20)

since $( \mathrm {SC1}^{\star } )^{*}$ holds in ${\mathsf {S}}{\mathsf {C}}_{1}$. By the claim 1 and the definition of $\prec $, (20) is equivalent to $x \in J_{{\mathcal {A}}} \wedge \lnot ( {\mathcal {A}}^{\star } )^{*} ( \langle y, 0 \rangle , \{ \langle z, 0 \rangle \mid z \prec x \} )$, which is equivalent to $x \in J_{{\mathcal {A}}} \wedge \lnot {\mathcal {A}} (y, \prec \upharpoonright _{x})$ by (17). $\square $

Hence, (16) is extended for arbitrary ${\mathcal {L}}_{\mathrm {SC}}$-formulae in the following way.

Lemma 5.9

For every ${\mathcal {L}}_{\mathrm {SC}}$-formula $\varphi ( x_{1}, \ldots , x_{k} )$,

$$\begin{aligned} {\mathsf {S}}{\mathsf {C}}_{1} \vdash \forall x_{1} \ldots \forall x_{k} \bigl ( \varphi ( \vec {x} ) \leftrightarrow ( \varphi ^{\star } )^{*} \bigl ( \langle x_{1}, 0 \rangle , \ldots , \langle x_{k}, 0 \rangle \bigr ) \bigr ). \end{aligned}$$

In particular, for every definable inductive or coinductive class $ Q $, such as I, H, ${\check{H}}$, M, and B, its $\star $-translation $ Q ^{\star }$ satisfies the following:

$$\begin{aligned} {\mathsf {S}}{\mathsf {C}}_{1} \vdash \forall u \bigl ( u \in Q \leftrightarrow \langle u, 0 \rangle \in ( Q^{\star } )^{*} \bigr ). \end{aligned}$$

Next, we will show that the collapsing function m(T, s) can be adequately defined within ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ for the $\star $-translation of each hyperelementary suitable tree T.

We first canonically extend $\star $ (restricted to ${\mathcal {L}}_{\in }$) to a translation of ${\mathcal {L}}_{2}$ in ${\mathcal {L}}_{\mathrm {KP}}$ by interpreting classes of second-order set theory into ${\mathcal {S}}$-subsets of $\mathsf {V}$. It is easily verified that the thus extended translation $\star $ is an interpretation of $\mathsf {ECA}$ in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$; see [8, §9.1].

For each ${\mathcal {L}}_{\mathrm {KP}}$-formula $\Phi $ and ${\mathcal {S}}$-set $X \subset ^{{\mathcal {S}}} \mathsf {V}$, let us define

$$\begin{aligned} TI _{\Phi }^{\star } ( X ) \ :\Leftrightarrow \ ( \forall x \in _{1} \mathsf {V} ) \bigl ( ( \forall y \in _{1} \mathsf {V} ) ( \langle y, x \rangle ^{{\mathcal {U}}} \in _{1} X \rightarrow \Phi ( y ) ) \rightarrow \Phi ( x ) \bigr ) \rightarrow ( \forall x \in _{1} \mathsf {V} ) \Phi ( x ). \end{aligned}$$

Given a collection $\Gamma $ of ${\mathcal {L}}_{\mathrm {KP}}$-formulae, we thereby define the schema $\Gamma \text {-} \mathrm {TI}^{\star }$ as follows.

$$\begin{aligned} {\Gamma \text {-} \mathrm {TI}^{\star }:} \quad&( \forall X \subset ^{{\mathcal {S}}} \mathsf {V} ) \bigl ( Wf ^{\star } ( X ) \rightarrow TI _{\Phi }^{\star } ( X ) \bigr ), \ \text {for all } \Phi \in \Gamma ; \end{aligned}$$

this is the ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$-counterpart of the class-theoretic axiom schema $\Gamma \text {-} \mathrm {TI}$. Note that since ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}} \vdash \mathsf {NBG}^{\star }$, it follows from Proposition 2.3 that $ Wf ^{\star } ( X )$ is equivalent to the $\Delta _{0}^{{\mathcal {S}}}$-statement that every non-empty ${\mathcal {U}}$-set u with $( \forall v \in _{0} u ) v \in _{1} X$ has a $\prec _{X}$-minimal element.

The next lemma can be proved in a parallel manner to Theorem 3.4 by using $\Sigma _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {Sep}_{0}^{+}$ and $\Sigma _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {Repl}_{0}^{+}$ instead of $\Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$ and $\Sigma ^{1}_{\infty } \text {-} \mathrm {Repl}$ in the construction of the Skolem functions.

Lemma 5.10

${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}\vdash \Pi _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {TI}^{\star }$.

In particular, we have ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}\vdash \Sigma _{1}^{{\mathcal {S}}} \text {-} \mathrm {TI}^{\star }$ and thus can standardly show that $\Sigma ^{{\mathcal {S}}}$-recursive definition are possible in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ along any well-founded ${\mathcal {S}}$-set relation on $\mathsf {V}$; hence, the Mostowski collapsing function m can be defined as a $\Sigma ^{{\mathcal {S}}}$-function in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$; cf. [12, Theorem 4.6]. More precisely, we have the next lemma.

Lemma 5.11

${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ proves the following: for each $\Sigma ^{{\mathcal {S}}}$-function $\mathsf {G} ( x, y, r )$ (possibly with other parameters), there exists a $\Sigma ^{{\mathcal {S}}}$-function $\mathsf {F} ( x, r )$ such that if r is a well-founded ${\mathcal {S}}$-set relation whose field a is an ${\mathcal {S}}$-subset of $\mathsf {V}$, then

$$\begin{aligned} ( \forall x \in _{1} a ) \bigl ( \, \mathsf {F} ( x, r ) = \mathsf {G} ( x, \, \mathsf {F} \! \upharpoonright _{ pred _{r} ( x )}, \, r ) \, \bigr ), \end{aligned}$$

where $ pred _{r} ( x ) := \{ y \in _{1} a \mid \langle y, x \rangle ^{{\mathcal {U}}} \in _{1} r \}^{{\mathcal {S}}}$, that is, the ${\mathcal {S}}$-set of r-predecessors of x, which is an ${\mathcal {S}}$-set by $\Delta _{0}^{{\mathcal {S}}} \text {-} \mathrm {Sep}$. Hence, in particular, ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ has a $\Sigma ^{{\mathcal {S}}}$-function m that satisfies (14) for every ${\mathcal {S}}$-set suitable tree $T \subset ^{{\mathcal {S}}} \mathsf {V}$ and its nodes $s \in _{1} T$.

This lemma implies that ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ proves the axiom Beta (suitably modified for ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$) restricted to well-founded ${\mathcal {S}}$-set relations on $\mathsf {V}$; also see Remark 5.19 below for the full-fledged version of the axiom Beta.

Since $\star $ interprets ${\mathsf {S}}{\mathsf {C}}_{1}$ in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, it follows from Fact 5.3 that for every $x \in M^{\star }$, $( H^{\star } )_{x}$ and $( {\check{H}}^{\star } )_{x}$ are true of the same ${\mathcal {U}}$-sets, and thus $\{ u \in _{1} \mathsf {V} \mid u \in ( H^{\star } )_{x} \} = \{ u \in _{1} \mathsf {V} \mid u \in ( {\check{H}}^{\star } )_{x} \}$ exists as an ${\mathcal {S}}$-set, provably in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, by $\Delta ^{{\mathcal {S}}}$-Separation for ${\mathcal {S}}$-sets (derivable in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }$ in the same way as [3, Theorem I.4.5]), since $H^{\star }$ is a $\Sigma ^{{\mathcal {S}}}$-predicate and ${\check{H}}^{\star }$ is a $\Pi ^{{\mathcal {S}}}$-predicate. Hence, for each $x \in M^{\star }$, the collapse $m ( ( H^{\star } )_{x}, \epsilon )$ of $( H^{\star } )_{x}$ exists as an ${\mathcal {S}}$-set provably in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$.

Now, since $ L ^{{\mathcal {S}}} ({\mathsf {V}} )\cup \, {\mathcal {U}}$ is an inner model of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$, $m^{ L ^{{\mathcal {S}}} ({\mathsf {V}} )}$ defines a function in $ L ^{{\mathcal {S}}} ({\mathsf {V}} )\cup \, {\mathcal {U}}$ that satisfies (14) in the sense of $ L ^{{\mathcal {S}}} ({\mathsf {V}} )\cup \, {\mathcal {U}}$. Then, since the condition (14) characterizing m is absolute, $m^{ L ^{{\mathcal {S}}} ({\mathsf {V}} )} ( T, s ) = m ( T, s )$ for every ${\mathcal {S}}$-set suitable tree $T \in L ^{{\mathcal {S}}} ({\mathsf {V}} )$ and $s \in _{1} T$. Hence, in sum, we have the following.

Lemma 5.12

${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ proves the following: for every ${\mathcal {S}}$-set suitable tree $T \in L ^{{\mathcal {S}}} ({\mathsf {V}} )$ and $s \in _{1} T$, $m ( T, s ) \in L ^{{\mathcal {S}}} ({\mathsf {V}} )$.

We are finally ready to prove Theorem 5.6.

Proof of Theorem 5.6

We will work within ${\mathsf {S}}{\mathsf {C}}_{1}$. We have to show $( {\mathcal {S}}\subset L ^{{\mathcal {S}}} ({\mathsf {V}} ))^{*}$. Note that if $x \in {\mathcal {S}}^{*}$, then $x = \langle a, 1 \rangle $ for some $a \in M$, for which we also have $\langle a, 0 \rangle \in ( M^{\star } )^{*}$ by Lemma 5.9. Hence, by the last lemma, it suffices to show

$$\begin{aligned} ( \forall x \in {\mathcal {S}}^{*} ) \forall a \bigl ( x = \langle a, 1 \rangle \rightarrow \langle a, 1 \rangle =^{*} ( m ( ( H^{\star } )_{\langle a, 0 \rangle }, \epsilon ) )^{*} \bigr ). \end{aligned}$$

This will be shown by induction on x along $\in _{1}^{*}$ using Fact 5.7.

Recall that for each $a, b \in M$, an inductive relation B(a, b, u, v) expresses that the sub-tree of $( H )_{a}$ below $u \in ( H )_{a}$ is bisimilar (in the aforementioned modified sense) to the sub-tree of $( H )_{b}$ below $v \in ( H )_{b}$, and m is so defined as to satisfy the following:

$$\begin{aligned} {{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}\vdash ( \forall x, y \in M^{\star } ) \forall u \forall v&\Bigl ( \bigl ( u \in ( H^{\star } )_{x} \wedge v \in ( H^{\star } )_{y} \bigr ) \nonumber \\&\rightarrow \bigl ( B^{\star } ( x, y, u, v ) \leftrightarrow m ( ( H^{\star } )_{x}, u ) \! = \! m ( ( H^{\star } )_{y}, v ) \bigr ) \Bigr ). \end{aligned}$$

(21)

Let us define an ${\mathcal {L}}_{\mathrm {SC}}$-formula $\psi ( x, y )$ and an ${\mathcal {L}}_{\mathrm {KP}}$-formula $\theta ( x, y )$ as follows:

$$\begin{aligned} \psi ( x, y )&\ :\Leftrightarrow \ \exists z \bigl ( \langle z \rangle \in ( H )_{y} \wedge B ( x, y, \epsilon , \langle z \rangle ) \bigr ). \\ \theta ( x, y )&\ :\Leftrightarrow \ m ( ( H^{\star } )_{x}, \epsilon ) \in _{1} m ( ( H^{\star } )_{y}, \epsilon ). \end{aligned}$$

Then, by (21) and the definition of m, we obtain

$$\begin{aligned} {{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}\vdash ( \forall x, y \in M^{\star } ) \bigl ( \psi ^{\star } ( x, y ) \leftrightarrow \theta ( x, y ) \bigr ). \end{aligned}$$

(22)

Also recall that A(b, a) means that $\langle \langle b, 0 \rangle \rangle $ is a leaf of the suitable tree $( H )_{a}$ (see Sect. 5.1.1) when $a \in M$, and m is so defined as to satisfy the following as well:

$$\begin{aligned} {{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}\vdash ( \forall x \in {\mathcal {U}} ) ( \forall y \in M^{\star } ) \bigl ( A^{\star } ( x, y ) \leftrightarrow x \in _{1} m ( ( H^{\star } )_{y}, \epsilon ) \bigr ). \end{aligned}$$

(23)

Now, fix $x \in {\mathcal {S}}^{*}$ and let $x = \langle a, 1 \rangle $, where $a \in M$ and thus $\langle a, 0 \rangle \in ( M^{\star } )^{*}$. Take any $z \in {\mathcal {S}}^{*} \cup {\mathcal {U}}^{*}$. First suppose $z \in {\mathcal {S}}^{*}$ and let $z = \langle b, 1 \rangle $, where $b \in M$ and thus $\langle b, 0 \rangle \in ( M^{\star } )^{*}$. Then, $z \in _{1}^{*} x$ means $\psi ( b, a )$, and we have

$$\begin{aligned} z \in _{1}^{*} x {\mathop { \quad \Leftrightarrow \quad }\limits ^{def. }} \psi ( b, a ) {\mathop { \quad \Leftrightarrow \quad }\limits ^{Lem 5.9 }} ( \psi ^{\star } )^{*} ( \langle b, 0 \rangle , \langle a, 0 \rangle )&{\mathop { \quad \Leftrightarrow \quad }\limits ^{(22) }} \theta ^{*} ( \langle b, 0 \rangle , \langle a, 0 \rangle ) \\&{\mathop { \quad \Leftrightarrow \quad }\limits ^{\text {I.H.}}} ( z \in _{1} m ( ( H^{\star } )_{\langle a, 0 \rangle }, \epsilon ) )^{*}. \end{aligned}$$

Second suppose $z \in {\mathcal {U}}^{*}$ and let $z = \langle b, 0 \rangle $. Then, $z \in _{1}^{*} x$ means A(b, a), and we have

$$\begin{aligned} z \in _{1}^{*} x {\mathop { \quad \Leftrightarrow \quad }\limits ^{def. }} A ( b, a ) {\mathop { \quad \Leftrightarrow \quad }\limits ^{Lem 5.9 }} ( A^{\star } )^{*} ( \langle b, 0 \rangle , \langle a, 0 \rangle ) {\mathop { \quad \Leftrightarrow \quad }\limits ^{(23) }} ( z \in _{1} m ( ( H^{\star } )_{\langle a, 0 \rangle }, \epsilon ) )^{*}. \end{aligned}$$

Finally, since the axiom of extensionality is true under the interpretation $*$, we thereby obtain $x =^{*} ( m ( ( H^{\star } )_{\langle a, 0 \rangle }, \epsilon ) )^{*}$. The proof is completed. $\square $

Remark 5.13

One might well question if a parallel statement to Theorem 5.6 holds over arithmetic, namely, if the arithmetical counterpart of the interpretation $*$ of ${\mathsf {S}}{\mathsf {C}}_{1}$ over arithmetic in the Kripke–Platek theory over ${\mathbb {N}}$ (such as $\mathsf {KPN}$ or ${\mathsf {K}}{\mathsf {P}}\omega $) verifies the axiom of constructibility. The proof in this section cannot be applied to the arithmetical case as it is, because the addition of the axiom Beta or $\Pi _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {TI}^{\star }$ to the Kripke–Platek theory over ${\mathbb {N}}$ increases consistency strength. Nonetheless, the answer to the question is affirmative. Lemma 5.10 is needed for the proof of Theorem 5.6 only in proving the existence of the Mostowski collapsing function m, and m need not be the collapsing function for all arbitrary suitable trees but only for all hyperelementary suitable trees. Then, to prove the existence of m for them, we only need to avail ourselves of transfinite induction along $\sqsubset _{T}$ for each hyperelementary suitable tree T. Now, in my definition in [8], the well-foundedness of a hyperelementary suitable tree $( H )_{a}$ is defined in ${\mathsf {S}}{\mathsf {C}}_{1}$ in terms of the accessible part of $\sqsubset _{( H )_{a}}$, and well-foundedness thus defined implies transfinite induction along $\sqsubset _{( H )_{a}}$ for arbitrary ${\mathsf {S}}{\mathsf {C}}_{1}$-formulae; in fact, there is a ‘universal’ (coinductive) relation $\sqsubset $ such that $\langle x, a \rangle \sqsubset \langle y, a \rangle \leftrightarrow x \sqsubset _{( H )_{a}} y$, and the well-foundedness of $( H )_{a}$ can be expressed in terms of the (inductive) accessible part of this $\sqsubset $ uniformly for all a. Hence, transfinite induction along the hyperelementary well-founded relations $\sqsubset _{( H )_{a}}$ comes for free in ${\mathsf {S}}{\mathsf {C}}_{1}$ without the need to resort to set-theoretic axioms such as ${\mathcal {L}}_{\mathrm {SC}} \text {-} \mathrm {Repl}$. The same argument can be carried out in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ in terms of the $\star $-interpretation of the inductive predicates (because they are defined by $\in $-recursion along ordinals in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ and because their stage comparison prewellorderings are defined in terms of ordinals, transfinite induction along hyperelementary well-founded relations in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ derives from the axiom schema of foundation), and we can thereby define the Mostowski collapsing function m for all hyperelementary suitable trees $( H^{\star } )_{x}$ in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$. This argument can be straightforwardly adapted for a proof of the corresponding statement over arithmetic.

5.1.3 The axiom of projectibility

The Barwise-Gandy-Moschovakis theorem has one more important implication: the companion M of the inductive sets on a transitive infinite set A is also projectible, namely, that there is a $\Delta _{1}$-definable partial surjective function from some set belonging to M onto the entire universe M of the companion, which is called a projection of M; see [19, Ch.9.D] or [3, Ch.V]. In the present Sect. 5.1.3, we consider adding to ${\mathcal {L}}_{\mathrm {KP}}$ a new predicate symbol for a projection of the universe of ${\mathcal {S}}$-sets.

Let ${\mathcal {L}}_{\mathrm {KP}} ( Pr )$ be a language extending ${\mathcal {L}}_{\mathrm {KP}}$ with one new binary predicate $ Pr $. We then consider the following new axiom expressed in ${\mathcal {L}}_{\mathrm {KP}} ( Pr )$.

$$\begin{aligned} {( \mathrm {Prj} ):} \quad&\forall x \forall y \bigl ( Pr ( x, y ) \rightarrow ( x \in {\mathcal {S}} \wedge y \in {\mathcal {U}} ) \bigr ) \wedge ( \forall x \in {\mathcal {S}} ) ( \exists y \in {\mathcal {U}} ) Pr ( x, y ) \\&\wedge \forall x \forall y \forall z \bigl ( ( Pr ( x, z ) \wedge Pr ( y, z ) ) \rightarrow x = y \bigr ). \end{aligned}$$

This axiom asserts that $ Pr $ associates each ${\mathcal {S}}$-set with some urelements (i.e., ${\mathcal {U}}$-sets) so that the inverse of $ Pr $ gives a surjection from the range of $ Pr $ onto ${\mathcal {S}}$. Let us call $( \mathrm {Prj} )$ the axiom of projectibility.^{Footnote 23}

We extend the definition of the collections $\Delta _{0}^{{\mathcal {S}}}$, $\Sigma _{n}^{{\mathcal {S}}}$, $\Pi _{n}^{{\mathcal {S}}}$, $\Sigma ^{{\mathcal {S}}}$, and $\Pi ^{{\mathcal {S}}}$ of ${\mathcal {L}}_{\mathrm {KP}}$-formulae to $\Delta _{0}^{p}$, $\Sigma _{n}^{p}$, $\Pi _{n}^{p}$, $\Sigma ^{p}$, and $\Pi ^{p}$ of ${\mathcal {L}}_{\mathrm {KP}} ( Pr )$-formulae in the obvious manner by counting $ Pr $ in $\Delta _{0}^{p}$. We thereby define ${\mathcal {L}}_{\mathrm {KP}} ( Pr )$-systems ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }_{p}$ and ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}_{p}$ as ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }+ ( \mathrm {Prj} )$ and ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}+ ( \mathrm {Prj} )$, respectively, with all their axiom schemata extended for ${\mathcal {L}}_{\mathrm {KP}} ( Pr )$.

Now, we will extend $*$ to an interpretation of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}_{p}$ in ${\mathsf {S}}{\mathsf {C}}_{1}$. We first define two inductive relations $ R $ and $\check{ R }$ as follows:

$$\begin{aligned} R ( x, y ) :\Leftrightarrow \,&( \exists b \in M ) \Bigl ( y = \langle b, 0 \rangle \, \wedge P^{+}_{=} ( x, \langle b, 1 \rangle ) \wedge \forall c \bigl ( b \preceq _{M} c \vee P^{-}_{=} ( x, \langle c, 1 \rangle ) \bigr ) \Bigr ) \\ \check{ R } ( x, y ) : \Leftrightarrow \,&x \in {\mathcal {U}}^{*} \vee y \not \in {\mathcal {U}}^{*} \\&\vee \exists b \Bigl ( {y = \langle b, 0 \rangle } \, \wedge \Bigl ( P^{-}_{=} ( x, \langle b, 1 \rangle ) \vee \exists c \bigl ( c \prec _{M} b \wedge P^{+}_{=} ( x, \langle c, 1 \rangle ) \bigr ) \Bigr ) \Bigr ); \end{aligned}$$

namely, R(x, y) says that $y = \langle b, 0 \rangle $ ($\in {\mathcal {U}}^{*}$) for a $\prec _{M}$-minimal element b of M with $x =^{*} \langle b, 1 \rangle $, and $\check{ R } ( x, y )$ expresses its negation. We then define $ Pr ^{*} ( x, y ) :\Leftrightarrow R ( x, y )$.

Proposition 5.14

${\mathsf {S}}{\mathsf {C}}_{1}$ proves the following:

1.
$( \forall x \in {\mathcal {S}}^{*} ) ( \exists y \in {\mathcal {U}}^{*} ) R ( x, y )$.
2.
$( \forall x, y \in {\mathcal {S}}^{*} \cup {\mathcal {U}}^{*} ) \bigl ( R ( x, y ) \rightarrow x \in {\mathcal {S}}^{*} \wedge y \in {\mathcal {U}}^{*} \bigr )$.
3.
$( \forall x_{0}, x_{1} \in {\mathcal {S}}^{*} ) ( \forall y \in {\mathcal {U}}^{*} ) \bigl ( R ( x_{0}, y ) \wedge R ( x_{1}, y ) \rightarrow x_{0} =^{*} x_{1} \bigr )$.

Proof

1. Take any $x = \langle a, 1 \rangle \in {\mathcal {S}}^{*}$. Recall that $P^{+}_{=}$ (i.e., the $*$-translation of $=$) satisfies the axioms of equality for all elements of ${{\mathcal {S}}^{*}} \cup {{\mathcal {U}}^{*}}$. Hence, $P^{+}_{=} ( x, x )$ holds and the class $X := \{ d \mid d \in M \wedge P^{+}_{=} ( x, \langle d, 1 \rangle ) \}$ is non-empty. Take a $\prec _{M}$-minimal element b of X and put $y = \langle b, 0 \rangle $. For every c, if $b \npreceq _{M} c$, then $c \prec _{M} b$ and $c \in M$ by $b \in M$ and Fact 5.2.1, which implies $\lnot P^{+}_{=} ( x, \langle c, 1 \rangle )$ by the minimality of b and thus $P^{-}_{=} ( x, \langle c, 1 \rangle )$ by (15). Hence, we have R(x, y).

2. If $ R ( x, y )$ for $x, y \in {\mathcal {S}}^{*} \cup {\mathcal {U}}^{*}$, then $y = \langle b, 0 \rangle \in {\mathcal {U}}^{*}$ for some $b \in M$ and $P^{+}_{=} ( x, \langle b, 1 \rangle )$, which implies $x \in {\mathcal {S}}^{*}$, since $P^{+}_{=}$ satisfies the axioms of equality.

3. Let $ R ( x_{0}, y )$ and $ R ( x_{1}, y )$ for some $y = \langle b, 0 \rangle \in {\mathcal {U}}^{*}$ and $b \in M$. Then, $\langle b, 1 \rangle \in {\mathcal {S}}^{*}$ and thus $P^{+}_{=} ( x_{0}, x_{1} )$, since $P^{+}_{=}$ satisfies the axioms of equality. $\square $

Proposition 5.15

${\mathsf {S}}{\mathsf {C}}_{1}$ proves the following:

$$\begin{aligned} ( \forall x, y \in {\mathcal {S}}^{*} \cup {\mathcal {U}}^{*} ) \bigl ( \lnot R ( x, y ) \leftrightarrow {\check{R}} ( x, y ) \bigr ). \end{aligned}$$

Proof

Take any $x, y \in {\mathcal {S}}^{*} \cup {\mathcal {U}}^{*}$. First suppose $\check{ R } ( x, y )$. If $x \in {\mathcal {U}}^{*}$, then $P^{+}_{=} ( x, \langle b, 1 \rangle )$ for no $b \in M$; if $y \not \in {\mathcal {U}}^{*}$, then $y = \langle b, 0 \rangle $ for no b. Assume otherwise. Then, $x = \langle a, 1 \rangle $ and $y = \langle b, 0 \rangle $ for some $a \in M$ and b. If $b \not \in M$, then we trivially have $y \ne \langle d, 0 \rangle $ for any $d \in M$. Let $b \in M$. If $P^{-}_{=} ( x, \langle b, 1 \rangle )$, then $\lnot P^{+}_{=} ( x, \langle b, 1 \rangle )$ by (15). Finally, if there is some $c \prec _{M} b$ with $P^{+}_{=} ( x, \langle c, 1 \rangle )$, then $c \in M \wedge b \npreceq _{M} c$ by Fact 5.2.1 and $\lnot P^{-}_{=} ( x, \langle c, 1 \rangle )$ by (15).

Conversely, suppose $\lnot R ( x, y )$ and $x \not \in {\mathcal {U}}^{*}$ and $y \in {\mathcal {U}}^{*}$. Then, $x = \langle a, 1 \rangle $ and $y = \langle b, 0 \rangle $ for some $a \in M$ and b. If $b \not \in M$, then $a \prec _{M} b$ by $a \in M$ and Fact 5.2.1, and $P^{+}_{=} ( x, \langle a, 1 \rangle )$ because $P^{+}_{=}$ satisfies the axioms of equality. Now, assume $b \in M$. If $\lnot P^{+}_{=} ( x, \langle b, 1 \rangle )$, then we have $P^{-}_{=} ( x, \langle b, 1 \rangle )$ by (15). Otherwise, there is some c such that $b \npreceq _{M} c$ and $\lnot P^{-}_{=} ( x, \langle c, 1 \rangle )$. Then, we have $c \prec _{M} b$ and $c \in M$ by $b \in M$ and Fact 5.2.1, and $P^{+}_{=} ( x, \langle c, 1 \rangle )$ by (15). $\square $

The next is the main result of this sub-subsection.

Lemma 5.16

The extended translation $*$ is an interpretation of ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}_{p}$ in ${\mathsf {S}}{\mathsf {C}}_{1}$.

Proof

We work within ${\mathsf {S}}{\mathsf {C}}_{1}$. It immediately follows from Proposition 5.14 that ${\mathsf {S}}{\mathsf {C}}_{1} \vdash ( \mathrm {Prj} )^{*}$. We next show that for each $\Delta _{0}^{p}$-formula $\varphi ( \vec {x} )$ of ${\mathcal {L}}_{\mathrm {KP}} ( Pr )$, there are inductive relation $P^{+}_{\varphi } ( \vec {x} )$ and $P^{-}_{\varphi } ( \vec {x} )$ such that

$$\begin{aligned} ( \forall \vec {x} \in {\mathcal {S}}^{*} \! \cup {\mathcal {U}}^{*} ) \Bigl ( \bigl ( \varphi ^{*} ( \vec {x} ) \leftrightarrow P^{+}_{\varphi } ( \vec {x} ) \bigr ) \wedge \bigl ( \lnot \varphi ^{*} ( \vec {x} ) \leftrightarrow P^{-}_{\varphi } ( \vec {x} ) \bigr ) \Bigr ); \end{aligned}$$

(24)

this is shown by induction on $\varphi $ in a parallel manner to [8, Lemma 7.12], in which we use Proposition 5.15 for the additional base step where $\varphi $ is an atomic formula of the form $ Pr ( t_{0}, t_{1} )$. Then, using (24), we can show in a parallel manner to [8, Lemmata 7.14 and 7.15] that the $*$-translations of $( \Delta _{0}^{p} \text {-} \mathrm {Sep}_{1} )$ and $( \Delta _{0}^{p} \text {-} \mathrm {Coll}_{1} )$ are provable in ${\mathsf {S}}{\mathsf {C}}_{1}$. Finally, since $=^{*}$ (i.e., $P^{+}_{=}$) is an equivalence relation, it is obvious from the definition of $ R $ that $=^{*}$ satisfies the axioms of equality with respect to $ Pr ^{*}$, namely,

$$\begin{aligned} ( \forall x, y, z \in {\mathcal {S}}^{*} \cup \, {\mathcal {U}}^{*} ) \Bigl ( x =^{*} \! y \rightarrow \Bigl ( \bigl ( Pr ^{*} ( x, z ) \leftrightarrow Pr ^{*} ( y, z ) \bigr ) \wedge \bigl ( Pr ^{*} ( z, x ) \leftrightarrow Pr ^{*} ( z, y ) \bigr ) \Bigr ) \Bigr ). \end{aligned}$$

The $*$-translation of the remaining axioms of ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}_{p}$ can be verified in exactly the same manner as in [8]. $\square $

Combined with Theorem 5.6, we obtain the following.

Theorem 5.17

${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}_{p} + {\mathcal {S}}\! = \! L ^{{\mathcal {S}}} ({\mathsf {V}} )$ has the same ${\mathcal {L}}_{\in }$-theorems as ${\mathsf {S}}{\mathsf {C}}_{1}$ with respect to the canonical translation $\star $ of ${\mathcal {L}}_{\in }$ in ${\mathcal {L}}_{\mathrm {KP}}$.

Remark 5.18

We have shown in Sect. 5.1.2 that $*$ is an interpretation of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ plus the assertion that every ${\mathcal {S}}$-set is the Mostowski collapse of some hyperelementary suitable tree. Under this extra assumption, $ Pr $ becomes definable within ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ by a predicate stating that x is the Mostowski collapse of some suitable tree $( H^{\star } )_{y}$ for some ${\mathcal {U}}$-set $y \in M^{\star }$; this definition of $ Pr $ involves $\Sigma $-notions such as $M^{\star }$ and $H^{\star }$, but they can be shown to be $\Delta $ by the argument presented in Remark 5.13 (or by $\Pi _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {TI}^{\star }$ if we additionally postulate it).

Remark 5.19

${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}_{p}$ proves the class-theoretic counterpart of Simpson’s axiom of countability [25, Ch. VII. 3], which asserts that every ${\mathcal {S}}$-set can be injectively mapped into $\mathsf {V}$, by assigning the ${\mathcal {U}}$-set $\langle v, 0^{{\mathcal {U}}} \rangle ^{{\mathcal {U}}}$ to each ${\mathcal {U}}$-set member v of an ${\mathcal {S}}$-set y and the following ${\mathcal {U}}$-set to each ${\mathcal {S}}$-set member x of y:

$$\begin{aligned} \bigl \{ \langle u, 1^{{\mathcal {U}}} \rangle ^{{\mathcal {U}}} \in _{1} \mathsf {V} \mid Pr ( x, u ) \wedge ( \forall w \in _{1} \mathsf {V} ) \bigl ( Pr ( x, w ) \rightarrow {\mathsf {r}}{\mathsf {k}}^{{\mathcal {U}}} ( u ) \le {\mathsf {r}}{\mathsf {k}}^{{\mathcal {U}}} ( w ) \bigr ) \bigr \}; \end{aligned}$$

note that the domain of the injection need not be a transitive hull of the ${\mathcal {S}}$-set y because the existence of a transitive hull is provable for every ${\mathcal {S}}$-set in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$. It follows from this and Lemma 5.11 that ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}_{p}$ proves the axiom Beta (suitably modified for ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$) unrestrictedly. Combined with Fact 5.5, it follows that ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ plus the axiom Beta has the same ${\mathcal {L}}_{\in }$-theorems as ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$. This makes an interesting dissimilarity between Kripke–Platek systems over ${\mathbb {V}}$ and over ${\mathbb {N}}$, since the axiom Beta makes $\mathsf {KPN}$ or ${\mathsf {K}}{\mathsf {P}}\omega $ a strictly stronger system (see [20] for example).

Remark 5.20

${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ plus the aforementioned class-theoretic counterpart of the axiom of countability directly interprets $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}$, and this interpretation requires no instances of $\Sigma _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {Found}_{1}$; cf. [8, Theorem 9.1]. This makes another dissimilarity between Kripke–Platek systems over ${\mathbb {V}}$ and over ${\mathbb {N}}$, since the strength of the Kripke–Platek system $\mathsf {KPu}$ over ${\mathbb {N}}$ (either with or without the axiom of countability) essentially relies on the full axiom of foundation; restricting the axiom of foundation to any fixed complexity results in a weaker theory; cf. [12, 21].

5.2 Interpretation of $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}$ in ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}_{p} + {\mathcal {S}}\! = \! L ^{{\mathcal {S}}} ({\mathsf {V}} )$

In this subsection, we will show that ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}_{p} + {\mathcal {S}}\! = \! L ^{{\mathcal {S}}} ({\mathsf {V}} )\vdash ( \Sigma ^{1}_{1} \text {-} \mathrm {DColl} )^{\star }$, which entails that ${\mathsf {S}}{\mathsf {C}}_{1}$ and $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}$ have the same ${\mathcal {L}}_{\in }$-theorems due to Theorem 5.17 and the aforementioned fact that $\star $ is an interpretation of $\mathsf {ECA}$ in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$.

We will first show a variant of $\Sigma $-recursion theorem. It is observed that all the basic principles proved in [3, Ch.I.4], such as $\Delta $-Separation and $\Sigma $-Replacement, can be proved in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }_{p}$ (in suitably modified forms where $\Delta $ and $\Sigma $ are replaced by $\Delta ^{p}$ and $\Sigma ^{p}$) in exactly the same way. For each ${\mathcal {U}}$-class X (such as $On^{{\mathcal {U}}}$), let us define ${\mathcal {S}} ( X ) := \{ x \in _{1} \mathsf {V} \mid x \in X \}$ ($\in {\mathcal {S}}$), which exists as an ${\mathcal {S}}$-set by $\Delta _{0}^{p} \text {-} \mathrm {Sep}$, since ${\mathcal {U}}$-classes are defined by formulae with all the quantifiers restricted to ${\mathcal {U}}$ ($= \mathsf {V}$). Then, we have the following useful lemma.

Lemma 5.21

$(\Sigma ^{p}$-recursion on $On^{{\mathcal {U}}})$${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }_{p} + \Sigma _{1}^{p} \text {-} \mathrm {Found}_{0}^{+}$ proves the following. Let $\Phi ( x, y, z )$ (possibly with parameters) define a $\Sigma ^{p}$-function: that is, $\Phi $ is a $\Sigma ^{p}$-formula and $\forall x \forall y \exists ! z \Phi ( x, y, z )$. For readability, let us write $\mathsf {G} ( x, y ) = z$ for $\Phi ( x, y, z )$. Then, there exists a ${\mathcal {S}}$-set function f with domain ${\mathcal {S}} ( On^{{\mathcal {U}}} )$ such that

$$\begin{aligned} ( \forall \alpha \in _{1} {\mathcal {S}} ( On^{{\mathcal {U}}} ) ) \bigl ( \, f ( \alpha ) = \mathsf {G} ( \alpha , f \! \upharpoonright _{{\mathcal {S}} ( \alpha )} ) \, \bigr ); \end{aligned}$$

(25)

where $f \! \upharpoonright _{{\mathcal {S}} ( \alpha )}$ is the restriction of f to ${\mathcal {S}} ( \alpha )$; recall that ${\mathcal {S}} ( \alpha ) = \{ x \in _{1} \mathsf {V} \mid x \in _{0} \alpha \}$.

Proof

The proof is parallel to the standard proof of the $\Sigma $-recursion theorem; e.g., [3, Theorem I.6.4]. The goal is to show that, for all ${\mathcal {U}}$-ordinals $\alpha $, there uniquely exist $f_{\alpha } \in {\mathcal {S}}$ and $z_{\alpha } \in {\mathcal {S}} \cup {\mathcal {U}}$ such that the following $ A $ holds:

$$\begin{aligned} A ( \alpha , f_{\alpha }, z_{\alpha } ) \ :\Leftrightarrow \&f_{\alpha } \text {is an} {\mathcal {S}} \text {-function with domain} {\mathcal {S}} ( \alpha ) \\&\wedge \ \forall \beta \in _{1} {\mathcal {S}} ( \alpha ) \bigl ( f_{\alpha } ( \beta ) = \mathsf {G} ( \beta , \, f_{\alpha } \! \upharpoonright _{{\mathcal {S}} ( \beta )} ) \bigr ) \ \wedge \ z_{\alpha } = \mathsf {G} ( \alpha , f_{\alpha } ). \end{aligned}$$

We will first show the following:

$$\begin{aligned}&\forall g \forall h \forall u \forall w ( \forall \alpha \in On^{{\mathcal {U}}} ) \bigl ( \bigl ( A ( \alpha , g, u ) \wedge A ( \alpha , h, w ) \bigr ) \rightarrow ( g = h \wedge u = w ) \bigr ). \end{aligned}$$

(26)

Take any g, h, u, and w, and suppose $ A ( \alpha , g, u ) \wedge A ( \alpha , h, w )$. If $g \ne h$, then we can pick the least $\beta <^{{\mathcal {U}}} \alpha $ such that $g ( \beta ) \ne h ( \beta )$ by $\Sigma _{1}^{p} \text {-} \mathrm {Found}_{0}^{+}$ (applied to the $\Delta _{0}^{p}$-formula $g ( v ) \ne h ( v )$), but we then have $g \! \upharpoonright _{{\mathcal {S}} ( \beta )} = h \! \upharpoonright _{{\mathcal {S}} ( \beta )}$ and thus $g ( \beta ) = \mathsf {G} ( \beta , \, g \! \upharpoonright _{{\mathcal {S}} ( \beta )} ) = \mathsf {G} ( \beta , \, h \! \upharpoonright _{{\mathcal {S}} ( \beta )} ) = h ( \beta )$, which is a contradiction. Hence, $g = h$, and thus $u = \mathsf {G} ( \alpha , \, g ) = \mathsf {G} ( \alpha , \, h ) = w$.

We will next show the following by induction on $\alpha $ (applied to a $\Sigma _{1}^{p}$-formula):

$$\begin{aligned}&\forall \alpha \in On^{{\mathcal {U}}} \exists f_{\alpha } \exists z_{\alpha } A ( \alpha , f_{\alpha }, z_{\alpha } ). \end{aligned}$$

(27)

Suppose (27) for all $\beta <^{{\mathcal {U}}} \alpha $. We have $( \forall \beta \in _{0} \alpha ) \exists ! f_{\beta } \exists ! z_{\beta } A ( \beta , f_{\beta }, z_{\beta } )$ by (26). Hence, we have

$$\begin{aligned} ( \forall \beta \in _{1} {\mathcal {S}} ( \alpha ) ) \exists ! z_{\beta } \exists f_{\beta } A ( \beta , f_{\beta }, z_{\beta } ). \end{aligned}$$

Since ${\mathcal {S}} ( \alpha )$ is an ${\mathcal {S}}$-set, we can apply $\Sigma ^{p}$-Replacement, which is provable in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }_{p}$, and thus obtain an ${\mathcal {S}}$-set function $f_{\alpha }$ with domain ${\mathcal {S}} ( \alpha )$ such that

$$\begin{aligned} ( \forall \beta \in _{1} {\mathcal {S}} ( \alpha ) ) \exists f_{\beta } A ( \beta , f_{\beta }, f_{\alpha } ( \beta ) ). \end{aligned}$$

Then, by (26), we have $( \forall \beta \in _{1} {\mathcal {S}} ( \alpha ) ) A ( \beta , {f_{\alpha } \! \! \upharpoonright _{{\mathcal {S}} ( \beta )}}, f_{\alpha } ( \beta ) )$. Finally, we simply set $z_{\alpha } := \mathsf {G} ( \alpha , f_{\alpha } )$, from which we obtain $ A ( \alpha , f_{\alpha }, z_{\alpha } )$.

By (26) and (27), we have now established that

$$\begin{aligned} ( \forall \alpha \in _{1} {\mathcal {S}} ( On^{{\mathcal {U}}} ) ) \exists ! z_{\alpha } \exists f_{\alpha } A ( \alpha , f_{\alpha }, z_{\alpha } ). \end{aligned}$$

Since ${\mathcal {S}} ( On^{{\mathcal {U}}} )$ is an ${\mathcal {S}}$-set, again by $\Sigma ^{p}$-Replacement, we take the unique ${\mathcal {S}}$-function f with domain ${\mathcal {S}} ( On^{{\mathcal {U}}} )$ such that $( \forall \alpha \in _{1} {\mathcal {S}} ( On^{{\mathcal {U}}} ) ) \exists f_{\alpha } A ( \alpha , f_{\alpha }, f ( \alpha ) )$. Hence, again by (26), we have $( \forall \alpha \in _{1} {\mathcal {S}} ( On^{{\mathcal {U}}} ) ) A ( \alpha , \, f \! \upharpoonright _{{\mathcal {S}} ( \alpha )}, \, f ( \alpha ) )$, and thus f satisfies (25). $\square $

Now, we are ready to prove our main claim.

Lemma 5.22

${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}_{p} + {\mathcal {S}}\! = \! L ^{{\mathcal {S}}} ({\mathsf {V}} )\vdash ( \Pi ^{1}_{0} \text {-} \mathrm {DColl} )^{\star }$.^{Footnote 24}

Proof

We work within ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}_{p} + {\mathcal {S}}\! = \! L ^{{\mathcal {S}}} ({\mathsf {V}} )$. In this proof (and only in this proof), we use capital Roman letters X, Y, Z, ..., to designate ${\mathcal {S}}$-subsets of $\mathsf {V}$ (viz., $\star $-translations of classes) for readability. We first note that $( X )_{x}$ and $( X )^{x}$ are interpreted by $\star $ as $\{ y \in _{1} \mathsf {V} \mid \langle y, x \rangle ^{{\mathcal {U}}} \in _{1} X \}$ and $\{ \langle y, z \rangle ^{{\mathcal {U}}} \in _{1} X \mid {\mathsf {r}}{\mathsf {k}}^{{\mathcal {U}}} ( z ) < {\mathsf {r}}{\mathsf {k}}^{{\mathcal {U}}} ( x ) \}$, respectively, both of which exist as ${\mathcal {S}}$-sets.

Take any $\Pi ^{1}_{0}$-formula $\Phi ( u, X, Y )$, and let us write $\Psi $ for $\Phi ^{\star }$, which is a $\Delta _{0}^{{\mathcal {S}}}$-formula. Suppose $( \forall u \in _{1} \mathsf {V} ) \forall X \exists Y \Psi ( u, X, Y )$. For each $\alpha \in On^{{\mathcal {U}}}$, we will define an ${\mathcal {S}}$-ordinal $\sigma _{\alpha }$, ${\mathcal {U}}$-ordinals $\upsilon _{\alpha } \ge \alpha $ and $\mu _{\alpha }$, and an ${\mathcal {S}}$-set $X_{\alpha } \subset ^{{\mathcal {S}}} \mathsf {V} \times ^{{\mathcal {U}}} V_{\upsilon _{\alpha }}^{{\mathcal {U}}}$ ($= \{ \langle y, u \rangle ^{{\mathcal {U}}} \! \in _{1} \mathsf {V} \mid u \in _{0} V_{\upsilon _{\alpha }}^{{\mathcal {U}}} \}$), by $\Sigma ^{p}$-recursion along $On^{{\mathcal {U}}}$ (Lemma 5.21) in the way that we will describe below.

For the base step, when $\alpha = 0^{{\mathcal {U}}}$, then we set

$$\begin{aligned}&\sigma _{\alpha } := 0^{{\mathcal {S}}}, \quad \quad \mu _{\alpha } := 0^{{\mathcal {U}}}, \quad \quad \upsilon _{\alpha } := 0^{{\mathcal {U}}},&\text {and} \quad \quad X_{\alpha } := \emptyset ^{{\mathcal {S}}}. \end{aligned}$$

When $\alpha $ is a limit ${\mathcal {U}}$-ordinal, then we set

$$\begin{aligned}&\sigma _{\alpha } := 0^{{\mathcal {S}}}, \quad \quad \mu _{\alpha } := 0^{{\mathcal {U}}}, \quad \quad \upsilon _{\alpha } := {\sup _{\beta< \alpha }}^{{\mathcal {U}}} \, \upsilon _{\beta }, \quad \quad \text {and} \quad \quad X_{\alpha } := {\bigcup _{\beta < \alpha }}^{{\mathcal {S}}} \, X_{\beta }; \end{aligned}$$

in fact, $\sigma _{\alpha }$ and $\mu _{\alpha }$ can be arbitrary for a limit $\alpha $.^{Footnote 25} Now, assume that $\alpha $ is a successor ${\mathcal {U}}$-ordinal and $\alpha = \beta + 1$ for some ${\mathcal {U}}$-ordinal $\beta $. Let $ LH ( f, \zeta )$ be an ${\mathcal {L}}_{\mathrm {KP}}$-formula expressing that $\zeta $ is an ${\mathcal {S}}$-ordinal and f is an ${\mathcal {S}}$-set function with domain $\zeta + 1$ such that $f ( \eta ) = L_{\eta }^{{\mathcal {S}}} ( \mathsf {V} )$ for each ${\mathcal {S}}$-ordinal $\eta \le \zeta $; such $ LH ( f, \zeta )$ can be taken as $\Delta ^{{\mathcal {S}}}$ (in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }+ \Sigma _{1}^{{\mathcal {S}}} \text {-} \mathrm {Found}$). Then, firstly, we define $\sigma _{\alpha }$ as the least ${\mathcal {S}}$-ordinal $\xi $ such that

$$\begin{aligned}&( \forall u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} ) \exists f \bigl ( LH ( f, \xi ) \wedge ( \exists Y \in _{1} f ( \xi ) ) \Psi ( u, ( X_{\beta } )^{u}, Y ) \bigr ), \nonumber \\&\ \text {namely}, \ ( \forall u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} ) ( \exists Y \in _{1} L_{\xi }^{{\mathcal {S}}} ( \mathsf {V} ) ) \Psi ( u, ( X_{\beta } )^{u}, Y ). \end{aligned}$$

(28)

Such $\xi $ exists, since it follows from the supposition $( \forall u \in _{1} \mathsf {V} ) \forall X \exists Y \Psi ( u, X, Y )$ and the postulate ${\mathcal {S}}\! = \! L ^{{\mathcal {S}}} ({\mathsf {V}} )$ that for each $u \in _{1} {\mathcal {S}} ( ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} )$ there is some ${\mathcal {S}}$-ordinal $\zeta $ such that $( \exists Y \in _{1} L_{\zeta }^{{\mathcal {S}}} ( \mathsf {V} ) ) \Psi ( u, ( X_{\beta } )^{u}, Y )$, which is a $\Sigma ^{p}$-formula, and thus it follows by $\Sigma ^{p}$-Collection (for ${\mathcal {S}}$-sets), which is derivable in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }_{p}$ (in the same way as [3, Theorem I.4.4]), that there is an ${\mathcal {S}}$-ordinal $\xi $ that satisfies (28); then, taking any such $\xi $ and f with $ LH ( f, \xi )$, we can pick $\sigma _{\alpha }$ as the least ${\mathcal {S}}$-ordinal $\eta \le \xi $ such that $( \forall u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} ) ( \exists Y \in _{1} f ( \eta ) ) \Psi ( u, ( X_{\beta } )^{u}, Y ) \bigr )$ by $\Delta _{0}^{p} \text {-} \mathrm {Found}_{1}$ using $\xi $ and f (as well as $\alpha $ and $X_{\beta }$) as parameters. Hence, $\sigma _{\alpha }$ can be $\Sigma ^{p}$-defined with the parameters $\alpha $ and $X_{\beta }$ as the unique ordinal $\xi $ such that

$$\begin{aligned} \exists f \bigl ( LH ( f, \xi )&\, \wedge \, ( \forall u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} ) ( \exists Y \in _{1} f ( \xi ) ) \, \Psi ( u, ( X_{\beta } )^{u}, Y ) \\&\, \wedge \, ( \forall \eta < \xi ) ( \exists u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} ) ( \forall Y \in _{1} f ( \eta ) ) \lnot \Psi ( u, ( X_{\beta } )^{u}, Y ) \bigr ) . \end{aligned}$$

Secondly, we define $\mu _{\alpha }$ as the least ${\mathcal {U}}$-ordinal $\nu $ such that

$$\begin{aligned} ( \forall u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} ) ( \exists v \in _{0} V_{\nu }^{{\mathcal {U}}} ) ( \exists Y \in _{1} L_{\sigma _{\alpha }}^{{\mathcal {S}}} ( \mathsf {V} ) ) \bigl ( Pr ( Y, v ) \wedge \Psi ( u, ( X_{\beta } )^{u}, Y ) \bigr ). \end{aligned}$$

(29)

Since we can assume that an ${\mathcal {S}}$-function $f_{\alpha }$ with $ LH ( f_{\alpha }, \sigma _{\alpha } )$ has been already given in defining $\sigma _{\alpha }$, the formula (29) can be taken as $\Delta _{0}^{p}$ (using $f_{\alpha }$ as parameters). Hence, $\mu _{\alpha }$ exists by the definition of $\sigma _{\alpha }$ and the axiom $( \mathrm {Prj} )$, with the help of $\Delta _{0}^{p} \text {-} \mathrm {Repl}_{0}^{+}$ and $\Delta _{0}^{p} \text {-} \mathrm {Found}_{0}^{+}$, and is $\Sigma ^{p}$-definable (actually, $\Delta _{0}^{p}$-definable) with the parameters $\alpha $, $\sigma _{\alpha }$, $f_{\alpha }$, and $X_{\beta }$. Thirdly, we define $\upsilon _{\alpha }$ as follows:

$$\begin{aligned} \upsilon _{\alpha } := {\sup }^{{\mathcal {U}}} \bigl \{ {\mathsf {r}}{\mathsf {k}}^{{\mathcal {U}}} \bigl ( \langle \upsilon _{\beta }, v \rangle ^{{\mathcal {U}}} \bigr ) + 1 \mid v \in _{0} V_{\mu _{\alpha }}^{{\mathcal {U}}} \bigr \}^{{\mathcal {U}}}; \end{aligned}$$

note that $\upsilon _{\alpha }$ is $\Delta _{0}^{p}$-definable with the parameters $\upsilon _{\beta }$ and $\mu _{\alpha }$; obviously, we have $\upsilon _{\alpha } > \upsilon _{\beta }$ and $\upsilon _{\alpha } \ge \alpha $. Finally, we define $X_{\alpha }$ as the following ${\mathcal {S}}$-set:

$$\begin{aligned}&\bigl \{ \langle y, w \rangle ^{{\mathcal {U}}} \! \in _{1} X_{\beta } \mid w \in _{0} \, V_{\upsilon _{\beta }}^{{\mathcal {U}}} \bigr \} \\&\cup \Bigl \{ \langle y, \langle \upsilon _{\beta }, v \rangle \rangle ^{{\mathcal {U}}} \! \in _{1} \mathsf {V} \mid v \in _{0} V_{\mu _{\alpha }}^{{\mathcal {U}}} \, \wedge \, ( \exists u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} ) ( \exists Y \in _{1} L_{\sigma _{\alpha }}^{{\mathcal {S}}} ( \mathsf {V} ) ) \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \ \ \ \Bigl ( Pr ( Y, v ) \wedge \Psi \bigl ( u, ( X_{\beta } )^{u}, Y \bigr ) \wedge y \in _{1} Y \Bigr ) \Bigr \}; \end{aligned}$$

note that $X_{\alpha }$ exists by $\Delta ^{p}$-Separation (for ${\mathcal {S}}$-sets) and thus is $\Sigma ^{p}$-definable; since $\langle \upsilon _{\beta }, v \rangle \not \in _{0} V_{\upsilon _{\beta }}^{{\mathcal {U}}}$ is always the case, the first and second sets are disjoint. Then, we observe that the following holds:

$$\begin{aligned} ( X_{\alpha } )_{w} \! := {\left\{ \begin{array}{ll} Pr ^{-1} ( v ) &{} \text {if}\,( \exists Y \in _{1} L_{\sigma _{\alpha }}^{{\mathcal {S}}} ( \mathsf {V} ) ) \bigl ( \Psi ( u, ( X_{\beta } )^{u}, Y ) \wedge Pr ( Y, v ) \bigr ) \\ &{} \text {and}\, w = \langle \upsilon _{\beta }, v \rangle \,\text {for some} \,v \in _{0} V_{\mu _{\alpha }}^{{\mathcal {U}}}\, \text {and}\, u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} \\ ( X_{\beta } )_{w} &{} \text {if}\, w \in V_{\upsilon _{\beta }} \\ \emptyset &{} \text {otherwise}; \end{array}\right. } \end{aligned}$$

hence, we have $( X_{\alpha } )^{\upsilon _{\beta }} = ( X_{\beta } )^{\upsilon _{\beta }}$, and thus $( X_{\beta } )^{u} = ( X_{\beta } )^{\beta } = ( X_{\alpha } )^{\beta } = ( X_{\alpha } )^{u}$ for all $u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}}$, since ${\mathsf {r}}{\mathsf {k}}^{{\mathcal {U}}} ( u ) = \beta \le \upsilon _{\beta }$, which implies

$$\begin{aligned} \bigl ( \forall u \in _{0} ( V_{\alpha } {\setminus } V_{\beta } )^{{\mathcal {U}}} \bigr ) \bigl ( \exists v \in _{0} ( V_{\upsilon _{\alpha }} {\setminus } V_{\upsilon _{\beta }} )^{{\mathcal {U}}} \bigr ) \, \Psi \bigl ( u, ( X_{\alpha } )^{u}, ( X_{\alpha } )_{v} \bigr ). \end{aligned}$$

(30)

We have completed the recursive definitions of $\sigma _{\alpha }$, $\mu _{\alpha }$, $\upsilon _{\alpha }$, and $X_{\alpha }$. It is easy to see that for each $u \in _{0} V_{\upsilon _{\alpha }}^{{\mathcal {U}}}$, we have $( X_{\alpha } )_{u} = ( X_{\gamma } )_{u}$ for all $\gamma > \alpha $.

Finally, we set $X = \bigcup _{\alpha \in On^{{\mathcal {U}}}} X_{\alpha }$; hence, we have

$$\begin{aligned} X := \bigl \{ \langle y, u \rangle ^{{\mathcal {U}}} \in _{1} \mathsf {V} \mid ( \exists \xi \in On^{{\mathcal {U}}} ) \bigl ( u \in _{0} V_{\upsilon _{\xi }}^{{\mathcal {U}}} \wedge y \in _{1} ( X_{\xi } )_{u} \bigr ) \bigr \}. \end{aligned}$$

We have to check $( \forall u \in _{1} \mathsf {V} ) ( \exists v \in _{1} \mathsf {V} ) \Psi ( u, ( X )^{u}, ( X )_{v} )$. Take any $u \in _{1} \mathsf {V}$ and let ${\mathsf {r}}{\mathsf {k}}^{{\mathcal {U}}} ( u ) = \alpha $. Then $u \in _{0} ( V_{\alpha + 1} {\setminus } V_{\alpha } )^{{\mathcal {U}}}$. By (30) we get $\Psi ( u, ( X_{\alpha + 1} )^{u}, ( X_{\alpha + 1} )_{v} )$ for some $v \in _{0} V_{\upsilon _{\alpha + 1}}^{{\mathcal {U}}}$. Finally, the claim obtains, since we generally have $( X )^{\upsilon _{\beta }} = ( X_{\beta } )^{\upsilon _{\beta }}$ and $\beta \le \upsilon _{\beta }$ for all ${\mathcal {U}}$-ordinals $\beta $. $\square $

By combining this with results from [7, 8, 23], we obtain the next theorem.

Theorem 5.23

The following systems all have the same ${\mathcal {L}}_{\in }$-theorems:

$$\begin{aligned}&\widehat{{\mathsf {I}}{\mathsf {D}}}_{1},&{\mathsf {I}}{\mathsf {D}}_{1},&{\mathsf {F}}{\mathsf {P}}_{0}^{-},&\mathsf {LFP}_{0}^{-},&{\mathsf {S}}{\mathsf {C}}_{1},&\Delta ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}},&\Sigma ^{1}_{1} \text {-} \mathsf {Coll},&\Pi ^{1}_{2} \text {-} \mathsf {RFN},&\text {and}&\Sigma ^{1}_{1} \text {-} \mathsf {DColl}.&\end{aligned}$$

They still have the same ${\mathcal {L}}_{\in }$-theorems even if we assume $\mathrm {AC}$ or $\mathrm {GC}$ (Remark 1.1).

Proof

For brevity, let us write $\mathsf {S} \subset _{{\mathcal {L}}_{\in }} \mathsf {T}$ to mean that all the ${\mathcal {L}}_{\in }$-theorems of a system $\mathsf {S}$ are provable in a system $\mathsf {T}$; we will write $\mathsf {S} =_{{\mathcal {L}}_{\in }} \mathsf {T}$ for $\mathsf {S} \subset _{{\mathcal {L}}_{\in }} \mathsf {T} \wedge \mathsf {T} \subset _{{\mathcal {L}}_{\in }} \mathsf {S}$. First, we have seen $\mathsf {LFP}_{0}^{-} =_{{\mathcal {L}}_{\in }} {\mathsf {F}}{\mathsf {P}}_{0}^{-} =_{{\mathcal {L}}_{\in }} \widehat{{\mathsf {I}}{\mathsf {D}}}_{1} =_{{\mathcal {L}}_{\in }} {\mathsf {I}}{\mathsf {D}}_{1}$ (Corollary 2.28) and $\widehat{{\mathsf {I}}{\mathsf {D}}}_{1} \subset _{{\mathcal {L}}_{\in }} \Sigma ^{1}_{1} \text {-} \mathsf {Coll}$ (by Lemma 2.26). Second, $\widehat{{\mathsf {I}}{\mathsf {D}}}_{1} =_{{\mathcal {L}}_{\in }} {\mathsf {S}}{\mathsf {C}}_{1}$ is due to Sato [23]. Third, $\Delta ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}} =_{{\mathcal {L}}_{\in }} \Sigma ^{1}_{1} \text {-} \mathsf {Coll}$ is due to [7, Theorem 80]. Fourth, we have shown $\Sigma ^{1}_{1} \text {-} \mathsf {Coll} \subset _{{\mathcal {L}}_{\in }} \Pi ^{1}_{2} \text {-} \mathsf {RFN} \subset _{{\mathcal {L}}_{\in }} \Sigma ^{1}_{1} \text {-} \mathsf {DColl}$ (Lemmata 4.3 and 4.9). Finally, we have just proven $\Sigma ^{1}_{1} \text {-} \mathsf {DColl} \subset _{{\mathcal {L}}_{\in }} {\mathsf {S}}{\mathsf {C}}_{1}$. $\square $

This and Corollary 4.5 give an alternative proof of the following result by Sato.

Corollary 5.24

(Sato [22]) ${\mathsf {F}}{\mathsf {P}}_{0}^{-}$ proves the consistency of $\mathsf {ETR}$.

5.3 A digression—an urelement-free formulation of ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$

What ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$ is to set theory is what $\mathsf {KPu}$ (see [11] for its definition) is to arithmetic,^{Footnote 26} and $\mathsf {KPu}$ has a urelement-free variant, namely, ${\mathsf {K}}{\mathsf {P}}\omega $, which theorizes the pure set part of $\mathsf {KPu}$. In this subsection, we will introduce and briefly discuss a urelement-free variant of ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$, which will play an important role in the study of stronger systems of classes (and the author’s future work on this subject).

We will no longer consider the axiom of projectibility in what follows, and focus on ${\mathsf {K}}{\mathsf {P}}$-systems in the language ${\mathcal {L}}_{\mathrm {KP}}$.

It is observed that the axiom $( \mathrm {Prj} )$ plays no role in the proof of Lemma 5.21, and $\Sigma ^{{\mathcal {S}}}$-recursion on $On^{{\mathcal {U}}}$ is available in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }+ \Sigma _{1}^{{\mathcal {S}}} \text {-} \mathrm {Found}_{0}^{+}$. Hence, in this system, we can define an ${\mathcal {S}}$-set function f with domain ${\mathcal {S}} ( On^{{\mathcal {U}}} )$ such that, for each $\alpha \in On^{{\mathcal {U}}}$, $f ( \alpha )$ is an ${\mathcal {S}}$-set function $g_{\alpha }$ with domain ${\mathcal {S}} ( V_{\alpha }^{{\mathcal {U}}} )$ so that

$$\begin{aligned} g_{\alpha + 1} ( u )&:= {\left\{ \begin{array}{ll} \{ g_{\alpha } ( w ) \mid w \in _{0} u \}^{{\mathcal {S}}} &{} \text {if } u \in _{0} ( V_{\alpha + 1} {\setminus } V_{\alpha } )^{{\mathcal {U}}} \\ g_{\alpha } ( u ) &{} \text {if }u \in _{0} V_{\alpha }^{{\mathcal {U}}}. \end{array}\right. } \\ g_{\lambda }&:= \bigcup _{\xi < \lambda } g_{\xi }, \quad \text {if } \lambda \text { is a limit ordinal}. \end{aligned}$$

Let $g := \bigcup _{\alpha } g_{\alpha }$, which is an ${\mathcal {S}}$-set function with domain ${\mathcal {S}} ( {\mathcal {U}} ) = \mathsf {V}$. We can show (by $\Sigma _{1}^{{\mathcal {S}}} \text {-} \mathrm {Found}_{0}^{+}$) that g is an injection. Let ${{\textsf {\textit{v}}}}$ denote the range of g. Then, g is an isomorphism between $\mathsf {V}$ and ${{\textsf {\textit{v}}}}$ in the sense that

$$\begin{aligned} ``g \text { is bijective''} \wedge ( \forall x, y \in _{1} \mathsf {V} ) \bigl ( x \in _{0} y \leftrightarrow g ( x ) \in _{1} g ( y ) \bigr ). \end{aligned}$$

(31)

As before, for each ${\mathcal {L}}_{\in }$-formula $\varphi $, we define its relativization $\varphi ^{\langle {{\textsf {\textit{v}}}}, \in _{1} \rangle }$ as the result of replacing $\in $ by $\in _{1}$ and also replacing each quantifier $\forall x$ by $\forall x \in _{1} {{\textsf {\textit{v}}}}$. It follows from (31) that, for all ${\mathcal {L}}_{\in }$-formulae $\varphi ( x_{1}, \ldots , x_{k} )$, ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }+ \Sigma _{1}^{{\mathcal {S}}} \text {-} \mathrm {Found}_{0}^{+}$ proves

$$\begin{aligned}&( \forall \vec {x} \in {\mathcal {U}} ) \bigl ( \varphi ^{\langle {\mathcal {U}}, \in _{0} \rangle } ( x_{1}, \ldots , x_{k} ) \leftrightarrow \varphi ^{\langle {{\textsf {\textit{v}}}}, \in _{1} \rangle } ( g ( x_{1} ), \ldots , g ( x_{k} ) ) \bigr ) \nonumber \\&\wedge ( \forall \vec {x} \in _{1} {{\textsf {\textit{v}}}}) \bigl ( \varphi ^{\langle {\mathcal {U}}, \in _{0} \rangle } ( g^{-1} ( x_{1} ), \ldots , g^{-1} ( x_{k} ) ) \leftrightarrow \varphi ^{\langle {{\textsf {\textit{v}}}}, \in _{1} \rangle } ( x_{1}, \ldots , x_{k} ) \bigr ). \end{aligned}$$

(32)

In particular, the system proves $\sigma ^{\langle {{\textsf {\textit{v}}}}, \in _{1} \rangle }$ for every axiom $\sigma $ of ${\mathsf {Z}}{\mathsf {F}}$. We will prove further useful properties of ${{\textsf {\textit{v}}}}$ under the assumptions of some extra axioms. For the sake of readability, let us define the ${\mathcal {L}}_{\mathrm {KP}}$-system ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$ as follows:

$$\begin{aligned} {{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1} \ := \ {{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }+ \Sigma _{1}^{{\mathcal {S}}} \text {-} \mathrm {Found}_{1} + \Sigma _{1}^{{\mathcal {S}}} \text {-} \mathrm {Found}_{0}^{+} + \Delta _{0}^{{\mathcal {S}}} \text {-} \mathrm {Sep}_{0}^{+} + \Delta _{0}^{{\mathcal {S}}} \text {-} \mathrm {Repl}_{0}^{+}. \end{aligned}$$

Proposition 5.25

${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$ proves that ${{\textsf {\textit{v}}}}$ is supertransitive, namely,

$$\begin{aligned} ( \forall x \in _{1} {{\textsf {\textit{v}}}}) ( \forall y \in _{1} x ) y \in _{1} {{\textsf {\textit{v}}}}\, \wedge \, ( \forall x \in _{1} {{\textsf {\textit{v}}}}) \forall y \in {\mathcal {S}} ( y \subset ^{{\mathcal {S}}} x \rightarrow y \in _{1} {{\textsf {\textit{v}}}}). \end{aligned}$$

Proof

The first conjunct (i.e., transitivity) is obvious from the definition of ${{\textsf {\textit{v}}}}$. For the second, let $x \in {{\textsf {\textit{v}}}}$ and $y \subset ^{{\mathcal {S}}} x$. By $\Delta _{0}^{{\mathcal {S}}} \text {-} \mathrm {Sep}_{0}^{+}$, we have

$$\begin{aligned} w := \{ u \in _{0} g^{-1} ( x ) \mid g ( u ) \in _{1} y \}^{{\mathcal {U}}} \ \in \ {\mathcal {U}}. \end{aligned}$$

We can easily check $g ( w ) = y$ (using the transitivity of ${{\textsf {\textit{v}}}}$). $\square $

Proposition 5.26

${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$ proves that for every ${\mathcal {S}}$-set function f, if $\mathrm {dom}^{{\mathcal {S}}} ( f ) \in _{1} {{\textsf {\textit{v}}}}$ and $\mathrm {ran}^{{\mathcal {S}}} ( f ) \subset ^{{\mathcal {S}}} {{\textsf {\textit{v}}}}$, then $\mathrm {ran}^{{\mathcal {S}}} ( f ) \in {{\textsf {\textit{v}}}}$, where $\mathrm {dom} ( f )$ and $\mathrm {ran} ( f )$ are ${\mathcal {L}}_{\in }$-expressions denoting the domain and range of f respectively.

Proof

Take any $f \in {\mathcal {S}}$ with some domain $a \in _{1} {{\textsf {\textit{v}}}}$, and suppose $\mathrm {ran}^{{\mathcal {S}}} ( f ) \subset {{\textsf {\textit{v}}}}$. We have $f \subset ^{{\mathcal {S}}} {{\textsf {\textit{v}}}}$, since ${{\textsf {\textit{v}}}}$ is transitive and closed under paring. Let us write

$$\begin{aligned} g^{-1} ( f ) := \{ g^{-1} ( z ) \in {\mathcal {U}} \mid z \in _{1} f \}^{{{\mathcal {S}}}} {\mathop { \ \ \ = \ \ \ }\limits ^{by (32) }} \{ \langle g^{-1} ( u ), g^{-1} ( v ) \rangle ^{{\mathcal {U}}} \in {\mathcal {U}} \mid \langle u, v \rangle ^{{\mathcal {S}}} \in _{1} f \}^{{{\mathcal {S}}}}, \end{aligned}$$

which exists as an ${\mathcal {S}}$-set by $\Delta _{0}^{{\mathcal {S}}} \text {-} \mathrm {Sep}_{1}$. Since g is bijective, we have

$$\begin{aligned} ( \forall u \in _{0} g^{-1} ( a ) ) ( \exists ! w \in {\mathcal {U}} ) ( \langle u, v \rangle ^{{\mathcal {U}}} \in _{1} g^{-1} ( f ) ). \end{aligned}$$

Finally, by $\Delta _{0} \text {-} \mathrm {Sep}_{0}^{+}$ and $\Delta _{0} \text {-} \mathrm {Repl}_{0}^{+}$, we obtain a $\mathcal {U}$-set b such that

$$\begin{aligned} b = \{ w \in \mathcal {U} \mid ( \exists u \in _{0} g^{-1} ( a ) ) ( \langle u, w \rangle ^{\mathcal {U}} \in _{1} g^{-1} ( f ) ) \} \end{aligned}$$

and thus $g ( b ) = \mathrm {ran}^{\mathcal {S}} ( f ) \in _{1} {{\textsf {\textit{v}}}}$ again by (32). $\square $

These observations suggest the following urelement-free formulation of ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$.

Definition 5.27

Let ${\mathcal {L}}_{\in } ( V )$ be ${\mathcal {L}}_{\in }$ plus a new set constant $ V $, and let $\Delta _{0}^{ V }$, $\Sigma _{n}^{ V }$, and $\Pi _{n}^{ V }$ denote the collections of ${\mathcal {L}}_{\in } ( V )$-formulae in the Lévy hierarchy modified so that the constant $ V $ is allowed to appear in $\Delta _{0}^{ V }$. The ${\mathcal {L}}_{\in } ( V )$-system $ {{\mathsf {K}}}{{\mathsf {P}}}V ^{r}$, which corresponds to ${\mathsf {K}}{\mathsf {P}}\omega ^{r}$ over $\omega $ (see [20] for its definition), consists of the axioms of extensionality, paring, union, $\Delta _{0}^{ V }$-separation, $\Delta _{0}^{ V }$-collection, and $\Delta _{0}^{ V }$-foundation, as well as the following axioms stating a certain closure properties of $ V $:

$$\begin{aligned} ({\textit{V}1}) \quad&( \forall x \in V ) ( \exists y \in V ) \forall z \bigl ( z \in y \leftrightarrow z \subset x \bigr ). \\ ({\textit{V}2}) \quad&\forall f \bigl ( ( f \text {is a function} \wedge \mathrm {dom} ( f ) \in V \wedge \mathrm {ran} ( f ) \subset V ) \rightarrow \mathrm {ran} ( f ) \in V \bigr ). \\ ({\textit{V}3}) \quad&V \text { is a non-empty transitive model of the axioms of paring, union}, \\&\text {and infinity, namely, the relativizations of these axioms to}~ V \text { are true}. \end{aligned}$$

The full system $ {{\mathsf {K}}}{{\mathsf {P}}}V $ is obtained by extending $\Delta _{0}^{ V }$-foundation to $\Sigma _{\infty }^{ V }$-foundation and strengthening the closure property of $ V $ by adding the following schemata.

$$\begin{aligned} {( \Sigma _{\infty }^{ V } \text {-} \mathrm {Sep}^{ V } ):}&( \forall a \in V ) ( \exists b \in V ) ( \forall z \in V ) \bigl ( z \in b \leftrightarrow z \in a \wedge \varphi ( z ) \bigr ) \\ {(\Sigma _{\infty }^{ V } \text {-} \mathrm {Repl}^{ V } ):}&( \forall a \in V ) \bigl [ ( \forall x \in a ) ( \exists ! y \in V ) \varphi \rightarrow ( \exists b \in V ) ( \forall x \in a ) ( \exists y \in b ) \varphi \bigr ], \end{aligned}$$

where $\varphi $ is an arbitrary $\Sigma _{\infty }^{ V }$-formula without b free.^{Footnote 27}

Lemma 5.28

For each axiom $\sigma $ of ${\mathsf {Z}}{\mathsf {F}}$, ${\mathsf {K}}{\mathsf {P}} V ^{r} \vdash \sigma ^{\langle V , \in \rangle }$.

Proof

The axioms of extensionality, paring, union, and infinity are made true in $\langle V , \in \rangle $ by V3. The axiom of foundation is true in $\langle V , \in \rangle $ due to $\Delta _{0}^{ V }$-foundation and the transitivity of $ V $. The powerset axiom in $\langle V , \in \rangle $ follows from V1 and the transitivity of $ V $. For each ${\mathcal {L}}_{\in }$-formula $\varphi $ with all parameters from $ V $, $\varphi ^{\langle V , \in \rangle }$ is $\Delta _{0}^{ V }$ and thus $\{ x \in a \mid \varphi ^{\langle V , \in \rangle } ( x ) \}$ ($\subset a$) exists for every $a \in V $ by $\Delta _{0}^{ V }$-separation, which belongs to $ V $ due to V1 and the transitivity of $ V $. Finally, suppose $( \forall x \in a ) ( \exists ! y \in V ) \psi ^{\langle V , \in \rangle } ( x, y )$ for any $a \in V $ and ${\mathcal {L}}_{\in }$-formula $\psi $ with all parameters from $ V $. Then, since $\psi ^{\langle V , \in \rangle }$ is $\Delta _{0}^{ V }$, the set function $f := \{ \langle x, y \rangle \in a \times V \mid \psi ^{\langle V , \in \rangle } ( x, y ) \}$ exists by $\Delta _{0}^{ V }$-separation (as well as $\Delta _{0}^{ V }$-collection and the axiom of paring to ensure the existence of cartesian products), and thus $\mathrm {ran} ( f )$ belongs to $ V $ by V2. $\square $

Since $\Sigma ^{{\mathcal {S}}}$-recursion along $\in $ is available in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$, we can define the support function ${\mathsf {s}}{\mathsf {p}}$ in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$ (see [3, Ch. 1.6]) so that

$$\begin{aligned} {\mathsf {s}}{\mathsf {p}} ( x ) := {\left\{ \begin{array}{ll} \{ x \} &{} \text {if }x \in {\mathcal {U}} \\ \textstyle {\bigcup _{y \in _{1} x}} {\mathsf {s}}{\mathsf {p}} ( y ) &{} \text {if } x \in {\mathcal {S}}; \end{array}\right. } \end{aligned}$$

${\mathsf {s}}{\mathsf {p}}$ is a $\Sigma ^{{\mathcal {S}}}$-function, and we call x a pure set when $x \in {\mathcal {S}} \wedge {\mathsf {s}}{\mathsf {p}} ( x ) = \emptyset ^{{\mathcal {S}}}$. Let $A^{{\mathcal {S}}}$ be the class of pure sets, which is a $\Delta ^{{\mathcal {S}}}$-predicate in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$. It is obvious that ${{\textsf {\textit{v}}}}\in A^{{\mathcal {S}}}$ and $A^{{\mathcal {S}}}$ is transitive provably in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$. Now, we define a translation $\sharp $ of ${\mathcal {L}}_{\in } ( V )$ to ${\mathcal {L}}_{\mathrm {KP}}$ as follows:

$$\begin{aligned}&( x \in y )^{\sharp } = x \in _{1} y; \quad \quad V ^{\sharp } = {{\textsf {\textit{v}}}}; \quad \quad ( \forall x \varphi )^{\sharp } = ( \forall x \in A^{{\mathcal {S}}} ) \varphi ^{\sharp }. \end{aligned}$$

Lemma 5.29

$\sharp $ is an interpretation of $ {{\mathsf {K}}}{{\mathsf {P}}}V ^{r}$ plus $\Sigma _{1}^{ V }$-foundation in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$; furthermore, it is an interpretation of $ {{\mathsf {K}}}{{\mathsf {P}}}V $ in ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$.

Proof

We will work within ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$. We have shown that ${{\textsf {\textit{v}}}}$ is transitive; since $\langle {\mathcal {U}}, \in _{0} \rangle $ is a model of the axioms of paring, union, and infinity, so is $\langle {{\textsf {\textit{v}}}}, \in _{1} \rangle $ by (32); hence, $( {\textit{V}3} )^{\sharp }$ holds. $( \textit{V}1 )^{\sharp }$ follows from Proposition 5.25, since $\langle {{\textsf {\textit{v}}}}, \in _{1} \rangle $ is a model of the axiom of powerset by (32). $( {\textit{V}2} )^{\sharp }$ follows from Proposition 5.26. We can show in essentially the same manner as [3, Theorem II.1.5], using the $\Sigma ^{{\mathcal {S}}}$-definability of ${\mathsf {s}}{\mathsf {p}}$, that the remaining axioms of $ {{\mathsf {K}}}{{\mathsf {P}}}V ^{r}$ plus $\Sigma _{1}^{ V }$-foundation are preserved by $\sharp $; for example, for each $\Sigma _{1}^{ V }$-formula $\varphi $, if

$$\begin{aligned} ( \forall x \in A^{{\mathcal {S}}} ) \bigl ( ( \forall y \in A^{{\mathcal {S}}} ) ( y \in x \rightarrow \varphi ^{\sharp } ( y ) ) \rightarrow \varphi ^{\sharp } ( x ) \bigr ), \end{aligned}$$

where all the parameters of $\varphi ^{\sharp }$ are taken from $A^{{\mathcal {S}}}$, then we have

$$\begin{aligned} \forall x \bigl ( ( \forall y \in x ) ( y \in A^{{\mathcal {S}}} \rightarrow \varphi ^{\sharp } ( y ) ) \rightarrow ( x \in A^{{\mathcal {S}}} \rightarrow \varphi ^{\sharp } ( x ) ) \bigr ), \end{aligned}$$

from which we obtain $( \forall x \in A^{{\mathcal {S}}} ) \varphi ^{\sharp } ( x )$ by $\Sigma _{1}^{{\mathcal {S}}} \text {-} \mathrm {Found}_{1}$, since $A^{{\mathcal {S}}}$ is transitive and $\Delta ^{{\mathcal {S}}}$ (in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$).

It remains to show that $( \Sigma _{\infty }^{ V } \text {-} \mathrm {Sep}^{ V } )^{\sharp }$ and $( \Sigma _{\infty }^{ V } \text {-} \mathrm {Repl}^{ V } )^{\sharp }$ in ${\mathsf {K}}{\mathsf {P}}{\mathbb {V}}$. The proof is similar to Propositions 5.25 and 5.26. For the latter, take any ${\mathcal {L}}_{\in } ( V )$-formula $\varphi $ and $a \in _{1} {{\textsf {\textit{v}}}}$, and suppose $( \forall x \in _{1} a ) ( \exists ! y \in _{1} {{\textsf {\textit{v}}}}) \varphi ^{\sharp } ( x, y, a )$. By (31), we have

$$\begin{aligned} ( \forall u \in _{0} g^{-1} ( a ) ) ( \exists ! v \in {\mathcal {U}} ) \varphi ^{\sharp } ( g ( u ), g ( v ), a ). \end{aligned}$$

By $( \Sigma _{\infty }^{{\mathcal {S}}} \text {-} \mathrm {Repl}_{0}^{+} )$, there is some $w \in {\mathcal {U}}$ such that

$$\begin{aligned} ( \forall u \in _{0} g^{-1} ( a ) ) ( \exists ! v \in _{0} w ) \varphi ^{\sharp } ( g ( u ), g ( v ), a ). \end{aligned}$$

Let $b = g ( w ) \in {{\textsf {\textit{v}}}}$, and it follows from (31) that $( \forall x \in _{1} a ) ( \exists y \in _{1} b ) \varphi ^{\sharp } ( x, y, a )$. We can similarly show $( \Sigma _{\infty }^{ V } \text {-} \mathrm {Sep}^{ V } )^{\sharp }$. $\square $

The composition $*\circ \sharp $ of the two translations gives an interpretation of $ {{\mathsf {K}}}{{\mathsf {P}}}V $ in ${\mathsf {S}}{\mathsf {C}}_{1}$. By interpreting sets as elements of $ V $ (i.e., $\forall x \mapsto ( \forall x \in V )$) and classes as subsets of $ V $ (i.e., $\forall X \mapsto ( \forall x \subset V )$), we have an interpretation of $\mathsf {ECA}$ in $ {{\mathsf {K}}}{{\mathsf {P}}}V $; let $\natural $ denote this interpretation. Then, the restriction of $\natural $ to ${\mathcal {L}}_{\in }$ interprets ${\mathsf {Z}}{\mathsf {F}}$ in $ {{\mathsf {K}}}{{\mathsf {P}}}V $, and we can standardly extend it to an interpretation of ${\mathsf {I}}{\mathsf {D}}_{1}$ in $ {{\mathsf {K}}}{{\mathsf {P}}}V $ (in the same manner as the standard interpretation of ${\mathsf {I}}{\mathsf {D}}_{1}$ in ${\mathsf {K}}{\mathsf {P}}\omega $ over arithmetic). Hence, $ {{\mathsf {K}}}{{\mathsf {P}}}V $ and the systems listed in Theorem 5.23 have the same ${\mathcal {L}}_{\in }$-theorems (with respect to the canonical translation $\natural $).

Remark 5.30

In our formulation of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}$ and related systems, we define ${\mathcal {S}} x :\Leftrightarrow \lnot \, {\mathcal {U}} x$, and thus ${\mathcal {S}}$ and ${\mathcal {U}}$ are disjoint. However, we may formulate these systems without postulating that ${\mathcal {S}}$ and ${\mathcal {U}}$ are disjoint. For example, we may introduce a separate predicate for ${\mathcal {S}}$ and allow the possibility that ${\mathcal {S}}$ and ${\mathcal {U}}$ overlap; alternatively, we may adopt two-sorted first-order logic in the formulation of those systems. In such a formulation that permits the overlap of ${\mathcal {S}}$ and ${\mathcal {U}}$, we can also interpret ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1}$ in $ {{\mathsf {K}}}{{\mathsf {P}}}V ^{r}$ plus $\Sigma _{1}^{ V }$-foundation by translating ${\mathcal {U}} x$ by $x \in V $, ${\mathcal {S}} x$ by $x = x$, $\mathsf {V}$ by $ V $, $x \in _{0} y$ by $x, y \in V \wedge x \in y$, and $x \in _{1} y$ by $x \in y$.

6 Other forms of reflection

$\Pi ^{1}_{n} \text {-} \mathrm {RFN}$ is a type of reflection principle that reflects an assertion about the entire universe (of sets and classes) onto a class structure (a coded ${\mathbb {V}}$-model). In this section, we will briefly consider some alternative types of reflection principles and give some observations about them.

6.1 Reflection onto class structures

Let $\Gamma $ be a collection of ${\mathcal {L}}_{2}$-formulae. We first consider a natural strengthening of $\Gamma \text {-} \mathrm {RFN}$.

$$\begin{aligned} {\Gamma \text {-} \mathrm {RFN}^{+}:} \qquad&\forall X \exists S \bigl ( X \, {\dot{\in }} \, S \wedge S \models \mathsf {NBG} \wedge \forall x ( \Phi ( x, X ) \leftrightarrow S \models \Phi ( x, X ) ) \bigr ), \ \text {for all }\Phi \in \Gamma , \end{aligned}$$

where $\Phi $ only contains the displayed variables free and S does not occur free in $\Phi $; note that we need not consider formulae $\Phi $ with more first- and/or second-order free variables because we can always contract multiple free variables into one variable by paring. By the existence of universal formulae (for $n > 0$) and Proposition 2.13 (for $n = 0$), $\Pi ^{1}_{n} \text {-} \mathrm {RFN}$ and $\Sigma ^{1}_{n} \text {-} \mathrm {RFN}$ are finitely axiomatizable (relative to $\mathsf {NBG}$). Obviously, $\mathsf {NBG} \vdash \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{+} \! \! \rightarrow \Pi ^{1}_{n} \text {-} \mathrm {RFN}$ and $\mathsf {NBG} \vdash \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{+} \! \! \leftrightarrow \Sigma ^{1}_{n} \text {-} \mathrm {RFN}^{+}$.

Proposition 6.1

Let $n \ge 1$.

1.
$\mathsf {NBG} \vdash \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{+} \rightarrow \Pi ^{1}_{n} \text {-} \mathrm {CA}$.
2.
$\mathsf {NBG} \vdash ( \Pi ^{1}_{n + 1} \text {-} \mathrm {RFN} \wedge \Pi ^{1}_{n} \text {-} \mathrm {CA} ) \rightarrow \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{+}$.
3.
$\mathsf {NBG} \vdash ( \Pi ^{1}_{n} \text {-} \mathrm {RFN} \wedge \Pi ^{1}_{n} \text {-} \mathrm {CA} \wedge \Sigma ^{1}_{n} \text {-} \mathrm {Coll} ) \rightarrow \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{+}$.
4.
$\mathsf {NBG} \vdash \Pi ^{1}_{1} \text {-} \mathrm {RFN}^{+} \rightarrow \Sigma ^{1}_{1} \text {-} \mathrm {Coll}$.

Proof

1. Take any $\Pi ^{1}_{n}$-formula $\Phi ( x, X )$. $\Pi ^{1}_{n} \text {-} \mathrm {RFN}^{+}$ yields a coded ${\mathbb {V}}$-model S with $X \, {\dot{\in }} \, S$ such that $\forall x \bigl ( \Phi ( x, X ) \leftrightarrow S \models \Phi ( x, X ) \bigr )$. Take $Y := \{ x \mid S \models \Phi ( x, X ) \}$ by $\mathrm {ECA}$. Then, we have $\forall x \bigl ( x \in Y \leftrightarrow \Phi ( x, X ) \bigr )$.

2. Take any $\Pi ^{1}_{n}$-formula $\Phi ( x, X )$. We take $Y := \{ x \mid \Phi ( x, X ) \}$ by $\Pi ^{1}_{n} \text {-} \mathrm {CA}$. $\Pi ^{1}_{n + 1} \text {-} \mathrm {RFN}$ yields a coded ${\mathbb {V}}$-model S such that $X, Y \, {\dot{\in }} \, S$, $S \models \mathsf {NBG}$, and

$$\begin{aligned} S \models \forall x ( x \in Y \leftrightarrow \Phi ( x, X ) ). \end{aligned}$$

(33)

Then, for all x, $\Phi ( x, X )$ and $S \models \Phi ( x, X )$ are both equivalent to $x \in Y$.

3. In the presence of $\Sigma ^{1}_{n} \text {-} \mathrm {Coll}$, the formula reflected in (33) becomes $\Sigma ^{1}_{n + 1}$, and thus $\Pi ^{1}_{n} \text {-} \mathrm {RFN}$, which is equivalent to $\Sigma ^{1}_{n + 1} \text {-} \mathrm {RFN}$, is enough to obtain the coded ${\mathbb {V}}$-model S.

4. Suppose $\forall x \exists X \Phi ( x, X )$ for $\Phi \in \Pi ^{1}_{0}$. Take a coded ${\mathbb {V}}$-model S containing all the parameters of $\Phi $ such that $\forall x \bigl ( \exists X \Phi ( x, X ) \leftrightarrow S \models \exists X \Phi ( x, X ) \bigr )$. Then, we have $\forall x ( \exists X {\dot{\in }} S ) \Phi ( x, X )$, namely, $\forall x \exists y \Phi ( x, ( S )_{y} )$. $\square $

We next consider a further strengthening of $\Gamma \text {-} \mathrm {RFN}$:

$$\begin{aligned} {\Gamma \text {-} \mathrm {RFN}^{B}:}\quad&\forall X \exists S \bigl ( X \, {\dot{\in }} \, S \wedge S \models \mathsf {NBG} \wedge \forall x ( \forall Y \, {\dot{\in }} \, S ) ( \Phi ( x, Y ) \leftrightarrow S \models \Phi ( x, Y ) ) \bigr ), \\&\ \text {for all } \Phi \in \Gamma \text { only with the displayed variables free (and }S \text { not free)}. \end{aligned}$$

Obviously, $\mathsf {NBG} \vdash \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{B} \rightarrow \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{+}$ and $\mathsf {NBG} \vdash \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{B} \leftrightarrow \Sigma ^{1}_{n} \text {-} \mathrm {RFN}^{B}$.

In a more familiar terminology, when $k > 0$, $\Pi ^{1}_{k} \text {-} \mathrm {RFN}^{B}$ asserts that, for all X, there is a class-theoretic analogue of a coded $\beta _{k}$-model S with $X \, {\dot{\in }} \, S$. In this analogy, let us call a coded ${\mathbb {V}}$-model S a coded $B_{k}$-model, if the following holds:

$$\begin{aligned} \forall x ( \forall X \, {\dot{\in }} \, S ) \bigl ( \Phi ( x, X ) \leftrightarrow S \models \Phi ( x, X ) \bigr ), \ \text {for all } \Phi \in \Pi ^{1}_{k}. \end{aligned}$$

Note that we can show in the same manner as [25, Lemma VII.2.4] that every coded $ B _{k}$-model is a model of $\mathsf {NBG}$ for all $k > 0$. The notion of coded $B_{1}$-model, which we will particularly call a coded B-model, is of some significance in the current context because of the following property.^{Footnote 28}

Remark 6.2

The proof of Proposition 6.1 can be carried out as it is in second-order arithmetic. In addition, as Jäger and Strahm [16] showed, we have $\mathsf {ACA}_{0} \vdash \Pi ^{1}_{n + 1} \text {-} \mathrm {RFN} \leftrightarrow \Pi ^{1}_{n} \text {-} \mathrm {TI}$ in second-order arithmetic. Since we obviously have $\mathsf {ACA}_{0} \vdash \Pi ^{1}_{n} \text {-} \mathrm {CA} \rightarrow \Pi ^{1}_{n} \text {-} \mathrm {TI}$, it follows that $\mathsf {ACA}_{0} \vdash \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{+} \leftrightarrow \Pi ^{1}_{n} \text {-} \mathrm {CA}$ for all $n \ge 1$.^{Footnote 29} There is also an intimate relationship between $\Pi ^{1}_{n} \text {-} \mathrm {CA}$ and $\Pi ^{1}_{n} \text {-} \mathrm {RFN}^{B}$ in second-order arithmetic. For $n = 1, 2$, by [25, Theorem VII.2.10] (for the case $n = 1$) and [25, Theorems VII.6.9.3 and VII.7.4] (for the case $n = 2$), $\Pi ^{1}_{n} \text {-} \mathrm {CA}$ is equivalent to $\Pi ^{1}_{n} \text {-} \mathrm {RFN}^{B}$ in $\mathsf {ACA}_{0}$. For $n \ge 3$, by [25, Theorems VII.6.20 and VII.7.4], $\Pi ^{1}_{n} \text {-} {\mathsf {C}}{\mathsf {A}}_{0}$ and $\Pi ^{1}_{n} \text {-} \mathrm {RFN}^{B}$ have the same $\Pi ^{1}_{4}$-consequences modulo $\mathsf {ACA}_{0}$.

Proposition 6.3

For each natural number n, $\mathsf {NBG}$ proves the following: every coded B-model is a coded ${\mathbb {V}}$-model of $\Pi ^{1}_{n} \text {-} \mathsf {RFN}$.

Proof

Let S be a coded B-model. Suppose $S \models \Phi ( x, X )$ for $X \, {\dot{\in }} \, S$. This implies $\exists S ( X \, {\dot{\in }} \, S \wedge S \models \Phi )$, which is a $\Sigma ^{1}_{1}$-statement. Since S is a coded B-model, we have $S \models \exists S' ( X \, {\dot{\in }} \, S' \wedge S' \models \Phi )$. $\square $

Lastly, we give a coarse upper bound of the strength of $\Pi ^{1}_{n} \text {-} \mathrm {RFN}^{B}$ ($n > 0$).

Proposition 6.4

For $n > 0$, $\Sigma ^{1}_{n + 1} \text {-} \mathsf {DColl}_{0} \vdash \Sigma ^{1}_{n} \text {-} \mathrm {RFN}^{B}$.

Proof

The proof is just parallel to [25, Theorem VII.7.4]. We trivially have

$$\begin{aligned} \forall x \forall X \exists Y \bigl ( \exists Z \Phi ( x, ( X )^{x}, Z ) \rightarrow \Phi ( x, ( X )^{x}, Y ) \bigr ), \ \text {for all} \Phi \in \Sigma ^{1}_{n}. \end{aligned}$$

Hence, $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DColl}$ implies the following.

$$\begin{aligned} \exists X \forall x \exists y \bigl ( \exists Z \Phi ( x, ( X )^{x}, Z ) \rightarrow \Phi ( x, ( X )^{x}, ( X )_{y} ) \bigr ), \ \text {for all} \Phi \in \Sigma ^{1}_{n}. \end{aligned}$$

This schema is the class-theoretic counterpart of the axiom of strong $\Sigma ^{1}_{n}$ dependent choice in second-order arithmetic; see [25, Definition VII.6.1]. For the rest of the proof, we refer the reader to [25, Theorem VII.7.4]. $\square $

As an immediate corollary, $\Pi ^{1}_{\infty } \text {-} \mathsf {RFN}$ is consistent relative to $\Sigma ^{1}_{2} \text {-} \mathsf {DColl}_{0}$ (actually, consistent relative to $\mathsf {NBG}$ plus strong $\Sigma ^{1}_{1}$ dependent collection).

Remark 6.5

In second-order arithmetic, $\Pi ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}}_{0}$ proves $\Pi ^{1}_{1} \text {-} \mathrm {RFN}^{B}$ (by the argument of Kleene basis theorem [25, Lemma VII.2.9]). However, the proof cannot be carried out in class theory. In general, by the result of Gitman, Hamkins, and Johnston mentioned in Remark 4.12, we have $\Pi ^{1}_{n} \text {-} {\mathsf {C}}{\mathsf {A}}_{0} \not \vdash \Pi ^{1}_{m} \text {-} \mathrm {RFN}^{+}$ for all $n > 0$ and $m \ge 1$ in class theory, since $\Pi ^{1}_{1} \text {-} \mathrm {RFN}^{+} \vdash \Sigma ^{1}_{1} \text {-} \mathrm {Coll}$ by Proposition 6.1.4. Nonetheless, as a matter of fact, $\Pi ^{1}_{n} \text {-} {\mathsf {C}}{\mathsf {A}}_{0}$ and $\Pi ^{1}_{n} \text {-} \mathsf {RFN}_{0}^{+}$ ($n \ge 1$) still have the same ${\mathcal {L}}_{\in }$-theorems under the assumption of $\mathrm {GC}$, and they are also equiconsistent even if $\mathrm {GC}$ is dropped. The proof of this fact will be given in the sequel paper to the present one.

6.2 Reflection onto set structures

We next consider reflection principles that reflect a second-order formula onto a set structure. We first consider the following two natural formulations of such reflection principles:

$$\begin{aligned} {\Gamma \text {-}\mathrm {Indes}:} \quad&\forall x \forall X \bigl ( \Phi ( x, X ) \rightarrow \exists \beta ( x \in V_{\beta } \wedge \langle V_{\beta }, V_{\beta + 1} \rangle \models \Phi ( x, X \cap V_{\beta } ) \bigr ) \\ {\Gamma \text {-} \mathrm {Indes}^{+}:} \quad&\forall X \forall \alpha ( \exists \beta > \alpha ) ( \forall x \in V_{\beta } ) \bigl ( \Phi ( x, X ) \leftrightarrow \langle V_{\beta }, V_{\beta + 1} \rangle \models \Phi ( x, X \cap V_{\beta } ) \bigr ), \end{aligned}$$

where $\Phi \in \Gamma $ for a collection $\Gamma $ of ${\mathcal {L}}_{2}$-formulae only with the displayed variables free (and without $\beta $ free). The former expresses the $\Gamma $-indescribability of ${\mathbb {V}}$.^{Footnote 30} However, while they are consistent relative to moderate large cardinal axioms, their strengths go far beyond ${\mathsf {M}}{\mathsf {K}}$ and do not fall under the scope of the present paper.^{Footnote 31} Actually, as the next proposition shows, even the parameter-free version of $\Pi ^{1}_{1} \text {-} \mathrm {Indes}$ derives the consistency of ${\mathsf {M}}{\mathsf {K}}$ in $\mathsf {NBG}$, while ${\mathsf {Z}}{\mathsf {F}}$ proves the existence of its set-sized model (which is not necessarily a model of $\mathsf {NBG}$ at the same time).

Proposition 6.6

Let $\Gamma $ be a collection of ${\mathcal {L}}_{2}$-formulae and $\Gamma \text {-} \mathrm {Indes}^{-}$ denote the following schema:

$$\begin{aligned} {\Gamma \text {-} \mathrm {Indes}^{-}:} \quad&\forall x \bigl ( \Phi ( x ) \rightarrow \exists \beta ( x \in V_{\beta } \wedge \langle V_{\beta }, V_{\beta + 1} \rangle \models \Phi ( x ) \bigr ), \end{aligned}$$

for all $\Phi \in \Gamma $ only with the displayed variables free (and without any second-order variable nor $\beta $ free). Then, we have the following.

1.
${\mathsf {Z}}{\mathsf {F}} \vdash \exists \alpha ( \langle V_{\alpha }, V_{\alpha + 1} \rangle \models \Pi ^{1}_{\infty } \text {-} \mathrm {Indes}^{-} )$.
2.
$\mathsf {NBG} + \Pi ^{1}_{1} \text {-} \mathrm {Indes}^{-}$ proves the existence of a regular cardinal $\kappa $ with $V_{\kappa } \models {\mathsf {Z}}{\mathsf {F}}$ and thus a $\Sigma ^{1}_{1}$-indescribable cardinal.^{Footnote 32}

Proof

We only give a proof of 2 and refer the reader to [6, Ch.9, Exercise 1.11] for the claim 1. For each $\Phi ( x, P ) \in {\mathcal {L}}_{\in } ( P )$ (see Sect. 2.5 and Definition 3.2), $\mathsf {NBG}$ proves

$$\begin{aligned} \forall X \forall \alpha ( \exists \beta > \alpha ) ( \forall x \in V_{\beta } ) \bigl ( \Phi ( x, X ) \leftrightarrow \langle V_{\beta }, X \cap V_{\beta } \rangle \models \Phi ( x, P ) \bigr ), \end{aligned}$$

(34)

which is shown by the standard Montague-Lévy argument. Let $\Psi ( x, P )$ be an ${\mathcal {L}}_{\in } ( P )$-formula expressing that $ P $ is a function with domain x, and take a sufficiently rich finite fragment $\mathsf {S}$ of ${\mathsf {Z}}{\mathsf {F}}$ such that $V_{\alpha } \models \mathsf {S}$ holds only for a limit ordinal $> \omega $. Since (34) is a $\Pi ^{1}_{1}$-sentence, under the assumption of $\Pi ^{1}_{1} \text {-} \mathrm {Indes}^{-}$ there is an ordinal $\kappa $ such that $\langle V_{\kappa }, V_{\kappa + 1} \rangle $ satisfies $\mathsf {S}$ and (34) for the ${\mathcal {L}}_{\in } ( P )$-formula $\Psi \wedge \bigwedge \mathsf {S}$ in the place of $\Phi $. Then, $\kappa $ is a limit ordinal $> \omega $, and thus $V_{\kappa }$ satisfies the axioms of ${\mathsf {Z}}{\mathsf {F}}$ except the axiom of replacement. It suffices to show that $\langle V_{\kappa }, V_{\kappa + 1} \rangle $ satisfies Class Replacement. Let F be any function from some $a \in V_{\kappa }$ to $V_{\kappa }$. Then, $\langle V_{\kappa }, F \rangle \models \Psi ( a, P )$. Hence, there is some $\beta $ with $a \in V_{\beta }$ such that $V_{\beta } \models \mathsf {S}$ and $\langle V_{\beta }, F \cap V_{\beta } \rangle \models \Psi ( \alpha , P )$. Since $\beta $ is a limit ordinal, $\Psi ( a, P )$ also expresses in $\langle V_{\beta }, F \cap V_{\beta } \rangle $ that $ P $ is a function with domain a, and thus $F \cap V_{\beta }$ is a function with domain a and codomain $V_{\beta }$. We have shown that $\kappa $ is regular and $V_{\kappa } \models {\mathsf {Z}}{\mathsf {F}}$. Such a cardinal is shown to be a $\Pi ^{1}_{0}$-indescribable cardinal in a parallel manner to [17, Lemma 6.1], in which we need not use $\mathrm {AC}$, and thus a $\Sigma ^{1}_{1}$-indescribable cardinal. $\square $

So, a reflection principle asserting the existence of a model of a second-order formula of the form $\langle V_{\kappa }, V_{\kappa + 1} \rangle $ is too strong for the context of the study of the present paper. Hence, we weaken this condition to the existence of a model of the form $\langle V_{\kappa }, s \rangle $ for some $s \subset V_{\kappa + 1}$. This restriction leads to the following principles:

$$\begin{aligned} {\Gamma \text {-} \mathrm {Rfn}:} \quad&\forall x \forall X \bigl ( \Phi ( x, X ) \rightarrow \\&\exists \beta ( \exists s \subset V_{\beta + 1} ) ( x \in V_{\beta } \wedge X \cap V_{\beta } \in s \wedge \langle V_{\beta }, s \rangle \models \mathsf {NBG} + \Phi ( x, X \cap V_{\beta } ) ) \bigr ); \\ {\Gamma \text {-} \mathrm {SRfn}:} \quad&\forall X \exists \beta ( \exists s \subset V_{\beta + 1} ) \ \Bigl ( X \cap V_{\beta } \in s \, \wedge \langle V_{\beta }, s \rangle \models \mathsf {NBG} \\&\qquad \qquad \qquad \qquad \qquad \wedge ( \forall x \in V_{\beta } ) \bigl ( \Phi ( x, X ) \leftrightarrow \langle V_{\beta }, s \rangle \models \Phi ( x, X \cap V_{\beta } ) \bigr ) \Bigr ); \end{aligned}$$

here, as before, $\Phi \in \Gamma $ and $\Phi $ only contains the displayed variables free (and without $\beta $ free). By the existence of universal formulae, both $\Pi ^{1}_{n} \text {-} \mathrm {Rfn}$ and $\Pi ^{1}_{n} \text {-} \mathrm {SRfn}$ are finitely axiomatizable for every $n > 0$. $\Pi ^{1}_{0} \text {-} \mathrm {Rfn}$ is also finitely axiomatizable, but we need a little more argument. Let $\exists Z \Psi ( e, x, X, Z )$ be a $\Sigma ^{1}_{1}$-universal formula, where $\Psi \in \Pi ^{1}_{0}$. Then, on the one hand, $\Pi ^{1}_{0} \text {-} \mathrm {Rfn}$ proves

$$\begin{aligned}&\forall X \forall Z ( \forall e \in \omega ) \forall x \Bigl ( \Psi ( e, x, X, Z ) \rightarrow \exists \beta ( \exists s \subset V_{\beta + 1} ) \nonumber \\&\bigl ( \omega , x \in V_{\beta } \, \wedge \, X \! \cap \! V_{\beta }, Z \! \cap \! V_{\beta } \in s \, \wedge \, \langle V_{\beta }, s \rangle \models \mathsf {NBG} + \Psi ( e, x, X \! \cap \! V_{\beta }, Z \! \cap \! V_{\beta } ) \bigr ) \Bigr ); \end{aligned}$$

(35)

on the other hand, since $\exists Z \Psi $ is a $\Sigma ^{1}_{1}$ universal formula in any model $\langle V_{\beta }, s \rangle $ of $\mathsf {NBG}$, the $\Pi ^{1}_{1}$-sentence (35) implies each instance of $\Pi ^{1}_{0} \text {-} \mathrm {Rfn}$.

The next proposition is obvious.

Proposition 6.7

1.
$\mathsf {NBG} + \Pi ^{1}_{\infty } \text {-} \mathrm {SRfn} \vdash \Sigma ^{1}_{\infty } \text {-} \mathrm {Repl} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$.
2.
For every n, $\mathsf {NBG} \vdash ( \Pi ^{1}_{n} \text {-} \mathrm {Rfn} \rightarrow \Sigma ^{1}_{n + 1} \text {-} \mathrm {Rfn} ) \wedge ( \Sigma ^{1}_{n} \text {-} \mathrm {SRfn} \rightarrow \Sigma ^{1}_{n + 1} \text {-} \mathrm {Rfn} )$.
3.
No finitely axiomatizable consistent ${\mathcal {L}}_{2}$-system proves $\Pi ^{1}_{\infty } \text {-} \mathrm {Rfn}$.

Proposition 6.8

For all n, $\mathsf {NBG} + \Pi ^{1}_{n + 1} \text {-} \mathrm {Rfn} \vdash Con ( \mathsf {NBG} + \Pi ^{1}_{n} \text {-} \mathrm {Rfn} + \Pi ^{1}_{\infty } \text {-} \mathrm {Sep} )$.^{Footnote 33}

Proof

Each instance of $\Pi ^{1}_{n} \text {-} \mathrm {Rfn}$ is $\Pi ^{1}_{n + 1}$. Hence, $\Pi ^{1}_{n + 1} \text {-} \mathrm {Rfn}$ yields a set-sized model $\langle V_{\beta }, s \rangle $ of $\mathsf {NBG}$ plus the finitely many (actually juse one) true $\Pi ^{1}_{n + 1}$-sentences that finitely axiomatize $\Pi ^{1}_{n} \text {-} \mathrm {Rfn}$. Since the first-order domain is $V_{\beta }$ (and $\beta $ is limit), $\Pi ^{1}_{\infty } \text {-} \mathrm {Sep}$ is also true in the model. $\square $

The next proposition indicates that $\Pi ^{1}_{\infty } \text {-} \mathrm {RFN}$ is an essentially stronger principle than $\Pi ^{1}_{\infty } \text {-} \mathrm {SRfn}$ (and thus than $\Pi ^{1}_{\infty } \text {-} \mathrm {Rfn}$).

Proposition 6.9

Let $\mathsf {F}$ be a finite set of $\Pi ^{1}_{n}$-sentences. Then,

$$\begin{aligned} \mathsf {NBG} + \mathsf {F} + \Pi ^{1}_{n} \text {-} \mathrm {RFN} \vdash Con ( \mathsf {NBG} + \mathsf {F} + \Pi ^{1}_{\infty } \text {-} \mathrm {Sep} + \Pi ^{1}_{\infty } \text {-} \mathrm {Repl} + \Pi ^{1}_{\infty } \text {-} \mathrm {SRfn} ). \end{aligned}$$

Proof

By Proposition 2.14 and Lemmata 2.16 and 2.21, $\Pi ^{1}_{0} \text {-} \mathsf {RFN}_{0}$ proves that $S \models \Pi ^{1}_{\infty } \text {-} \mathrm {SRfn}$ for every coded ${\mathbb {V}}$-model S. By Corollary 2.18, $\mathsf {F} + \Pi ^{1}_{n} \text {-} \mathrm {RFN}$ proves the existence of a coded ${\mathbb {V}}$-model S of $\mathsf {ECA} + \mathsf {F}$. $\square $

As the next proposition shows, $\Pi ^{1}_{n} \text {-} \mathrm {SRfn}$ and $\Pi ^{1}_{n} \text {-} \mathrm {Rfn}$ become equivalent in sufficiently strong (finite) systems.

Proposition 6.10

1.
$\mathsf {NBG} \vdash ( \Pi ^{1}_{n + 1} \text {-} \mathrm {Rfn} \wedge \Pi ^{1}_{n} \text {-} \mathrm {CA} ) \rightarrow \Pi ^{1}_{n} \text {-} \mathrm {SRfn}$.
2.
$\mathsf {NBG} \vdash ( \Pi ^{1}_{n} \text {-} \mathrm {Rfn} \wedge \Pi ^{1}_{n} \text {-} \mathrm {CA} \wedge \Sigma ^{1}_{n} \text {-} \mathrm {Coll} ) \rightarrow \Pi ^{1}_{n} \text {-} \mathrm {SRfn}$.

Proof

1. The proof is similar to Proposition 6.1. Let $\Phi ( x, X )$ be $\Pi ^{1}_{n}$. Take $Y := \{ x \mid \Phi ( x, X ) \}$. $\Pi ^{1}_{n + 1} \text {-} \mathrm {Rfn}$ yields $\alpha $ and $s \subset V_{\alpha + 1}$ such that $X \cap V_{\alpha }, Y \cap V_{\alpha } \in s$ and

$$\begin{aligned} \langle V_{\alpha }, s \rangle \models \forall x ( x \in Y \cap V_{\alpha } \leftrightarrow \Phi ( x, X \cap V_{\alpha } ) ); \end{aligned}$$

(36)

then, for all $x \in V_{\alpha }$, $\Phi ( x, X )$ and $\langle V_{\alpha }, s \rangle \models \Phi ( x, X \cap V_{\alpha } )$ are equivalent to $x \in Y \cap V_{\alpha }$.

(2) In the presence of $\Sigma ^{1}_{n} \text {-} \mathrm {Coll}$, the formula reflected in (36) becomes $\Sigma ^{1}_{n + 1}$, and thus $\Pi ^{1}_{n} \text {-} \mathrm {Rfn}$ (equivalent to $\Sigma ^{1}_{n + 1} \text {-} \mathrm {Rfn}$ in $\mathsf {NBG}$) is enough to carry out the rest of the proof. $\square $

Remark 6.11

By the same argument as the second claim of the last proposition, we can show that $\Pi ^{1}_{n} \text {-} \mathrm {Indes}$ and $\Pi ^{1}_{n} \text {-} \mathrm {Indes}^{+}$ are equivalent in $\Pi ^{1}_{n} \text {-} {\mathsf {C}}{\mathsf {A}}_{0} + \Sigma ^{1}_{n} \text {-} \mathrm {Coll}$. Now, since $( V_{\kappa }, V_{\kappa + 1} ) \models \Pi ^{1}_{n} \text {-} \mathrm {CA} + \Sigma ^{1}_{n} \text {-} \mathrm {Coll}$ holds for all n and cardinals $\kappa $ under the assumption of $\mathrm {AC}$, $\langle V_{\kappa }, V_{\kappa + 1} \rangle \models \Pi ^{1}_{n} \text {-} \mathrm {Indes}$ if and only if $\langle V_{\kappa }, V_{\kappa + 1} \rangle \models \Pi ^{1}_{n} \text {-} \mathrm {Indes}^{+}$, both of which are equivalent to $\kappa $ being $\Pi ^{1}_{n}$-indescribable. The same applies to higher-order indescribable cardinals. However, we do not know if the same holds without $\mathrm {AC}$.

Proposition 6.12

For each $n \in {\mathbb {N}}$, $\mathsf {NBG} + \Pi ^{1}_{n + 1} \text {-} \mathrm {Rfn} \vdash \Pi ^{1}_{n} \text {-} \mathrm {TI}$.

Proof

Let $\Phi ( x )$ be a $\Pi ^{1}_{n}$-formula with parameters z and Z, and suppose for contradiction that $ Wf ( X )$ and $\lnot TI _{\Phi } ( X )$ for some X. Since $\lnot TI _{\Phi } ( X )$ is a $\Pi ^{1}_{n + 1}$-formula, $\Pi ^{1}_{n + 1} \text {-} \mathrm {Rfn}$ yields some $\alpha $ and $s \subset V_{\alpha + 1}$ such that $z \in V_{\alpha }$, $X \cap V_{\alpha }, Z \cap V_{\alpha } \in s$, and $\langle V_{\alpha }, s \rangle \models \lnot TI _{\Phi } ( X \cap V_{\alpha } )$, namely,

$$\begin{aligned} \begin{aligned}&( \forall x \in V_{\alpha } ) \bigl ( ( \forall y \in V_{\alpha } ) \bigl ( y \prec _{X} x \rightarrow \langle V_{\alpha }, s \rangle \models \Phi ( y ) \bigr ) \rightarrow \langle V_{\alpha }, s \rangle \models \Phi ( x ) \bigr ) \\&\wedge ( \exists x \in V_{\alpha } ) \langle V_{\alpha }, s \rangle \not \models \Phi ( x ); \end{aligned} \end{aligned}$$

(37)

here we suppress the parameters z and $Z \cap V_{\alpha }$ in $\Phi $ for saving space. Let $Y := \{ x \in V_{\alpha } \mid \langle V_{\alpha }, s \rangle \models \Phi ( x, z, Z \cap V_{\alpha } ) \} \cup ( {\mathbb {V}} {\setminus } V_{\alpha } )$. The first conjunct of (37) implies $\forall x \bigl ( ( \forall y \prec _{X} x ) ( y \in Y ) \rightarrow x \in Y \bigr )$. By $ Wf ( X )$, we obtain $Y = {\mathbb {V}}$, which contradicts the second conjunct of (37). $\square $

As the next shows, $\Pi ^{1}_{n} \text {-} \mathrm {SRfn}$ is still quite a weak principle (relative to $\mathsf {ECA}$).

Theorem 6.13

$\mathsf {ECA} + \Pi ^{1}_{\infty } \text {-} \mathrm {SRfn}$ (and thus $\mathsf {NBG} + \Pi ^{1}_{\infty } \text {-} \mathrm {SRfn}$ by Lemma 6.7.1) have the same ${\mathcal {L}}_{\in }$-theorems as $\mathsf {ECA}$ does.

Proof

Recall that ${\mathcal {I}}$ is an interpretation of $\mathsf {ECA}$ in ${\mathsf {T}}{\mathsf {C}}$ (see Sect. 3) and that $\mathsf {ECA}$ and ${\mathsf {T}}{\mathsf {C}}$ have the same ${\mathcal {L}}_{\in }$-theorems. Hence, it suffices to show that ${\mathsf {T}}{\mathsf {C}} \vdash ( \Pi ^{1}_{\infty } \text {-} \mathrm {SRfn} )^{{\mathcal {I}}}$.

We work within ${\mathsf {T}}{\mathsf {C}}$. Let $\Phi ( x, X ) \in {\mathcal {L}}_{2}$. Take any $\varphi ( u ) \in Fml _{\in }^{\infty }$ (which corresponds to the parameter “X”). By Fact 3.10, since $\mathsf {NBG}$ consists of only finitely many axioms, there exists $\beta $ such that $\varphi ( u ) \in V_{\beta }$, $( \mathsf {NBG}^{{\mathcal {I}}} )^{V_{\beta }}$, and

$$\begin{aligned} ( \forall x \in V_{\beta } ) \bigl ( \Phi ^{{\mathcal {I}}} ( x, \varphi ) \leftrightarrow ( \Phi ^{{\mathcal {I}}} )^{V_{\beta }} ( x, \varphi )\bigr ); \end{aligned}$$

recall that, for each variable z, all the occurrences of $z \in X$ in $\Phi ( x, X )$ are replaced by $ T ( \varphi ( z ) )$ in $\Phi ^{{\mathcal {I}}} ( x, \varphi )$. Take ${\mathcal {I}}^{-1} ( V_{\beta } )$ (see Sect. 3), and let ${\mathcal {I}}^{-1} ( V_{\beta } ) = ( V_{\beta }, s )$. Let us write X for $\{ z \mid T ( \varphi ( z ) ) \}$. Since $V_{\beta } \models {\mathsf {Z}}{\mathsf {F}}$, each syntactic notion or operation on ${\mathcal {L}}_{\in }^{\infty }$ is absolute for $V_{\beta }$; therefore, in particular, we have $( Fml _{\in }^{\infty } )^{V_{\beta }} = Fml _{\in }^{\infty } \cap V_{\beta }$ and $\forall x \in V_{\beta } \bigl ( T ( \varphi ( x ) ) \leftrightarrow V_{\beta } \models T ( \varphi ( x ) ) \bigr )$. Hence, by (10), we obtain

$$\begin{aligned} X \cap V_{\beta } \in s \wedge ( \forall x \in V_{\beta } ) \bigl ( \Phi ^{{\mathcal {I}}} ( x, \varphi ) \leftrightarrow ( V_{\beta }, s ) \models \Phi ( x, X \cap V_{\beta } ) \bigr ). \end{aligned}$$

Hence, the ${\mathcal {I}}$-translation of the instance of $\Pi ^{1}_{\infty } \text {-} \mathrm {SRfn}$ for $\Phi $ holds in ${\mathsf {T}}{\mathsf {C}}$. $\square $

Finally, we will show that $\Pi ^{1}_{\infty } \text {-} \mathrm {Rfn}$ is an even weaker principle (relative to $\mathsf {ECA}$).

Theorem 6.14

$\mathsf {ECA} \vdash Con ( \mathsf {NBG} + \Pi ^{1}_{\infty } \text {-} \mathrm {Rfn} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep} )$.

Proof

Again, we will show the claimed consistency in ${\mathsf {T}}{\mathsf {C}}$ and work within ${\mathsf {T}}{\mathsf {C}}$.

We first take an ${\mathcal {L}}_{\mathrm {T}}$-definable closed unbounded class ${\mathcal {C}}$ of ordinals $\zeta $ such that $( \mathsf {NBG}^{{\mathcal {I}}} )^{V_{\zeta }}$. Next, for each (code of) ${\mathcal {L}}_{2}$-formula $\Phi ( x, X )$, we define a function $F_{\Phi } :{\mathbb {V}} \times {\mathbb {V}} \rightarrow On$ as follows:

$$\begin{aligned} F_{\Phi } ( x, y ) := {\left\{ \begin{array}{ll} \min \{ \eta \mid \eta \in {\mathcal {C}} \wedge V_{\eta } \models \Phi ^{{\mathcal {I}}} ( x, \varphi ) \} &{} \text {if}\,x \in V_{\eta }, y = \varphi ( u ) \in Fml _{\in }^{\infty } \cap V_{\eta },\\ &{} \quad \text {and such}\,\eta \,\text {exists}. \\ 0 &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

Then, we set

$$\begin{aligned} F ( \xi )&\, = \, \sup \{ F_{\Phi } ( x, \varphi ( u ) ) \mid x \in V_{\xi } \wedge \varphi ( u ) \in Fml _{\in }^{\infty } \cap V_{\xi } \wedge \Phi \in Fml _{2} \} \\ G ( \xi )&\, = \, \min \{ \eta \mid \eta \in {\mathcal {C}} \wedge \eta > F ( \xi ) \}. \end{aligned}$$

Thereby we inductively define $G^{0} ( \xi ) = \xi $ and $G^{k + 1} ( \xi ) = G ( G^{k} ( \xi ) )$ and set $H ( \xi ) = \sup _{n \in \omega } G^{n} ( \xi )$. By the closedness of ${\mathcal {C}}$, we have $H ( \xi ) \in {\mathcal {C}}$ for any $\xi \in On$. We claim that if $\kappa = H ( \xi )$ for some $\xi $, then ${\mathcal {I}}^{-1} ( V_{\kappa } ) \models \mathsf {NBG} + \Pi ^{1}_{\infty } \text {-} \mathrm {Rfn} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$.

Let $\kappa = H ( \xi )$ for some $\xi $. We will write $\gamma _{j} = G^{j} ( \xi )$ and $\langle V_{\kappa }, s \rangle = {\mathcal {I}}^{-1} ( V_{\kappa } )$. Since $\kappa \in {\mathcal {C}}$ and it is a limit ordinal, we have $\langle V_{\kappa }, s \rangle \models \mathsf {NBG} + \Sigma ^{1}_{\infty } \text {-} \mathrm {Sep}$. Now take any ${\mathcal {L}}_{2}$-formula $\Phi ( x, X )$ with parameters $x \in V_{\kappa }$ and $X \in s$. By definition, there is some $\varphi ( u ) \in ( Fml _{\in }^{\infty } )^{V_{\kappa }}$ ($= Fml _{\in }^{\infty } \cap V_{\kappa }$) such that $X = \{ a \in V_{\kappa } \mid T ( \varphi ( a ) ) \}$. By the definition of $\kappa $, there must be some $n < \omega $ such that $x \in V_{\gamma _{n}}$ and $\varphi ( u ) \in Fml _{\in }^{\infty } \cap V_{\gamma _{n}} = ( Fml _{\in }^{\infty } )^{V_{\gamma _{n}}}$. Now, suppose $\langle V_{\kappa }, s \rangle \models \Phi ( x, X )$, which is equivalent to $V_{\kappa } \models \Phi ^{{\mathcal {I}}} ( x, \varphi )$. Then, since $\kappa \in {\mathcal {C}}$ and $F_{\Phi } ( x, \varphi )< \gamma _{n + 1} < \kappa $, there is some $\eta \in \kappa \cap {\mathcal {C}}$ such that $x, \varphi \in V_{\eta }$ and $V_{\eta } \models \Phi ^{{\mathcal {I}}} ( x, \varphi )$. Let ${\mathcal {I}}^{-1} ( V_{\eta } ) = ( V_{\eta }, s' )$. We have $X \cap V_{\eta } = \{ a \in V_{\eta } \mid T ( \varphi ( a ) ) \} \in s'$ and $\langle V_{\eta }, s' \rangle \models \mathsf {NBG} + \Phi ( x, X \cap V_{\eta } )$. $\square $

7 Conclusion

Some of the results of this paper are summarized in the following diagram:

In the diagram, a solid double arrow “$\Rightarrow $” from a system $\mathsf {S}$ to a system $\mathsf {T}$ means that $\mathsf {S}$ is a subsystem of $\mathsf {T}$ but $\mathsf {T}$ has a higher consistency strength than $\mathsf {S}$; a solid single arrow $\rightarrow $ from $\mathsf {S}$ to $\mathsf {T}$ means that all the first-order theorems of $\mathsf {S}$ are also theorems of $\mathsf {T}$ but $\mathsf {T}$ has a higher consistency strength than $\mathsf {S}$; a solid single double-headed arrow “$\leftrightarrow $” means that the two system have the same ${\mathcal {L}}_{\in }$-theorems; a dashed arrow “$\dasharrow $” from $\mathsf {S}$ to $\mathsf {T}$ means that $\mathsf {T}$ is the class-theoretic counterpart of the subsystem $\mathsf {S}$ of second-order arithmetic. The number(s) below each arrow refer(s) to the result(s) of the present paper from which the asserted relation between the systems connected by the arrow follows. This diagram shows that the order of the strengths of systems is quite different between second-order arithmetic and class theory. As we remarked in Remark 1.1, the same holds even when we assume $\mathrm {AC}$ or $\mathrm {GC}$. To conclude the paper, we raise three open problems.

(A)
In second-order arithmetic, we have $\mathsf {ACA}_{0} \vdash \Pi ^{1}_{2} \text {-} \mathrm {RFN} \rightarrow \Sigma ^{1}_{1} \text {-} \mathrm {DC}$. Does the corresponding statement hold in class theory? Namely, is it the case that $\mathsf {NBG} \vdash \Pi ^{1}_{2} \text {-} \mathrm {RFN} \rightarrow \Sigma ^{1}_{1} \text {-} \mathrm {DColl}$?
(B)
Does $\mathsf {ECA}$ prove $\Pi ^{1}_{\infty } \text {-} \mathrm {SRfn}$?
(C)
We have shown that $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}$ and $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}$ prove the same ${\mathcal {L}}_{\in }$-theorems. In second-order arithmetic, we know that $\Sigma ^{1}_{1} \text {-} {\mathsf {A}}{\mathsf {C}}$ and $\Sigma ^{1}_{1} \text {-} {\mathsf {D}}{\mathsf {C}}$ prove the same $\Pi ^{1}_{2}$-sentences. Does the same $\Pi ^{1}_{2}$-conservation hold between $\Sigma ^{1}_{1} \text {-} \mathsf {Coll}$ and $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}$ in class theory?

Notes

As far as the author knows, the current trend of the study of subsystems of ${\mathsf {M}}{\mathsf {K}}$ started with Jäger’s work [13] on Feferman’s Operational Set Theory, in which he introduced a theory $\mathsf {NBG}_{< E_{0}}$ and initiated a proof-theoretic treatment of class theory. The study of subsystems of ${\mathsf {M}}{\mathsf {K}}$ had been by and large driven by proof-theoretic motivations for years since then, but we nowadays find more research on this subject from more purely set-theoretic interest, such as [10].
I only mean the addition of $\mathrm {AC}$ or $\mathrm {GC}$ to formal systems of class theory here. We will also consider Kripke–Platek systems over set theory in Sect. 5, and there are three different ways of adding $\mathrm {AC}$ or $\mathrm {GC}$ to such systems, namely, postulating it only for the ${\mathcal {U}}$-sets, only for the ${\mathcal {S}}$-sets, and both for ${\mathcal {U}}$-sets and ${\mathcal {S}}$-sets. If we count $\mathrm {AC}$ or $\mathrm {GC}$ among the axioms of $\mathsf {NBG}$, we accordingly need to add a corresponding axiom for the ${\mathcal {U}}$-sets to the Kripke–Platek systems; in this case, all the proofs in the present paper can be used to establish the corresponding results with $\mathrm {AC}$ or $\mathrm {GC}$ (for class theory) with no substantial change. In contrast, $\mathrm {AC}$ or $\mathrm {GC}$ for the ${\mathcal {S}}$-sets would make greater difference because they yield a choice or global wellordering on classes in terms of the canonical translation $\star $ (Sect. 5.1.1).
We remark that the acronyms ‘$\Sigma ^{1}_{n} \text {-} \mathrm {Sep}$’ and ‘$\Pi ^{1}_{n} \text {-} \mathrm {Sep}$’ are sometimes used to denote axioms of a completely different kind in the context of Second-order Arithmetic (e.g., [25]); the class-theoretic axioms corresponding to them are called $\Pi ^{1}_{n} \text {-}$ and $\Sigma ^{1}_{n} \text {-} \mathrm {Red}$ in [23].
We make this stipulation because we will occasionally use the same terms for both systems and axioms (or axiom schemata), such as $\Pi ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}}$ and $\Pi ^{1}_{1} \text {-} \mathrm {CA}$, following the convention. In the present paper, we take a “system” to mean a set of (non-logical) axioms (and not the set of the theorems derivable from the axioms); by a “theory” we mean a set of sentences closed under logical consequence but will not often use this term to avoid confusion. While an axiom is a single sentence, both a schema and a system are a set of sentences, and we do not make a precise distinction between systems and schemata here; hence, precisely speaking, this stipulation is ambiguous, but we believe that it causes no confusion.
$\mathsf {ECA}_{0}^{+}$ is denoted by $\mathsf {NBG}_{\omega }$ in [7].
Recall that we take a “system” to denote a set of axioms (and not the set of the theorems derivable from the axioms); see fn 4.
The following argument can be carried out in both $\mathsf {NBG}$ and $\mathsf {ACA}_{0}$. Suppose $S \models \mathsf {T}$. If the numbers of logical symbols of the axioms of $\mathsf {T}$ is unbounded, then $S \models \mathsf {T}$ implies the existence of a full satisfaction class X for S, and we can thereby show that $\mathsf {T} \vdash \Phi $ implies $S \models _{X} \Phi [ f, g ]$ for all formulae $\Phi $ and variable assignments f and g (in the notation of Lemma 2.16 below) by induction on the length of derivation. Suppose otherwise. Then, there is a bound m of the numbers of logical symbols of the axioms of $\mathsf {T}$. Let us write $\mathsf {T} \vdash _{k} \Phi $ when there is a derivation of $\Phi $ from $\mathsf {T}$ in which only formulae with at most k logical symbols occur. We can show by partial cut-elimination that there is some n ($\ge m$) such that $\mathsf {T} \vdash \bot $ implies $\mathsf {T} \vdash _{n} \bot $. Take an n-satisfaction class Y for S. Then, we can show by induction on the length of derivation that $\mathsf {T} \vdash _{n} \Phi $ implies $S \models _{Y} \Phi $ for all $\Phi $ with at most n logical symbols, which entails the consistency of $\mathsf {T}$.
Precisely speaking, $\mathrm {LFP}$ literally asserts the existence of least closed points in terms of [14], but we can show in a parallel manner to [14, Lemma 2] that each least closed point of an X-positive elementary formula is a least fixed-point of the same formula provably in $\mathsf {NBG}$. The converse also holds provably in $\mathsf {LFP}_{0}$, which can be shown in a parallel manner to [14, Theorem 3], but the proof crucially makes use of class parameters allowed in the schema $\mathrm {LFP}$, and I do not know whether the converse in question also holds in $\mathsf {LFP}_{0}^{-}$ (or even weaker systems such as $\mathsf {NBG}$).
Let $\Pi _{n}^{\mathrm {ID}}$ denote the collection of ${\mathcal {L}}_{\mathrm {ID}}$-formulae corresponding to $\Pi _{n}$ in the Lévy hierarchy, in which the new vocabulary $J_{{\mathcal {A}}}$s are counted in $\Pi _{0}^{\mathrm {ID}}$. The ${\mathcal {L}}_{\mathrm {ID}}$-system ${\mathsf {I}}{\mathsf {D}}_{1} \! \! \upharpoonright _{n}$ is obtained from ${\mathsf {I}}{\mathsf {D}}_{1}$ by restricting the formulae $\Psi $ appearing in $\mathrm {ID2}$ (also known as the principle of fixed-point induction) to $\Pi _{n}^{\mathrm {ID}}$-formulae. It is known that ${\mathsf {I}}{\mathsf {D}}_{1} \! \! \upharpoonright _{n}$ is stronger than ${\mathsf {I}}{\mathsf {D}}_{1} \! \! \upharpoonright _{m}$ over arithmetic if $n > m \ge 2$, but nothing similar can be said about these systems over set theory: since $\widehat{{\mathsf {I}}{\mathsf {D}}}_{1}$ and ${\mathsf {I}}{\mathsf {D}}_{1} \! \upharpoonright _{2}$ have the same theorems over set theory, so do ${\mathsf {I}}{\mathsf {D}}_{1} \upharpoonright _{n}$ and ${\mathsf {I}}{\mathsf {D}}_{1} \upharpoonright _{m}$ for any $m, n \ge 2$; cf. Remark 5.20.
The anonymous referee suggests that $\Pi ^{1}_{2} \text {-} \mathrm {RFN}$, rather than $\Sigma ^{1}_{1} \text {-} \mathrm {DColl}$, should be called the class-theoretic counterpart of $\Sigma ^{1}_{1}$ dependent choice, which is another reasonable option. We do not yet know if $\Pi ^{1}_{2} \text {-} \mathrm {RFN}$ is equivalent to $\Sigma ^{1}_{1} \text {-} \mathrm {DColl}$ in $\mathsf {NBG}$, while we know that the latter implies the former in $\mathsf {NBG}$ (Lemma 4.9) and that they have the same ${\mathcal {L}}_{\in }$-theorems (Theorem 5.23).
Krähenbühl considered two equivalent formulations of $\Sigma ^{1}_{n}$ dependent choice for class theory, $\Sigma ^{1}_{n} \text {-} \mathrm {DC}'$ and $\Sigma ^{1}_{n} \text {-} \mathrm {DC}''$, which will be defined below. These two and my formulation ($\Sigma ^{1}_{n} \text {-} \mathrm {DC}$) are equivalent, but they imply $\mathrm {GC}$. We show in Proposition 2.30 that the choice-less “collection” version $\Sigma ^{1}_{n} \text {-} \mathrm {DColl}'$ of $\Sigma ^{1}_{n} \text {-} \mathrm {DC}'$ is equivalent to $\Sigma ^{1}_{n} \text {-} \mathrm {DColl}$, but we do not know whether the “collection” version of $\Sigma ^{1}_{n} \text {-} \mathrm {DC}''$ is also equivalent to the other two.
We actually need one extra (but easy) step to prove the implication from $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}'$ to $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}$: if $\forall x \forall X \exists Y \Phi ( x, X, Y )$, then $\Sigma ^{1}_{n + 1} \text {-} \mathrm {DC}'$ yields a class W with $\Phi ( z, ( W )^{z}, ( W )_{x} )$ for all z and $x = V_{{\mathsf {r}}{\mathsf {k}} ( z )} \cup \{ z \}$ in the same manner as Proposition 2.30; then, we further construct a class Z from W (by $\mathrm {ECA}$) so that $( Z )_{z} := ( W )_{x}$ for $x = V_{{\mathsf {r}}{\mathsf {k}} ( z )} \cup \{ z \}$.
In fact, $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0}$ derives $\mathrm {ETR} ( X )$ for some class wellorderings whose “order-types” are greater than $\Omega $. We do not get into the details here because it requires a notation system of class wellorderings, such as Jäger’s [13] for those below $E_{0}$, but we note that Krähenbühl [18] showed that, under the assumption of $\mathrm {GC}$, $\Sigma ^{1}_{1} \text {-} {\mathsf {D}}{\mathsf {C}}_{0}$ has the same $\Pi ^{1}_{2}$-theorems as $\mathsf {NBG} + \bigcup _{n \in \omega } \mathrm {ETR} ( \Omega ^{n} )$ does; we conjecture that the same holds for $\Sigma ^{1}_{1} \text {-} \mathsf {DColl}_{0}$ (without assuming $\mathrm {GC}$).
The assumption of $\mathrm {AC}$ is only necessary here in picking such $\kappa $ with ${\mathsf {c}}{\mathsf {f}} ( \kappa ) > \omega $.
In fact, we can show that $\mathsf {ECA} + \mathrm {AC}$ is equiconsistent with $\mathsf {ECA}$, and Theorem 3.11 follows from this equiconsistency and Lemma 3.9. However, the proof of the equiconsistency is more involved than the direct proof of Theorem 3.11 given below, and we leave it (and some other equiconsistency results about $\mathrm {AC}$) for another paper.
In the literature, such as [7], the codes of ${\mathcal {L}}_{\in }^{\infty }$-formulae are expressed with the Quine brackets in such a way as $\ulcorner \varphi \urcorner $, but we don’t follow this convention for simplicity.
The relation $\prec $ defined here is denoted by $\prec _{0}$ with the subscript “0” in [7], but we suppress it because we need not consider iteration of typed truths in the present paper.
There are several different ways of proving this. For instance, since $\mathsf {LFP}_{0}$ has the same theorems as $\Pi ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}}_{0}$ does, it proves the existence of a coded $\beta $-model, which is automatically a model of $\Pi ^{1}_{\infty } \text {-} \mathsf {RFN}$ and thus $\Pi ^{1}_{3} \text {-} \mathsf {RFN}_{0}$ in particular; to see another example of a proof, we note that $\Pi ^{1}_{\infty } \text {-} \mathsf {RFN}_{0}$ and $\Pi ^{1}_{\infty } \text {-} {\mathsf {T}}{\mathsf {I}}$ have the same theorems (see [16]), and that they are proof-theoretically equivalent to ${\mathsf {I}}{\mathsf {D}}_{1}$ and thus $\mathsf {LFP}_{0}^{-}$, whose consistency is known to be provable in $\mathsf {LFP}_{0}$.
In second-order arithmetic, $\Pi ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}}_{0}$ derives $\Sigma ^{1}_{1} \text {-} \mathrm {DC}$ (see [25, Lemma VII.6.6.3 and Theorem VII.6.9.4]), and $\mathsf {LFP}_{0}$ and $\Pi ^{1}_{1} \text {-} {\mathsf {C}}{\mathsf {A}}_{0}$ have the same theorems.
Gitman, Hamklins, and Johnston constructed a model of ${\mathsf {M}}{\mathsf {K}} + \mathrm {GC}$ in which the axiom of $\Sigma ^{1}_{1}$ choice fails, which is equivalent to $\Sigma ^{1}_{1} \text {-} \mathrm {Coll}$ under the assumption of $\mathrm {GC}$ (see [15]).
We remark that ${\mathsf {S}}{\mathsf {C}}_{1}$ is defined differently in [8], where it is defined as ${\mathsf {I}}{\mathsf {D}}_{1} + {\mathcal {L}}_{\mathrm {SC}} \text {-} \mathrm {Sep} + {\mathcal {L}}_{\mathrm {SC}} \text {-} \mathrm {Repl}$ augmented with $\mathrm {SC1}$ and $\mathrm {SC2}$, but the two definitions result in the same theory: Fact 5.1.1 is easily proved in ${\mathsf {S}}{\mathsf {C}}_{1}$ with the current definition, and thereby we can show by induction along $\prec _{{\mathcal {A}}}$ (using $\mathrm {SC2}$) that ${\mathsf {S}}{\mathsf {C}}_{1}$ proves the schema $\mathrm {ID2}$ (Sect. 2.5) extended for ${\mathcal {L}}_{\mathrm {SC}}$.
Precisely, the well-foundedness of a tree $( H )_{a}$ is expressed as $\langle \epsilon , a \rangle \in Acc ( \sqsubset )$, where $\epsilon $ denotes the empty sequence, and we can show in the same manner as Proposition 2.3 (using the properties of $ Acc ( \sqsubset )$ shown in [8, §7]) that $\langle \epsilon , a \rangle \in Acc ( \sqsubset )$ is equivalent to the well-foundedness of $( H )_{a}$ in the sense of Proposition 2.3, for all trees $( H )_{a}$ with $a \in I$.
Under the assumption of $( \mathrm {Prj} )$, we can easily construct a $\Delta ^{p}$-definable (in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }_{p}$) partial surjection from $\mathsf {V}$ onto ${\mathcal {S}} \cup {\mathcal {U}}$. Hence, the entire domain of any model ${\mathfrak {M}}$ of ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}_{p}$ is projectible on $\mathsf {V}^{{\mathfrak {M}}}$ in the sense of [19, Ch.9] and projectible into $\mathsf {V}^{{\mathfrak {M}}}$ in the sense of [3, Ch.V].
As the following proof indicates, $( \Pi ^{1}_{0} \text {-} \mathrm {DColl} )^{\star }$ is actually provable in ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{\min }_{p} + {\mathcal {S}}\! = \! L ^{{\mathcal {S}}} ({\mathsf {V}} )+ \Sigma _{1}^{p} \text {-} \mathrm {Found}_{1} + \Sigma _{1}^{p} \text {-} \mathrm {Found}_{0}^{+} + \Delta _{0}^{p} \text {-} \mathrm {Sep}_{0}^{+} + \Delta _{0}^{p} \text {-} \mathrm {Repl}_{0}^{+}$, in other words, ${{\mathsf {K}}}{{\mathsf {P}}}{\mathbb {V}}^{1} + ( \mathrm {Prj} ) + {\mathcal {S}}= L ^{{\mathcal {S}}} ({\mathsf {V}} )$ (see Sect. 5.3 for its definition) with all its axiom schemata extended for ${\mathcal {L}}_{\mathrm {KP}} ( Pr )$.
The aim of the recursive definition here is to define $\upsilon _{\alpha }$ and $X_{\alpha }$; $\sigma _{\alpha }$ and $\mu _{\alpha }$ only play supplementary roles here, and their definitions could be incorporated into the definitions of the other two; we define them separately only for the sake of readability.
$\mathsf {KPu}$ is $\mathsf {KPU}^{+}$ [3, Ch. 1.2] with ${\mathsf {P}}{\mathsf {A}}$ as the theory of urelements augmented with a constant $\mathsf {N}$ for the set of urelements (viz., natural numbers) and the extended arithmetical induction schema for the entire language.
The axioms V1 and V2 jointly expresses that the set $ V $ is “inaccessible.” The inaccessibility in this sense is different from the inaccessibility in the sense we mean when we talk about $\mathsf {KPi}$ (see [20]) and the relevant systems. The axiom of inaccessibility included in $\mathsf {KPi}$ asserts that the entire universe (not any particular set) is inaccessible and that it is inaccessible in the sense of recursive inaccessibility. We can consider a Kripke–Platek set theory over ${\mathbb {V}}$ corresponding to $\mathsf {KPi}$, but the resulting theory is weaker than ${\mathsf {M}}{\mathsf {K}}$.
We use the uppercase “$ B $” instead of the lowercase “$\beta $”, since the term “$\beta $-model” is usually used to mean a model that is correct about well-foundedness, which is no longer equivalent to a model that is $\Sigma ^{1}_{1}$-correct in class theory.
The case $n = 0$ is an anomalous case because of the condition “$S \models \mathsf {NBG}$,” and both $\Pi ^{1}_{0} \text {-} \mathrm {RFN}^{+}$ and $\Pi ^{1}_{0} \text {-} \mathrm {RFN}^{B}$ are equivalent to $\mathrm {ACA}_{0}^{+}$ in $\mathsf {ACA}_{0}$ in second-order arithmetic, and, similarly, they are both equivalent to $\mathsf {ECA}_{0}^{+}$ in $\mathsf {NBG}$ in class theory by Lemma 2.21.
If we move the first “$\forall X$” to the place after “$( \exists \beta > \alpha )$”, namely, if we change it to
$$\begin{aligned} \forall \alpha ( \exists \beta > \alpha ) \forall X ( \forall x \in V_{\beta } ) \bigl ( \Phi ( x, X ) \leftrightarrow \langle V_{\beta }, V_{\beta + 1} \rangle \models \Phi ( x, X \cap V_{\beta } ) \bigr ), \end{aligned}$$
then this new principle is just inconsistent, since, for every $\kappa $, there are classes X and Y such that $X \ne Y$ (i.e., $\lnot \forall x ( x \in X \leftrightarrow x \in Y )$) but $X \cap V_{\kappa } = Y \cap V_{\kappa }$ (i.e., $( \forall x \in V_{\kappa } ) ( x \in X \leftrightarrow x \in Y )$).
It is easily observed that $\Pi ^{1}_{n + 1} \text {-} \mathrm {Indes}$ implies the existence of a $\Pi ^{1}_{n}$-indescribable cardinal, since $\Pi ^{1}_{n}$-indescribability is $\Pi ^{1}_{n + 1}$-expressible.
In the terminology of [5], Proporision 6.6.2 means that $\Pi ^{1}_{1} \text {-} \mathrm {Indes}^{-}$ implies (in $\mathsf {NBG}$) the existence of v-inaccessible cardinal, which becomes a strongly inaccessible cardinal (in the ordinary definition in terms of cardinal exponentiation) under the assumption of $\mathrm {AC}$ (which is needed only for defining the very notion of a strongly inaccessible cardinal). Hence, under $\mathrm {AC}$, $\Pi ^{1}_{1} \text {-} \mathrm {Indes}^{-}$ entails the existence of a strongly inaccessible cardinal.
In fact, we also have $\mathsf {NBG} + \Pi ^{1}_{n + 1} \text {-} \mathrm {RFN}^{+} \vdash Con ( \mathsf {ECA} + \Pi ^{1}_{n} \text {-} \mathrm {RFN}^{B} )$, but the proof is left for another occasion, which requires new arguments for strong systems of classes. We conjecture that $\mathsf {NBG} + \Pi ^{1}_{n + 1} \text {-} \mathrm {SRfn} \vdash Con ( \mathsf {ECA} + \Pi ^{1}_{n} \text {-} \mathrm {SRfn} )$.

References

Afshari, B., Rathjen, M.: Reverse mathematics and well-ordering principles: a pilot study. Ann. Pure Appl. Logic 160, 231–237 (2009)
Article MATH Google Scholar
Avigad, J.: On the relationships between $\mathit{ATR}_{0}$ and ${\widehat{ID}}_{< \omega }$. J. Symb. Logic 61(3), 768–779 (1996)
Article Google Scholar
Barwise, J.: Admissible Sets and Structures. Springer, Berlin (1975)
Book MATH Google Scholar
Barwise, J., Gandy, R., Moschovakis, Y.: The next admissible set. J. Symb. Logic 36, 108–120 (1971)
Article MATH Google Scholar
Blass, A., Dimitriou, I.M., Löwe, B.: Inaccessible cardinals without the axiom of choice. Fundam. Math. 194 (2007)
Drake, F.R.: Set Theory: An Introduction to Large Cardinals. North-Holland, Amsterdam (1974)
MATH Google Scholar
Fujimoto, K.: Classes and truths in set theory. Ann. Pure Appl. Logic 163, 1484–1523 (2012)
Article MATH Google Scholar
Fujimoto, K.: Truths, inductive definitions, and Kripke–Platek systems over set theory. J. Symb. Logic 83, 868–898 (2018)
Article MATH Google Scholar
Hájek, P., Pudlak, P.: Metamathematics of First-Order Arithmetic. Springer, Berlin (1993)
Book MATH Google Scholar
Hamkins, J.D., Woodin, W.H.: Open class determinacy is preserved by forcing. arXiv preprint arXiv:1806.11180 (2018)
Jäger, G.: Zur beweistheorie der Kripke-Platek-mengenlehre uber den naturlichen zahlen. Arch. Math. Logik Grundlagenforsc. 22, 121–139 (1982)
Article MATH Google Scholar
Jäger, G.: Theories for Admissible Sets: A Unifying Approach to Proof Theory. Bibliopolice, Naples (1986)
MATH Google Scholar
Jäger, G.: Full operational set theory with unbounded existential quantification and power set. Ann. Pure Appl. Logic 160, 33–52 (2009)
Article MATH Google Scholar
Jäger, G.: Short note: least fixed points versus least closed points. Arch. Math. Logic 60, 831–835 (2021)
Article MATH Google Scholar
Jäger, G., Krähenbühl, J.: $\Sigma ^{1}_{1}$ choice in a theory of sets and classes. In: Schindler, R. (ed.). Ways of Proof Theory, pp. 283–314. Ontos Verlag, Frankfurt (2010)
Jäger, G., Strahm, T.: Bar induction and $\omega $ model reflection. Ann. Pure Appl. Logic 97, 221–230 (1999)
Article MATH Google Scholar
Kanamori, A.: The Higher Infinite, 2nd edn. Springer, Berlin (2003)
MATH Google Scholar
Krähenbühl, J.: On the relationship between choice schemes and iterated class comprehension in set theory. PhD thesis, Universität Bern (2011)
Moschovakis, Y.: Elementary Induction on Abstract Structures. Number 77 in Studies in Logic and the Foundation of Mathematics. North Holland, Amsterdam (1974)
Pohlers, W.: Subsystems of set theory and second order number theory. In: Buss, S. (ed.). Handbook of Proof Theory, pp. 209–336. Elsevier, Amsterdam (1998)
Rathjen, M.: Fragments of Kripke–Platek set theory with infinity. In: Aczel, P., Simmons, H., Wainer, S. (ed.). Proof Theory, pp. 252–273. Cambridge University Press, Cambridge (1992)
Sato, K.: Relative predicativity and dependent recursion in second-order set theory and higher-order theories. J. Symb. Logic 79, 712–732 (2014)
Article MATH Google Scholar
Sato, K.: Full and hat inductive definitions are equivalent in $\mathit{NBG}$. Arch. Math. Logic 54, 75–112 (2015)
Article MATH Google Scholar
Simpson, S.G.: $\Sigma ^{1}_{1}$ and $\Pi ^{1}_{1}$ transfinite induction. In Van Dalen, D., Lascar, D., Smiley, T.J. (eds.). Logic Colloquium ’80, Studies in Logic and the Foundation of Mathematics, pp. 239–253. North-Holland (1982)
Simpson, S.G.: Subsystems of Second Order Arithmetic. Cambridge University Press, Cambridge (2009)

Download references

Author information

Authors and Affiliations

University of Bristol, Bristol, UK
Kentaro Fujimoto

Authors

Kentaro Fujimoto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kentaro Fujimoto.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The author is very grateful to Kentaro Sato for valuable discussions on various topics related to the present paper. He would also like to thank Victoria Gitman for informing him of her joint work with Joel Hamkins and Thomas Johnstone, and Gerhard Jäger and Philip Welch for helpful comments. Lastly, he is most thankful to the anonymous referee for their extraordinarily careful reading, meticulous feedback, and constructive suggestions.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fujimoto, K. A few more dissimilarities between second-order arithmetic and set theory. Arch. Math. Logic 62, 147–206 (2023). https://doi.org/10.1007/s00153-022-00829-3

Download citation

Received: 04 November 2018
Accepted: 07 February 2022
Published: 09 July 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00153-022-00829-3

Keywords

JEL Classification

03E30

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A few more dissimilarities between second-order arithmetic and set theory

Abstract

Similar content being viewed by others

Set Theory and its Place in the Foundations of Mathematics: A New Look at an Old Question

Set Theory and Numbers

Short note: Least fixed points versus least closed points

1 Introduction

Remark 1.1

2 Definitions and basic facts

2.1 Basic systems

Proposition 2.1

Proof

Lemma 2.2

Proof

2.2 Well-foundedness and transfinite recursion

Proposition 2.3

Proof

2.3 Coded \({\mathbb {V}}\)-models

Definition 2.4

Definition 2.5

Proposition 2.6

Lemma 2.7

Proof

Lemma 2.8

Definition 2.9

Proposition 2.10

Proposition 2.11

Definition 2.12

Proposition 2.13

Proposition 2.14

Proof

Corollary 2.15

Lemma 2.16

Proof

Lemma 2.17

Proof

Corollary 2.18

Proof

2.4 \({\mathbb {V}}\)-reflection

Definition 2.19

Proposition 2.20

Proof

Lemma 2.21

Proposition 2.22

Proof

Corollary 2.23

2.5 Other systems

Theorem 2.24

Proposition 2.25

Proof

Lemma 2.26

Proof

Proposition 2.27

Proof

Corollary 2.28

Proposition 2.29

Proposition 2.30

Proof

Proposition 2.31

Proof

Proposition 2.32

Proof

Corollary 2.33

3 Transfinite induction

Proposition 3.1

Definition 3.2

Lemma 3.3

Proof

Theorem 3.4

Proof

Corollary 3.5

Proposition 3.6

Proof

Theorem 3.7

Corollary 3.8

Lemma 3.9

Proof

Fact 3.10

Theorem 3.11

Proof