A few more dissimilarities between second-order arithmetic and set theory

Second-order arithmetic and class theory are second-order theories of mathematical subjects of foundational importance, namely, arithmetic and set theory. Despite the similarity in appearance, there turned out to be significant mathematical dissimilarities between them. The present paper studies various principles in class theory, from such a comparative perspective between second-order arithmetic and class theory, and presents a few new dissimilarities between them.


Introduction
The study of second-order set theory, also known as class theory, was recently reinvigorated with various different motivations. In particular, the development of ordinal analysis and reverse mathematics brings new perspectived and techniques to the study 1 As far as the author knows, the current trend of the study of subsystems of MK started with Jäger's work [13] on Feferman's Operational Set Theory, in which he introduced a theory NBG <E 0 and initiated a proof-theoretic treatment of class theory. The study of subsystems of MK had been by and large driven by proof-theoretic motivations for years since then, but we nowadays find more research on this subject from more purely set-theoretic interest, such as [10].
The author is very grateful to Kentaro Sato for valuable discussions on various topics related to the present paper. He would also like to thank Victoria Gitman for informing him of her joint work with Joel Hamkins and Thomas Johnstone, and Gerhard Jäger and Philip Welch for helpful comments. Lastly, he is most thankful to the anonymous referee for their extraordinarily careful reading, meticulous feedback, and constructive suggestions.
B Kentaro Fujimoto kentaro.fujimoto@bristol.ac.uk 1 University of Bristol, Bristol, UK of subsystems of Morse-Kelley theory MK, which has been a driving force of the recent trend of research on class theory. 1 In the course of the recent development on class theory, several significant dissimilarities between second-order arithmetic and class theory have been discovered. Both second-order arithmetic and class theory are second-order theories of mathematical subjects of foundational importance, namely, arithmetic and set theory. However, it turned out that the class-theoretic counterparts of some important theorems in second-order arithmetic fail in class theory, and some powerful tools and techniques in second-order arithmetic are not available in class theory. The existence of such dissimilarities, at the same time, attracts interests in "non-trivial" similarities; even if the standard or known proofs of some theorems in second-order arithmetic are no longer valid in class theory, it is sometimes the case that the corresponding theorems can be proven in class theory by different types of proofs. In the present paper, we present a few new such dissimilarities and "non-trivial" similarities.
It is well known that the schema of ω-model reflection is equivalent to the schema of transfinite induction (also known as the schema of Bar induction). In the notation of the standard textbook [25], the system 1 ∞ -RFN of ω-model reflection and the system 1 ∞ -TI of transfinite induction (also referred to as 1 ∞ -Bi in [16]) have exactly the same theorems in the language of second-order arithmetic. They are also prooftheoretically equivalent to the first-order system ID 1 of inductive definitions as well as its second-order counterpart LFP − 0 (also referred to as (ID 2 1 ) 0 in [20]). Transfinite induction concerns the notion of well-foundedness, and, while the notion of wellfoundedness is 1 1 -complete in second-order arithmetic, it is only elementary in class theory. Hence, the notion of well-foundedness is less robust in class theory than in second-order arithmetic, and the class-theoretic counterpart of 1 ∞ -TI is naturally expected to be significantly weaker in the context of class theory than it is in the context of second-order arithmetic, which will be shown to be indeed the case in the present paper.
In the present paper, we will mainly study the class-theoretic counterparts of the systems of transfinite induction and ω-reflection principle, as well as some related principles. We will show that transfinite induction is quite a weak principle in class theory, as is expected, and not equivalent to the class-theoretic counterpart of ωreflection. In fact, as we will show, the aforementioned three systems 1 ∞ -RFN, 1 ∞ -TI, and ID 1 , are all pairwise inequivalent in class theory, while they are all equivalent in second-order arithmetic. In addition, among other results, we will give an analysis of the class-theoretic counterpart of the axiom of 1 1 dependent choice, which we call 1 1 dependent collection, in relation to subsystems of 1 ∞ -RFN; as a corollary, we will obtain an alternative proof of Sato's theorem [22] that the class-theoretic counterpart ETR of the system of arithmetical transfinite recursion ATR is weaker than the classtheoretic counterpart of the system of 1 1 choice; 1 1 dependent collection will also be shown to be proof-theoretically equivalent to 1 2 -RFN in class theory, but the proof is quite different from the known proof of the equivalence of their second-order arithmetical counterparts. To conclude the paper, we will briefly consider alternative types of reflection principles. Remark 1. 1 Neither the Axiom of Choice (AC) nor Global Choice (GC) is counted as a default axiom of class theory in the present paper; in particular, neither is included among the axioms of the Von Neumann-Bernays-Gödel theory NBG. However, the addition of AC or GC to NBG does not affect any of the proofs in the present paper, while the assumption of them would make some proofs simpler, and all the results of L ∈ -conservation, relative consistency, etc., in the present paper concerning class theory still hold even when we assume either AC or GC. 2

Basic systems
Let L ∈ be the language of first-order set theory. The language L 2 of second-order set theory, i.e., class theory, is a two-sorted language with variables x, y, z, . . . of first-sort ("first-order") and variables X , Y , Z , . . . of second-sort ("second-order"), whose non-logical symbols are a binary membership predicate ∈ set between first-order entities and another binary membership predicate ∈ class between first-and secondorder entities. We assume that L 2 possesses the equality symbol = as a logical symbol only for first-order entities, i.e., sets, and the equality between two classes Y and Z is definitionally introduced by putting Y = Z :⇔ ∀z(z ∈ class Y ↔ z ∈ class Z ). The thus defined relation Y = Z is a congruent relation allowing substitution salva veritate. The equality and subset relation between sets and classes are defined in an obvious manner: x = X :⇔ ∀z(z ∈ set x ↔ z ∈ class X ); x ⊂ X :⇔ ∀z(z ∈ set x → z ∈ class X ). For simplicity, we will identify ∈ set and ∈ class throughout the present paper whenever there is no worry of confusion.
For each natural number n, we standardly define collections 0 n , 0 n , 1 n , and 1 n of L 2 -formulae in obvious analogy with those in second-order arithmetic: we start by identifying 0 0 and 0 0 with the collection of L 2 -formulae only with bounded firstorder quantifiers and no second-order quantifiers, which may contain second-order free variables as parameters ("class parameters"), and also identifying 1 0 and 1 0 with the collection of elementary formulae, namely, formulae with no second-order quantifiers but possibly with class parameters; then, 0 n+1 , 0 n+1 , 1 n+1 , 1 n+1 are defined from 0 n , 0 n , 1 n , and 1 n , respectively, in the usual manner in terms of alterations of universal and existential quantifiers. We write 1 ∞ = n 1 n and 1 ∞ = n 1 n . Given an L 2 -system T, we call an L 2 -formula a i n -formula (i = 0, 1) in T, when it is equivalent to some i n -formula and i n -formula in T. In what follows, we occasionally abuse the notation and say that an L 2 -formula is 1 n or 1 n when it is equivalent to some 1 n -or 1 n -formula in a system in question, 2 I only mean the addition of AC or GC to formal systems of class theory here. We will also consider Kripke-Platek systems over set theory in Sect. 5, and there are three different ways of adding AC or GC to such systems, namely, postulating it only for the U-sets, only for the S-sets, and both for U-sets and S-sets. If we count AC or GC among the axioms of NBG, we accordingly need to add a corresponding axiom for the U-sets to the Kripke-Platek systems; in this case, all the proofs in the present paper can be used to establish the corresponding results with AC or GC (for class theory) with no substantial change. In contrast, AC or GC for the S-sets would make greater difference because they yield a choice or global wellordering on classes in terms of the canonical translation (Sect. 5.1.1).
respectively. By means of ordered pairs, we can show (in, say, NBG) that for all 1 nformulae , the result of prefixing a first-order or second-order existential quantifier, i.e., ∃x or ∃X , is equivalent to a 1 n -formula; the dual holds for 1 n . The Von Neumann-Bernays-Gödel class theory NBG consists of the standard firstorder set-theoretic axioms of extensionality, paring, union, powerset, and infinity, and the following four axioms regarding classes.
ECA : ∃X ∀x x ∈ X ↔ (x) , for all elementary with X not free, where ECA is an acronym for the Elementary Comprehension Axiom; the (unique) class X satisfying ∀x(x ∈ X ↔ (x)) will be denoted by {x | (x)}.
Class Separation : ∀X ∀x∃y(y = x ∩ X ) Class Foundation : ∀X [X = ∅ → (∃x ∈ X )(∀y ∈ x)(y / ∈ X )] where x ∩ X := {z | z ∈ x ∧ z ∈ X }, Fun(X ) expresses "X is a function", and X x := {u | (∃v ∈ x) v, u ∈ X } (i.e., the image of x under X ). It is well known that NBG is finitely axiomatizable. Note that, as we remarked, neither Axiom of Choice (AC) nor Global Choice (GC) is included in the axioms of NBG.
We will occasionally consider adding the following axiom schemata to NBG: where and are any 1 n -and 1 n -formula, respectively, with neither X nor w free; these axiom schemata for 1 n -formulae, such as 1 n -CA, are defined similarly. 3 Throughout the present paper, we stipulate that, whenever we define axioms, the universal closures of displayed formulae are taken as the defined axioms; hence, and above may possibly contain other free variables, unless otherwise specified, and the names of the axioms above precisely mean the universal closures of the displayed formulae above. All the axioms listed above are underivable from NBG for n > 0, whereas they are all derivable from NBG for n = 0; note that ECA is just the same as 1 0 -CA. Proposition 2. 1 The following are provable in NBG. 3 We remark that the acronyms ' 1 n -Sep' and ' 1 n -Sep' are sometimes used to denote axioms of a completely different kind in the context of Second-order Arithmetic (e.g., [25]); the class-theoretic axioms corresponding to them are called 1 n -and 1 n -Red in [23].
Proof The claim 1 is obvious. For the claim 2, suppose ¬ (x) for some 1 n (or 1 n ) formula and x. Take a := {z ∈ TC({x}) | ¬ (z)} ( = ∅) by 1 n -Sep ( 1 n -Sep, resp.), where TC(y) denotes the transitive closure of y. By the foundation axiom, there is a ∈-minimal y ∈ a. Hence, we have ¬ (y) but (z) for all z ∈ y.
In analogy with second-order arithmetic, NBG corresponds to the system ACA 0 of arithmetical comprehension, and (in one view) NBG+ 1 ∞ -Sep+ 1 ∞ -Repl corresponds to ACA. We keep the nomenclature NBG, following the long-standing convention, but we call the latter system ECA in this analogy with second-order arithmetic. We will also consider the following systems: We will use sans serif fonts to denote systems and normal fonts for axioms and axiom schemata. 4 For each natural numbers n ≥ 1 and i, j ≥ 0, there is known to be a 0 n universal formula π 0 n,i, j (v, u 1 , . . . , u i , U 1 , . . . , U j ) only with the displayed variables free such that, for all 0 n -formulae (u 1 , . . . , u i , U 1 , . . . , U j ) only with the displayed variables free, there is a (standard) natural number e such that Proof Fix any z and Z . We can assume without loss of generality that ( z, Y , Z ) is in the following prenex normal form: where is 0 0 . By meta-induction on k, ( z, Y , Z ) is shown to be equivalent to where G 1 , . . . , G k are distinct from each other and Y , Z . By contracting G i s into a single class (and x i s and y i s into single sets, respectively), we have a 0 We finally obtain the claim by contracting G and Y into one class.
This implies that, for each n > 0 and 1 where π = π 0 2,1,n+1 , if n is even, and π = ¬π 0 2,1,n+1 , if n is odd. Hence, for each n > 0, we have a 1 n universal formula in NBG, which will be denoted by π 1 n (e, x, X ). Note that we don't have to consider universal formulae containing more first-and/or second-order free variables because we can always contract multiple free variables into one variable by pairing in NBG. As a result, all the axioms listed above are finitely axiomatizable modulo NBG.

Well-foundedness and transfinite recursion
For a class X , we will write x ≺ X y for x, y ∈ X . We define a formula Wf (X ) expressing the well-foundedness of ≺ X and a formula TI (X ) asserting transfinite induction along ≺ X with respect to (x) ∈ L 2 (possibly with parameters) as follows: the symbol V above denotes the universe of sets, namely, the class {x | x = x}. For a collection of L 2 -formulae, we define the schema -TI as follows: Thereby we define the systems of 1 n transfinite induction as follows (n ∈ N): 1 n -TI 0 := NBG + 1 n -TI 1 n -TI := ECA + 1 n -TI.
By the existence of universal formulae, 1 n -TI 0 is finitely axiomatizable for all n ∈ N (except the finite axiomatizability of 1 0 -TI 0 , where 1 0 -TI is derivable from NBG). The notion of well-foundedness is known to be 1 1 -complete in second-order arithmetic, but it is elementary in class theory, and the elementarity of well-foundedness causes a number of differences between second-order arithmetic and class theory, as seen in [7,22,23] for example. In the present paper, we adopt the following elementary expression of well-foundedness.

Proposition 2.3 For each class X , Wf (X ) is equivalent to the following in NBG:
Every non-empty set has a ≺ X -minimal element; in other words, there is no non-empty set c such that (∀x ∈ c)(∃y ∈ c)(y ≺ X x). (1) Take any x. If x / ∈ c, then the succedent of (2) trivially holds. Otherwise, there is y ≺ X x such that y ∈ c and thus y / ∈ Y , and the antecedent of (2) fails. For the converse, suppose ¬Wf (X ). There is some class Z = V such that From (3), we will construct a pseudo ω-descending chain of ≺ X , by which we mean a function f : ω → V such that First, for each x / ∈ Z , we define where rk(w) denotes the rank of a set w; that is, g(x) is the set of sets z with the least rank such that z ≺ X x and z / ∈ Z ; this is an application of Scott's trick and g(x) is a non-empty set by (3) for all x / ∈ Z . We thereby recursively define f : ω → V so that note that f (0) is the set of z / ∈ Z with the least rank. We put c be the range of f . Since f (n) = ∅ for all n ∈ ω, c has no ≺ X -minimal element.
We will denote von Neumann ordinals by lowercase Greek letters α, β, …, possibly with indices, and write α < β for α ∈ β, viz., the canonical ordering of the ordinals. For a class X and a set a, we define (X ) a := {x | x, a ∈ X }. Then, for an L 2 -formula (x, z, Z ) possibly with parameters, we define We thereby define the axiom ETR of elementary transfinite recursion, which is the class-theoretic version of Friedman's axiom ATR of arithmetical transfinite recursion, and its restriction ETR(α) to the set wellordering { γ, β | γ < β < α}.
Obviously, ETR implies ETR(α) for all α ∈ On, where On denotes the class of ordinals. We thus introduce four systems: We can show by an exactly parallel manner to second-order arithmetic that both ETR 0 and ECA + 0 are finitely axiomatizable; see [7,Theorem 90]. The third system ECA + 0 corresponds (in one sense) to the system ACA + 0 of ω-Turing jumps in second-order arithmetic. 5

Coded V-models
We will modify the notion of coded ω-model in second-order arithmetic for class theory, and define the notion of coded V-model, by which we mean a class S viewed as an (∀z (z)) S :⇔ ∀z S (z) (∀Z (Z )) S :⇔ ∀z S ((S) z )). 5 ECA + 0 is denoted by NBG ω in [7].
The relativization S of an L 2 -formula is always elementary, and thus, in particular, S holds in NBG for every instance of 1 ∞ -Sep or 1 ∞ -Repl. Furthermore, for every elementary formula , S is identical with .
For each classes X and S, we write X∈ S for (∃x)(X = (S) x ), which informally expresses that X is a member of the second-order domain of the coded V-model S; hence, the coded V-model S can be alternatively expressed as V, {X | X∈ S} . With this notation, (∀Z (Z )) S is equivalent to (∀Z∈ S) S (Z ).
We next consider a restriction of a coded V-model to a set. In what follows, we stipulate for simplicity that when we treat an ordered pair M, N of sets as an L 2structure only with the specification of the first-order domain M and second-order domain N , the membership relations are always assumed to be standardly interpreted in the L 2 -structure unless otherwise specified. We make a parallel stipulation for setsized L ∈ -structures: if a set M is treated as an L ∈ -structure, the membership relation is standardly interpreted unless otherwise specified.
Given a coded V-model S and a set M, we denote the set-sized For a set x and a class X , x X denotes the class of (set) functions from x to X . Then, given a set M, each h ∈ ω M can be viewed as a first-order variable assignment on S M that assigns h(i) to the ith first-order variable u i , and also as a second-order variable assignment on S M that assigns (S) h( j) ∩ M to the jth second-order variable U j .
Let Fml 2 be the (countable) set of codes of L 2 -formulae. For each L 2 -formula , Fml 2 contains its code and we will simply denote it by ; this notation neglects the distinction of formulae in the usual sense (as meta-theoretic syntactic entities) and their codes (which are only sets), but there should be no danger of confusion. Then, for each f , g ∈ ω M and ∈ Fml 2 we write S M | [ f , g] to mean that is satisfied in the set-sized L 2 -structure S M under the first-order variable assignment f and the second-order variable assignment g.
For a class X , let us write X∈ S M for ∃z(z ∈ M ∧ (S) z = X ); with this notation, the set-sized L 2 -structure S M can also be expressed as M, {X ∩ M | X∈ S M } . When some specific sets x 1 , . . . , x m ∈ M and classes X 1 , . . . , X n∈ S M are given, the relation S M | (x 1 , . . . x m , X 1 ∩ M, . . . , X n ∩ M) (or S M | ( x, X ) more simply) is defined in the obvious manner: that is, it means that S M | [ f , g] for every variable assignment f and g that assigns respectively. Now, the next notation is useful. In the definition of S M , classes are not restricted to the set M, but this notation is justified by the following proposition, which can be standardly shown by induction on the complexity of ; recall that equality between classes is not counted as a primitive predicate symbol of L 2 but defined in terms of ∈.
Proposition 2.6 Let ( u, U 1 , . . . , U k ) be an L 2 -formula only with the displayed variables free. NBG proves the following: for every coded V-model S, set M, x ∈ M, and X 1 , . . . , X k∈ S M , recall that the " "to the right of "| " is, precisely, the code of belonging to Fml 2 .
The next is a variation of the Montague-Lévy reflection principle.
Note that, by Proposition 2.6, (4) is equivalent to the following: Proof Let 1 , . . . , n be the enumeration of all the sub-formulae of . Take any coded V-model S. For each 1 ≤ i ≤ n, let i (x 1 , . . . , x k i , X 1 , . . . , X m i ) contain at most the displayed variables free, and we define a class function G i : V k i +m i → On in the following manner: if i is of the form ∃w (w, x, X ), then we set if i is of another form, then we just put G i ( a, z) = 0; note that G i s can be taken as classes because S i s are elementary. We thereby define class functions F i : On → On (1 ≤ i ≤ n) and F : On → On as follows: Then, we recursively define F 0 (ξ ) = ξ and F j+1 (ξ ) := F(F j (ξ )). Finally, we define H : On → On by H (ξ ) := sup j∈ω F j (ξ ). Now, for any given α, we set β := H (α) and write γ j for F j (α); hence, we have β = sup j<ω γ j . We can show (4) for all i s in place of by a routine induction; we only go through the crucial case here. Let i be ∃Z (Z , x, X ) and take x ∈ V β and X∈ S V β . Take the least l < ω such that x ∈ V γ l and X∈ S V γ l . Let z j ∈ V γ l be such that , then S (Z , x, X ) for some Z∈ S V γ l+1 (and thus Z∈ S V β ). By the induction hypothesis, we obtain S V β (Z , x, X ) and thus S V β i ( x, X ). The converse is obvious from the induction hypothesis.
By the standard trick, the last lemma implies the next one.
only with the displayed variables free, NBG proves the following: for all coded V-models S, classes Y∈ S, and ordinals α, there exists an ordinal β > α such that In order to express S for infinitely many 's at once, we need a something like a satisfaction predicate for a coded V-model S. Let S be a coded V-model. Then, each h ∈ ω V can be viewed as a first-order variable assignment on S that assigns h(i) to the ith first-order variable u i , as well as a second-order variable assignment on S that assigns (S) h( j) to the jth second-order variable U j . For each h ∈ ω V, x ∈ V, and n ∈ ω, we define a new set function h (x|n) ∈ ω V by putting h (x|n) (m) = x, if n = m, and h (x|n) (m) = h(m), if n = m. Finally, let Fml 2 (n) be the subset of Fml 2 that comprises the codes of L 2 -formulae with at most n logical symbols; for notational convenience, we set Fml 2 (ω) = Fml 2 . Definition 2.9 Let S be a coded V-model and α ≤ ω. A class X is said to be an α-satisfaction class for S, if and only if X ⊂ Fml 2 (α) × ω V × ω V and the following holds for all ∈ Fml 2 (α) and f , g ∈ ω V: We particularly call an ω-satisfaction class for S a full satisfaction class for S.
The next two propositions are standardly shown and we omit the proofs.

Proposition 2.10
Let S be a coded V-model and α ≤ ω. The following are provable in NBG.
1. If X is an α-satisfaction class for S, then X ∩ (Fml 2 (n) × ω V × ω V) is an nsatisfaction class for S for all n < α. 2. If classes X and Y are α-satisfaction classes for the same S, then X = Y .

Proposition 2.11
For each (standard) natural number n, NBG proves the existence of an n-satisfaction class for any coded V-model S.
Fml 2 with designated free variables u i 0 , . . . , u i m , U j 0 , . . . , U j k , which may contain other free variables. Then, for each sets x 0 , . . . , x m ∈ V and classes X 0 , . . . , X k∈ S, We say that a class S is a coded V-model of a (recursive) L 2 -system T, when S | for the set (⊂ Fml 2 ) of the codes of the axioms of T. Note that if T is finite, then S | does not necessitate the existence of a full satisfaction class for S. 6 The next is shown by induction on the complexity of using Proposition 2.11.

Proposition 2.13
For each (standard) L 2 -formula ( u, U ), NBG proves that, for every coded V-model S, x ∈ V, and X∈ S, recall again that the " "to the right of "| " is precisely the code of . Hence, in particualr, for every elementary formula (x, X ) and coded V-model S, we have that is, elementary formulae are "absolute" for coded V-models.
Hence, although the official definition of S | is a 1 1 in NBG (by Proposition 2.10), this proposition shows that S | is elementarily expressible for each (standard) L 2 -formula . It also follows that S | T is elementarily expressible for every finite L 2 -system T.
The proof of the next proposition is essentially the same as the well known corresponding fact (about ACA + 0 ) in second-order arithmetic, and we omit the details.
Proposition 2.14 1. ECA + 0 proves the existence of a full satisfaction class for any coded V-model S. 2. ECA + 0 proves that for each class Z there is a coded V-model S of NBG with Z∈ S. Proof 1. For every S, ETR(ω) yields a class X such that, for each n < ω, (X ) n is an n-satisfaction class, from which we can define a full satisfaction class for S. 2. It suffices to show the existence of a coded V-model of 0 1 -CA. Take any class Z . By ETR(ω), we construct X such that (X ) 0 = Z and, for each n < ω and e, The existence of a coded V-model of an L 2 -system T implies the consistency of T in NBG, in the same way as the existence of a coded ω-model of a system S of second-order arithmetic implies the consistency of S in ACA 0 . 7 As we will see below, the existence of a coded V-model bears more implications.
We first observe that Lemma 2.8 and Propositions 2.6 and 2.13 above immediately imply the following.

Corollary 2.15
Let T be a finite L 2 -system. NBG proves that if there is a coded V-model of T, then there are class-many set models of T.
In this corollary, since T is finite, the condition of the existence of a coded V-model of T does not require the existence of a full satisfaction class for the coded V-model. The presence of a full satisfaction class has a stronger consequence. Lemma 2.16 NBG proves the following: for each coded V-model S, if there is a full satisfaction class X for S, then, for all ordinals α, there is an ordinal β > α such that Proof Let X be a full satisfaction class for S. For each ∈ Fml 2 and f , g ∈ ω V, we will write S | X [ f , g] for , f , g ∈ X . It follows by Proposition 2.10.2 that 7 The following argument can be carried out in both NBG and ACA 0 . Suppose S | T. If the numbers of logical symbols of the axioms of T is unbounded, then S | T implies the existence of a full satisfaction class X for S, and we can thereby show that T implies S | X [ f , g] for all formulae and variable assignments f and g (in the notation of Lemma 2.16 below) by induction on the length of derivation. Suppose otherwise. Then, there is a bound m of the numbers of logical symbols of the axioms of T. Let us write T k when there is a derivation of from T in which only formulae with at most k logical symbols occur. We can show by partial cut-elimination that there is some n (≥ m) such that T ⊥ implies T n ⊥. Take an n-satisfaction class Y for S. Then, we can show by induction on the length of derivation that T n implies S | Y for all with at most n logical symbols, which entails the consistency of T.
Hence, it suffices to show that, for all α ∈ On, there is β > α such that The proof idea is essentially the same as Lemma 2.7: instead of taking Skolem functions G i 's separately for finitely many formulae 0 , . . . , n , we take a single "global" Skolem function G : The wanted function G is defined as follows: if is of the form ∃u j , then we set if is of the form ∃U l , then we set if is of another form, then we just put G( , f , g) = 0. Then, we define a class function F : On → On as follows: this is well defined, since Fml 2 and ω V ξ are sets. The rest is parallel to the proof of Lemma 2.7; note that the ω-induction and ω-recursion involved in the remaining part are possible because F is elementarily definable (and this is why we work with the Lemma 2.17 NBG proves the following.

For every coded V-model S, if there is a full satisfaction class for S, then S |
For every standard n ∈ N and coded V-model S, S | 1 n -Sep + 1 n -Repl. Proof 1. Let X be a full satisfaction class for a coded V-model S. By Proposition 2.10.2, S | 1 ∞ -Repl is equivalent to the following: This is equivalent to an instance of 1 0 -Repl, which is derivable in NBG. The other claim S | there is a full satisfaction class X for S. We take Z to be the (elementary) class of ordinals β that satisfies (8). Clearly, Z is unbounded in On and S V β | T for every β ∈ Z . The closedness of Z can be shown by the standard Tarski-Vaught argument. Let γ = {β ξ } ξ<λ for a limit λ such that {β ξ } ξ<λ ⊂ Z and β η < β ζ for η < ζ < λ. . Take any f , g ∈ ω V γ . Then, for instance (the crucial case), Since ∃U j contains only finitely many free variables, there is some ξ < λ such that f , g ∈ ω V β ξ ; we will write β for β ξ . Now, in general, we can show by the standard argument that for all variable assignments p, q ∈ ω V and p , q ∈ ω V , if p, qand p , q coincide on all the free variables of , Hence, by (9) ] again by (9). We obtain , by the induction hypothesis, and thus

V-reflection
Definition 2.19 (V-Reflection) The schema 1 n -RFN of 1 n V-reflection is defined as follows: where only contains the displayed variables free (and without S free); the schema 1 n -RFN is similarly defined. Since NBG is finitely axiomatizable by 1 2 -sentences, we can drop the condition "S | NBG" when n ≥ 2. We thereby define: 1 n -RFN 0 is finitely axiomatizable for all n by the existence of a 1 n universal formula (for n ≥ 1) and Proposition 2.20.1 below (for n = 0), but 1 n -RFN and 1 ∞ -RFN 0 are not finitely axiomatizable.

Proposition 2.20
The following hold in NBG.
Proof The claim 1 follows from Proposition 2.13. The claim 2 is obvious. The claim 3 follows from the claim 2 and the "downward absoluteness" of 1 1 -formulae in the sense that if ∈ 1 1 holds then holds in every coded V-model containing 's parameters.
The next can be shown in a parallel manner to [1, Lemma 3.4] (one direction was already shown in Propostion 2.14.2).

Proposition 2.22
Let F be any finite L 2 -system whose axioms are all 1 n . Then, Proof 1 n -RFN 0 + F proves ETR(ω) by Lemma 2.21 and the existence of a coded V-model of F, which implies the claim by Proposition 2.14.1 and Corollary 2.18.
Since each instance of 1 n -RFN is 1 n+1 , we have the following.

Other systems
In this subsection, we will introduce a few more systems. For a second-order variable X , an L 2 -formula is said to be X -positive, when X only occurs positively in . The axiom schemata of FP and LFP are thereby defined as follows.
for each X -positive elementary possibly with parameters. FP asserts the existence of a fixed-point of each X -positive elementary formula, and LFP asserts the existence of the least such fixed-points. 8 The weaker variants FP − and LFP − are obtained by restricting the range of s above to the X -positive elementary formulae without class parameters (but possibly with set parameters). We thereby set Note that FP − In second-order arithmetic, FP is equivalent to ATR (due to Avigad [2]), and LFP is equivalent to 1 1 -CA, but neither of the corresponding equivalences holds in class theory. The next is a remarkable theorem due to Sato. (Sato [23]). NBG LFP ↔ FP and NBG LFP − ↔ FP − .

Theorem 2.24
We next consider principles of class collection. For a collection of L 2 -formulae, the schema of -collection is defined as follows: where may have parameters. We also consider a parameter-free version of -Coll: -Coll − are obtained by restricting the above s to -formulae with no class parameters (but possibly with set parameters). We can easily show (by using paring), We thereby define are finitely axiomatizable for every n. The system 1 1 -Coll 0 + 1 ∞ -Ind is extensively studied in [15,18]. In the presence of a global choice, 1 1 -Coll implies the axiom 1 1 -AC of 1 1 choice (see [15] for its definition), and 1 1 -Coll − implies the parameterfree version 1 1 -AC − without class parameters, but these implications fail without assuming GC, since 1 1 -AC − implies GC in NBG (see [7,Lemma 5]). 8 Precisely speaking, LFP literally asserts the existence of least closed points in terms of [14], but we can show in a parallel manner to [14, Lemma 2] that each least closed point of an X -positive elementary formula is a least fixed-point of the same formula provably in NBG. The converse also holds provably in LFP 0 , which can be shown in a parallel manner to [14,Theorem 3], but the proof crucially makes use of class parameters allowed in the schema LFP, and I do not know whether the converse in question also holds in LFP − 0 (or even weaker systems such as NBG).
Proof For the claim 1, Avigad's [2] proof of ACA 0 FP → ATR can be applied to class theory as it is; also see [22,Proposition 29]. The claim 2 is proved in [7,Proposition 4].
We will consider some first-order extensions of ZF. Let us start with a few general definitions. For a first-or second-order language L including L ∈ , we occasionally consider extending the axiom schemata of separation and replacement for L: where ϕ is an arbitrary L-formula without b free; note that L 2 -Sep and L 2 -Repl are equivalent to 1 ∞ -Sep and 1 ∞ -Repl, respectively, in NBG. Next, given such L (⊃ L ∈ ), we set L(P 1 , . . . , P k ) to be the language obtained by adding fresh unary predicate symbols P 1 , . . . , P k to L: namely, L(P 1 , . . . , P k ) = L ∪ {P 1 , . . . , P k }. An inductive operator form is an L ∈ (P)-formula A(x, P) with at most one free variable in which P occurs only positively. Now, we define a first-order language L ID as an extension of L ∈ with unary predicates J A associated to each inductive operator form A(x, P). The L ID -system ID 1 is defined as ZF + L ID -Sep + L ID -Repl plus the following axiom schema ID asserting that each J A is a fixed-point of A(x, P).
This axiom schema says that J A is a fixed-point of A. The L ID -system ID 1 is the strengthening of ID 1 defined as ZF + L ID -Sep + L ID -Repl plus the following axiom schemata asserting that each J A is the least fixed-point of A(x, P) (cf. fn 8): where (u) may contain parameters, and "û" indicates which variable each term t is substituted for; hence, A(x, (û)) is obtained from A(x, P) by replacing each occurrence of Pt by (t) (with renaming of bound variables as necessary to avoid collision). Since P occurs only positively in an inductive operator form A(x, P), The next lemma is the class-theoretic version of so-called Aczel's trick.

Lemma 2.26
Let B(x, y, P, Q) be an L ∈ (P, Q)-formula only with the displayed variables free in which P occurs only positively (but Q may occur negatively, and y may be empty). There is a 1 1 -formula (u, y, X ) only with the displayed variables free such that Furthermore, when B(x, y, P) contains no Q, there is a 1 1 -formula (u, y) only with the displayed variable free such that y, (û, y)) .

Hence, in particular, for each inductive operator form
Proof From a 1 1 -universal formula we can construct a 1 1 (universal) formula σ (e, x, y, X ) such that for each 1 1 Since σ is 1 1 and occur only positively in B, is also 1 1 provably in 1 1 -Coll 0 : by using 1 1 -Coll, we can always push any first-order quantifier prefixed to a 1 1 -formula within the existential class quantifier; note that if B contains no Q (which is a placeholder for a class parameter), is 1 1 in 1 1 -Coll − 0 for the same reason. Hence, there is some (standard) natural number e such that 1 1 -Coll 0 ∀x∀ y∀X (x, y, X ) ↔ σ (e, x, y, X ) .
We fix such e and put (u, y, X ) :⇔ ( e, u , y, X ); note that e is definable and so only has u, y, and X free. Hence, we have If B contains no class parameters, then nor do and , and thus 1 1 -Coll − 0 is sufficient to derive this equivalence.
Recall that the schemata FP − and LFP − allow set parameters. However, as the next proposition shows, forbidding set parameters (as well as class parameters) results in the same theories. Proof Let (x, y, X ) be an X -positive 1 0 -formula only with the displayed variables free. We set (u, X ) := ∀x∀y u = x, y → (x, y, (X ) y ) . Then, for each x and y, if X is a fixed-point (or least fixed-point) of , then (X ) y is a fixed-point (least fixed-point, resp.) of with y as a parameter; cf. [8,Theorem 5.2]. Now, we can show by an obvious model-theoretic argument (or partial cutelimination) that ID 1 and NBG + FP 0 have the same L ∈ -theorems, and so do ID 1 and NBG + LFP 0 . Hence, the next corollary follows. 9 In second-order arithmetic, the schema of 1 n -dependent choice is defined as: This definition of (Z ) n cannot be straightforwardly generalized to our current setting, because we do not assume a global wellordering of the universe V. In fact, the axiom of 1 1 dependent choice in the above form with any reasonable definition of (Z ) x implies 1 1 -AC (by treating X as a dummy variable) and thus the axiom of global choice GC [7, Lemma 5]. Hence, for the current setting, we adopt an alternative axiom schema that we call 1 n dependent collection schema. 10 We set and, for a collection of L 2 -formulae, define the schema -DColl of dependent collection as follows.
It is easy to see that NBG Thanks to universal formulae, 1 n -DColl 0 is finitely axiomatizable for all n ∈ N; note that, since 1 0 = 1 0 , 1 1 -DColl 0 and 1 0 -DColl 0 have exactly the same L 2 -theorems. The next is obvious by treating X above as a dummy variable. 9 Let ID n denote the collection of L ID -formulae corresponding to n in the Lévy hierarchy, in which the new vocabulary J A s are counted in ID 0 . The L ID -system ID 1 n is obtained from ID 1 by restricting the formulae appearing in ID2 (also known as the principle of fixed-point induction) to ID n -formulae. It is known that ID 1 n is stronger than ID 1 m over arithmetic if n > m ≥ 2, but nothing similar can be said about these systems over set theory: since ID 1 and ID 1 2 have the same theorems over set theory, so do ID 1 n and ID 1 m for any m, n ≥ 2; cf. Remark 5.20. 10 The anonymous referee suggests that 1 2 -RFN, rather than 1 1 -DColl, should be called the class-theoretic counterpart of 1 1 dependent choice, which is another reasonable option. We do not yet know if 1 2 -RFN is equivalent to 1 1 -DColl in NBG, while we know that the latter implies the former in NBG (Lemma 4.9) and that they have the same L ∈ -theorems (Theorem 5.23).

Proposition 2.29 For all n
We have a different, but equivalent, formulation of 1 n dependent collection (n ∈ N). 11 For a class X and a set x, let us define [[X ]] x := { y, z ∈ X | z ∈ x}. Then, an alternative formulation of 1 n dependent collection is given as follows.
As is expected, 1 n+1 dependent collection implies 1 n+1 dependent choice under the assumption of GC.

Proposition 2.31
Let us define the schema of 1 n dependent choice as follows: Then, for all n ∈ N, NBG + GC 1 n+1 -DColl ↔ 1 n+1 -DC. Proof We work within NBG+GC. One direction is obvious. For the converse, suppose 1 n+1 -DColl. Consider the next schema (the "choice" version of 1 n+1 -DColl ): By the same argument as Proposition 2.30, we can show that 1 n+1 -DC and 1 n+1 -DC are equivalent in NBG. 12 Hence, it suffices to derive 1 n+1 -DC . 11 Krähenbühl considered two equivalent formulations of 1 n dependent choice for class theory, 1 n -DC and 1 n -DC , which will be defined below. These two and my formulation ( 1 n -DC) are equivalent, but they imply GC. We show in Proposition 2.30 that the choice-less "collection" version 1 n -DColl of 1 n -DC is equivalent to 1 n -DColl, but we do not know whether the "collection" version of 1 n -DC is also equivalent to the other two. 12 We actually need one extra (but easy) step to prove the implication from 1 for all z and x = V rk(z) ∪{z} in the same manner as Proposition 2.30; then, we further construct a class Z from W (by ECA) so that For a class W and a set w, let us define W w := {z | (∃u ∈ w)z ∈ (W ) u }: namely, W w = u∈w (W ) u . Our first goal is to derive the following intermediate schema: Then, we put We have derived 1 n+1 -DC . Our next goal is to show that 1 n+1 -DC implies 1 n+1 -DC ; this is shown by Krähenbühl [18], but let us rehearse it here for the reader's convenience. Suppose ∀x∀X ∃Y (x, X , Y ) for any ∈ 1 n+1 . Let us put As in second-order arithmetic, 1 1 -DColl 0 is a stronger system than NBG, whereas 1 1 -Coll 0 have the same L ∈ -theorems as NBG (see [7,Theorem 15]).

Proposition 2.32 Let
is obviously a class wellordering provably in NBG. Then, we have 1 1 -DColl 0 ETR( ). 13 13 In fact, 1 1 -DColl 0 derives ETR(X ) for some class wellorderings whose "order-types" are greater than . We do not get into the details here because it requires a notation system of class wellorderings, such as Jäger's [13] for those below E 0 , but we note that Krähenbühl [18] showed that, under the assumption of GC, 1 1 -DC 0 has the same 1 2 -theorems as NBG + n∈ω ETR( n ) does; we conjecture that the same holds for 1 1 -DColl 0 (without assuming GC).

Proof
We work within 1 1 -DColl 0 . Take any elementary (x, z, X ). Let We define a class function F by recursion on On so that F(α) is the set of all sets b with the least rank such that Hence, for each b ∈ F(α), we have (Z ) b = {x | (x, α, (W ) α } and thus We finally put W := W ∪ { x, a | a / ∈ On ∧ (z, a, ∅)} to treat the inessential case for (W ) a for a not belonging to the intended domain On of the wellordering . 1 1 -DColl 0 proves Con(NBG) and thus Con( 1 1 -Coll 0 ) because the aforementioned fact that NBG and 1 1 -Coll 0 have the same L ∈ -theorems is provable in 1 1 -DColl 0 (or even much weaker systems).

Transfinite induction
Since 1 ∞ -TI 0 is a sequential theory in the sense of [9, Ch. III, Definition 1.12] and derives ω-induction for every L 2 -formula, it follows from [9, Ch. III, Lemma 3.47] that 1 ∞ -TI 0 is reflexive and thus we have the following. Proposition 3.1 1 ∞ -TI 0 proves the consistency of NBG and 1 n -TI 0 for each n. Nonetheless, as we will see below, 1 ∞ -TI 0 and 1 ∞ -TI are still quite weak extensions of NBG.

with renaming of bound variables as necessary to avoid collision.
Proof The proof is parallel to that of the Montague-Lévy reflection principle. Let ψ 1 , . . . , ψ n be the enumeration of all the sub-formulae of ϕ. Then, for each 1 ≤ i ≤ n, let ψ i (z 1 , . . . , z m i ) contain only the displayed m i variables free, and we take the following L 2 -definable (not necessarily a class) function G i : V m i → On.
1. If ψ i is of the form ∃wθ(w, z, P 1 , . . . , P k ), then we set If ψ i is of another form, then G i (η) = 0; We thereby take the following L 2 -definable (again, not necessarily classes) functions F i : On → On (1 ≤ i ≤ n) and F : On → On: here, we essentially use 1 ∞ -Repl. By recursion (which also requires 1 ∞ -Repl as well as 1 ∞ -Ind), we set F 0 (ξ ) = ξ and F k+1 (ξ ) = F(F k (ξ )) and then define H : On → On by H(ξ ) := sup k<ω F k (ξ ). Now, for any given α, let β := H(α) (> α). It is routine to check by induction on the complexity of formulae that, for all 1 ≤ i ≤ n and for all z ∈ V β , . . . , k ). Proof We work within ECA. Suppose Wf (X ) and take any L 2 -formula . Then, by the last lemma, there exists α ∈ On such that V α , X , | TI P 2 (P 1 ), if and only if TI (X ).
Hence, it suffices to show that V α , X , | TI P 2 (P 1 ), i.e., Assume the antecedent. Let Z := (V α ∩ ) ∪ (V\V α ), which is a class by 1 ∞ -Sep. Then, it follows from the antecedent that The next follows from this theorem, Proposition 2.14, and Corollary 2.18.  Proof The claim follows from the proof of [7,Theorem 18], which actually establishes that, for each ∈ 1 0 , there exists some ∈ 1 1 such that Hence, the next theorem follows from Theorem 3.4 and Proposition 3.6.  Sato [22] showed that 1 1 -Coll even shows the consistency of ETR (i.e., ETR 0 + 1 ∞ -Sep + 1 ∞ -Repl), and we will give an alternative proof of this fact later in §5. We next consider 1 ∞ -TI 0 . It will be shown that the strength of 1 ∞ -TI 0 falls strictly between NBG and ECA.
First, preliminarily, we will observe that the consistency of 1 ∞ -TI 0 (+ 1 ∞ -Sep) can be relatively easily proved if we assume AC.
Proof ECA proves that there is an L 2 formula (α) of ordinals with V α | ZF such that {α ∈ On | (α)} is closed unbounded in On; see [7, Corollary 27] (or Fact 3.10 below). Hence, there is an ordinal κ with cofinality greater than ω such that V κ | ZF. 14 Let D be the set of V κ -definable sets with parameters from V κ . Then, V κ , D is a model of NBG + 1 ∞ -Sep + AC. We claim that V κ , D | 1 ∞ -TI. Take any set X ∈ D and suppose V κ , D | Wf (X ). This is equivalent to the non-existence of a pseudo ω-descending chain of ≺ X in V κ . Hence, ≺ X is indeed well-founded in V, since cf(κ) > ω and thus any pseudo ω-descending chain of ≺ X would be contained in V κ . We thereby obtain V κ , D | TI(≺ X ).
We need to eliminate the assumption of AC, but the above proof still gives a guidance for how to achieve it: that is, we should aim to give a transitive model of NBG+ 1 ∞ -Sep for which the notion of well-foundedness (of class orderings) is absolute. 15 For that goal (and for other purposes later on), we introduce a theory TC of the Tarskian typed truth defined in [7]; we will only repeat necessary facts about TC and refer the reader to [7] for the proofs and details.
The language L T of TC is defined as L ∈ ∪{T } for a unary predicate symbol T ("truth predicate"). Let L ∞ ∈ be the language obtained by adding constant symbols c a to L ∈ for all a ∈ V. We fix a coding of L ∞ ∈ in a sufficiently weak sub-theory of ZF, say, KPω. We will denote the classes of codes of L ∞ ∈ -formulae and L ∞ ∈ -sentences by Fml ∞ ∈ and St ∞ ∈ , respectively; we will also write Fml ∈ and St ∈ for the (countably infinite) sets of L ∈ -formulae and L ∈ -sentences, respectively. For each L ∈ -formula ϕ, Fml ∈ and Fml ∞ ∈ both contain its code and we will simply denote it by ϕ; this notation again neglects the distinction of formulae and their codes, but there should be no danger of confusion. 16 By writing ϕ(u 1 , . . . , u k ) ∈ Fml ∞ ∈ we indicate it codes an L ∞ ∈ -formula only with the displayed variables free, and, for each sets a 1 , . . . , a k , we simply write ϕ(a 1 , . . . , a k ) to denote the code of the L ∞ ∈ -formula ϕ(c a 1 , . . . , c a k ) obtained by substituting the constants c a i for the variables u i (1 ≤ i ≤ k); accordingly, T (ϕ( a)) expresses that ϕ( u) is true of a 1 , …, a k .
The L T -system TC comprises ZF + L T -Sep + L T -Repl (Sect. 2.5) plus the axioms expressing Tarski's inductive clauses of the truth predicate for L ∞ ∈ , such as "a sentence c a ∈ c b is true, if and only if a is indeed a member of b", more precisely, the following four axioms: ϕ(a)) , 14 The assumption of AC is only necessary here in picking such κ with cf(κ) > ω. 15 In fact, we can show that ECA + AC is equiconsistent with ECA, and Theorem 3.11 follows from this equiconsistency and Lemma 3.9. However, the proof of the equiconsistency is more involved than the direct proof of Theorem 3.11 given below, and we leave it (and some other equiconsistency results about AC) for another paper. 16 In the literature, such as [7], the codes of L ∞ ∈ -formulae are expressed with the Quine brackets in such a way as ϕ , but we don't follow this convention for simplicity.
It is shown in [7] that ECA and TC are mutually interpretable in the way that the L ∈ -part is preserved; hence, they have the same L ∈ -theorems. Such an interpretation of ECA in TC is obtained by translating second-order quantifiers "∀X " into "for all codes of L ∞ ∈ -formulae with exactly one free variables" (i.e., "∀ϕ(x) ∈ Fml ∞ ∈ ") and also translating the membership relation "a ∈ X " into "the L ∞ ∈ -formula X is true of a" (i.e., "T (ϕ(a))"). We will denote this translation of L 2 in L T by I.
In TC, we can directly express that a transitive set M is a elementary sub-structure of V. Let M be a transitive set. Following the notation of [7], by St M ∈ we denote the set of codes of L ∞ ∈ sentences only with constants c a from a ∈ M. If M is a model of a sufficiently strong theory such as KPω, then we have where σ M is (the code of) the ordinary relativization of σ to M. Since TC proves that all the axioms of ZF are true and that Let ϕ be an L T -formula and M a set. We inductively define the relativization of ϕ to M in the following obvious manner: Next The next fact [7, Lemma 29] will be used in the proof of our claim.

Fact 3.10
For any L T -formulae ϕ 1 , . . . , ϕ k , TC proves the following: Proof It suffices to prove the claimed consistency in TC. We will work within TC. We first note that the I-translation Wf I (x) of Wf I (X ) is an L T formula that takes a code ϕ(u) ∈ Fml ∞ ∈ of an L ∞ ∈ -formula with only one free variable as its argument, and Wf I (ϕ(u)) precisely denotes: For readability, we will write y ≺ ϕ x for T (ϕ( y, x )).
Since V β ≺ V, all the relevant syntactic notions and operations concerning the codes of L ∞ ∈ are absolute for V β , and thus it follows that is a class in the sense of NBG I ; hence, we obtain V β | ∀x I (x).

Reflection
In contrast to 1 ∞ -TI, the schema 1 ∞ -RFN is rather strong, whereas they are equivalent in second-order arithmetic (in ACA 0 ).
The next follows from the last theorem and the finite axiomatizability of LFP 0 by 1 3 -sentences; the proof is similar to those of Corollaries 4.5 and 4.6.

Corollary 4.8 1 3 -RFN 0 Con(LFP).
This makes another contrast with second-order arithmetic; in second-order arithmetic, LFP rather proves the consistency of 1 3 -RFN 0 . 18 The proof of the next lemma is essentially parallel to that of the corresponding statement of [25, Theorem VIII.5.12] in second-order arithmetic. Proof We will work within 1 1 -DColl 0 . Let (z, U ) = ∀X ∃Y (z, X , Y , U ) be a 1 2formula where is elementary; we can similarly show the claim for formulae with more free variables. Suppose (z, U ) holds. We define By the supposition, we have ∀x∀X ∃Y (x, z, X , Y , U ), and thus 1 1 -DColl yields a class S with ∀x∃y (x, z, (S) x , (S) y , U ). Since (0, z, (S) 0 , (S) y , U ) ↔ (S) y = U for some y, we have U∈ S. It remains to show that S | (z, U ); recall that we can ignore the condition S | NBG when working with 1 n -RFN for n ≥ 2 (see Sect. 2.4). Take any X∈ S. Let b be such that (S) b = X , and put x = 1, b . Then, there exists some y such that (x, z, (S) x , (S) y , U ), which entails z, ((S) x ) b , (S) y , U . Since rk(b) < rk(x), we obtain z, (S) b , (S) y , U ; hence, S | (z, U ).

Corollary 4.10 1 1 -DColl Con(ETR).
It is known that LFP 0 1 1 -DC in second-order arithmetic, 19 but this fails in class theory, even if we add 1 ∞ -Sep and 1 ∞ -Repl. 18 There are several different ways of proving this. For instance, since LFP 0 has the same theorems as 1 1 -CA 0 does, it proves the existence of a coded β-model, which is automatically a model of 1 ∞ -RFN and thus 1 3 -RFN 0 in particular; to see another example of a proof, we note that 1 ∞ -RFN 0 and 1 ∞ -TI have the same theorems (see [16]), and that they are proof-theoretically equivalent to ID 1 and thus LFP − 0 , whose consistency is known to be provable in LFP 0 . 19 In second-order arithmetic, 1 1 -CA 0 derives 1 1 -DC (see [25, Lemma VII.6.6.3 and Theorem VII.6.9.4]), and LFP 0 and 1 1 -CA 0 have the same theorems. Since FP 0 is finitely axiomatizable by 1 2 -formulae, we would have a coded V-model of FP 0 in FP. Since FP 0 ETR, we would have FP Con(FP); a contradiction.

Remark 4.12
Sato has already proved LFP 1 1 -Coll [23, Corollary 12], which entails the last corollary. Furthermore, Gitman, Hamkins, and Johnstone announced (by private communication with Gitman) a much stronger result: they have shown that even MK does not derive 1 1 -Coll. 20 We have seen that 1 ∞ -RFN is a relatively strong axiom (schema), and it is stronger than some systems, such as 1 ∞ -TI and LFP, that are equivalent or even stronger than it in second-order arithmetic.
It is shown in [8] that 1 1 -Coll has the same L ∈ -theorems as the first-order system SC 1 of stage comparison prewellorderings (of inductive definitions) and also that they have the same L ∈ -theorems as the Kripke-Platek system KPV over the set-theoretic universe V with respect to the canonical translation of L ∈ into the language L KP of KPV (see Sect. 5.1.1 for its definition). Using this fact, our claim will be shown in two steps. First, we will define a new system KPV p , which essentially is KPV augmented with a new predicate for a projection of the domain of sets, and then show that KPV p plus the "axiom of constructibility" S = L S (V), asserting that every set is constructible relative to the set of urelements, is interpretable in SC 1 in the way that the L ∈ -part is preserved (with respect to ). Second, we will show that 1 1 -DColl is also interpretable in KPV p + S = L S (V) in the way that preserves the L ∈ -part (with respect to ). The argument of the present section presupposes and makes use of the results of [8], and we refer the reader to [8] for the definitions and basic facts of the relevant systems.

KPV p + S = L S (V) and its interpretation in SC 1
Throughout this Sect. 5.1, we do not work with any second-order systems, and we identify "classes" with formulae without danger of confusion: hence, a class is always an abbreviation of some formula possibly with parameters in the present subsection. The definitions and notions we have so far made for classes (as second-order entities of class theory) carry over to classes in this sense; e.g., y ∈ (X ) x precisely means ( y, x ) for some formula that the "class" X denotes. We remark that we adopt some different notations from [8], where (X ) x is denoted by X x , for example.

The systems SC 1 and KPV
For the reader's convenience, we will repeat the definitions of the systems SC 1 and KPV and explain some basic facts about them necessary for the subsequent argument.
The language L SC of SC 1 is an extension of L ID (see Sect. 2.5) with additional unary predicates ≺ A associated to each inductive operator form A(x, P). Let us denote the class of ordered pairs by Pair, and, for each x ∈ Pair, we denote its first component by (x) 0 and the second by (x) 1 ; hence, ( a, b ) 0 = a and ( a, b ) y ) and also write ≺ A x for the class of ≺ A -predecessors of x, i.e., {y | y ≺ A x} (= (≺ A ) x , in other words). The L SC -system SC 1 is defined as ID 1 + L SC -Sep + L SC -Repl (see Sect. 2.5) plus the following axiom schemata of ≺ A for each inductive operator form A(x, P).
The axioms SC1 and SC2 express that ≺ A is the stage comparison (strict) prewellordering of the least fixed-point J A of the monotone operator induced by A; see [19] for the definitions of these notions. 21 SC 1 is a definitional extension of ID 1 (whether formulated over arithmetic or over set theory) [8,Theorem 9.4]; hence, in particular, SC 1 have the same L ∈ -theorems as ID 1 and thus ID 1 by Corollary 2.28.
The relation A is defined by x A y :⇔ A(x, ≺ A y ). We have the following basic facts concerning ≺ A and A ; see [8, §4] for their proofs.
The following definitions are made in for some a and inductive operator form A; X is called coinductive, if the negation of X is inductive, and it is hyperelementary, if it is both inductive and coinductive. SC 1 is strong enough to prove most of the basic properties of inductive and hyperelementary classes, such as those given in [19].
For an inductive operator form A and a set a, we set x ≺ A,a y :⇔ x, a ≺ A y, a , and x A,a y :⇔ x, a A y, a . This ≺ A,a strictly prewellorders (J A ) a ; hence, for an inductive class X = (J A ) a , the relation ≺ A,a prewellorders X . The way in which ≺ A,a prewellorders X depends on the choice of A and a, but the choice of the pair will not matter for our subsequent argument. So we let ≺ X and X denote ≺ A,a and A,a , respectively, for some fixed A and a that define X . We will use the following facts; their proofs are found in [8, §5].

Fact 5.2
For each inductive class X , the following are provable in SC 1 .

If a ∈ I then (H ) a = (Ȟ ) a (and thus (H ) a is hyperelementary for all a ∈ I ).
2. If X is hyperelementary, then X = (H ) a for some a ∈ I .

Fact 5.3 says that hyperelementary classes can be nicely coded by sets.
We next turn to the Kripke-Platek system KPV. In terms of [3], KPV is an extension of KPU + obtained by incorporating ZF as the theory of urelements and extending the axiom schemata of ZF for the entire language. We adopt the one-sorted formulation of KPU + : let L KP = {∈ 0 , ∈ 1 , U, V} (with equality as a logical symbol), where U is a unary predicate for urelements, ∈ 0 is the membership relation among urelements, ∈ 1 is the membership relation for sets, and V is a constant symbol for the set of urelements. We will write Sx for ¬ Ux to express set-hood. The Lévy hierarchy of formulae are introduced to L KP in an obvious manner: by S 0 we denote the least collection of L KP -formulae containing all L KP -atomics and closed under the Boolean connectives and bounded quantifiers (∀x ∈ 1 t) and (∃x ∈ 1 t) for L KP -terms t; S n , S n , S , and S are defined from S 0 in the standard manner. We express various sets and classes in the language L ∈ of (first-order) set theory, but L KP possesses two different membership relations ∈ 0 and ∈ 1 , and they have different intended domains U and U ∪ S. Hence, each set or class expressible in L ∈ can be expressed in two different ways in L KP depending on which structure, U, ∈ 0 or U ∪ S, ∈ 1 , is considered. For each set-theoretic notion, such as ordinals, subsets, functions, we will distinguish two different notions, in terms of U, ∈ 0 and of U ∪ S, ∈ 1 , by attaching prefixes "U-" and "S-"; for instance, a U-set means an urelement (an element of U), and an S-set means an element of S; a U-unordered pair of a and b, if (a, b ∈ 1 x) ∧ (∀y ∈ 1 x)(y = a ∨ y = b). If a set-theoretic notion or operation is given an abbreviation, such as On and ·, · , we will distinguish the two different notions by attaching superscript U or S to them; for example, On U and On S denote the classes of U-ordinals and S-ordinals, respectively; u, v U denotes the U-ordered pair (defined in terms of U-unordered pairs) of u and v, while x, y S denotes the S-ordered pair of x and y. We will, however, sometimes abuse this notation and suppress the superscripts U and S for simplicity, when it is clear from the context. For an L KP -definable class X , such as S and On U , we write x ∈ X to mean that x is a member of the class, but it is precisely a mere abbreviation of (x) for the L KPformula defining X and the symbol "∈" here should not be confused with "∈ 0 " and "∈ 1 ".
Each L ∈ -sentence is canonically translated into L KP by restricting every quantifier to U and replacing the membership relation of L ∈ with ∈ 0 . Let us denote this canonical translation of L ∈ in L KP by . In other words, for each L ∈ -sentence σ , σ is the relativization of σ to the structure U, ∈ 0 . Thereby we first define a minimal L KPsystem KPV min as the collection of the following axioms: where ψ is any S 0 -formula without b free. We also consider the following additional axiom schemata.
where ϕ is any formula without b free belonging to a collection of formulae (of a language including L KP ). The full systems KPV is thereby defined as follows: We will use the same interpretation * of KPV in SC 1 as in [8]. The entire domain of KPV is interpreted by the direct sum of V and a certain inductive class M, say, (V × {0})∪(M ×{1}), so that every ordered pair x, 0 represents some U-set, and an ordered pair x, 1 represents an S-set when x ∈ M. The special inductive class M consists of the codes (in the sense of Fact 5.3) of hyperelementary well-founded trees; here, by a well-founded tree we mean the same thing as what Simpson [25] calls a suitable tree, which is defined as a tree T , in the sense that T is a non-empty class of finite sequences of sets closed under initial segments, which is strictly prewellordered by the canonical ordering T defined as x T y :⇔ "yis a proper initial segment ofx. In sum, we put where M is indeed inductive, since the well-foundedness of (H ) a is uniformly expressible for all trees a ∈ I in terms of a certain inductive class Acc( ). 22 Then, the domain of quantifiers of L KP is interpreted as S * ∪ U * , and the * -interpretation of ∈ 0 is simply defined by x, 0 ∈ * 0 y, 0 :⇔ x ∈ y. Each hyperelementary suitable tree is intended to represent its Mostowski collapse. However, since we allow urelements in KPV, the notions of Mostowski collapse must be so modified as to accommodate urelements; each leaf of a suitable tree corresponds to an object with no ∈ 1 -member that is contained in the transitive closure of the Mostowski collapse of the tree, and we must somehow distinguish the cases where the leaf represents the S-emptyset and where it represents an urelement (i.e., U-sets), both of which has no ∈ 1 -member. For this purpose, we stipulate that, for a leaf u of a tree T , if u = u 0 , . . . , u k ends with an element of the form u k = x, 0 ∈ U * , then it represents the urelement that x, 0 represents, and otherwise represents ∅ S . Informally, given a transitive model of ZF with domain D, which is regarded as the domain of urelements, if T ⊂ D is a suitable tree, we define the collapse m(T , s) of T at each s ∈ T by recursion along T so that thereby we let T represent the S-set m(T , ) (a member of V D in terms of [3]). Finally, to define the * -interpretations of ∈ 1 and =, we first take a special inductive relation B (a, b, u, v), which expresses, for each a, b ∈ M, u ∈ (H ) a , and v ∈ (H ) b , that the sub-tree of the suitable tree (H ) a below u is bisimilar to the sub-tree of (H ) b below v in a suitably modified sense, in which the notion of bisimilation is modified so as to distinguish the leaves representing the empty set and those representing urelements so that suitable trees T 0 and T 1 are bisimilar (in the modified sense) if and only if m(T 0 , ) = m(T 1 , ). Hence, for a, b ∈ M, ∃z( z ∈ (H ) b ∧ B(a, b, , z ) expresses that (H ) a is bisimilar to some immediate sub-tree of (H ) b and thus that m ((H ) a , ) is a member of m((H ) b , ). Next, let A(a, b) express that a, 0 is a leaf of a tree (H ) b , which means that the urelement represented by a, 0 is a member of m ((H ) b , ) (when b ∈ M). Thereby, we define Then, there are binary inductive relations P − = and P − ∈ 1 such that Finally, (x = y) * and (x ∈ 1 y) * are defined as P + = (x, y) and P + ∈ 1 (x, y), respectively. Then, we have the following fact [8,Theorem 7.19].

Fact 5.4 * is an interpretation of
For each L ∈ -formula ϕ(x 1 , . . . , x k ) only with the displayed variables free, we can show the following by straightforward induction on ϕ: this fact (16) is used in the proof of Fact 5.4. Hence, the next follows.
Fact 5.5 KPV have the same L ∈ -theorems as SC 1 with respect to .
We will need a certain generalization of (16). We first extend to a translation of L ∈ (P) into L KP (P) simply by putting P t :⇔ Pt, where P is intended to be a predicate of U-sets. We next also extend * to a translation of L KP (P) into L SC (P) again by putting P * t :⇔ Pt. Then, in particular, for each inductive operator form A(x, P), A (x, P) and (A ) * (x, P) are an L KP (P)-formula and an L SC -formula, respectively, in which P occurs only positively. Let ψ(u) be any L SC -formula with a distinguished free variable u (possibly with parameters). Then, for each L ∈ (P)-formula ϕ(x 1 , . . . , x k , P) only with the displayed variables free, we can show where (ϕ ) * x 0 , 0 , . . . , x k , 0 , { u, 0 | ψ(u)} is, precisely, the result of replacing each occurrence of an atomic formula Pt in (ϕ ) * by ∃u(t = u, 0 ∧ ψ(u)). This equivalence (17) is proved by the same induction on ϕ as (16) with a trivial additional case for the base step where ϕ is of the form Pt.

The constructible universe relative to V
Let L S (V) be the S-class of constructible (S-) sets relative to V (or, from V, in terms of [3]); L S (V) is standardly built up from L S 0 (V) := V (∈ S); see [3, Ch. II] for the precise definition. By L S ξ (V) we will denote the ξ th stage of the construction of L S (V), in other words, the S-set of constructible sets of the level (or the "constructible rank") ξ in the constructible hierarchy relative to V. L S (V) is definable in KPV (in fact, in KPV min + S 1 -Found 1 ), and L S (V) ∪ U is an inner model of KPV provably in KPV. More precisely, for each L KP -formula ϕ, let ϕ L S (V) denote the result of restricting each quantifier in ϕ to L S (V) ∪ U (but keeping all the other vocabulary unchanged, in particular, interpreting U by itself and thus x ∈ Sx (⇔ x / ∈ U) as x ∈ L S (V)); then, for each axiom σ of KPV, σ L S (V) is provable in KPV.
Fact 5.4 is proved, in essence, by carrying out the proof of the Barwise-Gandy-Moschovakis theorem [4] within SC 1 , but the Barwise-Gandy-Moschovakis theorem bears richer implications. In particular, it shows that the companion of the inductive sets on a transitive infinite set A, as a Spector class on A, is the least admissible set containing A. Indeed, SC 1 can "see" this fact within it. Let us call the L KP -statement S = L S (V) the axiom of constructibility relative to V. The goal of the present subsubsection is to show the following.
Hence, it follows from Fact 5.4 that * is an To show this, we need to prove some preliminary results and use some facts implicit in the proof of Fact 5.4. Let A(x, P) be an inductive operator form. A is an L KP (P)-formula with every quantifier bounded (by V). Hence, by S -recursion (see [3, Ch. 1]), there exists, provably in KPV, a S -function I A : On S → S such that for readability. We thereby extend to a translation of L SC into L KP by putting note that both J A and ≺ A are S -predicates. We can standardly show that is an interpretation of SC 1 in KPV, and the notion of an inductive class in SC 1 is accordingly translated into KPV, namely, an inductive class in KPV is an S-class X such that X = (J A ) a = {u ∈ 1 V | u, a U ∈ J A } for some inductive operator form A and U-set a (∈ 1 V).
Since L S (V) ∪ U is an inner model of KPV, KPV (σ ) L S (V) for each axiom of SC 1 . Now, for each fixed α ∈ On S as a parameter, x ∈ 1 I α A is a S -predicate on V in KPV and thus I α . In particular, every inductive or coinductive class in the sense of KPV has the same meaning in S ∪ U and in L S (V) ∪ U.
To prove the subsequent Lemma 5.8, we need the following fact, which is implicit in the proof of [8,Lemma 7.17].
Fact 5.7 SC 1 proves the following: for every L SC -formula ϕ(x), Namely, SC 1 proves the principle of induction along ∈ * 1 with respect to not only the * -translations of L KP -formulae but also arbitrary L SC -formulae; recall that ∈ * 1 is defined in terms of bisimulations between well-founded trees. Lemma 5.8 SC 1 proves the following.
Note that for every u, v ∈ U * , it follows by the definition of * that there are some x and y with u = x, 0 and v = y, 0 and ( u, v U Proof We work within SC 1 . For the claim 1, we first show that This is shown by induction on α (along ∈ * 1 ) using Fact 5.7; for each x, 1 (I <α A ) * ; the first implication holds by the definition of I α A (in KPV) and the fact that * is an interpretation of KPV in SC 1 ; the second obtains by the induction hypothesis and the fact that P occurs only positively in A; the third follows from (17); the fourth holds by the axiom ID. We next show that This is shown by induction on x along ≺ A ; given x ∈ J A , we have y, 0 ∈ (J A ) * for every y ≺ A x by the induction hypothesis, and, since A(x, ≺ A x ) holds by Fact 5.1.1, we can thereby infer the first implication obtains due to (17); the second holds because interprets SC 1 in KPV and * interprets KPV in SC 1 . The claim 1 follows from (18) and (19). For the claim 2, we first remark that SC 1 proves that if an L SC -definable relation ≺ satisfies SC1 and (some finite instances of) SC2 for A, then ≺ is identical (coextensive) with ≺ A . Now, put x ≺ y :⇔ x, y , 0 ∈ (≺ A ) * . It suffices to show that ≺ satisfies SC1 and SC2 for A. SC2 is readily verified: since ≺ A orders U-sets u ∈ J A by comparing the least S-ordinals α with u ∈ I α A , the well-foundedness of ≺ A derives from that of ∈ 1 , and thus the well-foundedness of ≺ follows from Fact 5.7. To verify SC1 for ≺, we first observe that x ≺ y is equivalent to since (SC1 ) * holds in SC 1 . By the claim 1 and the definition of ≺, (20) is equivalent to (17).
Hence, (16) is extended for arbitrary L SC -formulae in the following way.

Lemma 5.9
For every L SC -formula ϕ(x 1 , . . . , x k ), In particular, for every definable inductive or coinductive class Q, such as I , H ,Ȟ, M, and B, its -translation Q satisfies the following: Next, we will show that the collapsing function m(T , s) can be adequately defined within KPV for the -translation of each hyperelementary suitable tree T .
We first canonically extend (restricted to L ∈ ) to a translation of L 2 in L KP by interpreting classes of second-order set theory into S-subsets of V. It is easily verified that the thus extended translation is an interpretation of ECA in KPV; see [8, §9.1].
For each L KP -formula and S-set X ⊂ S V, let us define Given a collection of L KP -formulae, we thereby define the schema -TI as follows.
-TI : (∀X ⊂ S V) Wf (X ) → TI (X ) , for all ∈ ; this is the KPV-counterpart of the class-theoretic axiom schema -TI. Note that since KPV NBG , it follows from Proposition 2.3 that Wf (X ) is equivalent to the S 0statement that every non-empty U-set u with (∀v ∈ 0 u)v ∈ 1 X has a ≺ X -minimal element.
The next lemma can be proved in a parallel manner to Theorem 3.4 by using In particular, we have KPV   F(x, r ) such that if r is a well-founded S-set relation whose field a is an S-subset of V, then where pred r (x) := {y ∈ 1 a | y, x U ∈ 1 r } S , that is, the S-set of r -predecessors of x, which is an S-set by S 0 -Sep. Hence, in particular, KPV has a S -function m that satisfies (14) for every S-set suitable tree T ⊂ S V and its nodes s ∈ 1 T . This lemma implies that KPV proves the axiom Beta (suitably modified for KPV) restricted to well-founded S-set relations on V; also see Remark 5.19 below for the full-fledged version of the axiom Beta.
Since interprets SC 1 in KPV, it follows from Fact 5.3 that for every x ∈ M , (H ) x and (Ȟ ) x are true of the same U-sets, and thus We are finally ready to prove Theorem 5.6.
Proof of Theorem 5. 6 We will work within SC 1 . We have to show (S ⊂ L S (V)) * . Note that if x ∈ S * , then x = a, 1 for some a ∈ M, for which we also have a, 0 ∈ (M ) * by Lemma 5.9. Hence, by the last lemma, it suffices to show (∀x ∈ S * )∀a x = a, 1 → a, 1 = * (m((H ) a,0 , )) * . This will be shown by induction on x along ∈ * 1 using Fact 5.7. Recall that for each a, b ∈ M, an inductive relation B(a, b, u, v) expresses that the sub-tree of (H ) a below u ∈ (H ) a is bisimilar (in the aforementioned modified sense) to the sub-tree of (H ) b below v ∈ (H ) b , and m is so defined as to satisfy the following: Let us define an L SC -formula ψ(x, y) and an L KP -formula θ(x, y) as follows: Then, by (21) and the definition of m, we obtain Also recall that A(b, a) means that b, 0 is a leaf of the suitable tree (H ) a (see Sect. 5.1.1) when a ∈ M, and m is so defined as to satisfy the following as well: Now, fix x ∈ S * and let x = a, 1 , where a ∈ M and thus a, 0 ∈ (M ) * . Take any z ∈ S * ∪ U * . First suppose z ∈ S * and let z = b, 1 , where b ∈ M and thus b, 0 ∈ (M ) * . Then, z ∈ * 1 x means ψ(b, a), and we have Second suppose z ∈ U * and let z = b, 0 . Then, z ∈ * 1 x means A(b, a), and we have Finally, since the axiom of extensionality is true under the interpretation * , we thereby obtain x = * (m((H ) a,0 , )) * . The proof is completed.

Remark 5.13
One might well question if a parallel statement to Theorem 5.6 holds over arithmetic, namely, if the arithmetical counterpart of the interpretation * of SC 1 over arithmetic in the Kripke-Platek theory over N (such as KPN or KPω) verifies the axiom of constructibility. The proof in this section cannot be applied to the arithmetical case as it is, because the addition of the axiom Beta or S ∞ -TI to the Kripke-Platek theory over N increases consistency strength. Nonetheless, the answer to the question is affirmative. Lemma 5.10 is needed for the proof of Theorem 5.6 only in proving the existence of the Mostowski collapsing function m, and m need not be the collapsing function for all arbitrary suitable trees but only for all hyperelementary suitable trees. Then, to prove the existence of m for them, we only need to avail ourselves of transfinite induction along T for each hyperelementary suitable tree T . Now, in my definition in [8], the well-foundedness of a hyperelementary suitable tree (H ) a is defined in SC 1 in terms of the accessible part of (H ) a , and well-foundedness thus defined implies transfinite induction along (H ) a for arbitrary SC 1 -formulae; in fact, there is a 'universal' (coinductive) relation such that x, a y, a ↔ x (H ) a y, and the well-foundedness of (H ) a can be expressed in terms of the (inductive) accessible part of this uniformly for all a. Hence, transfinite induction along the hyperelementary well-founded relations (H ) a comes for free in SC 1 without the need to resort to settheoretic axioms such as L SC -Repl. The same argument can be carried out in KPV in terms of the -interpretation of the inductive predicates (because they are defined by ∈recursion along ordinals in KPV and because their stage comparison prewellorderings are defined in terms of ordinals, transfinite induction along hyperelementary wellfounded relations in KPV derives from the axiom schema of foundation), and we can thereby define the Mostowski collapsing function m for all hyperelementary suitable trees (H ) x in KPV. This argument can be straightforwardly adapted for a proof of the corresponding statement over arithmetic.

The axiom of projectibility
The Barwise-Gandy-Moschovakis theorem has one more important implication: the companion M of the inductive sets on a transitive infinite set A is also projectible, namely, that there is a 1 -definable partial surjective function from some set belonging to M onto the entire universe M of the companion, which is called a projection of M; see [19,Ch.9.D] or [3,Ch.V]. In the present Sect. 5.1.3, we consider adding to L KP a new predicate symbol for a projection of the universe of S-sets.
Let L KP (Pr) be a language extending L KP with one new binary predicate Pr. We then consider the following new axiom expressed in L KP (Pr).
This axiom asserts that Pr associates each S-set with some urelements (i.e., U-sets) so that the inverse of Pr gives a surjection from the range of Pr onto S. Let us call (Prj) the axiom of projectibility. 23 We extend the definition of the collections S 0 , S n , S n , S , and S of L KPformulae to Now, we will extend * to an interpretation of KPV p in SC 1 . We first define two inductive relations R andŘ as follows: namely, R(x, y) says that y = b, 0 (∈ U * ) for a ≺ M -minimal element b of M with x = * b, 1 , andŘ(x, y) expresses its negation. We then define Pr * (x, y) :⇔ R(x, y).  (15). Hence, we have R(x, y). 2. If R(x, y) for x, y ∈ S * ∪ U * , then y = b, 0 ∈ U * for some b ∈ M and P + = (x, b, 1 ), which implies x ∈ S * , since P + = satisfies the axioms of equality. 3. Let R(x 0 , y) and R(x 1 , y) for some y = b, 0 ∈ U * and b ∈ M. Then, b, 1 ∈ S * and thus P + = (x 0 , x 1 ), since P + = satisfies the axioms of equality.
Proposition 5.15 SC 1 proves the following: Proof Take any x, y ∈ S * ∪ U * . First supposeŘ(x, y). The next is the main result of this sub-subsection.

Lemma 5.16
The extended translation * is an interpretation of KPV p in SC 1 .
Proof We work within SC 1 . It immediately follows from Proposition 5.14 that SC 1 (Prj) * . We next show that for each p 0 -formula ϕ( x) of L KP (Pr), there are inductive relation P + ϕ ( x) and P − ϕ ( x) such that this is shown by induction on ϕ in a parallel manner to [8,Lemma 7.12], in which we use Proposition 5.15 for the additional base step where ϕ is an atomic formula of the form Pr(t 0 , t 1 ). Then, using (24), we can show in a parallel manner to [8,Lemmata 7.14 and 7.15] that the * -translations of ( p 0 -Sep 1 ) and ( p 0 -Coll 1 ) are provable in SC 1 . Finally, since = * (i.e., P + = ) is an equivalence relation, it is obvious from the definition of R that = * satisfies the axioms of equality with respect to Pr * , namely, The * -translation of the remaining axioms of KPV p can be verified in exactly the same manner as in [8].
Combined with Theorem 5.6, we obtain the following. Theorem 5.17 KPV p + S = L S (V) has the same L ∈ -theorems as SC 1 with respect to the canonical translation of L ∈ in L KP . Remark 5. 18 We have shown in Sect. 5.1.2 that * is an interpretation of KPV plus the assertion that every S-set is the Mostowski collapse of some hyperelementary suitable tree. Under this extra assumption, Pr becomes definable within KPV by a predicate stating that x is the Mostowski collapse of some suitable tree (H ) y for some U-set y ∈ M ; this definition of Pr involves -notions such as M and H , but they can be shown to be by the argument presented in Remark 5.13 (or by S ∞ -TI if we additionally postulate it).

Remark 5.19
KPV p proves the class-theoretic counterpart of Simpson's axiom of countability [25,Ch. VII. 3], which asserts that every S-set can be injectively mapped into V, by assigning the U-set v, 0 U U to each U-set member v of an S-set y and the following U-set to each S-set member x of y: note that the domain of the injection need not be a transitive hull of the S-set y because the existence of a transitive hull is provable for every S-set in KPV. It follows from this and Lemma 5.11 that KPV p proves the axiom Beta (suitably modified for KPV) unrestrictedly. Combined with Fact 5.5, it follows that KPV plus the axiom Beta has the same L ∈ -theorems as KPV. This makes an interesting dissimilarity between Kripke-Platek systems over V and over N, since the axiom Beta makes KPN or KPω a strictly stronger system (see [20] for example).

Remark 5.20
KPV plus the aforementioned class-theoretic counterpart of the axiom of countability directly interprets 1 1 -Coll, and this interpretation requires no instances of S ∞ -Found 1 ; cf. [8,Theorem 9.1]. This makes another dissimilarity between Kripke-Platek systems over V and over N, since the strength of the Kripke-Platek system KPu over N (either with or without the axiom of countability) essentially relies on the full axiom of foundation; restricting the axiom of foundation to any fixed complexity results in a weaker theory; cf. [12,21].

Interpretation of 6 1 1 -DColl in KPV p + S = L S (V)
In this subsection, we will show that KPV p + S = L S (V) ( 1 1 -DColl) , which entails that SC 1 and 1 1 -DColl have the same L ∈ -theorems due to Theorem 5.17 and the aforementioned fact that is an interpretation of ECA in KPV.
We will first show a variant of -recursion theorem. It is observed that all the basic principles proved in [3,Ch.I.4], such as -Separation and -Replacement, can be proved in KPV min p (in suitably modified forms where and are replaced by p and p ) in exactly the same way. For each U-class X (such as On U ), let us define S(X ) := {x ∈ 1 V | x ∈ X } (∈ S), which exists as an S-set by Since S(On U ) is an S-set, again by p -Replacement, we take the unique S-function f with domain S(On U ) such that (∀α ∈ 1 S(On U ))∃ f α A(α, f α , f (α)). Hence, again by (26), we have (∀α ∈ 1 S(On U ))A(α, f S(α) , f (α)), and thus f satisfies (25). Now, we are ready to prove our main claim.

Lemma 5.22 KPV
Proof We work within KPV p + S = L S (V). In this proof (and only in this proof), we use capital Roman letters X , Y , Z , …, to designate S-subsets of V (viz., -translations of classes) for readability. We first note that (X ) x and (X ) x are interpreted by as {y ∈ 1 V | y, x U ∈ 1 X } and { y, z U ∈ 1 X | rk U (z) < rk U (x)}, respectively, both of which exist as S-sets. Take any 1 0 -formula (u, X , Y ), and let us write for , which is a S 0 -formula. Suppose (∀u ∈ 1 V)∀X ∃Y (u, X , Y ). For each α ∈ On U , we will define an S-ordinal σ α , U-ordinals υ α ≥ α and μ α , and an S-set , by p -recursion along On U (Lemma 5.21) in the way that we will describe below.
When α is a limit U-ordinal, then we set in fact, σ α and μ α can be arbitrary for a limit α. 25 Now, assume that α is a successor U-ordinal and α = β + 1 for some U-ordinal β. Let LH( f , ζ ) be an L KP -formula expressing that ζ is an S-ordinal and f is an S-set function with domain ζ + 1 such that f (η) = L S η (V) for each S-ordinal η ≤ ζ ; such LH( f , ζ ) can be taken as S (in KPV min + S 1 -Found). Then, firstly, we define σ α as the least S-ordinal ξ such that 24 As the following proof indicates, ( 1 0 -DColl) is actually provable in KPV min p + S = L S (V) +  25 The aim of the recursive definition here is to define υ α and X α ; σ α and μ α only play supplementary roles here, and their definitions could be incorporated into the definitions of the other two; we define them separately only for the sake of readability. hence, we have (X α ) υ β = (X β ) υ β , and thus (X β ) u = (X β ) β = (X α ) β = (X α ) u for all u ∈ 0 (V α \V β ) U , since rk U (u) = β ≤ υ β , which implies We have completed the recursive definitions of σ α , μ α , υ α , and X α . It is easy to see that for each u ∈ 0 V U υ α , we have (X α ) u = (X γ ) u for all γ > α. Finally, we set X = α∈On U X α ; hence, we have We have to check (∀u ∈ 1 V)(∃v ∈ 1 V) (u, (X ) u , (X ) v ). Take any u ∈ 1 V and let Finally, the claim obtains, since we generally have (X ) υ β = (X β ) υ β and β ≤ υ β for all U-ordinals β.
By combining this with results from [7,8,23], we obtain the next theorem.

Theorem 5.23
The following systems all have the same L ∈ -theorems: They still have the same L ∈ -theorems even if we assume AC or GC (Remark 1.1).
This and Corollary 4.5 give an alternative proof of the following result by Sato.

A digression-an urelement-free formulation of KPV
What KPV is to set theory is what KPu (see [11] for its definition) is to arithmetic, 26 and KPu has a urelement-free variant, namely, KPω, which theorizes the pure set part of KPu. In this subsection, we will introduce and briefly discuss a urelement-free variant of KPV, which will play an important role in the study of stronger systems of classes (and the author's future work on this subject). We will no longer consider the axiom of projectibility in what follows, and focus on KP-systems in the language L KP .
It is observed that the axiom (Prj) plays no role in the proof of Lemma 5.21, and S -recursion on On U is available in KPV min + S 1 -Found + 0 . Hence, in this system, we can define an S-set function f with domain S(On U ) such that, for each α ∈ On U , f (α) is an S-set function g α with domain S(V U α ) so that g λ := ξ<λ g ξ , if λ is a limit ordinal.
Let g := α g α , which is an S-set function with domain S(U) = V. We can show (by S 1 -Found + 0 ) that g is an injection. Let v denote the range of g. Then, g is an isomorphism between V and v in the sense that As before, for each L ∈ -formula ϕ, we define its relativization ϕ v ,∈ 1 as the result of replacing ∈ by ∈ 1 and also replacing each quantifier ∀x by ∀x ∈ 1 v . It follows from (31) that, for all L ∈ -formulae ϕ(x 1 , . . . , In particular, the system proves σ v ,∈ 1 for every axiom σ of ZF. We will prove further useful properties of v under the assumptions of some extra axioms. For the sake of readability, let us define the L KP -system KPV 1 as follows: Proposition 5.25 KPV 1 proves that v is supertransitive, namely, Proof The first conjunct (i.e., transitivity) is obvious from the definition of v . For the second, let x ∈ v and y ⊂ S x. By S 0 -Sep + 0 , we have We can easily check g(w) = y (using the transitivity of v ).

Proposition 5.26 KPV 1 proves that for every
where dom( f ) and ran( f ) are L ∈ -expressions denoting the domain and range of f respectively.
Proof Take any f ∈ S with some domain a ∈ 1 v , and suppose ran S ( f ) ⊂ v . We have f ⊂ S v , since v is transitive and closed under paring. Let us write which exists as an S-set by S 0 -Sep 1 . Since g is bijective, we have Finally, by 0 -Sep + 0 and 0 -Repl + 0 , we obtain a U-set b such that and thus g(b) = ran S ( f ) ∈ 1 v again by (32).
These observations suggest the following urelement-free formulation of KPV.

Definition 5.27
Let L ∈ (V ) be L ∈ plus a new set constant V , and let V 0 , V n , and V n denote the collections of L ∈ (V )-formulae in the Lévy hierarchy modified so that the constant V is allowed to appear in V 0 . The L ∈ (V )-system KPV r , which corresponds to KPω r over ω (see [20] for its definition), consists of the axioms of extensionality, paring, union, V 0 -separation, V 0 -collection, and V 0 -foundation, as well as the following axioms stating a certain closure properties of V : (V3) V is a non-empty transitive model of the axioms of paring, union, and infinity, namely, the relativizations of these axioms to V are true.
The full system KPV is obtained by extending V 0 -foundation to V ∞ -foundation and strengthening the closure property of V by adding the following schemata.
where ϕ is an arbitrary V ∞ -formula without b free. 27
Proof The axioms of extensionality, paring, union, and infinity are made true in V , ∈ by V 3. The axiom of foundation is true in V , ∈ due to V 0 -foundation and the transitivity of V . The powerset axiom in V , ∈ follows from V 1 and the transitivity of V . For each L ∈ -formula ϕ with all parameters from V , ϕ V ,∈ is V 0 and thus {x ∈ a | ϕ V ,∈ (x)} (⊂ a) exists for every a ∈ V by V 0 -separation, which belongs to V due to V 1 and the transitivity of V . Finally, suppose (∀x ∈ a)(∃!y ∈ V )ψ V ,∈ (x, y) for any a ∈ V and L ∈ -formula ψ with all parameters from V . Then, since ψ V ,∈ is V 0 , the set function f := { x, y ∈ a × V | ψ V ,∈ (x, y)} exists by V 0 -separation (as well as V 0 -collection and the axiom of paring to ensure the existence of cartesian products), and thus ran( f ) belongs to V by V 2.
Since S -recursion along ∈ is available in KPV 1 , we can define the support function sp in KPV 1 (see [3,Ch. 1.6]) so that sp is a S -function, and we call x a pure set when x ∈ S ∧ sp(x) = ∅ S . Let A S be the class of pure sets, which is a S -predicate in KPV 1 . It is obvious that v ∈ A S and A S is transitive provably in KPV 1 . Now, we define a translation of L ∈ (V ) to L KP as follows: Lemma 5.29 is an interpretation of KPV r plus V 1 -foundation in KPV 1 ; furthermore, it is an interpretation of KPV in KPV.
Proof We will work within KPV 1 . We have shown that v is transitive; since U, ∈ 0 is a model of the axioms of paring, union, and infinity, so is v , ∈ 1 by (32); hence, (V3) holds. (V1) follows from Proposition 5.25, since v , ∈ 1 is a model of the axiom of powerset by (32). (V2) follows from Proposition 5.26. We can show in essentially the same manner as [3, Theorem II.1.5], using the S -definability of sp, that the remaining axioms of KPV r plus V 1 -foundation are preserved by ; for example, for where all the parameters of ϕ are taken from A S , then we have ∀x (∀y ∈ x)(y ∈ A S → ϕ (y)) → (x ∈ A S → ϕ (x)) , from which we obtain (∀x ∈ A S )ϕ (x) by S 1 -Found 1 , since A S is transitive and S (in KPV 1 ).
The composition * • of the two translations gives an interpretation of KPV in SC 1 . By interpreting sets as elements of V (i.e., ∀x → (∀x ∈ V )) and classes as subsets of V (i.e., ∀X → (∀x ⊂ V )), we have an interpretation of ECA in KPV ; let denote this interpretation. Then, the restriction of to L ∈ interprets ZF in KPV , and we can standardly extend it to an interpretation of ID 1 in KPV (in the same manner as the standard interpretation of ID 1 in KPω over arithmetic). Hence, KPV and the systems listed in Theorem 5.23 have the same L ∈ -theorems (with respect to the canonical translation ).

Remark 5.30
In our formulation of KPV and related systems, we define Sx :⇔ ¬ Ux, and thus S and U are disjoint. However, we may formulate these systems without postulating that S and U are disjoint. For example, we may introduce a separate predicate for S and allow the possibility that S and U overlap; alternatively, we may adopt twosorted first-order logic in the formulation of those systems. In such a formulation that permits the overlap of S and U, we can also interpret KPV 1 in KPV r plus V 1 -foundation by translating Ux by x ∈ V , Sx by x = x, V by V , x ∈ 0 y by x, y ∈ V ∧ x ∈ y, and x ∈ 1 y by x ∈ y. 6 Other forms of reflection 1 n -RFN is a type of reflection principle that reflects an assertion about the entire universe (of sets and classes) onto a class structure (a coded V-model). In this section, we will briefly consider some alternative types of reflection principles and give some observations about them.

Reflection onto class structures
Let be a collection of L 2 -formulae. We first consider a natural strengthening of -RFN.
-RFN + : ∀X ∃S X∈ S ∧ S | NBG ∧ ∀x( (x, X ) ↔ S | (x, X )) , for all ∈ , where only contains the displayed variables free and S does not occur free in ; note that we need not consider formulae with more first-and/or second-order free Remark 6. 2 The proof of Proposition 6.1 can be carried out as it is in secondorder arithmetic. In addition, as Jäger and Strahm [16] showed, we have ACA 0 1 n+1 -RFN ↔ 1 n -TI in second-order arithmetic. Since we obviously have ACA 0 1 n -CA → 1 n -TI, it follows that ACA 0 1 n -RFN + ↔ 1 n -CA for all n ≥ 1. 29 There is also an intimate relationship between 1 n -CA and 1 n -RFN B in second-order arithmetic. For n = 1, 2, by [25, Theorem VII. Hence, 1 n+1 -DColl implies the following.
This schema is the class-theoretic counterpart of the axiom of strong 1 n dependent choice in second-order arithmetic; see [25,Definition VII.6.1]. For the rest of the proof, we refer the reader to [25,Theorem VII.7.4].
As an immediate corollary, 1 ∞ -RFN is consistent relative to 1 2 -DColl 0 (actually, consistent relative to NBG plus strong 1 1 dependent collection). Remark 6.5 In second-order arithmetic, 1 1 -CA 0 proves 1 1 -RFN B (by the argument of Kleene basis theorem [25, Lemma VII.2.9]). However, the proof cannot be carried out in class theory. In general, by the result of Gitman, Hamkins, and Johnston mentioned in Remark 4.12, we have 1 n -CA 0 1 m -RFN + for all n > 0 and m ≥ 1 in class theory, since 1 1 -RFN + 1 1 -Coll by Proposition 6.1.4. Nonetheless, as a matter of fact, 1 n -CA 0 and 1 n -RFN + 0 (n ≥ 1) still have the same L ∈ -theorems under the assumption of GC, and they are also equiconsistent even if GC is dropped. The proof of this fact will be given in the sequel paper to the present one. 29 The case n = 0 is an anomalous case because of the condition "S | NBG," and both 1 0 -RFN + and 1 0 -RFN B are equivalent to ACA + 0 in ACA 0 in second-order arithmetic, and, similarly, they are both equivalent to ECA + 0 in NBG in class theory by Lemma 2.21. finite fragment S of ZF such that V α | S holds only for a limit ordinal > ω. Since (34) is a 1 1 -sentence, under the assumption of 1 1 -Indes − there is an ordinal κ such that V κ , V κ+1 satisfies S and (34) for the L ∈ (P)-formula ∧ S in the place of . Then, κ is a limit ordinal > ω, and thus V κ satisfies the axioms of ZF except the axiom of replacement. It suffices to show that V κ , V κ+1 satisfies Class Replacement. Let F be any function from some a ∈ V κ to V κ . Then, V κ , F | (a, P). Hence, there is some β with a ∈ V β such that V β | S and V β , F ∩ V β | (α, P). Since β is a limit ordinal, (a, P) also expresses in V β , F ∩ V β that P is a function with domain a, and thus F ∩ V β is a function with domain a and codomain V β . We have shown that κ is regular and V κ | ZF. Such a cardinal is shown to be a 1 0 -indescribable cardinal in a parallel manner to [17,Lemma 6.1], in which we need not use AC, and thus a 1 1 -indescribable cardinal. So, a reflection principle asserting the existence of a model of a second-order formula of the form V κ , V κ+1 is too strong for the context of the study of the present paper. Hence, we weaken this condition to the existence of a model of the form V κ , s for some s ⊂ V κ+1 . This restriction leads to the following principles: here, as before, ∈ and only contains the displayed variables free (and without β free). By the existence of universal formulae, both 1 n -Rfn and 1 n -SRfn are finitely axiomatizable for every n > 0. 1 0 -Rfn is also finitely axiomatizable, but we need a little more argument. Let ∃Z (e, x, X , Z ) be a 1 1 -universal formula, where ∈ 1 0 . Then, on the one hand, 1 0 -Rfn proves ∀X ∀Z (∀e ∈ ω)∀x (e, x, X , Z ) → ∃β(∃s ⊂ V β+1 ) ω, x ∈ V β ∧ X ∩V β , Z ∩V β ∈ s ∧ V β , s | NBG + (e, x, X ∩V β , Z ∩V β ) ; on the other hand, since ∃Z is a 1 1 universal formula in any model V β , s of NBG, the 1 1 -sentence (35) implies each instance of 1 0 -Rfn. The next proposition is obvious. Proposition 6.7 1. NBG + 1 ∞ -SRfn  -RFN B ), but the proof is left for another occasion, which requires new arguments for strong systems of classes. We conjecture that NBG + 1 n+1 -SRfn Con(ECA + 1 n -SRfn).
Proof Recall that I is an interpretation of ECA in TC (see Sect. 3) and that ECA and TC have the same L ∈ -theorems. Hence, it suffices to show that TC ( 1 ∞ -SRfn) I . We work within TC. Let (x, X ) ∈ L 2 . Take any ϕ(u) ∈ Fml ∞ ∈ (which corresponds to the parameter "X "). By Fact 3.10, since NBG consists of only finitely many axioms, there exists β such that ϕ(u) ∈ V β , (NBG I ) V β , and recall that, for each variable z, all the occurrences of z ∈ X in (x, X ) are replaced by T (ϕ(z)) in I (x, ϕ). Take I −1 (V β ) (see Sect. 3), and let I −1 (V β ) = (V β , s). Let us write X for {z | T (ϕ(z))}. Since V β | ZF, each syntactic notion or operation on L ∞ ∈ is absolute for V β ; therefore, in particular, we have (Fml ∞ ∈ ) V β = Fml ∞ ∈ ∩ V β and ∀x ∈ V β T (ϕ(x)) ↔ V β | T (ϕ(x)) . Hence, by (10), we obtain Hence, the I-translation of the instance of 1 ∞ -SRfn for holds in TC.
Proof Again, we will show the claimed consistency in TC and work within TC.
We first take an L T -definable closed unbounded class C of ordinals ζ such that (NBG I ) V ζ . Next, for each (code of) L 2 -formula (x, X ), we define a function F : V× V → On as follows: Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.