Classes, why and how

This paper presents a new approach to the class-theoretic paradoxes. In the first part of the paper, I will distinguish classes from sets, describe the function of class talk, and present several reasons for postulating type-free classes. This involves applications to the problem of unrestricted quantification, reduction of properties, natural language semantics, and the epistemology of mathematics. In the second part of the paper, I will present some axioms for type-free classes. My approach is loosely based on the Gödel–Russell idea of limited ranges of significance. It is shown how to derive the second-order Dedekind–Peano axioms within that theory. I conclude by discussing whether the theory can be used as a solution to the problem of unrestricted quantification. In an appendix, I prove the consistency of the class theory relative to Zermelo–Fraenkel set theory.

This foundation was found in modern axiomatic set theory, which has its roots in the work of Cantor.
A number of authors (Maddy 1983;Lavine 1994) have argued that there were at least two different notions of class in the literature, and that only of them is prone to paradox (Gödel 1983). Following Lavine (1994, p. 63), we may call them the logical and the combinatorial notion of class, respectively.
According to the logical notion, a class may be defined as the extension of a concept or predicate, or, to use Russell's words, ''as all the terms satisfying some propositional function''. 1 Such classes are associated with some kind of definition or rule that tells us in a principled way whether an object belongs to the class or not. This is the notion of class that was championed by Frege, Peano and Russell. 2 Extensions of concepts had been part of logic since antiquity; they can be found in the works of Leibniz and are explicit in the Port-Royal Logic (Bochenski 2002, pp. 302-303). It is this fact that allowed Frege, dialectically, to assume that a reduction of number theory to class theory is sufficient to establish his thesis that arithmetic is a branch of logic (Heck 2011b, p. 126).
According to the combinatorial notion, on the other hand, classes are obtained from some well-defined objects such as the natural numbers by 'enumerating' their members in an arbitrary way. Such classes exist independently of our ability to provide a defining condition or rule that characterizes its members. Arguably, this is the notion adopted by Cantor and Zermelo and which underlies our modern iterative concept of set.
The difference between the concept of class as given by a rule and the concept of class freed from such restriction was an important factor in the controversy about the axiom of choice. This axiom states that we can select one element out of each of a family of (non-empty) classes and collect them into one class. As Bernays (1983) remarks, the axiom of choice ''is an immediate application of the combinatorial concepts in question.'' On the logical notion of class, on the other hand, it is doubtful whether a class satisfying the requirements set out in the axiom of choice can always be found. 3 In what follows, we will call the combinatorial classes sets and the logical ones classes. The (logical) notion of class motivates what is commonly referred to as ''the naïve calculus'', which consists in the naïve or unrestricted comprehension axiom scheme, which postulates the existence of a class corresponding to each predicate, and the axiom of extensionality, which states that two classes are identical if they have the same members. Of course, as Russell's paradox shows, the 1 A propositional function is a function that yields a proposition when given an argument, and one might think of them as being abstracted from propositions which are primarily given. In particular, a propositional function is not to be confused with the predicate (i.e., formula with one free variable) expressing it. 2 Two important remarks are in order. First, while Frege and Russell were logicists, Peano was not (despite frequent claims to the contrary). See Kennedy (1963). Second, there are important differences between Frege's and Peano-Russell's notion of class. For example, while for Russell classes are ''composed of terms'', for Frege the elements of a class do not seem to be constitutive of it. See Lavine (1994, pp. 63-64). Our own account will remain neutral between the two. 3 See Russell's (1973) discussion of how to divide an infinity of boots into two classes. naïve calculus is inconsistent. The standard approach to the class-theoretic paradoxes is to be found in the theory of types, which originated with Russell (1903aRussell ( , 1908. In a nutshell, what happens here is that one abandons the idea of a general or unrestricted variable and replaces it with a series of variables differentiated as to type. While Cantor never laid down explicitly the principles that he was working with, it has been argued that the naïve comprehension axiom scheme was not part of it and that the notion of set was never subject to the paradoxes (e.g., Lavine 1994). By contrast, one can make a case that the class-theoretic paradoxes are still unsolved. For instance, Gödel says about the type-theoretic approach that ''it cannot satisfy the condition of including the concept of concept which applies to itself or the universe of all classes that belong to themselves. To take such a hierarchy as the theory of concepts is an example of trying to eliminate the intensional paradoxes in an arbitrary manner.'' (Wang 1996, p. 278) The aim of this paper is to provide reasons for developing a typefree theory of classes and to indicate one way how this might be done.

The function of class talk
In a series of papers, Parsons (1982Parsons ( , 1983a has argued that the introduction of the notion of class answers a general need to generalize on predicate places in our language (where 'predicate' means formula with one free variable). For example, consider the usual (first-order) principle of mathematical induction. This consists in all sentences of the form uð0Þ^8n ðuðnÞ ! uðn þ 1ÞÞ ! 8n uðnÞ where uðxÞ is a predicate applying to numbers. The introduction of class terms, governed by the comprehension axiom scheme, allows us to substitute the expression uðtÞ by the materially equivalent t 2 fu j uðuÞg, where fu j uðuÞg occupies an object position and is therefore open to (objectual) quantification. Hence, the notion of class allows us to finitely axiomatize the induction schema by the single statement 8y ð0 2 y^8n ðn 2 y ! n þ 1 2 yÞ ! 8n n 2 yÞ Of course, in mathematical contexts the demand for generalising predicate places is met to a considerable extent by sets. But the notion of class allows us in addition to generalize every predicate in the language of set theory. This cannot be done by sets themselves because some predicates of the language of set theory, such as 'x is an ordinal', have extensions that are ''too big to be sets''. Examples of the use of classes in set theory include the formulation of certain schemata as single statements, such as the axiom schemes of separation and replacement, or reflection principles. 4 There are many other uses as well that are often eliminable but seem heuristically indispensable, for example, in connection with the study of elementary embeddings of the universe of sets into some inner model of ZFC (see Uzquiano 2003, section 2).
The function of class talk brings the notion of class in proximity with the notions of truth and truth-of (satisfaction). This was stressed by Parsons: just as the notion of class answers a need to generalize predicate places, so does the notion of truth answer to a need to generalize sentence places (cf. Quine 1970). Moreover, Parsons (1983b) observes that the notion of satisfaction can be seen a means to generalize predicate places as well, and that the usual predicative theories of satisfaction and classes are mutually interpretable. In Schindler (2015Schindler ( , 2017 it is shown that even impredicative theories of classes can be interpreted in (type-free) theories of satisfaction. Given the similar functions of the notions of truth and class, and the mentioned interpretability results, this suggests that someone who already has a broadly deflationary understanding of the notions of truth and satisfaction should probably have a deflationary understanding of the notion of class as well.
While I find the idea of a deflationary account of classes intriguing, it is rather tangential to our present purposes and I won't pursue it any further here. But let me make the following remark. That classes were merely introduced to fulfill a particular function does not imply a nominalistic account of classes, at least if one subscribes to Quine's criterion of ontological commitment. On the contrary, classes are introduced so that we can objectually quantify over entities that would otherwise be in predicate position. However, a deflationary account of classes may help us argue that classes are ''thin'' objects in the sense of Linnebo (2012), where ''thin'' is taken in the sense that ''very little is required for their existence''. But this is a task for another paper.

Reasons for postulating type-free classes
The literature is full of interesting attempts to overcome the restrictions imposed by the theory of types. For an overview, I send the reader to Cantini (2009). The are various reasons why one may be interested in a type-free theory of classes. For instance, Feferman (1977Feferman ( , 1984, Muller (2001) and others are interested in a set-or classtheoretic foundation of category theory. The problem here is that there are certain categories that are very natural to think about, such as the category of all sets, the category of all groups or the category of all categories, that cannot be formed in standard set theory. [For a recent overview, see Schulman (2008).] In what follows, I will list four more reasons. My own interests are mainly with the first and last of them.

Unrestricted quantification
There are certain contexts in logic and philosophy where we intend our quantifiers to range over absolutely everything whatsoever, or at least to be unrestricted, for Footnote 4 continued (collections) at all (as remarked in footnote 2, Frege did not think of extensions as being constituted by their members). Moreover, if type-free classes are collections, then set theory is certainly not about them. example when we say that everything is self-identical or that the empty set has no members. Presented with a counterexample, we would not regard it open to the defendant to dismiss the counterexample on the ground that it is not in the domain of quantification. The possibility of unrestricted quantification does not only seem to be plausible, its denial seems to border on the incoherent. If someone claims that one cannot quantify over everything, they seem to imply at the same time that there is something one cannot quantify over (Williamson 2003).
Despite this, the coherence of unrestricted quantification has been doubted. For an overview of this debate, see Rayo and Uzquiano (2006). One objection is related to a principle that was first discussed (but not endorsed) by Cartwright (1994), and is nowadays known as the All-in-One Principle The objects in a domain of discourse make up a set or some set-like object.
In modern semantics, for example, the domain of discourse is usually taken to be a set. However, according to standard set theory there is no universal set. This causes problems, in particular, when one tries to interpret set-theoretic talk itself. It seems natural to assume that when a set theorist talks about sets, she is (at least sometimes) talking about all sets. The proposal that we should trade in standard set theory for a theory that admits a universal set, such as Quine's New Foundations (Quine 1980), has not been met with enthusiasm, because this theory does not seem to embody any intuitive picture of sets.
One popular defence of unrestricted quantification makes use of the theory of types. On this account, interpretations are not (first-order) objects but higher-order entities. But this defence is not unproblematic; see Sect. 4 below and Linnebo (2006, pp. 154-156). Therefore, one may think it is preferable to treat the domain of quantification as an (first-order) object. As Linnebo points out, there is no reason to assume that this object needs to be a set. Hence, one possible solution to the problem of unrestricted quantification consists in replacing or supplementing set theory by a theory of classes that allows for a universal class. This proposal is not unproblematic itself, because theories with a universal class are incompatible with the axiom of separation, which seems necessary for semantics. I believe, however, that this problem can be dealt with and will return to it in Sect. 6.

Reduction of properties
Another area where classes might be useful is metaphysics: one might try to reduce properties or universals to classes. An influential account of this sort was given by Lewis (1986, chap. 1.5). However, there a good reasons, mainly in connection with the semantics of natural language (see below), to assume that properties need to be type-free. 5 Another motive for self-membered properties was suggested by Allen (2016 pp. 28-31). One classical problem confronting property theory is Bradley's Regress argument. This argument can be described as follows. Assume that a instantiates the universal or property F. This relation of instantiation is itself a universal, say I 1 . Now, one might ask what connects a, F and I 1 ? This will be another instantiation relation, I 2 . But then we may ask what connects a; F; I 1 and I 2 ? This will be another instantiation relation, I 3 . And so on. Whether this regress is vicious or not is a hotly debated topic.
Whatever the outcome, one might try to simplify the hierarchy of instantiation relations required by the regress. There are at least two options: one could treat I 1 ; I 2 ; I 3 ; . . . as instances of a single multigrade relation I 0 (where a relation is multigrade if the number of entities it relates can vary); or, one could treat I 1 ; I 2 ; I 3 ; . . . as so-called inexactly resembling instances of a single instantiation relation I Ã (where instances of a relation inexactly resemble each other if the resemblance is not exact similarity). Either way, I 0 and I Ã need to be able to selfinstantiate.

Natural language semantics
Classes (properties, concepts) have been applied in the analysis of natural language semantics (Montague 1974). However, there are many intuitively valid inferences that cannot be reconstructed in a typed framework due to the lack of selfexemplifying properties; this has motivated quite some research into type-free theories of properties (Bealer 1982;Menzel 1986Menzel , 1993Orilia 1999;Chierchia and Turner 1988). For example, consider the inference from 1. Everything has the property of being self-identical to 2. Socrates has the property of being self-identical and the inference from (1) to 3. The property of being red has the property of being self-identical The intuitive soundness of both inferences requires not only the existence of the property of being red but also that the quantifier in (1) ranges over both Socrates and the property in question. Hence, this inference cannot be captured in a typed language.

Reduction of mathematics
Last, but not least, one might be interested in a theory of classes (properties) for the very same reason for which Frege and Russell were originally drawn to it, namely, the ''reduction of mathematics to logic''. It has often been claimed that logicism is dead, but several reformed versions of logicism have emerged in recent decades. One should mention here, on the one hand, the works of Bealer (1982), Cocchiarella (1986, Landini (2004), and Orilia (1991), which are based on type-free theories of properties, and, on the other hand, the works of the Neo-Fregean school, which are based on abstraction principles. For a technical overview of the latter, see Burgess (2005). For philosophical discussion, see Hale and Wright (2001), Heck (2011a), and Cook (2009). The Neo-Fregean project originated with the discovery of Frege's Theorem, namely, that the second-order Dedekind-Peano axioms for arithmetic can be derived, in second-order logic, from what is known as ''Hume's Principle'', namely 8F8Gð#F ¼ #G $ EqðF; GÞÞ This principle states that the numbers of Fs is identical to the number of Gs if and only if the Fs and Gs are equinumerous (i.e., can be put in a one-one correspondence). Now, one might be sceptical about the analyticity of Hume's Principle or whether class theory should be counted as part of logic. But such reductions may still be seen as answering to Frege's question: How are numbers given to us? The problem of epistemic access to abstract objects has been emphasized by Benacerraf (1983). How can we have knowledge of abstract objects, such as numbers, when we have no causal interactions with such objects? Wright's idea is that an agent who is capable of second-order reasoning but has no knowledge of number theory could stumble upon Hume's Principle, say, in a dream and decide to use terms of the form #F in accordance with it. Then the claim is that the agent thereby acquires a concept of number without significant epistemological presupposition.
Similarly, one may claim that the concept of class (property) is acquired without significant epistemological presupposition. We ''nominalize'' predicates in order to generalize predicate places, and that's all there is to class (property) talk. However, if we want to reduce mathematics to a theory of classes, then type-free classes are called for, because we need to initiate a boot-strapping process in order to generate enough objects that can serve as proxies for mathematical objects.

Ranges of significance
The purpose of the present section is to motivate a novel approach to the paradoxes that is loosely based on some remarks that Gödel made in (Gödel 1983) about Russell's theory of types. Recall that a propositional function is a function that yields a proposition when given an argument. According to Russell's theory, every propositional function uðxÞ has ''in addition to its range of truth a range of significance, i.e., a range within which x must lie if uðxÞ is to be a proposition at all, whether true or false'' (Russell 1903b, p. 523). More generally, the range of significance of a function is the collection of arguments for which said function is defined (i.e., has a value), and the range of significance of a propositional function is the collection of arguments for which the function yields a proposition. The idea of a range of significance need not be tied to the notion of propositional function. Gödel applies it to concepts, 6 but of course one can also apply it to predicates.
There are several ways in which the notion of a range of significance can be interpreted on a pre-theoretical level. The literature on philosophy of language provides many examples of grammatically well-formed sentences that, for some reason or other, do not express a proposition or lack a definite truth value. Many of these examples may be taken as instances of an object's being a singular point of the relevant predicate or propositional function. For example, one may think that in the case of a category mistake (e.g., ''The number 2 is green''), the object denoted by the name lies outside the range of significance of the predicate. Of course, one may simply treat such a sentence as false and its negation as true (perhaps for reasons of technical simplicity, e.g., in order to stay classical). On a more narrow understanding, one may think that in all and only those cases where the application of a predicate to a name yields a paradoxical sentence (e.g., ''This sentence is false''), the object denoted by the name lies outside the range of significance of the predicate.
As Gödel remarks, the idea that every propositional function has a range of significance that need not exhaust the entire universe ''brings in a new idea for the solution of the paradoxes, especially suited to their intentional form'', which ''consists in blaming the paradoxes not on the axiom that every propositional function defines a concept or class, but on the assumption that every concept gives a meaningful proposition, if asserted for any arbitrary object or objects as arguments'' (1983, p. 466). He adds that ''[t]he obvious objection that every concept can be extended to all arguments, by defining another one which gives a false proposition whenever the original one was meaningless, can easily be dealt with by pointing out that the concept ''meaningfully applicable'' need not itself be always meaningfully applicable'' (otherwise Grelling's paradox would ensue).
For reasons that I do not want to enter here, Russell thought that ranges of significance form types such that whenever a propositional function is significant for some argument x, and y belongs to the same type as x, then that function is significant for the argument y as well. This means that 1. whenever a propositional function is significant for some argument x, its range of significance is identical with the type of x; 2. sameness of type is an equivalence relation and, therefore, types are mutually exclusive; and 3. if two functions are both significant for some argument x, then they must have exactly the same range of significance.
The types are then divided into orders (yielding the ramified theory of types), but this further complication need not interest us here. Unfortunately, the theory of types suffers from expressive limitations that have often been pointed out in the literature. For example, Gödel remarks that ''[w]hat makes the above principle particularly suspect, however, is that its very assumption makes its formulation as a meaningful proposition impossible, because x and y must then be confined to definite ranges of significance which are either the same or different, and in both cases the statement does not express the principle or even part of it.'' (Gödel 1983, p. 466) It should be observed that Russell's idea that every propositional function has a range of significance is logically independent of the assumption that the ranges of significance form types. One might therefore consider the possibility of construing classes based on the first but without the second assumption. In the remainder of this paper, I wish to develop the theory of classes in this direction. This approach is inspired by Gödel's remark that: It is not impossible that the idea of limited ranges of significance could be carried out without the above restrictive principle [i.e. that the ranges of significance form types]. It might even turn out that it is possible to assume every concept to be significant everywhere except for certain ''singular points'' or ''limiting points'', so that the paradoxes appear as something analogous to dividing by zero. Such a system would be most satisfactory in the following respect: our logical intuitions would then remain correct up to certain minor corrections, i.e., they could then be considered to give an essentially correct, only somewhat 'blurred' picture of the real state of affairs. Unfortunately the attempts made in this direction have failed so far; on the other hand, the impossibility of this scheme has not been proved either, in spite of the strong inconsistency results of Kleene and Rosser. (Gödel 1983, p. 466-467) The following general picture emerges. Let U be the universe of all objects, and uðxÞ be some propositional function. uðxÞ has a range of significance, RðuÞ, which is a subset of U. If uðxÞ has singular points, then RðuÞ is a proper subset of U. For every object a in RðuÞ, uðaÞ is meaningful-that is, true or false. uðxÞ thereby determines two classes, the extension fa 2 RðuÞ j uðaÞg and anti-extension fa 2 RðuÞ j :uðaÞg of uðxÞ, whose union coincides with RðuÞ, that is, fa 2 RðuÞ j uðaÞg [ fa 2 RðuÞ j :uðaÞg ¼ RðuÞ Gödel mentions Church's (inconsistent) system (Church 1932) as an interesting attempt to carry out these ideas. Another possibility is to use some non-classical logic, such as the Weak or Strong Kleene logics. This route faces the notorious problem that the material conditional is not well behaved in these logics. One might therefore consider the following alternative route, which retains classical logic.
Again, let U be the universe of all objects, and uðxÞ be some propositional function. As before, uðxÞ has a range of significance, RðuÞ, which is a subset of U. If uðxÞ has singular points, then RðuÞ is a proper subset of U. This time, however, we treat uðaÞ as meaningful (i.e., true or false) for every object a in U. As before, uðxÞ determines two classes, the extension fa 2 RðuÞ j uðaÞg and anti-extension fa 2 RðuÞ j :uðaÞg of uðxÞ, whose union coincides with RðuÞ. The difference to the previous picture is that the classes fa 2 RðuÞ j uðaÞg and fa 2 RðuÞ j :uðaÞg may ''underspill'': if a is an object outside the range of uðxÞ, then either uðaÞ or Classes, why and how 415 :uðaÞ will be true; but since a is a singularity of uðxÞ, it is neither an element of fa 2 RðuÞ j uðaÞg nor of fa 2 RðuÞ j :uðaÞg. It is the latter route that will be followed in the remainder of this paper. 7 For technical convenience, I will modify the picture above in two ways. First, I will treat classes as extensions of predicates (i.e., formulas with one free variable) rather than propositional functions (or concepts) because I am not aware of any suitable theory of propositional functions (concepts). Second, instead of assigning ranges of significance to predicates, I will directly assign them to classes. This saves us the trouble of introducing names for predicates and function symbols for syntactic operations on predicates. From a technical point of view, this does not seem to make too much of a difference, because to every predicate uðxÞ there corresponds a unique class abstract fx j uðxÞg. The class abstract can therefore serve as some form of Gödel code for the predicate. In the informal presentation, I will nevertheless talk frequently (as a form of shorthand) of the range of significance of u instead of that of fx j uðxÞg.

A type-free theory of classes
The language of the theory that we are going to present is an ordinary one-sorted first-order language with identity. It contains a binary relation symbol, 2, for membership in a class. One of the expressive limitations of the theory of types is that it cannot express that some object is not in the range of significance of some propositional function (or predicate). In order not to fall prey to the same objection, we will introduce a primitive binary relation symbol, R, into our language. We may read xRy as ''x is in the range of significance of y'' or ''x is not a singular point (singularity) of y''. Let total(x) abbreviate the formula 8z zRx. If x is total, then x has an unrestricted range of significance (i.e., has no singular points). According to the theory that we are going to present, every predicate determines a class. We will therefore assume that our language contains a class term fu j ug for every formula u containing the free variable u. Since we are aiming at a type-free system, u is allowed to contain 2; R and other class terms. Moreover, it may contain other free variables as parameters.
A remark on notation. I will use u; w for well-formed formulas, u, v, x, y, z for variables, and s, t for arbitrary terms. Some special symbols will be introduced as we go along. uðt=xÞ denotes the result of substituting all free occurrences of x in u by t. Instead of : s 2 t we will also write s 6 2 t. The usual conventions for the use of brackets apply.
The axioms can be divided into three groups. The first group consists of 'conceptual' axioms that describe the general relation between a class and its range of significance. These axioms are directly suggested by the picture provided in the previous section (i.e., that every predicate, together with its range of significance, determines an extension and anti-extension in the indicated way). The second group of axioms describe the relation between the range of significance of a predicate and the logical form of that predicate. They are based on the natural assumption that classes/ranges of significance should be closed under the algebraic operations corresponding to the logical operations on predicates. The third group contains only one axiom expressing the wide-spread idea that the paradoxes are to be blamed on some form of circularity or non-well-foundedness, a view that goes back at least to the days of Russell. These axioms belong to the pure theory of classes, i.e., the part that deals with classes of classes; at the end of this section, we will discuss an axiom for the applied theory of classes, i.e., classes of individuals or urelements.
Our first and most basic axiom scheme is a relativized form of naïve comprehension and follows immediately from the picture presented in the previous section. The axiom states that whenever x is in the range of significance of the predicate u (or equivalently: whenever x is not a singularity of u), then x is an element of the class fu j ug if and only if uðx=uÞ holds. That is: where ϕ is any formula and x is free for u in ϕ.
Notice that u may contain free variables besides u. These should be bound by universal quantifiers. A similar remark applies to the other axioms below.
The axiom scheme is completely general and topic-neutral. We can insert any formula in place of u, including the predicates u ¼ u; u 6 2 u and uRu.
It is easily seen that the Axiom of Class Comprehension is consistent. Being a universally quantified conditional, we can make it vacuously true. In this framework, Russell's paradox is simply transformed into the theorem that the Russell class r :¼ fu j u 6 2 ug does not lie in its own range of significance: Carrying out the usual reasoning, we convince ourselves that rRr ! ðr 2 r $ r 6 2 rÞ from which we simply conclude that : rRr. No contradiction ensues.
Our second axiom, which also follows from the picture provided in the previous section, states that if x is a singular point of y, then x is not an element of y: In conjunction with the Axiom of Class Comprehension, the Axiom of Singularity implies: This is a very useful theorem. If we know that x is an element of the class y, then we can deduce that x satisfies the defining condition of y. Moreover, this theorem rules Classes, why and how 417 out that some classes ''overspill'': it is not possible that the class fu j ug contains some objects that are not us. We adopt the following version of extensionality, according to which two classes are identical if they have the same range of significance and the same members. (The other direction follows from the logical laws of identity.) Axiom Extensionality Here, RðxÞ ¼ RðyÞ is shorthand for 8z ðzRx $ xRyÞ. The reason for imposing this condition is as follows. Assume it is possible to define a class w such that w ¼ fu j u 6 2 u^u ¼ wg. (Such self-referential classes cannot be defined in the present formalism, but one may muse about extensions of the system in which this is possible.) It is easy to prove, using the first two axioms, that w 6 2 w. Hence, w has no members. Now assume that the class £ :¼ fu j u 6 ¼ ug has an unrestricted range of significance. (This will actually follows from our other axioms.) Hence, if we identified classes with the same members, we would get that £ ¼ w. But then w would have an unrestricted range of significance as well, which we have just ruled out. (It should be noted that, as things stand, the ordinary axiom of extensionality is consistent with our theory as well.) The Axiom of Extensionality (in either form) will not be used in any of the theorems below. The reason to include it, apart from conceptual considerations, is merely to highlight that it can be included without leading to triviality. This seems noteworthy because there are well-known problems for adding axioms of extensionality to non-classical logics that contain naïve comprehension (Field 2008, pp. 296-298).
It is perhaps interesting to remark that, given the first three axioms, we can characterize classes with the following abstraction principle (scheme), which states that the class of us is identical to the class of ws if and only if u and w have the same range of significance and are satisfied by exactly the same objects: The above abstraction principle is a theorem of our theory. In contrast to ordinary abstraction principles, in the above scheme the class terms appear also on the righthand side of the biconditional. Of course, this is a side-effect of my decision to use classes, instead of predicates, as the second relatum of the R relation. If predicates were used instead, the abstraction terms would only occur on the left-hand side of the abstraction principle. Our next group of axioms deals with the relation between the range of significance of a predicate and the logical form of that predicate. They are based on the natural assumption that classes/ranges of significance should be closed under the algebraic operations corresponding to the logical operations on predicates. For example, if the number 2 lies in the range of significance of ''is green'', then it should lie within the range of ''is not green'' as well; if Aristotle lies in the range of significance of the predicates ''is Greek'' and ''is a philosopher'', then Aristotle should also lie within the range of ''is a Greek philosopher''.

Axiom Negation, Connectives
We will adopt similar axioms for atomic predicates. For example, consider the atomic predicate u 2 t, where t denotes some class. Of what objects should we say that they lie in the range of significance of u 2 t? Intuitively, t simply is fu j u 2 tg. Hence, the following seems natural: if x is an object that already lies in the range of significance of t, then x lies in the range of u 2 t as well. 8

Axiom Membership, Identity
The axioms introduced so far are compatible with every predicate having an empty range of significance. (Note that all of them are universally quantified conditionals.) They are therefore trivially consistent. In order to get our theory off the ground, we need some axioms that ensure that some predicates have non-empty ranges. The following axiom stipulates that the (class determined by the) predicate u ¼ u has an unrestricted range of significance. Recall that total(x) abbreviates the formula 8z zRx.
The reason for postulating this axiom is clear, given our motives. We want to design a theory in which models with a universal domain are available. Instead of adopting the Axiom of Self-Identity, we could stipulate that the empty class £ :¼ fu j u 6 ¼ ug is total. Given that ranges of significance are preserved under negation, it does not matter which one we choose. The totality of one follows from the the totality of the other.
I find the Axiom of Self-Identity fairly innocuous. First, the predicate x ¼ x (just as any other tautological predicate) is stable (i.e., its extension is fixed on every interpretation of the non-logical primitives). Second, the predicate x ¼ x does not contain the membership symbol 2, and should therefore be admissible. One might compare this line of argument to how the T-schema is restricted in formal theories of truth. The sentences without the truth predicate are always assumed to be admissible instances of the T-schema.
Before presenting the last axiom of the pure theory of classes, let me mention some straightforward consequences of the axioms introduced so far. I hope this will help the reader to get a better feeling for the theory.
First, observe that, as desired, the universal class V :¼ fu j u ¼ ug contains every class, including itself and the Russell class r. For by the Axiom of Class Comprehension, we have xRV ! ðx 2 V $ x ¼ xÞ By the Axiom of Self-Identity, we know that xRV for every x. Hence, x 2 V for every x.
Second, since the Russell class r is contained in the universal class but not vice versa (as the reader may easily verify), we can conclude (by usual laws of identity) that V 6 ¼ r. This is in stark contrast to the traditional theories of ''proper classes'' (i.e., theories of the Morse-Kelley variety), which do not distinguish the two.
For illustrative purposes, I show (3). The other items are proved in a similar fashion. By the Axiom of Membership and the totality of t 1 , 8xðxRfu j u 2 t 1 gÞ. Similarly, we have 8xðxRfu j u 2 t 2 gÞ. Therefore, by the Axiom of Connectives, we have 8x ðxRfu j u 2 t 1 _ u 2 t 2 gÞ; which means that t 1 [ t 2 is total.
Using item (7), we can successively generate the finite ordinals in the usual von Neumann style. That is, we let 0 :¼ £, 1 :¼ f0g, 2 :¼ f0; 1g, 3 :¼ f0; 1; 2g and so on. It is easily seen that all these classes are total. However, we are not yet able to collect these classes into one.
By item (8), it follows that total classes are closed under adjunction. This means that our theory relatively interprets adjunctive set theory (=existence of empty set plus closure under adjunction), which in turn interprets Robinson arithmetic (roughly, Peano arithmetic without the induction scheme). 9 It should also be noted that out theory interprets the system known as NF 2 , whose axioms are extensionality, existence of the empty set, and closure under complements, intersections, and singletons (Forster 2001).
This brings us to the most interesting axiom. It expresses the wide-spread idea that paradoxes are to be blamed on some form of circularity or non-wellfoundedness. More precisely, the axiom states that if x is a singularity of some class, then x itself has singular points: For a typical example, consider the Russell class. The Russell class has a singular point (namely, the Russell class itself), and that singular point has a singular point itself (namely, the Russell class). And similarly for the Burali-Forti paradox: the class of all ordinals is a singular point of (the class defined by) the predicate 'x is an ordinal'. We need not assume that all paradoxes stem from such a simple type of circularity. Perhaps there are classes x, y such that x is a singular point of y and vice versa. We may also imagine a infinitely descending chain of classes x 1 ; x 2 ; x 3 ; . . . such that every class in that sequence is a singular point of its immediate predecessor. 10 The Axiom of Circularity is logically equivalent to the claim 8x ð8z zRx ! 8y xRyÞ In words: whenever x is total, then x lies in the range of significance of every class (i.e., x is not a singularity of any class). Since there are total classes (in fact, infinitely many ones: our proxies for the natural numbers 0; 1; 2; . . . are all total), this means that no predicate has an empty range of significance (in fact, every predicate has infinitely many objects in its range). Thus I believe that the Axiom of Circularity actually captures to some extent Gödel's idea that we may assume every predicate to be significant for most arguments. The Axiom of Circularity is quite remarkable. It justifies impredicative class formation in the sense that it entitles us (in conjunction with the first two axioms) to form classes of total classes at will. For every predicate u, the following is a theorem of our theory: 9 A proof of that result (i.e., that Robinson arithmetic is interpretable in adjunctive set theory) can be found in Visser (2009), who also gives a short history of the result. 10 It would therefore be more appropriate to call the Axiom of Circularity 'the Axiom of Non-Well-Foundedness'. The reason I did not choose this name is two-fold. First, the name 'Axiom of Non-Well-Foundedness' could easily lead to a confusion with the Axiom of Anti-Foundation in non-well-founded set theories. Second, non-well-foundedness can be seen as some form of unfolding of circularity. from which the above claim follows by existential weakening. The Axiom of Circularity boosts the mathematical strength of our theory significantly. It allows us, using suitable definitions, to derive the second-order Dedekind-Peano axioms for arithmetic within our theory. Let us define x :¼ fx j totalðxÞ^8yð0 2 y^8z 2 yðSðzÞ 2 yÞ ! x 2 yÞg; where 0 and S(z) are defined as above. This states that x is the class of total classes that are contained in every inductive class, where a class is inductive if it contains 0 and is closed under successor. This is the usual von Neumann definition of natural numbers; we have only added the condition that the natural numbers must be total.
For illustrative purposes, let us show that x actually contains all natural numbers (as defined above). First, we have already seen that 0 (the empty class) is total. Trivially, 0 is contained in every inductive class. Hence, 0 satisfies the defining condition for being a natural number. But then the scheme (SOC) above allows us to conclude that 0 is indeed an element of x. Next, let us show that x is closed under successors. So let x 2 x. Then by (Out) we know that x satisfies the defining condition of x. Hence x is total and contained in every inductive class. But then it trivially follows that its successor, S(x), must also be contained in every inductive class. Moreover, we have seen above that whenever x is total, so is its successor. Hence, S(x) satisfies the defining condition of being a natural number, and since it is total we can conclude that SðxÞ 2 x, by another application of (SOC). A complete derivation of the second-order Dedekind-Peano axioms can be found in ''Appendix 1''.
The theory presented above is consistent. A proof is given in ''Appendix 2''. It needs to be stressed that the axioms presented here are only basic axioms that can and should be extended by additional ones that increase the expressive (mathematical) power of the theory even more.
So far we have only considered classes of classes, that is pure classes. One of the main motives for developing a theory of classes lies in its application to some given domain. So let us assume that our language contains additional predicates applying to objects other than classes, such as people, stones, numbers or sets, and let us introduce a distinguished predicate, U (for urelements), applying to these objects. Then we may adopt the following axiom which states that every urelement is in the range of significance of every class. 11 Axiom Urelements ∀x (Ux → ∀y xRy) Now let T be some first-order theory not containing the symbols 2; R; U nor any of the abstraction terms. If T is the language of set theory, we can simply work with two copies of 2. Let T U be the theory resulting from T by relativizing all quantifiers in the axioms of T to the predicate U. Moreover, if T contains axiom schemata (such as induction or replacement), extend these so that 2; R; U and the abstraction terms are allowed to occur in the instances of the schemata. Then it is easily seen that T U , conjoined with our axioms for classes, implies that 9y 8x ðUx ! ðx 2 y $ uÞÞ (This follows simply from the Axioms of Urelements and Class Comprehension.) Hence, T U together with the theory of classes interprets the second-order version of T.
It is possible to go further. For example, we may add an axiom to the effect that whenever x is a class containing only urelements, then x is in the range of every class. This would allow us to interpret the third-order version of T. This process can be iterated. We can add an axiom to the effect that whenever x is a class of classes of urelements, then x is in the range of every class, which gives us fourth-order T, and so on for every finite order. Hence, we can embed the full type hierarchy over T into our theory of classes.
That this can be done in a type-free theory of classes is something I take as a minimal adequacy result. We have claimed that a type-free theory needs to be developed because of certain expressive limitations of type theory. But then the replacing theory should be at least as expressive as type theory.
There are some theories T that are inconsistent with full second-order comprehension, e.g., abstraction principles for ordinals conceived as sui generis objects (Florio and Leach-Krouse 2017). In such cases, one can weaken the Axiom of Urelements in several ways, if desired. For example, let P 1 ; P 2 ; . . . be the predicates of T. Then one could replace the Axiom of Urelements with the following schema:

8xðUx ! xRfu j P i ugÞ
In this case, one only obtains comprehension for predicates that are first-order definable in T, that is, a predicative comprehension principle.

Unrestricted quantification
Following a suggestion of Linnebo (2006), I have mentioned that one way to approach the problem of unrestricted quantification is by adopting a theory with a universal class or property. An objection that is frequently raised against such proposals is that theories with universal classes are incompatible with the axiom scheme of separation, Classes, why and how 423 8x 9y 8z ðz 2 y $ z 2 x^uÞ This axiom states that given a class x, we can collect all members of x that satisfy some property u into one class y. But if x is the universal class, and u is the Russell predicate z 6 2 z, then y cannot exist on pain of contradiction. Hence, if we admit a universal class we lose separation. However, it seems that we need separation in order to comply with the following semantic principle: For any domain of interpretation d and any predicate uðzÞ in the language, it should be possible to specify an interpretation such that for all individuals x 2 d, a predicate letter 'P' applies to x if and only if uðx=zÞ.
The problem emerges because in a model-theoretic semantics the semantic value of 'P' needs to be an object. In order that the above principle is satisfied, we need to assign the class fz 2 d j uðzÞg to 'P'. And this in turn requires the axiom of separation.
I believe, however, that the quasi-Gödelian strategy adopted in the present paper allows us to formulate a satisfactory response to this objection. 12 For suppose that the domain of our interpretation consists of a class d, and let 'P' be a predicate symbol that we want to interpret by a predicate in our language. If we take the idea that every predicate (concept, propositional function) has a range of significance seriously, then it seems reasonable to demand that a predicate be chosen that is significant for all elements in d. I am not sure whether it is plausible to insist that it must be possible to interpret 'P' by some predicate that is not significant for some objects in the domain of interpretation. Indeed, the type-theoretic defense can be seen as a special case of this. After all, Russell's theory diverges from Gödel's only insofar as a further condition is imposed on the ranges of significance, namely, that they form types. (Recall our discussion in Sect. 4.) Hence we could replace the above semantic principle by the following one: For any domain of interpretation d and any predicate uðzÞ that is significant for all objects in d, it should be possible to specify an interpretation such that for all objects x 2 d, a predicate letter 'P' applies to x if and only if uðx=zÞ.
This demand can be met in the theory of classes presented in this paper. For if uðzÞ is a predicate and d is a class such that all members of d are in the range of (the class determined by) uðzÞ, then for all x 2 d we will have x 2 fz 2 d j uðzÞg if and only if uðx=zÞ. Hence we can assign the class fz 2 d j uðzÞg as extension to the predicate letter 'P'.
How severe is the restriction imposed by the suggested principle? One may argue about this, but I do not think that it is too severe. Notice, for instance, that whenever the universe d contains only urelements, then any predicate in the language can be assigned as interpretation to 'P'. This still holds true if d contains, in addition, total classes. Only if d contains classes that are non-total are we not free to choose arbitrary predicates. (For example, if d contains the Russell class r, then we cannot interpret 'P' by the Russell predicate u 6 2 u.) However, we can still assign to 'P' any predicate that is total-such as the predicate x ¼ x.

Conclusion
I have listed a number of desiderata for a type-free theory of classes. Let us now see how the theory proposed in this paper fares with respect to them. The main function of class talk is that it enables us to generalize on predicate places in our language. Second-and higher-order quantifiers provide a means to do so directly. Our class theory can be used for the same purpose. For example, if our class theory is applied to set theory, then we can express the axioms of separation and replacement by single sentences. In addition, our theory allows us to generalize predicate places that cannot be generalized in type theory.
Another possible application for a type-free theory of classes is to serve as a foundation of category theory. In order to be applied like this, we need at least be able to form the class of all sets and the class of all functions between sets, as well as the power class of the class of all sets and the class of all functions between these classes (see Muller 2001). This is possible if our class theory is combined with set theory and the Axiom of Urelements is iterated in the way described. How successful such a class-theoretic foundation of category theory is from a philosophical point of view is, of course, a difficult question which demands further discussion.
The problem of unrestricted quantification was cited as a main motive for a typefree theory of classes. In the previous section, I have argued that the idea of limited ranges of significance provides a response to one of the main objections against a universal domain, namely, the problem of separation.
What about the reduction of properties (universals) to classes and the corresponding analysis of natural language semantics? Obviously, this cannot be answered unless we are given a formal theory of properties (universals). But I think that the prospects here are not too bad either (assuming, of course, that one can deal effectively with the problem that classes seem to be more coarse-grained than classes, perhaps by following David Lewis' strategy). For instance, consider the property of being a property, which applies to all properties including itself. This could be modelled by the class of all classes. There are good reasons to believe that properties are closed under the algebraic operations corresponding to the logical operations of negation, conjunction, etc. These operations can be performed on classes as well. Moreover, consider again the inference from 1. Everything has the property of being self-identical to and the inference from (1) to 3. The property of being red has the property of being self-identical In our theory, both inferences can be carried out if talk of properties is appropriately replaced by talk of classes.
Finally, we have seen that the pure theory of classes allows for an interpretation of the second-order Dedekind-Peano axioms of arithmetic (i.e., Z 2 ). If the theory is extended with an appropriate axiom for forming total power classes of total classes, it is even possible to interpret Z x , that is the union of n-th order arithmetic for every n 2 x, which is roughly equivalent to Zermelo set theory (Zermelo-Fraenkel set theory without replacement and foundation). What this means for the philosophy of mathematics is an altogether different question. I have indicated that the naïve concept of class is acquired without significant epistemological presupposition, and therefore might be used in a project similar to the Neo-Fregean one. But the paradoxes force us to regiment the notion of class, and whether the regimentation proposed here preserves the epistemological status of the naïve notion is clearly in need of further discussion. This, however, is left for another occasion.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix 1: Derivation of the Dedekind-Peano axioms
In this appendix we show that, given suitable definitions, the second-order Dedekind-Peano axioms can be derived within the pure theory of classes. The proof makes heavy use of the following theorem schemata, namely 8x ðtotalðxÞ ! ðx 2 fu j ug $ uðx=uÞÞÞ ðSOCÞ and 8x ðx 2 fu j ug ! uðx=uÞÞ; ðOutÞ both of which where discussed in Sect. 5.
Definition 7.1 (Natural numbers.) Let 0 :¼ £ and n þ 1 :¼ SðnÞ. Let x :¼ fx j totalðxÞ^8yð0 2 y^8z 2 yðSðzÞ 2 yÞ ! x 2 yÞg This is the usual von Neumann definition of finite ordinals; the only difference is that we have smuggled the condition total(x) into the defining predicate. This allows us to prove that x actually contains 0 and is closed under successor, and that induction holds: Proposition 7.2 1. 0 2 x^8xðx 2 x ! SðxÞ 2 xÞ 2. y x^0 2 y^8z 2 x ðz 2 y ! SðzÞ 2 yÞ ! w y 3. ðInductionÞ uð0Þ^8x 2 x ðuðxÞ ! uðSðxÞÞÞ ! 8x 2 x u: Proof Ad 1. We already know that 0 is total. Since trivially 0 is a member of every inductive class, (SOC) implies 0 2 x. Now let x 2 x. By (Out), x is total and a member of every inductive class. It is easily seen that S(x) is also a member of every inductive class. Moreover, we know that if x is total, then so is S(x). Thus by (SOC), SðxÞ 2 x. Ad 2. Let x 2 x to show that x 2 y. By (Out) we know that x is a member of every inductive class. So it suffices to show that y is inductive. Obviously, 0 2 y. Now let z 2 y. Since y x also z 2 x. But then the third condition on y yields SðzÞ 2 y. So y is inductive, we are done.
Ad 3. Consider a :¼ fx j x 2 x^uðxÞg. Note that x 2 a implies x 2 x by (Out). So a x. We will apply induction in the sense of the previous proposition (2). Since 0 is total, from uð0Þ (and the fact that 0 2 x) we conclude 0 2 a by (SOC). Now let y 2 x^y 2 a. By (Out), y is total and uðyÞ. By assumption, uðSðyÞÞ. Since y is total, S(y) is total too. By (1) and y 2 x we conclude SðyÞ 2 x. Thus from (SOC) and totality of S(y) we conclude SðyÞ 2 a. So by (2), x a. So by (Out), every member of x has the predicate u. h The following propositions establish the transitivity and irreflexitivity of the natural numbers. Here, x is transitive if and only if for all y 2 x, we have y x.
Proposition 7.3 1. 8x 2 x ðxis transitiveÞ 2. 8x 2 x ðx 6 2 xÞ: Proof Ad 1. By induction. 0 (which is the empty set) is obviously transitive. Assume n 2 x, n transitive. Let y 2 x^x 2 n þ 1. By (Out), x 2 n _ x ¼ n. If x 2 n then y 2 n by induction hypothesis (IH). If x ¼ n then also y 2 n. Thus y 2 n. By the Axiom of Singularity, y is in the range of significance of n and therefore y is in the range of significance of fz j z 2 ng by the Axiom of Membership. By the Axioms of Identity and Connectives, y is also in range of fz j z 2 n _ z ¼ ng ¼: n þ 1. Then y 2 n implies y 2 n þ 1. Classes,why and how 427 Ad 2. By induction. Obvious for x ¼ 0. Let n 2 x with n 6 2 n. Assume n þ 1 2 n þ 1. Then n þ 1 2 n _ n þ 1 ¼ n by (Out). Also, by definition of n þ 1 and totality of n, n 2 n þ 1. But then n 2 n, contradicting the IH. h Now we are in a position to prove the successor axioms: Proof Ad 1. Once can show that 0 has no elements but x þ 1 has at least one element. Therefore they cannot be identical by Leibniz' law. Ad 2. Assume x þ 1 ¼ y þ 1. Then x 2 y þ 1 and y 2 x þ 1 (because x 2 x þ 1 and y 2 y þ 1). But then (applying the definition of successor and using the totality of the natural numbers) ðx 2 y^y 2 xÞ _ x ¼ y. By Proposition 7.3(1), Finally, we show that comprehension holds.
Corollary 7.5 (Comprehension) For any u which does not contain y free, we have 9y x 8x 2 x ðx 2 y $ uÞ.
Proof Let u be given. Let x 2 x. If x 2 fu j u 2 x^ug then uðx=uÞ by (Out) and Conjunction Elimination. Conversely, if uðx=uÞ holds, then since x 2 x (which implies totality of x) we get x 2 fu j u 2 x^ug by (SOC). Therefore, for all x, x 2 fu j u 2 x^ug $ uðx=uÞ. Clearly, fu j u 2 x^ug x. h Notice that if x; y 2 x then the ordered pair (x, y) is total. (See our discussion in the middle of Sect. 5.) A slight modification of the above proof shows that we can also have comprehension for binary relations (indeed, n-ary relations) over natural numbers. Hence, Dedekind's famous result implies that our theory of classes interprets full second-order arithmetic.

Appendix 2: Consistency proof
We will work in Zermelo-Fraenkel set theory with Urelements. The use of the urelements can be eliminated but is assumed here for technical convenience. Before starting with the formal construction of the model, let me sketch the underlying idea (without laying claim to completeness or accuracy).
The domain of our model will consist of all sets of rank 6 x that can be constructed in the cumulative hierarchy starting with a countable set of urelements. Hence, the set of natural numbers and each of its subsets live within the model. There are four subsets of the domain that will be relevant: (1) the urelements; (2) the finite sets; (3) the co-finite sets; (4) the infinite sets that are not co-finite. The objects in category (1)-(3) will represent the total classes, the sets in category (4) will represent the non-total or proper classes. The range of significance of a total class will consist of the entire domain; the range of a proper class will consist of the set of total classes. In other words, the singularities of a proper class will comprise all and only the proper classes. Therefore, the Axiom of Circularity will be true in this model. We will associate (identify) each set in category (3) with exactly one urelement. Self-membership is then achieved by stipulating that a class is contained in itself iff it contains the urelement associated with it. Now let us consider the Axiom of Class Comprehension. This axiom states that, for every predicate uðxÞ, the class associated with that predicate comprises exactly those objects that are in the range of uðxÞ and satisfy uðxÞ. In particular, given how we have defined the ranges of significance in our model, if uðxÞ is not total, then the class associated with it must comprise exactly the total classes satisfying uðxÞ. But any collection of total classes can be represented by a collection of finite sets, because every co-finite set is represented by an urelement. Since such a collection has rank 6 x, it lives within our model, and we will use it to validate the Axiom of Class Comprehension.
Moreover, it is easy to see that, in this model, the total classes are closed under all relevant algebraic operations postulated by the axioms: for example, they are closed under complementation because the complement of a finite set is a co-finite set and vice versa; they are closed under singletons because the singleton of a finite set is finite, and the singleton of a co-finite set will be represented by the singleton of the urelement associated with it, which is a finite set as well.
In this short sketch of the main idea, I have swept several problems under the rug, and we need to make certain adjustments in order to deal with them. In particular, according to the Axiom of Extensionality, there may be co-extensional classes that are not identical because they do not have the same range of significance. In order to deal with this problem, we will have to add copies of the finite and co-finite sets to our model and declare them to be non-total. Secondly, we need to allow proper classes to be elements of total classes. However, proper classes can have rank x, and there are no objects in our domain of rank x þ 1. For cardinality reasons, it is not possible to introduce an urelement for each proper class that could serve as its proxy. Fixing this problem makes the proof more complicated.
We will now start with the formal construction. Let U be a countably infinite set of urelements. Let t, p be two urelements not contained in U. Let U T ¼ fðu; tÞ j u 2 Ug. The objects in U T will be used to model membership between classes of equal rank. Let V x ½U T be the smallest set X that (a) U T X, and (b) whenever x 1 ; . . .; x n 2 X then ðfx 1 ; . . .; x n g; tÞ 2 X The collection of total classes, T, consists of all x such that 1. x 2 V x ½U T , or 2. 9y 1 ; . . .; y n 2 V x ½U T such that x ¼ ðV x ½U T n fy 1 ; . . .; y n g; tÞ Note that the objects satisfying (1) have finite rank and that the ones satisfying (2) are such that their first components are co-finite in V x ½U T .

Classes, why and how 429
The collection of proper (i.e., non-total) classes, P, consists of all x such that 3. x ¼ ðy; pÞ for some y V x ½U T The domain, D, of our model, consists of T [ P. Notice that, as I said above, P contains a 'copy' of each finite and co-finite set. Given an object x ¼ ða; iÞ 2 D with i 2 ft; pg, we call a the set-component of x and i the index of x. If the set-component of x is an urelement, we call (by abuse of language) x itself an urelement. We let r 1 ðða; iÞÞ ¼ a. Given x; y 2 D, we write x 2 1 y if and only if x is an element of the set-component of y, that is if and only if x 2 r 1 ðyÞ. Observe that x 2 1 y can only obtain if x has finite rank.
The ranges of significance are defined as follows. If x is total, i.e., if x 2 T, then the range of x consists of the entire domain, D, that is, yRx iff y 2 D. If x is not total, i.e., if x 2 P, then the range of x consists of the set of total classes, T, that is, yRx iff y 2 T. So there are only two different kinds of ranges. This definition ensures that the Axiom of Circularity is true on our model: Proposition 7.6 8x 2 D ð9y 2 D : xRy ! 9z 2 D : zRxÞ: Proof Let x 2 D and assume that there is a y 2 D such that : xRy. By definition of R, this implies that x 2 P (for if x were in T, x would be R-related to every y 2 D).
But if x 2 P, then x is R-related to all and only the the objects in T. Since P is nonempty, there must be a z 2 P such that : zRx. h Let D Ã be the set of all total classes with infinite rank (i.e., the co-finite sets) and let f be a bijection between D Ã and U T . We define a relation on our domain D that will serve as our interpretation for identity between classes. Let x y iff x ¼ y or x ¼ f ðyÞ or y ¼ f ðxÞ. That is, two objects are equivalent iff they are either identical or one is the urelement associated with the other. (Notice that the last two cases can only occur when both objects are total and one of them is co-finite.) Clearly, is an equivalence relation.
Next we define a binary relation E on D that will serve as our interpretation for class membership. We will make sure that x and f(x) turn out to be co-extensional in the sense of E (whenever x is total). We will achieve self-membership by stipulating that xEx iff f ðxÞ 2 1 x, that is, if the set-component of x contains the urelement associated with x. Note that, according to our axioms, while proper classes may be self-membered, only total classes are forced to be. We will produce a 'minimal' model in which only total classes will be self-membered. Indeed, we will produce a model in which no proper class contains any other proper class. (Again, while our axioms allow that proper classes contain proper classes, they don't force it.) E is defined as follows. If x; y 2 P then : xEy. If x 2 P and y 2 T n ðD Ã [ U T Þ then : xEy. If x 2 P and y 2 D Ã then xEy and xEf(y). If x 2 T then xEy if and only if x 2 1 y _ x 2 1 f À1 ðyÞ _ f ðxÞ 2 1 y _ f ðxÞ 2 1 f À1 ðyÞ In words: in our model, no proper classes is E-contained in any other proper class. A proper class is never contained in any finite total class. Every proper class is Econtained in any infinite total class and in their corresponding urelements. Finally, if x is total, then x is an E-member of y iff the set-component of x (or its corresponding urelement, if it exists) is an 2-member of the set-component of y (or of the (infinite) set corresponding to y, if y is an urlement). Now it is easily seen that the Axiom of Singularity holds in our model.
Proposition 7.7 8x; y 2 D ð: xRy ! : xEyÞ: Proof If x is not in the range of y then y is a proper class. But since x is not contained in some range, it must be a proper class too. However, if x; y 2 P then : xEy by definition of E. h Moreover, the definitions so far ensure that the Axiom of Extensionality is true in our model.
Proposition 7.8 8x; y 2 D ðRðxÞ ¼ RðyÞ^8z 2 D ðzEx $ zEyÞ ! x yÞ: Proof Assume that the antecedent holds. Then either both x, y are total or both are proper. There are several cases to distinguish.
Assume first that neither x nor y are urelements. It is sufficient to show that the set components of x, y are identical, i.e., that 8z ðz 2 1 x $ z 2 1 yÞ. Let z 2 1 x. Clearly then, z has finite rank. Hence z 2 V x ½U T . So, z is total. Thus we can deduce zEx. By assumption, zEy as well. Since y is not an urelement, this means that z 2 1 y or f ðzÞ 2 1 y. The latter cannot obtain because f(z) is undefined as z is either an urelement or a total class of finite rank. Thus z 2 1 y. The argument for the other direction is completely symmetrical. If x is an urelement but y is not, one can run a similar argument to show that x ¼ f ðyÞ. If both x, y are urelements, one can run a similar argument to show that f À1 ðxÞ ¼ f À1 ðyÞ. h The following proposition states that two objects that are treated as equal are Econtained in the same objects.
Proposition 7.9 8x; y; z 2 Dðx y^xEz ! yEzÞ: Proof If x ¼ y this is trivial. So assume w.l.o.g. that x ¼ f ðyÞ, which means that x is an urelement and both x, f(x) are total. The definition of xEz implies x 2 1 z or x 2 1 f À1 ðzÞ or f ðxÞ 2 1 y or f ðxÞ 2 1 f À1 ðyÞ. The third and fourth disjunct cannot obtain since x is an urelement and so f(x) is not defined. Hence (since x ¼ f ðyÞ) we have f ðyÞ 2 1 z or f ðyÞ 2 1 f À1 ðzÞ. By definition of E, yEz. h Let L be the language of our class theory. In order to distinguish the membership symbol of L from that of our metatheory (ZF), we will denote the former by e. If u is an L-formula without any class abstracts, let u Ã be obtained from u by replacing all occurrences of x e y by xEy, all occurrences of x ¼ y by x y, and all occurrences of 8x w by 8x 2 D w Ã . Classes,why and how 431 Let us check that the law of substitutivity holds: Proposition 7.10 If u is a formula of L then 8x; y 2 D ðx y^u Ã ðxÞ ! u Ã ðyÞÞ: Proof By induction on the build up of u Ã . If u Ã ðxÞ is of the form x s then this follows from the fact that is an equivalence relation. If u Ã ðxÞ is of the form xEs this follows from Proposition 7.9. Let u Ã ðxÞ be of the form sEx . The claim is trivial if x ¼ y. So assume w.l.o.g. that y ¼ f ðxÞ, so y is an urelement and both y, x are total (x has infinite rank). If s 2 P then the claim follows immediately from the definition of E. So assume s 2 T. Then sEx implies s 2 1 x or s 2 1 f À1 ðxÞ or f ðsÞ 2 1 x or f ðsÞ 2 1 f À1 ðxÞ. We can exclude the second and fourth disjunct (because if f(x) is defined, f À1 ðxÞ is not defined). Hence s 2 1 x or f ðsÞ 2 1 x. Since x ¼ f À1 ðyÞ, we have s 2 1 f À1 ðyÞ or f ðsÞ 2 1 f À1 ðyÞ, hence sEy by definition of E.
If u Ã ðxÞ is of the form xRs or sRx this follows from the definition of R. The other clauses follow from the induction hypothesis. h In order to finish the specification of our model, we need to interpret the class abstracts. This is done in a two-step construction. In the first step, we define an interpretation for all class abstracts that our axiomatic theory of classes proves to be total. In a second step, we extend this interpretation to cover the remaining class abstracts. For sake of simplicity, we assume that L contains :;^as its only logical connectives.
First step First, we need to capture the set of class abstracts that our axiomatic theory proves to be total. We'll call them p-terms. They are defined by the following simultaneous recursive definition: