Abstract
This paper presents a new approach to the classtheoretic paradoxes. In the first part of the paper, I will distinguish classes from sets, describe the function of class talk, and present several reasons for postulating typefree classes. This involves applications to the problem of unrestricted quantification, reduction of properties, natural language semantics, and the epistemology of mathematics. In the second part of the paper, I will present some axioms for typefree classes. My approach is loosely based on the Gödel–Russell idea of limited ranges of significance. It is shown how to derive the secondorder Dedekind–Peano axioms within that theory. I conclude by discussing whether the theory can be used as a solution to the problem of unrestricted quantification. In an appendix, I prove the consistency of the class theory relative to Zermelo–Fraenkel set theory.
1 Sets versus classes
Russell’s paradox of the class of all nonselfmembered classes was first discovered in connection with Frege’s Grundgesetze (Frege 1964), where Frege sought to establish the logicist thesis that arithmetic is a branch of logic. The paradox caused (together with other paradoxes such as Cantor’s and BuraliForti’s) what some have called the “third foundational crisis of mathematics”, which prompted the search for a firm foundation of mathematics (Fraenkel and BarHillel 1958, pp. 14–15). This foundation was found in modern axiomatic set theory, which has its roots in the work of Cantor.
A number of authors (Maddy 1983; Lavine 1994) have argued that there were at least two different notions of class in the literature, and that only of them is prone to paradox (Gödel 1983). Following Lavine (1994, p. 63), we may call them the logical and the combinatorial notion of class, respectively.
According to the logical notion, a class may be defined as the extension of a concept or predicate, or, to use Russell’s words, “as all the terms satisfying some propositional function”.^{Footnote 1} Such classes are associated with some kind of definition or rule that tells us in a principled way whether an object belongs to the class or not. This is the notion of class that was championed by Frege, Peano and Russell.^{Footnote 2} Extensions of concepts had been part of logic since antiquity; they can be found in the works of Leibniz and are explicit in the PortRoyal Logic (Bochenski 2002, pp. 302–303). It is this fact that allowed Frege, dialectically, to assume that a reduction of number theory to class theory is sufficient to establish his thesis that arithmetic is a branch of logic (Heck 2011b, p. 126).
According to the combinatorial notion, on the other hand, classes are obtained from some welldefined objects such as the natural numbers by ‘enumerating’ their members in an arbitrary way. Such classes exist independently of our ability to provide a defining condition or rule that characterizes its members. Arguably, this is the notion adopted by Cantor and Zermelo and which underlies our modern iterative concept of set.
The difference between the concept of class as given by a rule and the concept of class freed from such restriction was an important factor in the controversy about the axiom of choice. This axiom states that we can select one element out of each of a family of (nonempty) classes and collect them into one class. As Bernays (1983) remarks, the axiom of choice “is an immediate application of the combinatorial concepts in question.” On the logical notion of class, on the other hand, it is doubtful whether a class satisfying the requirements set out in the axiom of choice can always be found.^{Footnote 3}
In what follows, we will call the combinatorial classes sets and the logical ones classes. The (logical) notion of class motivates what is commonly referred to as “the naïve calculus”, which consists in the naïve or unrestricted comprehension axiom scheme, which postulates the existence of a class corresponding to each predicate, and the axiom of extensionality, which states that two classes are identical if they have the same members. Of course, as Russell’s paradox shows, the naïve calculus is inconsistent. The standard approach to the classtheoretic paradoxes is to be found in the theory of types, which originated with Russell (1903a, 1908). In a nutshell, what happens here is that one abandons the idea of a general or unrestricted variable and replaces it with a series of variables differentiated as to type.
While Cantor never laid down explicitly the principles that he was working with, it has been argued that the naïve comprehension axiom scheme was not part of it and that the notion of set was never subject to the paradoxes (e.g., Lavine 1994). By contrast, one can make a case that the classtheoretic paradoxes are still unsolved. For instance, Gödel says about the typetheoretic approach that “it cannot satisfy the condition of including the concept of concept which applies to itself or the universe of all classes that belong to themselves. To take such a hierarchy as the theory of concepts is an example of trying to eliminate the intensional paradoxes in an arbitrary manner.” (Wang 1996, p. 278) The aim of this paper is to provide reasons for developing a typefree theory of classes and to indicate one way how this might be done.
2 The function of class talk
In a series of papers, Parsons (1982, 1983a, b) has argued that the introduction of the notion of class answers a general need to generalize on predicate places in our language (where ‘predicate’ means formula with one free variable). For example, consider the usual (firstorder) principle of mathematical induction. This consists in all sentences of the form
where \(\varphi (x)\) is a predicate applying to numbers. The introduction of class terms, governed by the comprehension axiom scheme, allows us to substitute the expression \(\varphi (t)\) by the materially equivalent \(t\in \{ u\mid \varphi (u)\}\), where \(\{ u\mid \varphi (u)\}\) occupies an object position and is therefore open to (objectual) quantification. Hence, the notion of class allows us to finitely axiomatize the induction schema by the single statement
Of course, in mathematical contexts the demand for generalising predicate places is met to a considerable extent by sets. But the notion of class allows us in addition to generalize every predicate in the language of set theory. This cannot be done by sets themselves because some predicates of the language of set theory, such as ‘x is an ordinal’, have extensions that are “too big to be sets”. Examples of the use of classes in set theory include the formulation of certain schemata as single statements, such as the axiom schemes of separation and replacement, or reflection principles.^{Footnote 4} There are many other uses as well that are often eliminable but seem heuristically indispensable, for example, in connection with the study of elementary embeddings of the universe of sets into some inner model of ZFC (see Uzquiano 2003, section 2).
The function of class talk brings the notion of class in proximity with the notions of truth and truthof (satisfaction). This was stressed by Parsons: just as the notion of class answers a need to generalize predicate places, so does the notion of truth answer to a need to generalize sentence places (cf. Quine 1970). Moreover, Parsons (1983b) observes that the notion of satisfaction can be seen a means to generalize predicate places as well, and that the usual predicative theories of satisfaction and classes are mutually interpretable. In Schindler (2015, 2017) it is shown that even impredicative theories of classes can be interpreted in (typefree) theories of satisfaction. Given the similar functions of the notions of truth and class, and the mentioned interpretability results, this suggests that someone who already has a broadly deflationary understanding of the notions of truth and satisfaction should probably have a deflationary understanding of the notion of class as well.
While I find the idea of a deflationary account of classes intriguing, it is rather tangential to our present purposes and I won’t pursue it any further here. But let me make the following remark. That classes were merely introduced to fulfill a particular function does not imply a nominalistic account of classes, at least if one subscribes to Quine’s criterion of ontological commitment. On the contrary, classes are introduced so that we can objectually quantify over entities that would otherwise be in predicate position. However, a deflationary account of classes may help us argue that classes are “thin” objects in the sense of Linnebo (2012), where “thin” is taken in the sense that “very little is required for their existence”. But this is a task for another paper.
3 Reasons for postulating typefree classes
The literature is full of interesting attempts to overcome the restrictions imposed by the theory of types. For an overview, I send the reader to Cantini (2009). The are various reasons why one may be interested in a typefree theory of classes. For instance, Feferman (1977, 1984), Muller (2001) and others are interested in a setor classtheoretic foundation of category theory. The problem here is that there are certain categories that are very natural to think about, such as the category of all sets, the category of all groups or the category of all categories, that cannot be formed in standard set theory. [For a recent overview, see Schulman (2008).] In what follows, I will list four more reasons. My own interests are mainly with the first and last of them.
3.1 Unrestricted quantification
There are certain contexts in logic and philosophy where we intend our quantifiers to range over absolutely everything whatsoever, or at least to be unrestricted, for example when we say that everything is selfidentical or that the empty set has no members. Presented with a counterexample, we would not regard it open to the defendant to dismiss the counterexample on the ground that it is not in the domain of quantification. The possibility of unrestricted quantification does not only seem to be plausible, its denial seems to border on the incoherent. If someone claims that one cannot quantify over everything, they seem to imply at the same time that there is something one cannot quantify over (Williamson 2003).
Despite this, the coherence of unrestricted quantification has been doubted. For an overview of this debate, see Rayo and Uzquiano (2006). One objection is related to a principle that was first discussed (but not endorsed) by Cartwright (1994), and is nowadays known as the
AllinOne Principle The objects in a domain of discourse make up a set or some setlike object.
In modern semantics, for example, the domain of discourse is usually taken to be a set. However, according to standard set theory there is no universal set. This causes problems, in particular, when one tries to interpret settheoretic talk itself. It seems natural to assume that when a set theorist talks about sets, she is (at least sometimes) talking about all sets. The proposal that we should trade in standard set theory for a theory that admits a universal set, such as Quine’s New Foundations (Quine 1980), has not been met with enthusiasm, because this theory does not seem to embody any intuitive picture of sets.
One popular defence of unrestricted quantification makes use of the theory of types. On this account, interpretations are not (firstorder) objects but higherorder entities. But this defence is not unproblematic; see Sect. 4 below and Linnebo (2006, pp. 154–156). Therefore, one may think it is preferable to treat the domain of quantification as an (firstorder) object. As Linnebo points out, there is no reason to assume that this object needs to be a set. Hence, one possible solution to the problem of unrestricted quantification consists in replacing or supplementing set theory by a theory of classes that allows for a universal class. This proposal is not unproblematic itself, because theories with a universal class are incompatible with the axiom of separation, which seems necessary for semantics. I believe, however, that this problem can be dealt with and will return to it in Sect. 6.
3.2 Reduction of properties
Another area where classes might be useful is metaphysics: one might try to reduce properties or universals to classes. An influential account of this sort was given by Lewis (1986, chap. 1.5). However, there a good reasons, mainly in connection with the semantics of natural language (see below), to assume that properties need to be typefree.^{Footnote 5}
Another motive for selfmembered properties was suggested by Allen (2016 pp. 28–31). One classical problem confronting property theory is Bradley’s Regress argument. This argument can be described as follows. Assume that a instantiates the universal or property F. This relation of instantiation is itself a universal, say \(I_1\). Now, one might ask what connects a, F and \(I_1\)? This will be another instantiation relation, \(I_2\). But then we may ask what connects \(a, F, I_1\) and \(I_2\)? This will be another instantiation relation, \(I_3\). And so on. Whether this regress is vicious or not is a hotly debated topic.
Whatever the outcome, one might try to simplify the hierarchy of instantiation relations required by the regress. There are at least two options: one could treat \(I_1, I_2, I_3, \ldots\) as instances of a single multigrade relation \(I'\) (where a relation is multigrade if the number of entities it relates can vary); or, one could treat \(I_1, I_2, I_3, \ldots\) as socalled inexactly resembling instances of a single instantiation relation \(I^{*}\) (where instances of a relation inexactly resemble each other if the resemblance is not exact similarity). Either way, \(I'\) and \(I^{*}\) need to be able to selfinstantiate.
3.3 Natural language semantics
Classes (properties, concepts) have been applied in the analysis of natural language semantics (Montague 1974). However, there are many intuitively valid inferences that cannot be reconstructed in a typed framework due to the lack of selfexemplifying properties; this has motivated quite some research into typefree theories of properties (Bealer 1982; Menzel 1986, 1993; Orilia 1999; Chierchia and Turner 1988). For example, consider the inference from

1.
Everything has the property of being selfidentical
to

2.
Socrates has the property of being selfidentical
and the inference from (1) to

3.
The property of being red has the property of being selfidentical
The intuitive soundness of both inferences requires not only the existence of the property of being red but also that the quantifier in (1) ranges over both Socrates and the property in question. Hence, this inference cannot be captured in a typed language.
3.4 Reduction of mathematics
Last, but not least, one might be interested in a theory of classes (properties) for the very same reason for which Frege and Russell were originally drawn to it, namely, the “reduction of mathematics to logic”. It has often been claimed that logicism is dead, but several reformed versions of logicism have emerged in recent decades. One should mention here, on the one hand, the works of Bealer (1982), Cocchiarella (1986), Landini (2004), and Orilia (1991), which are based on typefree theories of properties, and, on the other hand, the works of the NeoFregean school, which are based on abstraction principles. For a technical overview of the latter, see Burgess (2005). For philosophical discussion, see Hale and Wright (2001), Heck (2011a), and Cook (2009). The NeoFregean project originated with the discovery of Frege’s Theorem, namely, that the secondorder Dedekind–Peano axioms for arithmetic can be derived, in secondorder logic, from what is known as “Hume’s Principle”, namely
This principle states that the numbers of Fs is identical to the number of Gs if and only if the Fs and Gs are equinumerous (i.e., can be put in a oneone correspondence).
Now, one might be sceptical about the analyticity of Hume’s Principle or whether class theory should be counted as part of logic. But such reductions may still be seen as answering to Frege’s question: How are numbers given to us? The problem of epistemic access to abstract objects has been emphasized by Benacerraf (1983). How can we have knowledge of abstract objects, such as numbers, when we have no causal interactions with such objects? Wright’s idea is that an agent who is capable of secondorder reasoning but has no knowledge of number theory could stumble upon Hume’s Principle, say, in a dream and decide to use terms of the form \(\#F\) in accordance with it. Then the claim is that the agent thereby acquires a concept of number without significant epistemological presupposition.
Similarly, one may claim that the concept of class (property) is acquired without significant epistemological presupposition. We “nominalize” predicates in order to generalize predicate places, and that’s all there is to class (property) talk. However, if we want to reduce mathematics to a theory of classes, then typefree classes are called for, because we need to initiate a bootstrapping process in order to generate enough objects that can serve as proxies for mathematical objects.
4 Ranges of significance
The purpose of the present section is to motivate a novel approach to the paradoxes that is loosely based on some remarks that Gödel made in (Gödel 1983) about Russell’s theory of types. Recall that a propositional function is a function that yields a proposition when given an argument. According to Russell’s theory, every propositional function \(\varphi (x)\) has “in addition to its range of truth a range of significance, i.e., a range within which x must lie if \(\varphi (x)\) is to be a proposition at all, whether true or false” (Russell 1903b, p. 523). More generally, the range of significance of a function is the collection of arguments for which said function is defined (i.e., has a value), and the range of significance of a propositional function is the collection of arguments for which the function yields a proposition. The idea of a range of significance need not be tied to the notion of propositional function. Gödel applies it to concepts,^{Footnote 6} but of course one can also apply it to predicates.
There are several ways in which the notion of a range of significance can be interpreted on a pretheoretical level. The literature on philosophy of language provides many examples of grammatically wellformed sentences that, for some reason or other, do not express a proposition or lack a definite truth value. Many of these examples may be taken as instances of an object’s being a singular point of the relevant predicate or propositional function. For example, one may think that in the case of a category mistake (e.g., “The number 2 is green”), the object denoted by the name lies outside the range of significance of the predicate. Of course, one may simply treat such a sentence as false and its negation as true (perhaps for reasons of technical simplicity, e.g., in order to stay classical). On a more narrow understanding, one may think that in all and only those cases where the application of a predicate to a name yields a paradoxical sentence (e.g., “This sentence is false”), the object denoted by the name lies outside the range of significance of the predicate.
As Gödel remarks, the idea that every propositional function has a range of significance that need not exhaust the entire universe “brings in a new idea for the solution of the paradoxes, especially suited to their intentional form”, which “consists in blaming the paradoxes not on the axiom that every propositional function defines a concept or class, but on the assumption that every concept gives a meaningful proposition, if asserted for any arbitrary object or objects as arguments” (1983, p. 466). He adds that “[t]he obvious objection that every concept can be extended to all arguments, by defining another one which gives a false proposition whenever the original one was meaningless, can easily be dealt with by pointing out that the concept “meaningfully applicable” need not itself be always meaningfully applicable” (otherwise Grelling’s paradox would ensue).
For reasons that I do not want to enter here, Russell thought that ranges of significance form types such that whenever a propositional function is significant for some argument x, and y belongs to the same type as x, then that function is significant for the argument y as well. This means that

1.
whenever a propositional function is significant for some argument x, its range of significance is identical with the type of x;

2.
sameness of type is an equivalence relation and, therefore, types are mutually exclusive; and

3.
if two functions are both significant for some argument x, then they must have exactly the same range of significance.
The types are then divided into orders (yielding the ramified theory of types), but this further complication need not interest us here. Unfortunately, the theory of types suffers from expressive limitations that have often been pointed out in the literature. For example, Gödel remarks that “[w]hat makes the above principle particularly suspect, however, is that its very assumption makes its formulation as a meaningful proposition impossible, because x and y must then be confined to definite ranges of significance which are either the same or different, and in both cases the statement does not express the principle or even part of it.” (Gödel 1983, p. 466)
It should be observed that Russell’s idea that every propositional function has a range of significance is logically independent of the assumption that the ranges of significance form types. One might therefore consider the possibility of construing classes based on the first but without the second assumption. In the remainder of this paper, I wish to develop the theory of classes in this direction. This approach is inspired by Gödel’s remark that:
It is not impossible that the idea of limited ranges of significance could be carried out without the above restrictive principle [i.e. that the ranges of significance form types]. It might even turn out that it is possible to assume every concept to be significant everywhere except for certain “singular points” or “limiting points”, so that the paradoxes appear as something analogous to dividing by zero. Such a system would be most satisfactory in the following respect: our logical intuitions would then remain correct up to certain minor corrections, i.e., they could then be considered to give an essentially correct, only somewhat ‘blurred’ picture of the real state of affairs. Unfortunately the attempts made in this direction have failed so far; on the other hand, the impossibility of this scheme has not been proved either, in spite of the strong inconsistency results of Kleene and Rosser. (Gödel 1983, p. 466467)
The following general picture emerges. Let U be the universe of all objects, and \(\varphi (x)\) be some propositional function. \(\varphi (x)\) has a range of significance, \(R(\varphi )\), which is a subset of U. If \(\varphi (x)\) has singular points, then \(R(\varphi )\) is a proper subset of U. For every object a in \(R(\varphi )\), \(\varphi (a)\) is meaningful—that is, true or false. \(\varphi (x)\) thereby determines two classes, the extension \(\{a\in R(\varphi )\mid \varphi (a)\}\) and antiextension \(\{a\in R(\varphi )\mid \lnot \varphi (a)\}\) of \(\varphi (x)\), whose union coincides with \(R(\varphi )\), that is,
Gödel mentions Church’s (inconsistent) system (Church 1932) as an interesting attempt to carry out these ideas. Another possibility is to use some nonclassical logic, such as the Weak or Strong Kleene logics. This route faces the notorious problem that the material conditional is not well behaved in these logics. One might therefore consider the following alternative route, which retains classical logic.
Again, let U be the universe of all objects, and \(\varphi (x)\) be some propositional function. As before, \(\varphi (x)\) has a range of significance, \(R(\varphi )\), which is a subset of U. If \(\varphi (x)\) has singular points, then \(R(\varphi )\) is a proper subset of U. This time, however, we treat \(\varphi (a)\) as meaningful (i.e., true or false) for every object a in U. As before, \(\varphi (x)\) determines two classes, the extension \(\{a\in R(\varphi )\mid \varphi (a)\}\) and antiextension \(\{a\in R(\varphi )\mid \lnot \varphi (a)\}\) of \(\varphi (x)\), whose union coincides with \(R(\varphi )\). The difference to the previous picture is that the classes \(\{a\in R(\varphi )\mid \varphi (a)\}\) and \(\{a\in R(\varphi )\mid \lnot \varphi (a)\}\) may “underspill”: if a is an object outside the range of \(\varphi (x)\), then either \(\varphi (a)\) or \(\lnot \varphi (a)\) will be true; but since a is a singularity of \(\varphi (x)\), it is neither an element of \(\{a\in R(\varphi )\mid \varphi (a)\}\) nor of \(\{a\in R(\varphi )\mid \lnot \varphi (a)\}\).
It is the latter route that will be followed in the remainder of this paper.^{Footnote 7} For technical convenience, I will modify the picture above in two ways. First, I will treat classes as extensions of predicates (i.e., formulas with one free variable) rather than propositional functions (or concepts) because I am not aware of any suitable theory of propositional functions (concepts). Second, instead of assigning ranges of significance to predicates, I will directly assign them to classes. This saves us the trouble of introducing names for predicates and function symbols for syntactic operations on predicates. From a technical point of view, this does not seem to make too much of a difference, because to every predicate \(\varphi (x)\) there corresponds a unique class abstract \(\{x\mid \varphi (x)\}\). The class abstract can therefore serve as some form of Gödel code for the predicate. In the informal presentation, I will nevertheless talk frequently (as a form of shorthand) of the range of significance of \(\varphi\) instead of that of \(\{x\mid \varphi (x)\}\).
5 A typefree theory of classes
The language of the theory that we are going to present is an ordinary onesorted firstorder language with identity. It contains a binary relation symbol, \(\in\), for membership in a class. One of the expressive limitations of the theory of types is that it cannot express that some object is not in the range of significance of some propositional function (or predicate). In order not to fall prey to the same objection, we will introduce a primitive binary relation symbol, R, into our language. We may read xRy as “x is in the range of significance of y” or “x is not a singular point (singularity) of y”. Let total(x) abbreviate the formula \(\forall z\, zRx\). If x is total, then x has an unrestricted range of significance (i.e., has no singular points). According to the theory that we are going to present, every predicate determines a class. We will therefore assume that our language contains a class term \(\{u\mid \varphi \}\) for every formula \(\varphi\) containing the free variable u. Since we are aiming at a typefree system, \(\varphi\) is allowed to contain \(\in , R\) and other class terms. Moreover, it may contain other free variables as parameters.
A remark on notation. I will use \(\varphi , \psi\) for wellformed formulas, u, v, x, y, z for variables, and s, t for arbitrary terms. Some special symbols will be introduced as we go along. \(\varphi (t/x)\) denotes the result of substituting all free occurrences of x in \(\varphi\) by t. Instead of \(\lnot \, s\in t\) we will also write \(s\notin t\). The usual conventions for the use of brackets apply.
The axioms can be divided into three groups. The first group consists of ‘conceptual’ axioms that describe the general relation between a class and its range of significance. These axioms are directly suggested by the picture provided in the previous section (i.e., that every predicate, together with its range of significance, determines an extension and antiextension in the indicated way). The second group of axioms describe the relation between the range of significance of a predicate and the logical form of that predicate. They are based on the natural assumption that classes/ranges of significance should be closed under the algebraic operations corresponding to the logical operations on predicates. The third group contains only one axiom expressing the widespread idea that the paradoxes are to be blamed on some form of circularity or nonwellfoundedness, a view that goes back at least to the days of Russell. These axioms belong to the pure theory of classes, i.e., the part that deals with classes of classes; at the end of this section, we will discuss an axiom for the applied theory of classes, i.e., classes of individuals or urelements.
Our first and most basic axiom scheme is a relativized form of naïve comprehension and follows immediately from the picture presented in the previous section. The axiom states that whenever x is in the range of significance of the predicate \(\varphi\) (or equivalently: whenever x is not a singularity of \(\varphi\)), then x is an element of the class \(\{ u\mid \varphi \}\) if and only if \(\varphi (x/u)\) holds. That is:
Notice that \(\varphi\) may contain free variables besides u. These should be bound by universal quantifiers. A similar remark applies to the other axioms below.
The axiom scheme is completely general and topicneutral. We can insert any formula in place of \(\varphi\), including the predicates \(u=u, u\notin u\) and uRu.
It is easily seen that the Axiom of Class Comprehension is consistent. Being a universally quantified conditional, we can make it vacuously true. In this framework, Russell’s paradox is simply transformed into the theorem that the Russell class \(r:=\{u\mid u\notin u\}\) does not lie in its own range of significance: Carrying out the usual reasoning, we convince ourselves that
from which we simply conclude that \(\lnot \, rRr\). No contradiction ensues.
Our second axiom, which also follows from the picture provided in the previous section, states that if x is a singular point of y, then x is not an element of y:
In conjunction with the Axiom of Class Comprehension, the Axiom of Singularity implies:
This is a very useful theorem. If we know that x is an element of the class y, then we can deduce that x satisfies the defining condition of y. Moreover, this theorem rules out that some classes “overspill”: it is not possible that the class \(\{u\mid \varphi \}\) contains some objects that are not \(\varphi\)s.
We adopt the following version of extensionality, according to which two classes are identical if they have the same range of significance and the same members. (The other direction follows from the logical laws of identity.)
Here, \(R(x)=R(y)\) is shorthand for \(\forall z\,(zRx\leftrightarrow xRy)\). The reason for imposing this condition is as follows. Assume it is possible to define a class w such that \(w=\{u\mid u\notin u\wedge u=w\}\). (Such selfreferential classes cannot be defined in the present formalism, but one may muse about extensions of the system in which this is possible.) It is easy to prove, using the first two axioms, that \(w\notin w\). Hence, w has no members. Now assume that the class \(\varnothing :=\{u\mid u\ne u\}\) has an unrestricted range of significance. (This will actually follows from our other axioms.) Hence, if we identified classes with the same members, we would get that \(\varnothing =w\). But then w would have an unrestricted range of significance as well, which we have just ruled out. (It should be noted that, as things stand, the ordinary axiom of extensionality is consistent with our theory as well.)
The Axiom of Extensionality (in either form) will not be used in any of the theorems below. The reason to include it, apart from conceptual considerations, is merely to highlight that it can be included without leading to triviality. This seems noteworthy because there are wellknown problems for adding axioms of extensionality to nonclassical logics that contain naïve comprehension (Field 2008, pp. 296–298).
It is perhaps interesting to remark that, given the first three axioms, we can characterize classes with the following abstraction principle (scheme), which states that the class of \(\varphi\)s is identical to the class of \(\psi\)s if and only if \(\varphi\) and \(\psi\) have the same range of significance and are satisfied by exactly the same objects:
The above abstraction principle is a theorem of our theory. In contrast to ordinary abstraction principles, in the above scheme the class terms appear also on the righthand side of the biconditional. Of course, this is a sideeffect of my decision to use classes, instead of predicates, as the second relatum of the R relation. If predicates were used instead, the abstraction terms would only occur on the lefthand side of the abstraction principle.
Our next group of axioms deals with the relation between the range of significance of a predicate and the logical form of that predicate. They are based on the natural assumption that classes/ranges of significance should be closed under the algebraic operations corresponding to the logical operations on predicates. For example, if the number 2 lies in the range of significance of “is green”, then it should lie within the range of “is not green” as well; if Aristotle lies in the range of significance of the predicates “is Greek” and “is a philosopher”, then Aristotle should also lie within the range of “is a Greek philosopher”.
We will adopt similar axioms for atomic predicates. For example, consider the atomic predicate \(u\in t\), where t denotes some class. Of what objects should we say that they lie in the range of significance of \(u\in t\)? Intuitively, t simply is \(\{u\mid u\in t\}\). Hence, the following seems natural: if x is an object that already lies in the range of significance of t, then x lies in the range of \(u\in t\) as well.^{Footnote 8}
The axioms introduced so far are compatible with every predicate having an empty range of significance. (Note that all of them are universally quantified conditionals.) They are therefore trivially consistent. In order to get our theory off the ground, we need some axioms that ensure that some predicates have nonempty ranges. The following axiom stipulates that the (class determined by the) predicate \(u=u\) has an unrestricted range of significance. Recall that total(x) abbreviates the formula \(\forall z\, zRx\).
The reason for postulating this axiom is clear, given our motives. We want to design a theory in which models with a universal domain are available. Instead of adopting the Axiom of SelfIdentity, we could stipulate that the empty class \(\varnothing :=\{u\mid u\ne u\}\) is total. Given that ranges of significance are preserved under negation, it does not matter which one we choose. The totality of one follows from the the totality of the other.
I find the Axiom of SelfIdentity fairly innocuous. First, the predicate \(x=x\) (just as any other tautological predicate) is stable (i.e., its extension is fixed on every interpretation of the nonlogical primitives). Second, the predicate \(x=x\) does not contain the membership symbol \(\in\), and should therefore be admissible. One might compare this line of argument to how the Tschema is restricted in formal theories of truth. The sentences without the truth predicate are always assumed to be admissible instances of the Tschema.
Before presenting the last axiom of the pure theory of classes, let me mention some straightforward consequences of the axioms introduced so far. I hope this will help the reader to get a better feeling for the theory.
First, observe that, as desired, the universal class \(V:=\{u\mid u=u\}\) contains every class, including itself and the Russell class r. For by the Axiom of Class Comprehension, we have
By the Axiom of SelfIdentity, we know that xRV for every x. Hence, \(x\in V\) for every x.
Second, since the Russell class r is contained in the universal class but not vice versa (as the reader may easily verify), we can conclude (by usual laws of identity) that \(V\ne r\). This is in stark contrast to the traditional theories of “proper classes” (i.e., theories of the MorseKelley variety), which do not distinguish the two.
Next, we notice that total classes are closed under the following operations. Assume that \(t_1, \ldots , t_n\) are total. Then:

1.
\(\{t_1, \ldots , t_n\} :=\{u\mid u=t_1\vee \ldots \vee u=t_n\}\) is total.

2.
\((t_1, t_2):=\{\{t_1\}, \{t_1, t_2\}\}\) is total.

3.
\(t_1\cup t_2 :=\{u\mid u\in t_1\vee u\in t_2\}\) is total.

4.
\(t_1\cap t_2 :=\{u\mid u\in t_1\wedge u\in t_2\}\) is total.

5.
\(t_1\setminus t_2 :=\{u\mid u\in t_1\wedge u\notin t_2\}\) is total.

6.
\(\overline{t}_1 :=\{u\mid u\notin t_1\}\) is total.

7.
\(S(t_1) := t_1\cup \{t_1\}\) is total.

8.
\(t_1\cup \{t_2\}\) is total.
For illustrative purposes, I show (3). The other items are proved in a similar fashion. By the Axiom of Membership and the totality of \(t_1\), \(\forall x (xR\{u\mid u\in t_1\})\). Similarly, we have \(\forall x (xR\{u\mid u\in t_2\})\). Therefore, by the Axiom of Connectives, we have
which means that \(t_1\cup t_2\) is total.
Using item (7), we can successively generate the finite ordinals in the usual von Neumann style. That is, we let \(0:=\varnothing\), \(1:=\{0\}\), \(2:=\{0, 1\}\), \(3:=\{0,1,2\}\) and so on. It is easily seen that all these classes are total. However, we are not yet able to collect these classes into one.
By item (8), it follows that total classes are closed under adjunction. This means that our theory relatively interprets adjunctive set theory (=existence of empty set plus closure under adjunction), which in turn interprets Robinson arithmetic (roughly, Peano arithmetic without the induction scheme).^{Footnote 9}
It should also be noted that out theory interprets the system known as \({\mathsf {NF}}_2\), whose axioms are extensionality, existence of the empty set, and closure under complements, intersections, and singletons (Forster 2001).
This brings us to the most interesting axiom. It expresses the widespread idea that paradoxes are to be blamed on some form of circularity or nonwellfoundedness. More precisely, the axiom states that if x is a singularity of some class, then x itself has singular points:
For a typical example, consider the Russell class. The Russell class has a singular point (namely, the Russell class itself), and that singular point has a singular point itself (namely, the Russell class). And similarly for the BuraliForti paradox: the class of all ordinals is a singular point of (the class defined by) the predicate ‘x is an ordinal’. We need not assume that all paradoxes stem from such a simple type of circularity. Perhaps there are classes x, y such that x is a singular point of y and vice versa. We may also imagine a infinitely descending chain of classes \(x_1, x_2, x_3, \ldots\) such that every class in that sequence is a singular point of its immediate predecessor.^{Footnote 10}
The Axiom of Circularity is logically equivalent to the claim
In words: whenever x is total, then x lies in the range of significance of every class (i.e., x is not a singularity of any class). Since there are total classes (in fact, infinitely many ones: our proxies for the natural numbers \(0, 1, 2, \ldots\) are all total), this means that no predicate has an empty range of significance (in fact, every predicate has infinitely many objects in its range). Thus I believe that the Axiom of Circularity actually captures to some extent Gödel’s idea that we may assume every predicate to be significant for most arguments.
The Axiom of Circularity is quite remarkable. It justifies impredicative class formation in the sense that it entitles us (in conjunction with the first two axioms) to form classes of total classes at will. For every predicate \(\varphi\), the following is a theorem of our theory:
(The usual condition on the variables apply.) This can be seen as follows. The Axiom of Cirularity implies
Thus by the Axiom of Class Comprehension we get
from which the above claim follows by existential weakening.
The Axiom of Circularity boosts the mathematical strength of our theory significantly. It allows us, using suitable definitions, to derive the secondorder Dedekind–Peano axioms for arithmetic within our theory. Let us define
where 0 and S(z) are defined as above. This states that \(\omega\) is the class of total classes that are contained in every inductive class, where a class is inductive if it contains 0 and is closed under successor. This is the usual von Neumann definition of natural numbers; we have only added the condition that the natural numbers must be total.
For illustrative purposes, let us show that \(\omega\) actually contains all natural numbers (as defined above). First, we have already seen that 0 (the empty class) is total. Trivially, 0 is contained in every inductive class. Hence, 0 satisfies the defining condition for being a natural number. But then the scheme (SOC) above allows us to conclude that 0 is indeed an element of \(\omega\). Next, let us show that \(\omega\) is closed under successors. So let \(x\in \omega\). Then by (Out) we know that x satisfies the defining condition of \(\omega\). Hence x is total and contained in every inductive class. But then it trivially follows that its successor, S(x), must also be contained in every inductive class. Moreover, we have seen above that whenever x is total, so is its successor. Hence, S(x) satisfies the defining condition of being a natural number, and since it is total we can conclude that \(S(x)\in \omega\), by another application of (SOC). A complete derivation of the secondorder Dedekind–Peano axioms can be found in “Appendix 1”.
The theory presented above is consistent. A proof is given in “Appendix 2”. It needs to be stressed that the axioms presented here are only basic axioms that can and should be extended by additional ones that increase the expressive (mathematical) power of the theory even more.
So far we have only considered classes of classes, that is pure classes. One of the main motives for developing a theory of classes lies in its application to some given domain. So let us assume that our language contains additional predicates applying to objects other than classes, such as people, stones, numbers or sets, and let us introduce a distinguished predicate, U (for urelements), applying to these objects. Then we may adopt the following axiom which states that every urelement is in the range of significance of every class.^{Footnote 11}
Now let T be some firstorder theory not containing the symbols \(\in , R, U\) nor any of the abstraction terms. If T is the language of set theory, we can simply work with two copies of \(\in\). Let \(T^U\) be the theory resulting from T by relativizing all quantifiers in the axioms of T to the predicate U. Moreover, if T contains axiom schemata (such as induction or replacement), extend these so that \(\in , R, U\) and the abstraction terms are allowed to occur in the instances of the schemata. Then it is easily seen that \(T^U\), conjoined with our axioms for classes, implies that
(This follows simply from the Axioms of Urelements and Class Comprehension.) Hence, \(T^U\) together with the theory of classes interprets the secondorder version of T.
It is possible to go further. For example, we may add an axiom to the effect that whenever x is a class containing only urelements, then x is in the range of every class. This would allow us to interpret the thirdorder version of T. This process can be iterated. We can add an axiom to the effect that whenever x is a class of classes of urelements, then x is in the range of every class, which gives us fourthorder T, and so on for every finite order. Hence, we can embed the full type hierarchy over T into our theory of classes.
That this can be done in a typefree theory of classes is something I take as a minimal adequacy result. We have claimed that a typefree theory needs to be developed because of certain expressive limitations of type theory. But then the replacing theory should be at least as expressive as type theory.
There are some theories T that are inconsistent with full secondorder comprehension, e.g., abstraction principles for ordinals conceived as sui generis objects (Florio and LeachKrouse 2017). In such cases, one can weaken the Axiom of Urelements in several ways, if desired. For example, let \(P_1, P_2, \ldots\) be the predicates of T. Then one could replace the Axiom of Urelements with the following schema:
In this case, one only obtains comprehension for predicates that are firstorder definable in T, that is, a predicative comprehension principle.
6 Unrestricted quantification
Following a suggestion of Linnebo (2006), I have mentioned that one way to approach the problem of unrestricted quantification is by adopting a theory with a universal class or property. An objection that is frequently raised against such proposals is that theories with universal classes are incompatible with the axiom scheme of separation,
This axiom states that given a class x, we can collect all members of x that satisfy some property \(\varphi\) into one class y. But if x is the universal class, and \(\varphi\) is the Russell predicate \(z\notin z\), then y cannot exist on pain of contradiction.
Hence, if we admit a universal class we lose separation. However, it seems that we need separation in order to comply with the following semantic principle:
For any domain of interpretation d and any predicate \(\varphi (z)\) in the language, it should be possible to specify an interpretation such that for all individuals \(x\in d\), a predicate letter ‘P’ applies to x if and only if \(\varphi (x/z)\).
The problem emerges because in a modeltheoretic semantics the semantic value of ‘P’ needs to be an object. In order that the above principle is satisfied, we need to assign the class \(\{z\in d\mid \varphi (z)\}\) to ‘P’. And this in turn requires the axiom of separation.
I believe, however, that the quasiGödelian strategy adopted in the present paper allows us to formulate a satisfactory response to this objection.^{Footnote 12} For suppose that the domain of our interpretation consists of a class d, and let ‘P’ be a predicate symbol that we want to interpret by a predicate in our language. If we take the idea that every predicate (concept, propositional function) has a range of significance seriously, then it seems reasonable to demand that a predicate be chosen that is significant for all elements in d. I am not sure whether it is plausible to insist that it must be possible to interpret ‘P’ by some predicate that is not significant for some objects in the domain of interpretation. Indeed, the typetheoretic defense can be seen as a special case of this. After all, Russell’s theory diverges from Gödel’s only insofar as a further condition is imposed on the ranges of significance, namely, that they form types. (Recall our discussion in Sect. 4.) Hence we could replace the above semantic principle by the following one:
For any domain of interpretation d and any predicate \(\varphi (z)\) that is significant for all objects in d, it should be possible to specify an interpretation such that for all objects \(x\in d\), a predicate letter ‘P’ applies to x if and only if \(\varphi (x/z)\).
This demand can be met in the theory of classes presented in this paper. For if \(\varphi (z)\) is a predicate and d is a class such that all members of d are in the range of (the class determined by) \(\varphi (z)\), then for all \(x\in d\) we will have \(x\in \{z\in d\mid \varphi (z)\}\) if and only if \(\varphi (x/z)\). Hence we can assign the class \(\{z \in d\mid \varphi (z)\}\) as extension to the predicate letter ‘P’.
How severe is the restriction imposed by the suggested principle? One may argue about this, but I do not think that it is too severe. Notice, for instance, that whenever the universe d contains only urelements, then any predicate in the language can be assigned as interpretation to ‘P’. This still holds true if d contains, in addition, total classes. Only if d contains classes that are nontotal are we not free to choose arbitrary predicates. (For example, if d contains the Russell class r, then we cannot interpret ‘P’ by the Russell predicate \(u\notin u\).) However, we can still assign to ‘P’ any predicate that is total—such as the predicate \(x=x\).
7 Conclusion
I have listed a number of desiderata for a typefree theory of classes. Let us now see how the theory proposed in this paper fares with respect to them. The main function of class talk is that it enables us to generalize on predicate places in our language. Second and higherorder quantifiers provide a means to do so directly. Our class theory can be used for the same purpose. For example, if our class theory is applied to set theory, then we can express the axioms of separation and replacement by single sentences. In addition, our theory allows us to generalize predicate places that cannot be generalized in type theory.
Another possible application for a typefree theory of classes is to serve as a foundation of category theory. In order to be applied like this, we need at least be able to form the class of all sets and the class of all functions between sets, as well as the power class of the class of all sets and the class of all functions between these classes (see Muller 2001). This is possible if our class theory is combined with set theory and the Axiom of Urelements is iterated in the way described. How successful such a classtheoretic foundation of category theory is from a philosophical point of view is, of course, a difficult question which demands further discussion.
The problem of unrestricted quantification was cited as a main motive for a typefree theory of classes. In the previous section, I have argued that the idea of limited ranges of significance provides a response to one of the main objections against a universal domain, namely, the problem of separation.
What about the reduction of properties (universals) to classes and the corresponding analysis of natural language semantics? Obviously, this cannot be answered unless we are given a formal theory of properties (universals). But I think that the prospects here are not too bad either (assuming, of course, that one can deal effectively with the problem that classes seem to be more coarsegrained than classes, perhaps by following David Lewis’ strategy). For instance, consider the property of being a property, which applies to all properties including itself. This could be modelled by the class of all classes. There are good reasons to believe that properties are closed under the algebraic operations corresponding to the logical operations of negation, conjunction, etc. These operations can be performed on classes as well. Moreover, consider again the inference from

1.
Everything has the property of being selfidentical
to

2.
Socrates has the property of being selfidentical
and the inference from (1) to

3.
The property of being red has the property of being selfidentical
In our theory, both inferences can be carried out if talk of properties is appropriately replaced by talk of classes.
Finally, we have seen that the pure theory of classes allows for an interpretation of the secondorder Dedekind–Peano axioms of arithmetic (i.e., \({\mathsf {Z}}_2\)). If the theory is extended with an appropriate axiom for forming total power classes of total classes, it is even possible to interpret \({\mathsf {Z}}_\omega\), that is the union of nth order arithmetic for every \(n\in \omega\), which is roughly equivalent to Zermelo set theory (Zermelo–Fraenkel set theory without replacement and foundation). What this means for the philosophy of mathematics is an altogether different question. I have indicated that the naïve concept of class is acquired without significant epistemological presupposition, and therefore might be used in a project similar to the NeoFregean one. But the paradoxes force us to regiment the notion of class, and whether the regimentation proposed here preserves the epistemological status of the naïve notion is clearly in need of further discussion. This, however, is left for another occasion.
Notes
A propositional function is a function that yields a proposition when given an argument, and one might think of them as being abstracted from propositions which are primarily given. In particular, a propositional function is not to be confused with the predicate (i.e., formula with one free variable) expressing it.
Two important remarks are in order. First, while Frege and Russell were logicists, Peano was not (despite frequent claims to the contrary). See Kennedy (1963). Second, there are important differences between Frege’s and Peano–Russell’s notion of class. For example, while for Russell classes are “composed of terms”, for Frege the elements of a class do not seem to be constitutive of it. See Lavine (1994, pp. 63–64). Our own account will remain neutral between the two.
See Russell’s (1973) discussion of how to divide an infinity of boots into two classes.
Some philosophers have objected to interpreting secondorder variables as ranging over classes (often understood as one additional layer of sets) because set theory is supposed to be a theory about all setlike entities, e.g., Boolos (1984). But classes, in the logical sense, need not be understood as setlike entities (collections) at all (as remarked in footnote 2, Frege did not think of extensions as being constituted by their members). Moreover, if typefree classes are collections, then set theory is certainly not about them.
Of course, one may deny that properties are reducible to classes. But given the formal similarities between classes and properties mentioned in Sect. 2, advances in formal theories of classes and the analysis of the classtheoretic paradoxes can guide us in developing formal theories of properties.
See Crocco (2006) for more on Gödel’s account of concepts.
This should not be understood as a judgement in favour of a classical solution. A nonclassical solution may be preferable if one could come up with a natural system that sufficient prooftheoretic strength. My own attempts in this direction have failed so far.
Indeed, it seems natural to strengthen the Axioms of Negation, Connectives, Membership and Identity to a biconditional. This is consistent with the other axioms.
A proof of that result (i.e., that Robinson arithmetic is interpretable in adjunctive set theory) can be found in Visser (2009), who also gives a short history of the result.
It would therefore be more appropriate to call the Axiom of Circularity ‘the Axiom of NonWellFoundedness’. The reason I did not choose this name is twofold. First, the name ‘Axiom of NonWellFoundedness’ could easily lead to a confusion with the Axiom of AntiFoundation in nonwellfounded set theories. Second, nonwellfoundedness can be seen as some form of unfolding of circularity.
It would perhaps be more natural if we also introduced a predicate, C, applying to all classes, and reformulated all class axioms slightly, so that e.g. the Axiom of Urelements becomes \(\forall x\, (Ux\rightarrow \forall y\,(Cy\rightarrow xRy))\). I leave this as an exercise for the reader.
An alternative approach is formulated in Linnebo (2006). The article describes a process whereby more and more properties are “individuated”. At any stage of this process, the semantic principle stated above is validated because \(\varphi (x)\) is understood in accordance with the notion of property application that has been constructed by this stage. What cannot be done is to complete this process of property individuation and then apply the semantic principle. But the article attempts (whether successfully or not) to view this process as incompletable.
References
Allen, S. R. (2016). A critical introduction to properties. London: Bloomsbury.
Bealer, G. (1982). Quality and concept. Oxford: Clarendon Press.
Benacerraf, P. (1983). Mathematical truth. In P. Benacerraf & H. Putnam (Eds.), Philosophy of mathematics (2nd ed., pp. 403–420). Cambridge: Cambridge University Press.
Bernays, P. (1983). On platonism in mathematics. In P. Benacerraf & H. Putnam (Eds.), Philosophy of mathematics (2nd ed., pp. 258–271). Cambridge: Cambridge University Press.
Bochenski, J. M. (2002). Formale logik. Freiburg (Breisgau): Alber.
Boolos, G. (1984). To be is to be a value of a variable (or to be some value of some variables). Journal of Philosophy, 81, 430–449.
Burgess, J. P. (2005). Fixing Frege. Princeton: Princeton University Press.
Cantini, A. (2009). Paradoxes, selfreference and truth in the 20th century. In D. Gabbay & J. Woods (Eds.), Handbook of the history of logic (Vol. 5, pp. 875–1013). Amsterdam: Elsevier.
Cartwright, R. (1994). Speaking of everything. Noûs, 28, 1–20.
Chierchia, G., & Turner, R. (1988). Semantics and property theory. Linguistics and Philosophy, 11, 261–302.
Church, A. (1932). A set of postulates for the foundation of logic. Annals of Mathematics, 33, 346–366.
Cocchiarella, N. B. (1986). Frege, Russell and logicism: A logical reconstruction. In L. Haaparanta & J. Hintikka (Eds.), Frege synthesized (pp. 197–252). Dordrecht: Reidel.
Cook, R. T. (2009). New waves on an old beach: Fregean philosophy of mathematics today. In O. Bueno & O. Linnebo (Eds.), New waves in philosophy of mathematics (pp. 13–34). Basingstoke: Palgrave Macmillian.
Crocco, G. (2006). Gödel on concepts. History and Philosophy of Logic, 27, 171–191.
Feferman, S. (1977). Categorical foundations and foundations of category theory. In R. E. Butts & J. Hintikka (Eds.), Logic, foundations of mathematics and computability theory (pp. 149–169). Dordrecht: Reidel.
Feferman, S. (1984). Towards useful typefree theories, I. Journal of Symbolic Logic, 49, 75–111.
Field, H. (2008). Saving truth from paradox. New York: Oxford University Press.
Florio, S., & LeachKrouse, G. (2017). What Russell should have said to BuraliForti. The Review of Symbolic Logic. https://doi.org/10.1017/S1755020316000484.
Forster, T. (2001). Church’s set theory with a universal set. In C. A. Anderson & M. Zelëny (Eds.), Logic, meaning and computation: Essays in memory of Alonzo Church. Dordrecht: Kluwer.
Fraenkel, A. A., & BarHillel, Y. (1958). Foundations of set theory. Amsterdam: North Holland.
Frege, G. (1964). Grundgesetze der Arithmetik. Hildesheim: Olms.
Gödel, K. (1983). Russell’s mathematical logic. In P. Benacerraf & H. Putnam (Eds.), Philosophy of mathematics (2nd ed., pp. 447–469). Cambridge: Cambridge University Press.
Hale, B., & Wright, C. (2001). The reason’s proper study. Oxford: Oxford University Press.
Heck, R, Jr. (2011a). Frege’s theorem. Oxford: Oxford University Press.
Heck, R., Jr. (Ed.). (2011b). Julius Ceasar and basic law V. In Frege’s theorem (pp. 111–126). Oxford: Oxford University Press.
Kennedy, H. C. (1963). The mathematical philosophy of Guiseppe Peano. Philosophy of Science, 30, 262–266.
Landini, G. (2004). Logicism’s ‘insolubilia’ and their solution by symbolic logic. In G. Link (Ed.), One hundred years of Russell’s paradox (pp. 373–398). Berlin: De Gruyter.
Lavine, S. (1994). Understanding the infinite. Cambridge: Harvard University Press.
Lewis, D. (1986). On the plurality of worlds. Oxford: Blackwell Publishers.
Linnebo, O. (2006). Sets, properties, and unrestricted quantification. In A. Rayo & G. Uzquiano (Eds.), Absolute generality (pp. 149–178). Oxford: Oxford University Press.
Linnebo, O. (2012). Metaontological minimalism. Philosophy Compass, 7, 139–151.
Maddy, P. (1983). Proper classes. Journal of Symbolic Logic, 48, 113–139.
Menzel, C. (1986). A complete typefree ‘second order’ logic and its philosophical foundations. Stanford: Stanford University: Center for the Study of Language and Information.
Menzel, C. (1993). The proper treatment of predication in finegrained intensional logic. Philosophical Perspectives, 7, 61–87.
Montague, R. (1974). Formal philosophy. New Haven and London: Yale University Press.
Muller, F. A. (2001). Sets, classes and categories. British Journal for the Philosophy of Science, 52, 539–573.
Orilia, F. (1991). Typefree property theory, exemplification, and Russell’s paradox. Notre Dame Journal of Formal Logic, 32, 432–447.
Orilia, F. (1999). Predication, analysis and reference. Bologna: CLUEB.
Parsons, C. (1982). Objects and logic. The Monist, 65, 491–516.
Parsons, C. (Ed.). (1983a). The liar paradox. In Mathematics in philosophy (pp. 221–267). Ithaca, NY: Cornell University Press.
Parsons, C. (Ed.). (1983b). Sets and classes. In Mathematics in philosophy (pp. 209–220). Ithaca, NY: Cornell University Press.
Quine, W. V. O. (1970). Philosophy of logic. Cambridge: Harvard University Press.
Quine, W. V. O. (Ed.). (1980). New foundations for mathematical logic. In From a logical point of view (p. 80–101). Cambridge, MA: Harvard University Press.
Rayo, A., & Uzquiano, G. (Eds.). (2006). Absolute generality. Oxford: Oxford University Press.
Russell, B. (1903a). On denoting. Mind, 14, 479–493.
Russell, B. (1903b). The principles of mathematics. Abingdon: Routledge and Kegan.
Russell, B. (1908). Mathematical logic as based on the theory of types. American Journal of Mathematics, 30, 222–262.
Russell, B. (1973). On some difficulties in the theory of transfinite numbers and order types. Essays in analysis (pp. 135–164). London: Allen and Unwin.
Schindler, T. (2015). A disquotational theory of truth as strong as \(\text{ Z }^{}_2\). Journal of Philosophical Logic, 44, 395–410.
Schindler, T. (2017). Some notes on truth and comprehension. Journal of Philosophical Logic, 1–31. https://doi.org/10.1007/s1099201794341.
Schulman, M. A. (2008). Set theory for category theory. arXiv:0810.1279v2
Uzquiano, G. (2003). Plural quantification and classes. Philosophia Mathematica, 11, 67–81.
Visser, A. (2009). Cardinal arithmetic in the style of Baron von Münchhausen. Review of Symbolic Logic, 2, 570–589.
Wang, H. (1996). A logical journey. From Gödel to philosophy. Cambridge: MIT Press.
Williamson, T. (2003). Everything. Philosophical Perspectives, 17, 415–465.
Acknowledgements
This work was supported by the research project “Reference patterns of paradox” by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) and written while the author held a Junior Research Fellowship by Clare College, University of Cambridge. This paper was presented to audiences in Bristol, Buenos Aires, Cambridge, Munich, New York, and Oxford. I thank the attendees for their feedback, in particular Tim Button, Roy Cook, Hartry Field, Volker Halbach, Leon Horsten, Oystein Linnebo, Chris Menzel, Beau Madison Mount, Graham Priest, Stanislav Speranski, and Rob Trueman. I thank Neil Barton for written comments on an earlier draft of this paper, and an anonymous referee for further helpful comments. Special thanks go to my longtime collaborators Lavinia Picollo and Timo Beringer for many fruitful discussions.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Derivation of the Dedekind–Peano axioms
In this appendix we show that, given suitable definitions, the secondorder Dedekind–Peano axioms can be derived within the pure theory of classes. The proof makes heavy use of the following theorem schemata, namely
and
both of which where discussed in Sect. 5.
Definition 7.1
(Natural numbers.) Let \(0:=\varnothing\) and \(n+1:=S(n)\). Let
This is the usual von Neumann definition of finite ordinals; the only difference is that we have smuggled the condition total(x) into the defining predicate. This allows us to prove that \(\omega\) actually contains 0 and is closed under successor, and that induction holds:
Proposition 7.2

1.
\(0\in \omega \wedge \forall x(x\in \omega \rightarrow S(x)\in \omega )\)

2.
\(y\subseteq \omega \wedge 0\in y\wedge \forall z\in \omega \, (z\in y\rightarrow S(z)\in y)\rightarrow w\subseteq y\)

3.
\((Induction)\,\varphi (0)\wedge \forall x\in \omega \,(\varphi (x)\rightarrow \varphi (S(x)))\rightarrow \forall x\in \omega \,\varphi.\)
Proof
Ad 1. We already know that 0 is total. Since trivially 0 is a member of every inductive class, (SOC) implies \(0\in \omega\). Now let \(x\in \omega\). By (Out), x is total and a member of every inductive class. It is easily seen that S(x) is also a member of every inductive class. Moreover, we know that if x is total, then so is S(x). Thus by (SOC), \(S(x)\in \omega\).
Ad 2. Let \(x\in \omega\) to show that \(x\in y\). By (Out) we know that x is a member of every inductive class. So it suffices to show that y is inductive. Obviously, \(0\in y\). Now let \(z\in y\). Since \(y\subseteq \omega\) also \(z\in \omega\). But then the third condition on y yields \(S(z)\in y\). So y is inductive, we are done.
Ad 3. Consider \(a:=\{x\mid x\in \omega \wedge \varphi (x)\}\). Note that \(x\in a\) implies \(x\in \omega\) by (Out). So \(a\subseteq \omega\). We will apply induction in the sense of the previous proposition (2). Since 0 is total, from \(\varphi (0)\) (and the fact that \(0\in \omega\)) we conclude \(0\in a\) by (SOC). Now let \(y\in \omega \wedge y\in a\). By (Out), y is total and \(\varphi (y)\). By assumption, \(\varphi (S(y))\). Since y is total, S(y) is total too. By (1) and \(y\in \omega\) we conclude \(S(y)\in \omega\). Thus from (SOC) and totality of S(y) we conclude \(S(y)\in a\). So by (2), \(\omega \subseteq a\). So by (Out), every member of \(\omega\) has the predicate \(\varphi\). \(\square\)
The following propositions establish the transitivity and irreflexitivity of the natural numbers. Here, x is transitive if and only if for all \(y\in x\), we have \(y\subseteq x\).
Proposition 7.3

1.
\(\forall x\in \omega \,(x { is\,transitive})\)

2.
\(\forall x\in \omega \,(x\notin x).\)
Proof
Ad 1. By induction. 0 (which is the empty set) is obviously transitive. Assume \(n\in \omega\), n transitive. Let \(y\in x\wedge x\in n+1\). By (Out), \(x\in n\vee x=n\). If \(x\in n\) then \(y\in n\) by induction hypothesis (IH). If \(x=n\) then also \(y\in n\). Thus \(y\in n\). By the Axiom of Singularity, y is in the range of significance of n and therefore y is in the range of significance of \(\{z\mid z\in n\}\) by the Axiom of Membership. By the Axioms of Identity and Connectives, y is also in range of \(\{z\mid z\in n\vee z=n\}=: n+1\). Then \(y\in n\) implies \(y\in n+1\).
Ad 2. By induction. Obvious for \(x=0\). Let \(n\in \omega\) with \(n\notin n\). Assume \(n+1\in n+1\). Then \(n+1\in n\vee n+1=n\) by (Out). Also, by definition of \(n+1\) and totality of n, \(n\in n+1\). But then \(n\in n\), contradicting the IH. \(\square\)
Now we are in a position to prove the successor axioms:
Proposition 7.4

1.
\(\forall x\in \omega \,(0\ne x+1)\)

2.
\(\forall x,y\in \omega \,(x+1=y+1\rightarrow x=y).\)
Proof
Ad 1. Once can show that 0 has no elements but \(x+1\) has at least one element. Therefore they cannot be identical by Leibniz’ law.
Ad 2. Assume \(x+1=y+1\). Then \(x\in y+1\) and \(y\in x+1\) (because \(x\in x+1\) and \(y\in y+1\)). But then (applying the definition of successor and using the totality of the natural numbers) \((x\in y\wedge y\in x)\vee x=y\). By Proposition 7.3(1), \(x\in x\vee x=y\). So \(x=y\) by Proposition 7.3(3). \(\square\)
Finally, we show that comprehension holds.
Corollary 7.5
(Comprehension) For any\(\varphi\)which does not containyfree, we have\(\exists y\subseteq \omega \,\forall x\in \omega \,(x\in y\leftrightarrow \varphi )\).
Proof
Let \(\varphi\) be given. Let \(x\in \omega\). If \(x\in \{u\mid u\in \omega \wedge \varphi \}\) then \(\varphi (x/u)\) by (Out) and Conjunction Elimination. Conversely, if \(\varphi (x/u)\) holds, then since \(x\in \omega\) (which implies totality of x) we get \(x\in \{u\mid u\in \omega \wedge \varphi \}\) by (SOC). Therefore, for all x, \(x\in \{u\mid u\in \omega \wedge \varphi \}\leftrightarrow \varphi (x/u)\). Clearly, \(\{u\mid u\in \omega \wedge \varphi \}\subseteq \omega\).\(\square\)
Notice that if \(x, y\in \omega\) then the ordered pair (x, y) is total. (See our discussion in the middle of Sect. 5.) A slight modification of the above proof shows that we can also have comprehension for binary relations (indeed, nary relations) over natural numbers. Hence, Dedekind’s famous result implies that our theory of classes interprets full secondorder arithmetic.
Appendix 2: Consistency proof
We will work in Zermelo–Fraenkel set theory with Urelements. The use of the urelements can be eliminated but is assumed here for technical convenience. Before starting with the formal construction of the model, let me sketch the underlying idea (without laying claim to completeness or accuracy).
The domain of our model will consist of all sets of rank \(\leqslant \omega\) that can be constructed in the cumulative hierarchy starting with a countable set of urelements. Hence, the set of natural numbers and each of its subsets live within the model. There are four subsets of the domain that will be relevant: (1) the urelements; (2) the finite sets; (3) the cofinite sets; (4) the infinite sets that are not cofinite. The objects in category (1)–(3) will represent the total classes, the sets in category (4) will represent the nontotal or proper classes. The range of significance of a total class will consist of the entire domain; the range of a proper class will consist of the set of total classes. In other words, the singularities of a proper class will comprise all and only the proper classes. Therefore, the Axiom of Circularity will be true in this model. We will associate (identify) each set in category (3) with exactly one urelement. Selfmembership is then achieved by stipulating that a class is contained in itself iff it contains the urelement associated with it.
Now let us consider the Axiom of Class Comprehension. This axiom states that, for every predicate \(\varphi (x)\), the class associated with that predicate comprises exactly those objects that are in the range of \(\varphi (x)\) and satisfy \(\varphi (x)\). In particular, given how we have defined the ranges of significance in our model, if \(\varphi (x)\) is not total, then the class associated with it must comprise exactly the total classes satisfying \(\varphi (x)\). But any collection of total classes can be represented by a collection of finite sets, because every cofinite set is represented by an urelement. Since such a collection has rank \(\leqslant \omega\), it lives within our model, and we will use it to validate the Axiom of Class Comprehension.
Moreover, it is easy to see that, in this model, the total classes are closed under all relevant algebraic operations postulated by the axioms: for example, they are closed under complementation because the complement of a finite set is a cofinite set and vice versa; they are closed under singletons because the singleton of a finite set is finite, and the singleton of a cofinite set will be represented by the singleton of the urelement associated with it, which is a finite set as well.
In this short sketch of the main idea, I have swept several problems under the rug, and we need to make certain adjustments in order to deal with them. In particular, according to the Axiom of Extensionality, there may be coextensional classes that are not identical because they do not have the same range of significance. In order to deal with this problem, we will have to add copies of the finite and cofinite sets to our model and declare them to be nontotal. Secondly, we need to allow proper classes to be elements of total classes. However, proper classes can have rank \(\omega\), and there are no objects in our domain of rank \(\omega +1\). For cardinality reasons, it is not possible to introduce an urelement for each proper class that could serve as its proxy. Fixing this problem makes the proof more complicated.
We will now start with the formal construction. Let U be a countably infinite set of urelements. Let t, p be two urelements not contained in U. Let \(U_T=\{(u, t)\mid u\in U\}\). The objects in \(U_T\) will be used to model membership between classes of equal rank. Let \(V_\omega [U_T]\) be the smallest set X that
(\(\alpha\)) \(U_T\subseteq X\), and
(\(\beta\)) whenever \(x_1, \ldots , x_n\in X\) then \((\{x_1, \ldots , x_n\}, t)\in X\) The collection of total classes, T, consists of all x such that

1.
\(x\in V_\omega [U_T]\), or

2.
\(\exists y_1,\ldots , y_n\in V_\omega [U_T]\) such that
$$x=(V_\omega [U_T]\setminus \{y_1, \ldots , y_n\}, t)$$
Note that the objects satisfying (1) have finite rank and that the ones satisfying (2) are such that their first components are cofinite in \(V_\omega [U_T]\).
The collection of proper (i.e., nontotal) classes, P, consists of all x such that

3.
\(x=(y, p)\) for some \(y\subseteq V_\omega [U_T]\)
The domain, D, of our model, consists of \(T\cup P\). Notice that, as I said above, P contains a ‘copy’ of each finite and cofinite set.
Given an object \(x=(a, i)\in D\) with \(i\in \{t,p\}\), we call a the setcomponent of x and i the index of x. If the setcomponent of x is an urelement, we call (by abuse of language) x itself an urelement. We let \(\sigma _1((a, i))=a\). Given \(x, y\in D\), we write \(x\in _1 y\) if and only if x is an element of the setcomponent of y, that is if and only if \(x\in \sigma _1(y)\). Observe that \(x\in _1 y\) can only obtain if x has finite rank.
The ranges of significance are defined as follows. If x is total, i.e., if \(x\in T\), then the range of x consists of the entire domain, D, that is, yRx iff \(y\in D\). If x is not total, i.e., if \(x\in P\), then the range of x consists of the set of total classes, T, that is, yRx iff \(y\in T\). So there are only two different kinds of ranges. This definition ensures that the Axiom of Circularity is true on our model:
Proposition 7.6
\(\forall x\in D\,(\exists y\in D\, \lnot \, xRy\rightarrow \exists z\in D\, \lnot \, zRx).\)
Proof
Let \(x\in D\) and assume that there is a \(y\in D\) such that \(\lnot \, xRy\). By definition of R, this implies that \(x\in P\) (for if x were in T, x would be Rrelated to every \(y\in D\)). But if \(x\in P\), then x is Rrelated to all and only the the objects in T. Since P is nonempty, there must be a \(z\in P\) such that \(\lnot \, zRx\).\(\square\)
Let \(D^*\) be the set of all total classes with infinite rank (i.e., the cofinite sets) and let f be a bijection between \(D^*\) and \(U_T\). We define a relation on our domain D that will serve as our interpretation for identity between classes. Let \(x\equiv y\) iff \(x=y\) or \(x=f(y)\) or \(y=f(x)\). That is, two objects are equivalent iff they are either identical or one is the urelement associated with the other. (Notice that the last two cases can only occur when both objects are total and one of them is cofinite.) Clearly, \(\equiv\) is an equivalence relation.
Next we define a binary relation E on D that will serve as our interpretation for class membership. We will make sure that x and f(x) turn out to be coextensional in the sense of E (whenever x is total). We will achieve selfmembership by stipulating that xEx iff \(f(x)\in _1 x\), that is, if the setcomponent of x contains the urelement associated with x. Note that, according to our axioms, while proper classes may be selfmembered, only total classes are forced to be. We will produce a ‘minimal’ model in which only total classes will be selfmembered. Indeed, we will produce a model in which no proper class contains any other proper class. (Again, while our axioms allow that proper classes contain proper classes, they don’t force it.) E is defined as follows.
If \(x, y\in P\) then \(\lnot \, xEy\).
If \(x\in P\) and \(y\in T\setminus (D^*\cup \, U_T)\) then \(\lnot \, xEy\).
If \(x\in P\) and \(y\in D^*\) then xEy and xEf(y).
If \(x \in T\) then xEy if and only if
In words: in our model, no proper classes is Econtained in any other proper class. A proper class is never contained in any finite total class. Every proper class is Econtained in any infinite total class and in their corresponding urelements. Finally, if x is total, then x is an Emember of y iff the setcomponent of x (or its corresponding urelement, if it exists) is an \(\in\)member of the setcomponent of y (or of the (infinite) set corresponding to y, if y is an urlement).
Now it is easily seen that the Axiom of Singularity holds in our model.
Proposition 7.7
\(\forall x, y\in D\, (\lnot \, xRy\rightarrow \lnot \, xEy).\)
Proof
If x is not in the range of y then y is a proper class. But since x is not contained in some range, it must be a proper class too. However, if \(x,y\in P\) then \(\lnot \, xEy\) by definition of E.\(\square\)
Moreover, the definitions so far ensure that the Axiom of Extensionality is true in our model.
Proposition 7.8
\(\forall x, y\in D\,(R(x)=R(y)\wedge \forall z\in D\, (zE x\leftrightarrow zEy)\rightarrow x\equiv y).\)
Proof
Assume that the antecedent holds. Then either both x, y are total or both are proper. There are several cases to distinguish.
Assume first that neither x nor y are urelements. It is sufficient to show that the set components of x, y are identical, i.e., that \(\forall z\, (z\in _1 x\leftrightarrow z\in _1 y)\). Let \(z\in _1 x\). Clearly then, z has finite rank. Hence \(z\in V_\omega [U_T]\). So, z is total. Thus we can deduce zEx. By assumption, zEy as well. Since y is not an urelement, this means that \(z\in _1 y\) or \(f(z)\in _1 y\). The latter cannot obtain because f(z) is undefined as z is either an urelement or a total class of finite rank. Thus \(z\in _1 y\). The argument for the other direction is completely symmetrical.
If x is an urelement but y is not, one can run a similar argument to show that \(x=f(y)\). If both x, y are urelements, one can run a similar argument to show that \(f^{1}(x)=f^{1}(y)\). \(\square\)
The following proposition states that two objects that are treated as equal are Econtained in the same objects.
Proposition 7.9
\(\forall x, y, z\in D(x\equiv y\wedge xEz\rightarrow yEz).\)
Proof
If \(x=y\) this is trivial. So assume w.l.o.g. that \(x=f(y)\), which means that x is an urelement and both x, f(x) are total. The definition of xEz implies \(x\in _1 z\) or \(x\in _1 f^{1}(z)\) or \(f(x)\in _1 y\) or \(f(x)\in _1 f^{1}(y)\). The third and fourth disjunct cannot obtain since x is an urelement and so f(x) is not defined. Hence (since \(x=f(y)\)) we have \(f(y)\in _1 z\) or \(f(y)\in _1 f^{1}(z)\). By definition of E, yEz. \(\square\)
Let \({\mathcal {L}}\) be the language of our class theory. In order to distinguish the membership symbol of \({\mathcal {L}}\) from that of our metatheory (ZF), we will denote the former by \(\varepsilon\). If \(\varphi\) is an \({\mathcal {L}}\)formula without any class abstracts, let \(\varphi ^*\) be obtained from \(\varphi\) by replacing all occurrences of \(x\, \varepsilon \, y\) by xEy, all occurrences of \(x=y\) by \(x\equiv y\), and all occurrences of \(\forall x\,\psi\) by \(\forall x\in D\,\psi ^*\).
Let us check that the law of substitutivity holds:
Proposition 7.10
If \(\varphi\) is a formula of \({\mathcal {L}}\) then \(\forall x, y\in D\,(x\equiv y\wedge \varphi ^*(x)\rightarrow \varphi ^*(y)).\)
Proof
By induction on the build up of \(\varphi ^*\). If \(\varphi ^*(x)\) is of the form \(x\equiv s\) then this follows from the fact that \(\equiv\) is an equivalence relation. If \(\varphi ^*(x)\) is of the form xEs this follows from Proposition 7.9.
Let \(\varphi ^*(x)\) be of the form sEx . The claim is trivial if \(x=y\). So assume w.l.o.g. that \(y=f(x)\), so y is an urelement and both y, x are total (x has infinite rank). If \(s\in P\) then the claim follows immediately from the definition of E. So assume \(s\in T\). Then sEx implies \(s\in _1 x\) or \(s\in _1 f^{1}(x)\) or \(f(s)\in _1 x\) or \(f(s)\in _1 f^{1}(x)\). We can exclude the second and fourth disjunct (because if f(x) is defined, \(f^{1}(x)\) is not defined). Hence \(s\in _1 x\) or \(f(s)\in _1 x\). Since \(x=f^{1}(y)\), we have \(s\in _1 f^{1}(y)\) or \(f(s)\in _1 f^{1}(y)\), hence sEy by definition of E.
If \(\varphi ^*(x)\) is of the form xRs or sRx this follows from the definition of R. The other clauses follow from the induction hypothesis.\(\square\)
In order to finish the specification of our model, we need to interpret the class abstracts. This is done in a twostep construction. In the first step, we define an interpretation for all class abstracts that our axiomatic theory of classes proves to be total. In a second step, we extend this interpretation to cover the remaining class abstracts. For sake of simplicity, we assume that \({\mathcal {L}}\) contains \(\lnot ,\wedge\) as its only logical connectives.
First step First, we need to capture the set of class abstracts that our axiomatic theory proves to be total. We’ll call them \(\pi\)terms. They are defined by the following simultaneous recursive definition:

1.
\(u=u\in \pi\) and \(\{u\mid u=u\}\in \pi\)

2.
If \(s\in \pi\) then \(u=s\in \pi\) and \(\{u\mid u=s\}\in \pi\)

3.
If \(s\in \pi\) then \(u\, \varepsilon \, s\in \pi\) and \(\{u\mid u\, \varepsilon \, s\}\in \pi\)

4.
If \(\varphi \in \pi\) then \(\lnot \varphi \in \pi\) and \(\{u\mid \lnot \varphi \}\in \pi\)

5.
If \(\varphi ,\psi \in \pi\) then \(\varphi \wedge \psi \in \pi\) and \(\{u\mid \varphi \wedge \psi \}\in \pi\)

6.
Nothing else is a \(\pi\)term.
Now we define a mapping (interpretation) \({}^+\) of the set of \(\pi\)terms into our domain D. In fact, notice that \({}^+\) maps the \(\pi\)terms into T.

1.
\(\{u\mid u=u\}^+=(V_\omega [U_T], t)\)

2.
\(\{u\mid u=s\}^+=(\{s^+\}, t)\)

3.
\(\{u\mid u\, \varepsilon \, s\}^+=s^+\)

4.
\(\{u\mid \lnot \varphi \}^+=(V_\omega [U_T]\setminus \sigma _1(\{u\mid \varphi \}^+), t)\)

5.
\(\{u\mid \varphi \wedge \psi \}^+=(\sigma _1(\{u\mid \varphi \}^+)\cap \sigma _1(\{u\mid \psi \}^+), t)\)
Second step We extend \({}^+\) to an interpretation that maps all class abstracts of \({\mathcal {L}}\) into D. In order to cope with parameters that may occur in a class abstract, we will augment \({\mathcal {L}}\) with a constant term for each object in D that is not in the range of the function \({}^+\). If a is such an object, we write \(\overline{a}\) for the corresponding constant and let \({\overline{a}}^+=a\).
We recursively expand \({}^+\) as follows. Let \(\varphi (u, y_1, \ldots , y_n)\) be a \({\mathcal {L}}\)formula with all free variables displayed and containing no class abstracts or individual constants. Let \(s_1, \ldots , s_n\) be a sequence of class terms or individual constants (of lower complexity than \(\varphi\)) and assume that \(s_1^+,\ldots , s_n^+\in D\) are already defined. We define the interpretation for \(\{u\mid \varphi (u, s_1, \ldots , s_n)\}\) as follows. If \(\{u\mid \varphi (u, s_1, \ldots , s_n)\}\) is a \(\pi\)term, then \(\{u\mid \varphi (u, s_1, \ldots , s_n)\}^+\) is defined as above. Otherwise, we let \(\{u\mid \varphi (u, s_1, \ldots , s_n)\}^+\) be the set
Note that this is welldefined, i.e., \((\{u\in V_\omega [U_T]\mid \varphi ^*(u, s_1^+, \ldots , s_n^+)\},p)\in D\), because the setcomponent is a subset of \(V_\omega [U_T]\).
Now we are in a position to show that the Axiom of Class Comprehension holds in our model.
Proposition 7.11
Let\(\varphi (u, y_1, \ldots , y_n)\)by a formula of\({\mathcal {L}}\)and\(s_1, \ldots , s_n\)be a sequence of class abstracts or individual constants. Assume\(dR\{u\mid \varphi (u, s_1, \ldots , s_n)\}^+\). Then
Proof
Let \(c:=\{u\mid \varphi (u, s_1, \ldots , s_n)\}\). Assume first that c is not a \(\pi\)term. Hence
Assume first that \(dEc^+\). Since \(c^+\in P\), the definition of E implies \(d\in T\). By definition of E we have \(d\in _1 c^+\) or \(d\in _1 f^{1}(c^+)\) or \(f(d)\in _1 c^+\) or \(f(d)\in _1 f^{1}(c^+)\). The second and fourth case can be ruled out because \(f^{1}(c^+)\) is undefined as \(c^+\) is a proper class. If \(d\in _1 c^+\), the definition of \(c^+\) implies \(\varphi ^*(d, s_1^+, \ldots , s_n^+)\). If \(f(d)\in _1 c^+\), then \(\varphi ^*(f(d), s_1^+, \ldots , s_n^+)\). Since \(d\equiv f(d)\), Proposition 7.10 implies \(\varphi ^*(d, s_1^+, \ldots , s_n^+)\).
For the other direction, assume \(\varphi ^*(d, s_1^+, \ldots , s_n^+)\). Assume w.l.o.g. that d has finite rank, i.e., \(d\in V_\omega [U_T]\) (if not, we can work with f(d) and rely on Proposition 7.10). Then by definition of \(c^+\) we can deduce \(d\in _1 c\) and therefore dEc.
Now assume that c is a \(\pi\)term. Hence \(c^+\in T\). We prove the claim by induction on the complexity of a \(\pi\)term. We only show two cases and leave the rest to the reader.
1. Assume that \(\varphi\) has the form \(u=u\). Hence \(\varphi ^*\) has the form \(u\equiv u\). Of course this holds for any object in D. Moreover, \(c^+=(V_\omega [U_T], t)\). Since the setcomponent of \(c^+\) is infinite, we have \(dEc^+\) for any \(d\in P\) by definition of E. Moreover, it is easily checked that \(dEc^+\) for any \(d\in T\).
2. Let us check that \(xE\{u\mid \varphi \wedge \psi \}^+\) iff \(\varphi ^*(x)\wedge \psi ^*(x)\). Note that \(\{u\mid \varphi \}, \{u\mid \psi \}\) must be \(\pi\)terms as well. Assume first that \(x\in P\). (The case \(x\in T\) is proved in a similar way.) Assume \(xE\{u\mid \varphi \wedge \psi \}^+\). Then \(\{u\mid \varphi \wedge \psi \}^+\) must be cofinite and hence both \(\{u\mid \varphi \}^+, \{u\mid \psi \}^+\) must be cofinite as well. Therefore, by definition of E, \(xE\{u\mid \varphi \}^+\) and \(xE\{u\mid \psi \}^+\). By induction hypothesis \(\varphi ^*(x)\wedge \psi ^*(x)\). The other direction follows by a similar argument. \(\square\)
Next, we verify the the Axiom of Negation.
Proposition 7.12
\(\forall x\in D\, (xR\{u\mid \varphi \}^+\rightarrow xR\{u\mid \lnot \varphi \}^+).\)
Proof
If \(\{u\mid \varphi \}\) is a \(\pi\)term, then so is \(\{u\mid \lnot \varphi \}\). Hence both \(\{u\mid \varphi \}^+, \{u\mid \lnot \varphi \}^+\) are in T and therefore have the same range of significance. Similarly, if \(\{u\mid \varphi \}\) is not a \(\pi\)term, then neither is \(\{u\mid \lnot \varphi \}\), hence both \(\{u\mid \varphi \}^+, \{u\mid \lnot \varphi \}^+\) are in P and therefore have the same range of significance. \(\square\)
Note that the above argument shows that we can strengthen the Axiom of Negation to a biconditional. In a similar manner, one proves the Axioms of Connectives, the Axiom of Identity, and the Axiom of Membership. Finally, we check the Axiom of SelfIdentity.
Proposition 7.13
\(\forall x\in D\, xR\{u\mid u=u\}^+.\)
Proof
By definition, \(\{u\mid u=u\}^+=(V_\omega [U_T], t)\in T\). Hence any object in D is in the range of it. \(\square\)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Schindler, T. Classes, why and how. Philos Stud 176, 407–435 (2019). https://doi.org/10.1007/s1109801710222
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1109801710222
Keywords
 Classes
 Sets
 Secondorder arithmetic
 Unrestricted quantification
 Logicism