1 Intuitionistic arithmetic and intuitionistic finite set theory

It has long been known that (classical) Peano arithmetic is, in some strong sense, “equivalent” to the variant of (classical) Zermelo–Fraenkel set theory (including choice) in which the axiom of infinity is replaced by its negation. The intended model of the latter is the set of hereditarily finite sets. The connection between the theories is so tight that they may be taken as notational variants of each other.

Our purpose here is to develop and establish a constructive version of this. We present an intuitionistic theory of the hereditarily finite sets, and show that it is definitionally equivalent to Heyting Arithmetic HA, in a sense to be made precise.Footnote 1

We also include a brief comparison to the classical counterparts of our results. Our main results carry over without modification (although some of them, such as the decidability of identity, are not needed in the classical context). The result is, we believe, an improvement on the present understanding of the classical situation.

Our main target theory, the intuitionistic small set theory SST presented in the next section, is remarkably simple, and intuitive. It has just one non-logical primitive, for membership, and four straightforward axioms. We locate our theory within intuitionistic mathematics generally.

Logical preamble: Unless explicitly indicated otherwise, all reasoning here—concerning both object languages and metalanguages—is strictly intuitionistic and is representable in weak subtheories of the intuitionistic set theory IZF (Myhill’s Constructive Set Theory \(\textsf{CST}\) [24], Aczel’s Constructive Zermelo–Fraenkel Set Theory [4], Beeson [9], 162–166, MCarty [18], 54–62, Aczel and Rathjen [6, 7], and, indeed, in our own SST). Accordingly, \(\vdash \) refers always to formal derivability, in the relevant language, defined by the rules of Heyting’s first-order predicate logic (see Troelstra [36] and Troelstra and van Dalen [37], 36–50).

Definition 1.1

Heyting Arithmetic or HA is the formal system of first-order arithmetic adopted as standard (e.g., Troelstra and van Dalen [37], 126–131) for the formalization of elementary arithmetic in intuitionism. In most of the following sections, HA is assumed to contain, for each primitive recursive number-theoretic function f, a distinguished symbol \(f_\textrm{a}\) and, among its axioms, familiar defining equations for f. HA is complete with respect to primitive recursive functions and basic facts involving them at least in that, for each primitive recursive function \(\lambda x \lambda y. \, f(x, y)\), and natural numbers n and m,

$$\begin{aligned} \text {\textsf {HA} }\vdash f_\textrm{a}(\overline{n}, \overline{m}) \ = \ \overline{f(n, m)}, \end{aligned}$$

where \(\overline{n}\) is the canonical numeral in the language of HA for n.

If one wants to go to subtheories of HA, the foregoing approach of including function symbols for all primitive recursive functions is not necessarily a germane one. It would be quite interesting to study small intuitionistic set theories corresponding to systems of bounded arithmetic. Presumably, many of the results of this paper would apply to them, too. At several places in the paper, the authors will raise interesting questions (many of them due to a referee of this paper) about subsystems of SST and their relationship to ones of HA.

2 The small set theory SST

Definition 2.1

(Intuitionistic Small Set Theory SST) The language of the intuitionistic small set theory SST is the standard first-order language of Zermelo–Fraenkel or ZF set theory, featuring both \(\in \) and \(=\) as primitives. The \(=\) sign is subject to standard logical laws of identity. The nonlogical axioms of SST are these:

  1. 1.

    Extensionality: \(\forall x \forall y \ (\forall z\ (z \in x \leftrightarrow z \in y) \rightarrow x = y).\)

  2. 2.

    Empty Set: \(\exists x \forall y\,y\notin x\). Our symbols for the empty set will be the familiar \(\emptyset \) and 0, as context demands.

  3. 3.

    y-Successor of x: \(\forall x \forall y \exists z \forall u \ (u \in z \leftrightarrow (u \in x \vee u = y)).\) Classical set theorists have called this operation both ‘adjunction’ and ‘adduction’ ([15]). Our unofficial (and eliminable) notation for the y-successor of x is

    $$\begin{aligned} x \cup \{ y \}. \end{aligned}$$

    Please note that, when writing \(x \cup \{ y \}\) we do not presume thereby that an operation of binary union exists over the class of all sets.

  4. 4.

    Adduction: For any formula \(\phi (x)\) in the language of set theory—featuring perhaps set parameters—if \(\phi (0),\) and if

    $$\begin{aligned} \forall x \forall y ((y \notin x \wedge \phi (x) \wedge \phi (y)) \rightarrow \phi (x \cup \{y\})), \end{aligned}$$

    then \(\forall x\,\phi (x).\)

We will occasionally note a variant on this:

Weak Adduction: For any formula \(\phi (x)\) in the language of set theory—featuring perhaps set parameters—if \(\phi (0),\) and if

$$\begin{aligned} \forall x \forall y (\phi (x) \rightarrow \phi (x \cup \{y\})), \end{aligned}$$

then \(\forall x\,\phi (x).\)

We shall soon see that Weak Adduction is indeed weaker than Adduction.

As usual, we take \(x \subseteq y\) as an abbreviation of \(\forall z(z \in x \rightarrow z \in y)\).

Historical Antecedents: Some people call the theory axiomatized by (2) and (3) Adjunctive Set Theory, \(\textsf{AS}\).

  1. 1.

    In [33], Szmielew and Tarski announce the interpretability of Robinson Arithmetic \(\textsf{Q}\) in \(\textsf{AS}\) plus extensionality (the authors’ (1), (2), (3)). See also [35, p. 34].

  2. 2.

    A proof of the Szmielew-Tarski result is given by Collins and Halpern in [10].

  3. 3.

    Montagna and Mancini, in [20], give an improvement of the Szmielew-Tarski result. They prove that \(\textsf{Q}\) can be interpreted in an extension of \(\textsf{AS}\) in which one stipulates the functionality of empty set and adjoining of singletons.

  4. 4.

    In Appendix III of [23], Mycielski, Pudlák and Stern provide the ingredients of the interpretation of \(\textsf{Q}\) and \(\textsf{AS}\). See also [22].

  5. 5.

    A new proof of the interpretability of Q in AS is given in [40] by Visser. A very nice presentation of the converse interpretability of (an extension of) \(\textsf{AS}\) plus extensionality in \(\textsf{Q}\), is given by Nelson in [25]. This is an interpretation with absolute identity.

  6. 6.

    Damnjvanovic [11] shows that this theory (as well as our axioms (1), (2), and (3)) is mutually interpretable with Q.Footnote 2

  7. 7.

    Using strictly intuitionistic logic, Previale [28] has given a set of axioms that extends those of the present paper by including primitive notions for y-predecessor of x in addition to y-successor of x, and explicit axioms governing the transitive closure of a set.

  8. 8.

    Jeon [13] examines bi-interpretability subtheories of finitary CZF and Heyting Arithmetic.

Proposition 2.2

\(\forall x \forall z ( x \subseteq z \rightarrow z \notin x).\)

Proof

By Adduction. Let \(\phi (x)\) be \(\forall z ( x \subseteq z \rightarrow z \notin x).\) The base case, where \(x=\emptyset \), is immediate.

Induction hypothesis: \(y \notin x \ \wedge \ \phi (x) \ \wedge \ \phi (y)\). We need to show \(\phi (x\cup \{y\})\). So, suppose (i) \( x\cup \{y\} \subseteq z.\) We need to show \(z \notin x\cup \{y\}\). We therefore suppose that \(z \in x\cup \{y\}\). Then, (ii) either \(z \in x \text { or }z=y.\) But we have \(x \subseteq x\cup \{y\}\) and \(x\cup \{y\} \subseteq z\). By the induction hypothesis, \(z \notin x\). Hence, \(z=y\) (disjunctive syllogism on (ii)).

Then, (i) is \(x\cup \{y\} \subseteq y\). This implies that \(y \in y\). However, that contradicts the induction hypothesis, since \(y \subseteq y\). \(\square \)

Corollary 2.3

\(\forall x\,x \notin x\).

Remark 2.4

Weak Adduction wouldn’t suffice to prove \(\forall x\,x \notin x\). This can be seen as follows. In the set-theoretic world of hereditarily finite sets, HF, take the collection of all finite pointed accessible directed graphs (apgs for short) (see Aczel [5, p. 4]), and identify two such apgs if they are bisimular. This gives rise to a model of ZFC bereft of the axioms of Foundation and Infinity (see Aczel [5, Sect. 3]). In this model Weak Adduction holds since any set can be obtained from the empty set by applying the operation \(x\mapsto x\cup \{y\}\) finitely many times. However, this is also a model of the Anti-Foundation axiom, refuting \(\forall x\,x \notin x\).

3 T-sets

In order to single out the naturals in SST, we introduce the notion of T-set.

Definition 3.1

(Predecessor, T-set)

  1. 1.

    We say that a set x has a predecessor if and only if \(\exists y \in x\;x = y \cup \{y\}\).

  2. 2.

    A set x is a T-set if and only if

    1. (a)

      x is a transitive set of transitive sets, and

    2. (b)

      Either \(x = \emptyset \) or x and all of its non-zero elements have predecessors.

    When x is a T-set, we write \(\textrm{T}(x)\).

Proposition 3.2

  1. 1.

    \(\textrm{T}(\emptyset )\).

  2. 2.

    Whenever \(\textrm{T}(x)\), \(\textrm{T}(x \cup \{x\})\) as well.

  3. 3.

    The class of all T-sets is itself inductive: for all formulae \(\phi \), if \(\phi (0)\) and if \(\phi (x \cup \{x\}),\text { whenever }\textrm{T}(x) \text { and } \phi (x),\) then, \(\text {for every T-set }x, \ \phi (x).\)

  4. 4.

    For all T-sets x, either \(x = 0\), \(x = \{ 0 \}\), or \(\{ 0 \} \in x\).

Proof

Items 1 and 2 are straightforward from the definition.

For item 3: we prove that the class of all T-sets is inductive. First, we assume that \(\phi (\emptyset )\) and that \(\phi (x \cup \{x\})\) whenever \(\textrm{T}(x)\) and \(\phi (x)\). The goal is to prove, by induction in SST, that, for all sets x, \(\psi (x)\), wherein \(\psi (x)\) is the following conjunctive expression.

$$\begin{aligned} \forall y \in x (\textrm{T}(y) \rightarrow \phi (y)) \wedge (\textrm{T}(x) \rightarrow \phi (x)). \end{aligned}$$

Clearly, once we prove \(\forall x\,\psi (x)\), we are home and dry.

The base case \(\psi (\emptyset )\) is immediate. So we make the relevant additional assumptions for adduction in SST, briefly,

$$\begin{aligned} \psi (a),\text { and }\psi (b). \end{aligned}$$

In other words, we are assuming that

$$\begin{aligned}&\forall y \in a(\textrm{T}(y) \rightarrow \phi (y)) \wedge (\textrm{T}(a) \rightarrow \phi (a)), \text { plus } \\&\forall y \in b(\textrm{T}(y) \rightarrow \phi (y)) \wedge (\textrm{T}(b) \rightarrow \phi (b)). \end{aligned}$$

The goal is to prove from all these that

$$\begin{aligned} \psi (a \cup \{ b \}), \end{aligned}$$

that is,

  1. 1.

    \(\forall y \in (a \cup \{ b \}) (\textrm{T}(y) \rightarrow \phi (y))\), and

  2. 2.

    \(\textrm{T}(a \cup \{ b \} ) \rightarrow \phi (a \cup \{ b \})\).

Ad (1): When \(y \in (a \cup \{ b \})\), either \(y \in a\) or \(y = b\). In either case, the assumptions show that

$$\begin{aligned} \textrm{T}(y) \rightarrow \phi (y). \end{aligned}$$

Ad (2): To show that \(\textrm{T}(a \cup \{ b \} ) \rightarrow \phi (a \cup \{ b \})\), assume that \(\textrm{T}(a \cup \{ b \} )\). By definition of T-set, it follows that \(a \cup \{ b \}\) has a predecessor z such that

$$\begin{aligned}&z \in (a \cup \{ b \}), \\&\textrm{T}(z), \text { and } \\&a \cup \{ b \} = z \cup \{ z \}. \end{aligned}$$

By the proof of (1), immediately above,

$$\begin{aligned} \phi (z). \end{aligned}$$

From the original inductive hypothesis at the start of the proof, we know that

$$\begin{aligned} \phi (z \cup \{ z \}). \end{aligned}$$

Therefore,

$$\begin{aligned} \phi (a \cup \{ b \}), \end{aligned}$$

we are finished with (2) and the proof of induction over the T-sets.

Item 4 follows immediately from 3. \(\square \)

4 Internal natural numbers

The class of T-sets is our choice for canonical natural numbers internal to SST. Between T-sets, the relation of membership \(\in \) will serve, within the set theory, as the “less than” relation < between natural numbers internal to SST. Our ultimate goals with regard to HA are to

  1. 1.

    Develop a recognizable theory of hereditarily finite and decidable sets within STT,

  2. 2.

    Demonstrate that, within the latter theory, Heyting Arithmetic HA is soundly interpreted,

  3. 3.

    Show that SST is soundly interpreted within HA, and

  4. 4.

    Show that SST and HA are definitionally equivalent (in the sense of Sect. 12).

We prove all this intuitionistically.

Definition 4.1

(Class of natural numbers) Let \(\mathbb {N}\) be the SST-internal class of T-sets. Note that \(\mathbb {N}\) is a proper class in SST, and not a set.

For \(x \in \mathbb {N}\), Sx—the natural number successor of x—will stand for \(x \cup \{x\}\). As Kirby [15] has emphasized, successor Sx on the internal natural numbers is the diagonal of the ‘y-successor of x’ operation.

Induction on \(\mathbb {N}\) can now be formulated

$$\begin{aligned}{}[\phi (0)\wedge \forall x\in \mathbb {N}( \phi (x) \rightarrow \phi (\textrm{S}x))] \rightarrow \forall x\in \mathbb {N}\;\phi (x). \end{aligned}$$

Proposition 4.2

These are provable within \(\textsf{SST}\):

  1. 1.

    \(0 \in \mathbb {N}\).

  2. 2.

    Whenever \(x \in \mathbb {N}\), then \(\textrm{S}x \in \mathbb {N}\).

  3. 3.

    For \(x, y \in \mathbb {N}\ (\textrm{S}x = \textrm{S}y \rightarrow x = y).\)

  4. 4.

    For \(x, y \in \mathbb {N}\ (x \in y \vee x = y \vee y \in x).\)

  5. 5.

    For \(x, y\in \mathbb {N}\ (x = y \vee x \ne y)\).

Proof

Ad 3: If x and \(y \in \mathbb {N}\), while \(\textrm{S}x = \textrm{S}y\), then either \(x = y\) or \(x \in y \in x\). But the latter, since x is transitive, yields \(x\in x\), which is impossible by Corollary 2.3.

Ad 4: This is a straightforward nested induction on \(\mathbb {N}\), essentially the same as the proof of the corresponding theorem in HA (see Proposition 2.8 (v) in Troelstra and van Dalen [37], Volume 1, p. 124).

Ad 5: Immediately from 4 by Corollary 2.3. \(\square \)

5 Finiteness and decidability

The usual notion of finiteness in set theory is based on that of 1-1 correspondence with particular sets. So one first has to look at the notion of function in set theory. Since sets in SST are closed under the operation

$$\begin{aligned} x, y \mapsto (x \cup \{ y \}), \end{aligned}$$

there are Kuratowski pairs, triples, quadruples, and so on, for all sets. As one would expect, functions for SST are sets of pairs, or of triples, or of quadruples, etc. that satisfy a uniqueness condition on the last component.

Crucially, one has to show in \(\textsf{SST}\) that if \(\langle a,b\rangle \) denotes the Kuratowski pair \(\{\{a\},\{a,b\}\}\), then ab are uniquely determined by \(\langle a,b\rangle \), that is to say,

$$\begin{aligned} \text{ if } \langle a,b\rangle =\langle a',b'\rangle&\text{ then }&a=a'\;\wedge \;b=b'.\end{aligned}$$
(1)

The proofs one finds in books on classical set theory rely on the instance \(x\in y\,\vee \,x\notin y\) of excluded middle, which isn’t guaranteed intuitionistically. An intuitionistic proof, though, can e.g. be found in [6, Proposition 3.1] or [7, 4.1.1]. But as it will turn out later in Theorem 5.8, SST proves excluded middle for atomic formulas anyway, and so the classical proof works just fine in SST.

Definition 5.1

(Finiteness) A set is (internally) finite whenever it is in bijective correspondence with some natural number (i.e., a T-set).

We shall write \(x\equiv n\) if there exists a bijection between x and n.

5.1 Lemma on cardinality

Lemma 5.2

(on the uniqueness of finite cardinality) If \(m, n \in \mathbb {N}\) stand in bijective correspondence, then \(m = n\).

Proof

The proof of [7, Corollary 8.2.4] works in SST. \(\square \)

Definition 5.3

For any finite set x, \(|x |\) is its unique cardinality, the unique \(n \in \mathbb {N}\) such that n and x are in bijective correspondence.

Theorem 5.4

(universal finiteness) Every set is finite.

Proof

(by adduction in SST) \(\emptyset \) is certainly finite. Assume that x is a finite set and that \(y \notin x\). Then, \(x \cup \{y\}\) is finite (the clause \(y \notin x\) is really used here). By adduction in SST, every set is finite. \(\square \)

Note that the previous proof fully utilizes adduction. Our guess is that with weak adduction we do not get decidability of elementhood.

5.2 Decidability for \(\Delta _0\)-predicates

In intuitionists’ parlance, a predicate or formula that obeys the principle of excluded middle is often said to be decidable.

Definition 5.5

(Decidability for set identity) A set x is decidable for identity just in case \(\forall y\, (x = y \vee x \ne y)\).

Proposition 5.6

Every set is decidable for set identity.

Proof

Let x and y be sets. \(\{ x, y \}\) is a finite set by Theorem 5.4, and thus stands in bijective correspondence with a natural number. As we have seen in Proposition 4.2, identity on the natural numbers is decidable. Hence, \(x = y\) or \(x \ne y\). \(\square \)

Definition 5.7

(Decidability for membership) A set x is decidable for membership just in case, \(\forall y\, (y \in x \vee y \notin x)\).

Proposition 5.8

(Decidability for membership) All sets are decidable for membership.

Proof

(by adduction in SST) First, for any set y, either \(y \in 0\) or \(y \notin 0\). Second, assume that x is decidable for membership.Footnote 3 Consider \(x \cup \{y\}.\) Because x is assumed decidable for membership, for any set a, either \(a \in x\) or \(a \notin x\). If \(a \in x\), then \(a \in (x \cup \{y\})\). On the other hand, when \(a \notin x\), we know that, since y is decidable for identity, either \(a = y\) or \(a \ne y\). If the first, then \(a \in (x \cup \{y\})\). If the second, \(a \notin (x \cup \{y\})\). \(\square \)

Corollary 5.9

Every set is finite and decidable for both \(\in \) and \(=\).

Definition 5.10

A formula of SST is said to be \(\Delta _0\) if all quantifiers in it are of bounded form, that is to say, of either form \(\forall x\in a\) or \(\exists y\in b\).

Theorem 5.11

Any \(\Delta _0\) formula \(\varphi \) is decidable in \(\textsf{SST}\), that is, \(\textsf{SST}\) proves \(\varphi \,\vee \,\lnot \varphi \).

Proof

The proof proceeds by metainduction on the buildup of \(\varphi \). The atomic cases are dealt with in Propositions 5.6 and 5.8. Suppose \(\varphi \) is of the form \(\forall x\mathop {\in }a\,\theta (x)\). Metainductively we have that SST proves \(\theta (x)\,\vee \,\lnot \theta (x)\). Working in SST, we can now use weak adduction on a to prove \(\forall x\mathop {\in }a\,\theta (x)\;\vee \; \lnot \forall x\mathop {\in }a\,\theta (x)\). This is clearly the case for \(a=0\). Suppose \(a=b\,\cup \,\{c\}\) and inductively that \(\forall x\mathop {\in }b\,\theta (x)\;\vee \; \lnot \forall x\mathop {\in }b\,\theta (x)\). Note that we also have \(\theta (c)\,\vee \,\lnot \theta (c)\). Now, \(\forall x\mathop {\in }b\,\theta (x)\) and \(\theta (c)\) yield \(\forall x\mathop {\in }a\,\theta (x)\) whereas in all the other possible cases \(\lnot \forall x\mathop {\in }a\,\theta (x)\) ensues.

The other cases are handled in a similar vein. \(\square \)

At this point it is perhaps in order to point out that \(\textsf{SST}\) does not prove \(\varphi \,\vee \,\lnot \varphi \) for all formulas. This will follow from results in Sects. 15 and 16, e.g., Corollary 15.1 (i).

6 Axioms for general set theory

Now, one can prove that sets in SST satisfy several of the familiar axioms of Zermelo–Fraenkel set theory with choice:

  1. 1.

    Pairing,

  2. 2.

    Arbitrary Union,

  3. 3.

    Decidable Separation,

  4. 4.

    Replacement,

  5. 5.

    Power Set,

  6. 6.

    Strong Collection,

  7. 7.

    \(\in \)-Induction, and

  8. 8.

    Foundation.

Since SST entails that all sets are finite, it will not admit the existence of an infinite set (if it is consistent). So, of course, we do not include the axiom of infinity in this list.

The theory axiomatized by the principles just listed constitutes the intuitionistic analogue to a classical theory of Wilhelm Ackermann, from 1937, named allgemeine Mengenlehre, ‘General Set Theory’. Intuitionistic proofs of these principles, in an hereditarily finite setting, appear also in Previale [28].

Recall that in Sect. 3, we noted our notation \(x \cup \{ y \}\) does not presuppose that an operation of binary union exists over the class of all sets. Here we establish just that:

Theorem 6.1

(Binary Union) \(\forall x \forall y \exists z \forall w(w \in z \leftrightarrow (w \in x \vee w \in y))\)

Proof

We proceed by (weak) Set Induction. Let \(\phi (x)\) be \(\forall y \exists z \forall w(w \in z \leftrightarrow (w \in x \vee w \in y))\). Clearly, \(\phi (\emptyset )\). Assume \(\phi (a)\). Then, for any y, \(a \cup y\) exists. Let b be any set. Then, for any y, \((a \cup \{b\}) \cup y\) is \(a \cup (y \cup \{b\})\). The latter exists, by the induction hypothesis. \(\square \)

Theorem 6.2

(Arbitrary Union) If x is a set, so is \(\bigcup x\), its union.

Proof

Using (weak) adduction, we see that

$$\begin{aligned} \bigcup (a \cup \{ b \}) \end{aligned}$$

is just the binary union of \(\bigcup a\) with b. \(\square \)

Definition 6.3

(Decidable over x) A formula \(\psi (y)\) is decidable over the set x just in case

$$\begin{aligned} \forall y \in x \ (\psi (y) \vee \lnot \psi (y)). \end{aligned}$$

Theorem 6.4

(Decidable Separation) If x is a set and \(\psi (x)\) is decidable over x, then the collection \(\{ y \in x: \psi (y) \}\) is also a set.

Proof

By (weak) Adduction. Let \(\psi \) be decidable over \(a \cup \{ b \}\). Because of decidability, either (i) \(\psi (b)\) or (ii) \(\lnot \psi (b)\). By the induction hypothesis,

$$\begin{aligned} \{ x \in a: \psi (x) \} \end{aligned}$$

is a set. If (i), then

$$\begin{aligned} \{ x \in (a \cup \{ b \}): \psi (x) \} \end{aligned}$$

is just

$$\begin{aligned} \{ x \in a: \psi (x) \} \cup \{ b \}. \end{aligned}$$

If (ii),

$$\begin{aligned} \{ x \in (a \cup \{ b \}): \psi (x) \} = \{ x \in a: \psi (x) \}. \end{aligned}$$

\(\square \)

Corollary 6.5

If \(\psi (x)\) is \(\Delta _0\), then for every set a, \(\{x\in a:\psi (x)\}\) is a set.

Proof

By Theorems 5.11 and 6.3. \(\square \)

Theorem 6.6

(Strong Collection) For any formula \(\phi (x, y)\), if \(\forall x \mathop {\in }a\, \exists y \ \phi (x, y)\) then there is a set c such that \(\forall x \mathop {\in }a\, \exists y \mathop {\in }c \, \phi (x, y)\) and \(\forall y\mathop {\in }c \,\exists x \mathop {\in }a \, \phi (x, y)\).

Proof

We use weak Adduction on a with the formula

$$\begin{aligned} \psi (a):= & {} \forall x \mathop {\in }a\, \exists y \, \phi (x, y)\rightarrow \exists z\,\theta (a,z), \end{aligned}$$

where \(\theta (a,z):=\forall x \mathop {\in }a\, \exists y \mathop {\in }z\, \phi (x, y)\,\wedge \forall y\mathop {\in }z\,\exists x\mathop {\in }a\, \phi (x,y)\).

Obviously, \(\psi (0)\). Assuming \(\psi (a)\), one has to show \(\psi (a\cup \{b\})\) for all b. So suppose \(\forall x \in a \cup \{b\}\,\exists y \phi (x, y)\). Owing to \(\psi (a)\) there exists c such that

$$\begin{aligned} \forall x \mathop {\in }a \,\exists y \mathop {\in }c\, \phi (x, y) \,\wedge \,\forall y\mathop {\in }c\exists x\mathop {\in }a \,\phi (x,y). \end{aligned}$$

Also, there is a d such that \(\phi (b,d)\). Thus \(\theta (a\cup \{b\}, c\cup \{d\})\), and hence \(\psi (a\cup \{b\})\). \(\square \)

Definition 6.7

(Class function) A formula \(\phi (x, y)\) (which may contain other parameters) is a class function just in case, for all x, there is a unique y such that \(\phi (x, y)\). We then also write \(G_{\phi }(x)\) to refer to the unique y such that \(\phi (x, y)\).

A formula \(\phi (x, y)\) (which may contain other parameters) is a class function on a set a just in case, for all \(x\in a\), there is a unique y such that \(\phi (x, y)\).

Theorem 6.8

(Replacement) A class function on a set is a set function. The image of a set under a class function is again a set.

Proof

Suppose \(\forall x\mathop {\in }a\,\exists ! y\, \phi (x,y)\). Thus, \(\forall x\mathop {\in }a\, \exists ! z\,\exists y\,[ \phi (x,y)\,\wedge \,z=\langle x,y\rangle ]\). By Strong Collection, \(\{\langle x,y\rangle :x\in a\;\wedge \; \phi (x,y)\}\) is a set. The latter set is also a function on a. \(\square \)

Definition 6.9

(Power set) For a set x, the power set of x is the collection

$$\begin{aligned} \{y: y \subseteq x \}, \end{aligned}$$

the collection of all subsets of x.

Theorem 6.10

(Power Set) If x is a set, so is the power set of x.

Proof

By (weak) Adduction. The power set of \(a \cup \{ b \}\) is the union of the power set of a (call it P(a)) together with the collection

$$\begin{aligned} \{ x \cup \{ b \}: x \in P(a) \}. \end{aligned}$$

The latter is a set by Replacement. \(\square \)

Theorem 6.11

(\(\in \)-Induction) Assume, for any formula \(\phi (x)\) and set x,

$$\begin{aligned} (\forall y \in x\,\phi (y)) \rightarrow \phi (x). \end{aligned}$$

It follows that

$$\begin{aligned} \forall x\,\phi (x). \end{aligned}$$

Proof

Proof by Adduction. First, assume that

$$\begin{aligned} \forall x( (\forall y \mathop {\in }x\,\phi (y)) \rightarrow \phi (x)). \end{aligned}$$

Then, for Adduction, assume that

  1. 1.

    \(\forall x \mathop {\in }a\,\phi (x)\, \wedge \, \phi (a)\) and

  2. 2.

    \(\forall x \mathop {\in }b\,\phi (x)\, \wedge \, \phi (b).\)

It suffices to show that

  1. (i)

    \(\forall x \in (a \cup \{ b \}) \ \phi (x)\) and

  2. (ii)

    \(\phi (a \cup \{ b \})\).

  1. (i)

    Follows immediately from 1, 2, and the definition of \(a \cup \{ b \}\).

  2. (ii)

    Now holds by the assumption for Adduction applied to (i).

\(\square \)

The previous result allows us to prove the common axiom of foundation in classical set theories, stating that each non-empty set has an element that is disjoint from the set.

Theorem 6.12

(Foundation) \(\forall x \,[\exists y\,y\mathop {\in }x \rightarrow \exists y\in x \,\forall z\in y\,\lnot z \in x ]\).

Proof

Let a be any inhabited set. Suppose \((1)\;\forall y\mathop {\in }a \,\exists z\mathop {\in }y\,z\mathop {\in }a\). Then, \((2)\;\;\;\forall u\,[\forall v\mathop {\in }u\, v\notin a \rightarrow u\notin a]\), because in view of (1), \(\forall v\mathop {\in }u\, v\notin a\) and \(u\in a\) yield \(\exists v\in u \,v\in a\). Whence, by \(\in \)-Induction, \(\forall u\,u\notin a\). But this is ridiculous since a is inhabited. Thus (1) is false, so by classical logic for \(\Delta _0\) formulas (Theorem 5.11), we must have \(\exists y\in a \,\forall z\in y\,z\notin a\). \(\square \)

6.1 The axiom of choice

Definition 6.13

(Base) A set x is a base if, for all sets yr, whenever

$$\begin{aligned} \forall a \in x \, \exists b \in y\;\langle a,b\rangle \in r, \end{aligned}$$

there is a function f with domain x such that

$$\begin{aligned} \forall a \in x\, [f(a)\in y\;\wedge \; \langle a,f(a)\rangle \in r]. \end{aligned}$$

Theorem 6.14

(Axiom of Choice) Every set is a base.

Proof

Use Adduction. Clearly, 0 is a base. Suppose c is a base. We want to show that \(c\,\cup \,\{d\}\) is a base for every set \(d\notin c\). Suppose \(\forall a \in c\,\cup \,\{d\} \, \exists b \in y\;\langle a,b\rangle \in r.\) Since c is a base, there is a function g with domain c such that

$$\begin{aligned} \forall a \in c\,[g(a)\in y\,\wedge \, \langle a,g(a)\rangle \in r]. \end{aligned}$$

Also, there is \(b_0\in y\) such that \(\langle d,b_0\rangle \in r\). Hence, with \(f:=g\cup \{ \langle d,b_0\rangle \}\) we have a function satisfying \(\forall a \in c\,\cup \,\{d\}\,[g(a)\in y\,\wedge \, \langle a,g(a)\rangle \in r].\) \(\square \)

7 Recursion on membership

Definition of functions and class functions by \(\in \)-Recursion is a central tool of set theory. It is available in SST, too. The proofs are similar to those in [6, 11.2] and [7, 19.2], where they are carried out in intuitionistic Kripke-Platek set theory for \(\Sigma \) predicates.

Theorem 7.1

(Definition by \(\in \) Recursion in \(\textsf{SST}\)) Let \(\textbf{x}:=x_1,\ldots ,x_n\). If G is a total \((n+2)\)–ary class function, that is,

$$\begin{aligned} \forall \textbf{x} yz\exists !u\,G(\textbf{x},y,z) =u, \end{aligned}$$

then there is a total \((n+1)\)–ary class function F such that

$$\begin{aligned}{} & {} \forall \textbf{x}y\,[F(\textbf{x},y)=G(\textbf{x},y,(F(\textbf{x},z)\mid z\in y))],\end{aligned}$$
(2)

where \((F(\textbf{x},z)\mid z\in y):= \{\langle z,F(\textbf{x},z)\rangle : z\in y\}\).

For the avoidance of doubt, the claim is that for every formula \(G'(\textbf{x},y,z,u)\), we can effectively find a formula \(F'(\textbf{x},y,v)\) such that \(\textsf{SST}\) proves the following: whenever \(\forall \textbf{x} yz\exists !u\,G'(\textbf{x},y,z,u)\) then \(\forall \textbf{x}\,y\,\exists !v\,F'(\textbf{x},y,v)\) and (2) hold, where \(G(\textbf{x},y,z)\) denotes the unique u such that \(G'(\textbf{x},y,z,u)\) and \(F(\textbf{x},y)\) denotes the unique v such that \(F'(\textbf{x},y,v)\).

Proof

Let \(\textrm{dom}(f)\) denote the domain of f. Let \(\Phi (f,\textbf{x})\) be the formula

$$\begin{aligned}{}[f \text{ is } \text{ a } \text{ function}]\wedge [\textrm{dom}(f) \text{ is } \text{ transitive}]\wedge [\forall y\in \textrm{dom}(f)\,(f(y)=G(\textbf{x},y,f\restriction y))]. \end{aligned}$$

Set

$$\begin{aligned} \psi (\textbf{x},y,f)=[\Phi (f,\textbf{x})\wedge y\in \textrm{dom}(f)]. \end{aligned}$$

Claim \(\quad \forall \textbf{x},y\exists ! f\psi (\textbf{x},y,f)\).

Proof of Claim: By \(\in \) induction on y. Suppose \(\forall u\,{\in }\,y\,\exists g\,\psi ( \textbf{x},u,g)\). By Strong Collection, that is Theorem 6.6, we find a set A such that \(\forall u\,{\in }\,y\,\exists g\,{\in }\,A \,\psi (\textbf{x},u, g)\) and \(\forall g\,{\in }\,A\exists u\,{\in }\,y\,\psi (\textbf{x},u,g)\). Let \(f_0=\bigcup \{g:g\in A\}\). Since for all \(g\in A\), dom(g) is transitive we have that dom(\(f_0\)) is transitive. We want to show that \(f_0\) is a function. But it is readily shown by another induction that if \(g_0,g_1\in A\), then \(\forall x \in \textrm{dom}(g_0)\cap \textrm{dom}(g_1)[g_0(x)=g_1(x)]\). Therefore \(f_0\) is a function. Moreover, if \(u\in y\), then \(u\in \textrm{dom}(f_0)\). Hence, by our general assumption, there exists a \(u_0\) such that \(G(\textbf{x},y,(f_0(u)\mid u\in y))=u_0\). Set \(f=f_0\cup \{\langle y,u_0\rangle \}\). Then f is a function, too, and \(\textrm{dom}(f)\) is transitive since all \(u\mathop {\in }y\) are in \(\textrm{dom}(f_0)\). We also have \(\forall w\,{\in }\,\textrm{dom}(f)[f(w)=G(\textbf{x},w,f\restriction w)]\), confirming the claim.

Now define F by

$$\begin{aligned} F(\textbf{x},y)=w \text{ iff } \exists f[\psi (\textbf{x},y,f)\wedge f(y)=w]. \end{aligned}$$

\(\square \)

Corollary 7.2

There is a class function \(\textrm{TC}\) such that

$$\begin{aligned} \forall a[\textrm{TC}(a)=a\cup \bigcup \{\textrm{TC}(x):x\in a\}]. \end{aligned}$$

Moreover, we have

  1. 1.

    \(\textrm{TC}(\emptyset ) \ = \ \emptyset .\)

  2. 2.

    For all sets x, y, and z, \(z \in \textrm{TC}(x \cup \{ y \})\) if and only if

    $$\begin{aligned} z \in \textrm{TC}(x) \vee z \in \textrm{TC}(y) \vee z = y. \end{aligned}$$
  3. 3.

    Whenever x is a set, so is \(\textrm{TC}(x)\).

Proof

The existence of \(\textrm{TC}\) is a consequence of Theorem 7.1. 1 and 2 are immediate from the definitions. \(\square \)

N.B. Items 1 and 2 in the preceding theorem are axioms in the system of [28].

Proposition 7.3

(Definition by \(\textrm{TC}\)–Recursion) Under the assumptions of Theorem 7.1 there is an \((n+1)\)–ary class function F such that

$$\begin{aligned} \forall \textbf{x} y[F(\textbf{x},y)= G(\textbf{x},y,(F(\textbf{x},z)\mid z\in \textrm{TC}(y)))]. \end{aligned}$$

Proof

Let \(\theta (f,\textbf{x},y)\) be the formula

$$\begin{aligned}{}[f \text{ is } \text{ a } \text{ function}]\wedge [\textrm{dom}(f)=\textrm{TC}(y)]\wedge [\forall u\,{\in }\,\textrm{dom}(f) [f(u)=G(\textbf{x},u,f\restriction \textrm{TC}(u))]]. \end{aligned}$$

Prove by \(\in \)–induction that \(\forall y\exists !f\,\theta (f,\textbf{x},y)\). Suppose \(\forall v\,{\in }\,y\,\exists !g\,\theta (g,\textbf{x},v)\). We then have

$$\begin{aligned} \forall v\,{\in }\,y\exists ! a \exists g[\theta (g,\textbf{x},v)\wedge G(\textbf{x},v,g)=a]. \end{aligned}$$

By Replacement there is a function h such that \(\textrm{dom}(h)=y\) and

$$\begin{aligned} \forall v\,{\in }\,y\,\exists g\left[ \theta (g,\textbf{x},v)\wedge G(\textbf{x},v,g)=h(v)\right] . \end{aligned}$$

Employing Strong Collection to \(\forall v\,{\in }\,y\,\exists !g\,\theta (g,\textbf{x},v)\) also provides us with a set A such that \(\forall v\,{\in }\,y\,\exists g\,{\in }\,A\,\theta (g,\textbf{x},v)\) and \(\forall g\,{\in }\,A\,\exists v\,{\in }\,y\, \theta (g,\textbf{x},v)\). Now let \(f=(\bigcup \{g:g\in A\})\cup h\). Then \(\theta (f,\textbf{x},y)\).\(\square \)

Definition 7.4

A set x is said to be transitive if whenever \(y\in x\) and \(z\in y\), then \(z\in x\). In an intuitionistic setting, ordinals are defined as transitive sets all of whose elements are transitive sets.

Let \(\textrm{Ord}\) be the class of ordinals. Then \(x\in \textrm{Ord}\) is the \(\Delta _0\) formula expressing that x is an ordinal. As per tradition, we use variables \(\alpha ,\beta ,\gamma , \ldots \) to range over ordinals.

However, to show that ordinals are linearly ordered, that is, \(\alpha \in \beta \,\vee \, \alpha =\beta \,\vee \,\beta \in \alpha \), one needs classical logic. As it turns out, one only needs that \(\Delta _0\) formulas are decidable, which is guaranteed by Theorem 5.11.

Corollary 7.5

There is a class function \(\textrm{rank}\) assigning a rank to every set. That is to say,

$$\begin{aligned} \textrm{rank}(x)= & {} \bigcup \{\textrm{rank}(y)+1:y\in x\}\end{aligned}$$

where \(u+1:=u\cup \{u\}\). Moreover, \(\textrm{rank}(x)\) is always an ordinal. For ordinals \(\alpha \) one has \(\alpha =\textrm{rank}(\alpha )\).

Proof

Clearly, this class function exists by Theorem 7.1. Moreover, one can easily verify by \(\in \)-Induction that \(\textrm{rank}(x)\) is a transitive set consisting of transitive sets, thus \(\textrm{rank}(x)\) is always an ordinal.

Furthermore, one shows by induction on ordinals that \(\alpha =\textrm{rank}(\alpha )\).

\(\square \)

Theorem 7.6

In \(\textsf{SST}\), ordinals are linearly ordered. Moreover, the class of ordinals coincides with the class of natural numbers and all ordinals are cardinals in that they cannot be put in one-one correspondence with a smaller ordinal.

Proof

One proves \(\forall \beta \,(\alpha \in \beta \,\vee \, \alpha =\beta \,\vee \,\beta \in \alpha )\) by induction on \(\alpha \). Given \(\beta \), we have \(\exists \xi \in \alpha \,(\xi =\beta \,\vee \,\beta \in \xi )\) or \(\lnot \exists \xi \in \alpha \,(\xi =\beta \,\vee \,\beta \in \xi )\) by classical logic for \(\Delta _0\) formulas. In the first case, \(\beta \in \alpha \). In the second case we get \(\forall \xi \in \alpha \,\xi \in \beta \) from the induction hypothesis. Thus \(\alpha \subseteq \beta \). If \(\alpha \ne \beta \), then, by Foundation, there exists \(\xi _0\in \beta \) such \(\xi _0\notin \alpha \) and \(\forall \delta \in \xi _0\; \delta \in \alpha \). The induction hypothesis also implies that \(\alpha \subseteq \xi _0\). Whence, \(\alpha =\xi _0\), so \(\alpha \in \beta \). (Note that we used classical logic for \(\Delta _0\) formulas several times).

It is clear that every natural is an ordinal. Let \(\varphi (x)\) be the formula \((\exists n\,{\in }\,\mathbb {N})\,\textrm{rank}(x)=n\). To show \(\forall x\,\varphi (x)\) we use Adduction. Clearly, \(\varphi (0)\). Now suppose \(\varphi (x)\) and \(\varphi (y)\) hold. Then \(\textrm{rank}(x)=n\) and \(\textrm{rank}(y)=m\) for some naturals m and n. Note that

$$\begin{aligned} \textrm{rank}(x\cup \{y\})= & {} \bigcup \{\textrm{rank}(a)+1:\, a\in x\cup \{y\}\} \nonumber \\= & {} \bigcup \{\textrm{rank}(a)+1:\, a\in x\}\,\cup \,(\textrm{rank}(y)+1) \nonumber \\= & {} \textrm{rank}(x)\,\cup \,(\textrm{rank}(y)+1) \;=\; n\,\cup \,(m+1). \end{aligned}$$
(3)

We have \(n= m\) or \(n\in m\) or \(m\in n\) by Proposition 4.2. Thus, in view of (3), \(\textrm{rank}(x)=n+1\) or \(\textrm{rank}(x)=m+1\) or \(\textrm{rank}(x)=n\), whence \(\textrm{rank}(x)\) is a natural.

Since \(\textrm{rank}(\alpha )=\alpha \) holds for ordinals \(\alpha \) by Corollary 7.5, all ordinals are naturals.

It follows from the proof of Theorem 8.2.2 in [7] (which carries over to \(\textsf{SST}\)) that a natural cannot be put in one-one correspondence with a smaller natural. \(\square \)

Corollary 7.7

\(\textsf{SST}\) refutes the Infinity Axiom.

Proof

Arguing in \(\textsf{SST}\) (towards a contradiction), suppose there is an inhabited set a such that \(\forall x\in a\,\exists y\in a\,x\in y\). Let \(n=\textrm{rank}(a)\). Notice that n is a natural by the previous Corollary. As a is inhabited, \(n\ne 0\), and hence \(n=k+1\) for some natural k. Moreover, \(k=\textrm{rank}(x)\) for some \(x\in a\). By assumption, there exists \(y\in a\) such that \(x\in y\). But then \(k=\textrm{rank}(x)<\textrm{rank}(y)<\textrm{rank}(a)=k+1\) which is impossible. \(\square \)

Sometimes we just want to define a partial class function where the recursion variable ranges over \(\mathbb {N}\) but not the entire universe. This can be arranged by modifying Theorem 7.1.

Theorem 7.8

(Definition by Recursion on \(\mathbb {N}\) in \(\textsf{SST}\)) Let \(\textbf{x}=x_1,\ldots ,x_r\), \(\textbf{m}=m_1,\ldots m_s\). If \(G'\) is a partial \((r+s+2)\)–ary class function such that

$$\begin{aligned} \forall \textbf{x}\, \forall \textbf{m}\in \mathbb {N}\,\forall n\in \mathbb {N}\,\forall z\,\exists !u\,G'(\textbf{x},\textbf{m}, n,z) =u, \end{aligned}$$

then there is a total \((r+s+1)\)–ary class function F such that

$$\begin{aligned} \forall \textbf{x},\forall \textbf{m}\in \mathbb {N}\,\forall n\in \mathbb {N}\,[F(\textbf{x},\textbf{m},n)=G'(\textbf{x},n,(F(\textbf{x},k)\mid k\in n))]. \end{aligned}$$

Proof

From \(G'\) one obtains a total class function G by declaring that \(G(\textbf{x},\textbf{y},u,z)=0\) in case some member of \(\textbf{y},u\) is not in \(\mathbb {N}\) and \(G(\textbf{x},\textbf{y},u,z)= G'(\textbf{x},\textbf{y},u,z)\) otherwise. This is possible as elementhood in \(\mathbb {N}\) is a \(\Delta _0\) property (e.g. by Theorem 7.6), and hence decidable. So one can apply Theorem 7.1 to \(G'\) to obtain the desired F. \(\square \)

Corollary 7.9

As a consequence of the previous theorem, addition and multiplication of naturals and indeed all primitive recursive functions on \(\mathbb {N}\) can be defined in \(\textsf{SST}\).

The upshot of that is that Heyting Arithmetic has a “canonical” interpretation in \(\textsf{SST}\) once the collection of natural numbers is equated with that of the ordinals of \(\textsf{SST}\) and the successor function is interpreted as the familiar successor function \(\alpha \mapsto \alpha \cup \{\alpha \}\) on ordinals. Note that this successor function in combination with the defining equations for addition and multiplication uniquely determines the latter in \(\textsf{SST}\).

8 Proving Gödel’s incompleteness theorems

Because of its expressive power, SST would be a perfect place to carry out proofs of Gödel’s two Incompleteness Theorems. For one thing, since SST represents notions and demonstrates facts pertaining to finite sets, functions, and (especially) sequences as such, little or no encoding of sequences of symbols or sequences of sequences by numbers would be required. Note that Gödel originally developed his theorems using set-theoretic coding. It was perhaps only John von Neuman’s questioning that made him invent Gödel numbering.

Indeed, as in Sect. 1 through 9 of Świerczkowski [34], 6–35, full proofs of both Incompleteness Theorems may be carried out in SST without hand-waving or gaps.Footnote 4 These realizations formed the metalogical framework for Paulson’s automated proofs of the Incompleteness Theorems (see Paulson [27]). An elegant use of set-theoretic coding is also made by Zambella [41].

9 A self-interpretation of \(\textsf{SST}\)

Ackermann’s bijection between the naturals and the hereditarily finite sets furnishes a self-interpretation of \(\textsf{SST}\) in the intuitionistic context, too.

Definition 9.1

Let us define: \(\textrm{Set}(0)=0\) and

$$\begin{aligned} \textrm{Set}(2^{n_1}+2^{n_2}+\cdots +2^{n_k})= & {} \{\textrm{Set}(n_1),\ldots ,\textrm{Set}(n_k)\}\end{aligned}$$
(4)

whenever \(n_1>n_2>\cdots >n_k\). The definition of \(\textrm{Set}\) in \(\textsf{SST}\) falls under the scope of Theorem 7.8. More precisely, one would first show that every natural number, save 0, has a unique binary expansion as in (4), and then use Ackermann’s primitive recursive \(\textrm{Bit}\) function on naturals such that, for all natural numbers n and i,

$$\begin{aligned} \textrm{Bit}(n,i) = {\left\{ \begin{array}{ll} 1 &{} \text {if the } i \text {th bit in the binary notation for } n \text { is } 1 \\ 0 &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$

where the ith bit of n is 1 if \(2^i\) occurs in the binary expansion of n and 0 otherwise. In terms of \(\textrm{Bit}\), one can define \(\textrm{Set}\) by

$$\begin{aligned} \textrm{Set}(n)\,=\,\{\textrm{Set}(i):\textrm{Bit}(n,i)=1\}. \end{aligned}$$

Note that \(\textrm{Bit}(n,i)=1\) entails that \(i<n\), so it is a proper recursion.

Lemma 9.2

(\(\textsf{SST}\)) \(\textrm{Set}\) furnishes a bijection from \(\mathbb {N}\) onto the universe V.

Proof

For injectivity, use induction on \(n+m\) (or course-of-values induction) to show that \(\textrm{Set}(n)=\textrm{Set}(m)\) yields \(n=m\). Suppose \(\textrm{Set}(n)=\textrm{Set}(m)\). Thus, for every i with \(\textrm{Bit}(n,i)=1\) there exists j such that \(\textrm{Bit}(m,j)=1\) and \(\textrm{Set}(i)=\textrm{Set}(j)\). Moreover, \(\textrm{Bit}(n,i)=1\) and \(\textrm{Bit}(m,j)=1\) entail that \(i<n\) and \(j<m\) so that inductively we have \(i=j\). Likewise, for every j with \(\textrm{Bit}(m,j)=1\) there exists i such that \(\textrm{Bit}(n,i)=1\) and \(\textrm{Set}(j)=\textrm{Set}(i)\), so that inductively we get \(j=i\). As a result, \(n=m\).

For surjectivity we use Adduction. Given sets a and b such that \(b\notin a\), suppose there exist \(n_0\) and i such \(\textrm{Set}(n_0)=a\) and \(\textrm{Set}(i)=b\). \(b\notin a\) implies that \(\textrm{Bit}(n_0,i)=0\), and hence \(\textrm{Set}(n_0+2^i)=\textrm{Set}(n_0)\,\cup \,\{\textrm{Set}(i)\}=a\,\cup \,\{b\}\). \(\square \)

Corollary 9.3

(\(\textsf{SST}\)) There is a bijective class function \(\textrm{Num}:V\rightarrow \mathbb {N}\) such that

$$\begin{aligned} \textrm{Num}\circ \textrm{Set}=\textrm{id}_{\mathbb {N}} \text{ and } \textrm{Set}\circ \textrm{Num}=\textrm{id}_V. \end{aligned}$$

Moreover,

$$\begin{aligned} b\in a\Leftrightarrow & {} \textrm{Bit}( \textrm{Num}(a),\textrm{Num}(b))=1, \\ \textrm{Num}(a)= & {} \sum _{b\in a}2^{\textrm{Num}(b)}.\end{aligned}$$

Proof

This follows directly from Lemma 9.2.\(\square \)

Definition 9.4

Let \(\in ^*\) be the relation on \(\mathbb {N}\) defined by \(i\in ^*n\) if \(\textrm{Bit}(n,i)=1\). For a set-theoretic formula \(\varphi \), define its translation \(\varphi ^*\) inductively as follows: \((x\in y)^*:= x\in ^*y\); \((x=y)^*:= x=y\); \(\,^*\) commutes with the logical particles \(\lnot ,\wedge ,\vee ,\rightarrow \), and \((Qx\,\phi (x))^*:= Qx\in \mathbb {N}\,\phi (x)^*\) for quantifiers Q.

Theorem 9.5

(Ackermann intuitionistically) For any set-theoretic formula \(\varphi (x_1,\ldots ,x_r)\) with all free variables exhibited,

$$\begin{aligned} \textsf{SST}\vdash \varphi (x_1,\ldots ,x_r) \leftrightarrow \varphi ^*(\textrm{Num}(x_1),\ldots ,\textrm{Num}(x_r) ). \end{aligned}$$

Proof

This is a consequence of Corollary 9.3. Formally one proceeds by (meta) induction on the complexity of \(\varphi \). \(\square \)

Note that Theorem 12.5 says that, in the category of theories and interpretations (see [39]), where two interpretations are the same if they are provably the same in the target theory, the Ackermann interpretation is the same as the identity interpretation.

10 Interpreting SST soundly into arithmetic

The interpretation \(^*\) of Definition 9.4 lends itself to an interpretation of \(\textsf{SST}\) into a system of formal arithmetic. We’d like to ascertain that Ackermann’s interpretation also leads to an interpretation of \(\textsf{SST}\) into Heyting Arithmetic.

Definition 10.1

For the function \(\textrm{Bit}\) of Definition 9.4 we assumed that it is defined in the set-theoretic language. Here we need to distinguish it from its definition in Heyting Arithmetic, for which we use the notation \(\textrm{Bit}_{\textrm{a}}\). Let \(x\in ^{\mathfrak {a}}y\) be the arithmetic formula \(\textrm{Bit}_a(y,x)=1\). For a set-theoretic formula \(\varphi \), define its Ackermann interpretation,Footnote 5\(\varphi ^\mathfrak {a}\), inductively as follows: \((x\in y)^\mathfrak {a}:= x\in ^\mathfrak {a}y\), \((x=y)^\mathfrak {a}:= x=y\), \(^\mathfrak {a}\) commutes with the logical particles \(\lnot ,\wedge ,\vee ,\rightarrow \), as well as the quantifiers, that is, \((Qx\,\phi (x))^\mathfrak {a}:= Qx\,\phi (x)^\mathfrak {a}\) for quantifiers Q.

Theorem 10.2

Whenever \(\textsf{SST}\vdash \phi \), then \(\textsf{HA} \vdash \phi ^\mathfrak {a}\).

Proof

It suffices to check that the \(\mathfrak {a}\)-translations of \(\textsf{SST}\) axioms are deducible in \(\textsf{HA}\).

  1. 1.

    (Extensionality)

    $$\begin{aligned} \forall x\forall y\,[ x = y \leftrightarrow \forall z \ (\textrm{Bit}_\textrm{a}(x, z) = \textrm{Bit}_\textrm{a}(y, z))] \end{aligned}$$

    is a theorem of HA provable in the same way as in Lemma 9.2.

  2. 2.

    (Empty set) \(\forall x\, \textrm{Bit}_\textrm{a}(0, x) = 0\) is an obvious theorem of HA.

  3. 3.

    (y-successor of x)

    $$\begin{aligned} \forall x\forall y\exists z\forall u\,[u\in ^\mathfrak {a}z\leftrightarrow (u\in ^{\mathfrak {a}} x\,\vee \,u=y)] \end{aligned}$$

    is provable in HA since for given numbers xy this works with

    $$\begin{aligned} z={\left\{ \begin{array}{ll} x &{} \text {if } \textrm{Bit}_{\textrm{a}}(x, y) = 1 \\ x + 2^{y} &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
  4. 4.

    (Adduction) Let \(\psi (x)\) be any formula in the language of HA. The following is a theorem of HA.

    $$\begin{aligned} \psi (0) \,\wedge \, \forall x\forall y\,[ \lnot (y\in ^\mathfrak {a}x)\, \wedge \, \psi (x) \,\wedge \, \psi (y)\rightarrow \psi (x + 2^{y})] \rightarrow \forall x\, \psi (x). \end{aligned}$$

    To see this in HA, one employs induction to demonstrate formally that every number is either 0 or is a sum of pairwise distinct powers of 2.

\(\square \)

11 Interpreting HA soundly into SST

In view of Corollary 7.9, we obtain a canonical interpretation of \(\textsf{HA}\) into \(\textsf{SST}\). This together with Theorem 10.2 yields the mutual interpretability of \(\textsf{SST}\) and \(\textsf{HA}\). The aim of this paper, however, is to show that a very tight connection obtains between the two theories, not just mere interpretability in both directions, namely that there is an inverse \({\mathfrak {b}}\) to the interpretation \(\mathfrak {a}\), showing that the theories are definitionally equivalent. The latter notion will be clarified in Definition 12.1.

The usual language of \(\textsf{HA}\) has function symbols for the successor, addition and multiplication functions (and perhaps more primitive recursive functions), but the language of \(\textsf{SST}\) has no function symbols. To interpret the terms of \(\textsf{HA}\) in \(\textsf{SST}\) one would have to unwind these terms or beef up the set-theoretic language by function symbols. Here we will assume that \(\textsf{HA}\) is formulated with relation symbols for the less-than relation < and the graphs of the successor, addition, and multiplication functions, which we denote by \(\textsf{Suc},\textsf{Add}\) and \(\textsf{Mult}\), respectively. The version of \(\textsf{HA}\) with function symbols is then an extension by definitions of its version with relation symbols in the usual sense (cf. [32, p. 60]).

Definition 11.1

(The interpretation of HA into SST) For a formula \(\varphi \) of the language of \(\textsf{HA}\), the formula \(\varphi ^{{\mathfrak {b}}}\) of \(\textsf{SST}\) is obtained as follows.

  1. 1.

    Replace the constant 0 by \(0_o\) (where we now assume that \(\textsf{SST}\)’s language has a constant \(0_o\) for the empty set).

  2. 2.

    Leave \(=\) untouched but replace \(x<y\), \(\textsf{Suc}(x,y)\), \(\textsf{Add}(x,y,z)\), and \(\textsf{Mult}(x,y,z)\) by \(\textrm{Num}(x)<_o\textrm{Num}(y)\), \(\textrm{Num}(x)+_o1_o=\textrm{Num}(y)\), \(\textrm{Num}(x)+_o\textrm{Num}(y)=\textrm{Num}(z)\), and \(\textrm{Num}(x)\cdot _0\textrm{Num}(y)=\textrm{Num}(z)\), respectively, where \(<_o\) is the less-than relation on ordinals, \(1_o\) is the next ordinal after \(0_o\), and \(+_o,\cdot _o\) stand for the obvious functions of ordinal addition and ordinal multiplication, respectively, as formalized in the language of \(\textsf{SST}\).

  3. 3.

    \({\mathfrak {b}}\) commutes with the connectives \(\lnot ,\wedge ,\vee ,\rightarrow \) as well as the quantifiers, that is, \((\lnot \varphi )^{\mathfrak {b}}:=\lnot \varphi ^{\mathfrak {b}}\), \((\varphi \,\diamond \,\psi )^{\mathfrak {b}}:= \varphi ^{\mathfrak {b}}\,\diamond \,\psi ^{\mathfrak {b}}\) for \(\diamond \in \{\wedge ,\vee ,\rightarrow \}\), \((\exists u\varphi )^{\mathfrak {b}}:= \exists u\varphi ^{\mathfrak {b}}\) and \((\forall u\varphi )^{\mathfrak {b}}:= \forall u\varphi ^{\mathfrak {b}}\).

Theorem 11.2

For all sentences \(\phi \) of (the relational version of) \(\textsf{HA}\), if \(\textsf{HA} \vdash \phi \), then \(\textsf{SST} \vdash \phi ^{{\mathfrak {b}}}\).

Proof

Given previous results, this is obvious. For instance, the deducibility of the interpretation of successor induction follows from Theorem 6.11. \(\square \)

12 Definitional equivalence and conservativeness

In view of Theorems 10.2 and 11.2 we know that HA and SST are mutually interpretable in each other. One goal of this paper, however, is to show that the connection between the theories is even tighter, namely that the interpretations \({\mathfrak {a}}\) and \({{\mathfrak {b}}}\) are inverses of each other, which will be captured by the notion of definitional equivalence. We now define the key notions.

Definition 12.1

(Mutual interpretability and definitional equivalence of theories)

  1. 1.

    To simplify matters, we will assume that all languages are purely relational. Let \(T_{1}\) and \(T_{2}\) be theories with languages \(L_1\) and \(L_2\), respectively. A translation f from \(L_1\) to \(L_2\) is given by a formula \(\psi _D(x)\) of \(L_2\) (with sole free variable x) such that \(T\vdash \exists x\psi _D(x)\) and a mapping of atomic formulas \(R(x_1,\ldots ,x_r)\) of \(L_1\) to formulas \(R(x_1,\ldots ,x_r)^f\) of \(L_2\) in the same free variables. Moreover, we require that \((x_1=x_2)^f\) be just \(x_1=x_2\). f is then canonically extended to all formulas of \(L_1\) by requireing f to commute with the connectives, i.e., \((\lnot \varphi )^f\) is \(\lnot \varphi ^f\) and \((\varphi _1\diamond \varphi _2)^f\) is \(\varphi _1^f\,\diamond \,\varphi _2^f\) (\(\diamond \in \{\wedge ,\vee ,\rightarrow \}\)), and letting \((\forall x\theta (x))^f\) and \((\exists x\theta (x))^f\) be \(\forall x[\psi _D(x)\rightarrow \theta (x)^f]\) and \(\exists x[\psi _D(x)\,\wedge \, \theta (x)^f]\), respectively. f is said to interpret \(T_1\) in \(T_2\) if, for all sentences \(\theta \) of \(L_1\), \(T_1\vdash \theta \) yields \(T_2\vdash \theta ^f\).

  2. 2.

    Theories \(T_{1}\) and \(T_{2}\) are mutually interpretable if there are interpretations both ways.

  3. 3.

    When the conditions listed in (1) are met, f and g are the interpretation functions.

  4. 4.

    Theories \(T_{1}\) and \(T_{2}\) are definitionally equivalent whenever, in addition to the conditions listed in (1),Footnote 6

    1. (a)

      For any sentence \(\phi \) of \(T_{1}\), \((\phi ^f)^g\) is provably equivalent to \(\phi \) in \(T_{1}\), and

    2. (b)

      For any sentence \(\psi \) of \(T_{2}\), \((\psi ^g)^f\) is provably equivalent to \(\psi \) in \(T_{2}\).

N.B. A crucial improvement of definitional equivalence over mere mutual interpretability is its faithfulness, namely that nontheorems are transformed into nontheorems, too. It is easy to come up with simple examples showing that mutual interpretability does not suffice for definitional equivalence.Footnote 7

Proposition 12.2

Definitional equivalence is transitive.

Proof

Obvious.

\(\square \)

13 Classical counterpart

Kaye and Wong [14], 497, describe the (classical) “folklore” as follows:

The first-order theories of Peano arithmetic and ZF set theory with the axiom of infinity negated are equivalent.

Let \(\textsf{ZF}^-\) be ordinary Zermelo–Fraenkel set theory (without choice), but with the axiom of infinity replaced by its negation. And let PA be classical Peano arithmetic. Of course PA is obtained from HA by adding excluded middle. So the “folklore” is that \(\textsf{ZF}^-\) is equivalent to PA.

Equivalent in what sense? Kaye and Wong point out that \(\textsf{ZF}^-\) is not definitionally equivalent to PA (see also [15]). Moreover, unlike ZF, \(\textsf{ZF}^-\) does not prove induction for membership (what we call \(\in \)-induction in Sect. 6), nor does it prove that every set has a transitive closure. Indeed, \(\textsf{ZF}^-\) does not prove that every set is a subset of a transitive set, a principle sometimes called “transitive containment.” Again, all of these are provable in SST (see Theorem 6.11 and Corollary 7.2).

Kaye and Wong [14] call ZF-inf* the theory \(\textsf{ZF}^-\) plus transitive containment. They show that this theory is equivalent to \(\textsf{ZF}^-\) plus a statement that every set has a transitive closure, and also equivalent to \(\textsf{ZF}^-\) plus \(\in \)-induction. It follows from the results in Sect. 6 that SST\(^C\) just is ZFC-inf*. All told, then, we have a simple and elegant axiomatization of the classical theory of hereditarily finite sets.

Moreover, Kaye and Wong [14] show, in effect, that ZF-inf* is definitionally equivalent to PA. Hence, \(\textsf{SST}^C\) and \(\textsf{PA}\) are definitionally equivalent.

We now return to matters intuitionistic.

14 The definitional equivalence theorem

Theorem 14.1

The systems \(\textsf{SST}\) and \(\textsf{HA}\) are definitionally equivalent.

Proof

This is basically the same proof as for [14, Theorem 19], but for the readers convenience we present it all the same.

We will show that the interpretations \(\mathfrak {a}: \textsf{SST}\rightarrow \textsf{HA}\) and \({\mathfrak {b}}:\textsf{HA}\rightarrow \textsf{SST}\) are inverse to each other, meaning that

$$\begin{aligned}{} & {} \textsf{HA}\vdash \forall \textbf{x}\,[(\varphi (\textbf{x}\,)^{\mathfrak {b}})^\mathfrak {a}\leftrightarrow \varphi (\textbf{x}\,)] \end{aligned}$$
(5)
$$\begin{aligned}{} & {} \textsf{SST}\vdash \forall \textbf{x}\,[(\psi (\textbf{x}\,)^\mathfrak {a})^{\mathfrak {b}}\leftrightarrow \psi (\textbf{x}\,)] \end{aligned}$$
(6)

hold for all formulas \(\varphi (\textbf{x}\,)\) of \(\textsf{HA}\) and all formulas \(\psi (\textbf{x}\,)\) of \(\textsf{SST}\). Fortunately, since these interpretations do not affect the logical connectives, the quantifiers, and equality, it suffices to check this for atomic formulas not involving \(=\).

To show (5), we argue in \(\textsf{HA}\). Write \(x<'y \) for \((x< y)^{\mathfrak {a}{\mathfrak {b}}}\), \(0'\) for the \(<'\)-least number, \(1'\) for \(<'\)-least number after \(0'\), \(x+'y\) for the unique z such that \(\textsf{Add}(x,y,z)^{\mathfrak {a}{\mathfrak {b}}}\), and \(x\cdot 'y\) for the unique u such that \(\textsf{Mult}(x,y,u)^{\mathfrak {a}{\mathfrak {b}}}\). One checks that \(0'=0\), \(1'=1\), and \(\forall x \, \textsf{Suc}(x,x+'1')\). In consequence, by induction on y this implies that \(\forall x\forall y\,\textsf{Add}(x,y,x+'y)\). As \(x<y \leftrightarrow \exists z\,\,(z\ne 0\,\wedge \, \textsf{Add}(x,z,y))\) and also \(x<'y \leftrightarrow \exists z\,\,(z\ne 0\,\wedge \, y=x+'z)\), one infers that \(\forall x\forall y\,(x<y\leftrightarrow x<'y)\). In the same vein one shows \(x\cdot '(y+'1')=x\cdot ' y+'x\), so that by induction, \(\forall x\forall y \,\textsf{Mult}(x,y,x\cdot ' y)\), as desired.

For (6) one works in \(\textsf{SST}\). Writing \(x\in 'y\) for \((x\in y)^{{\mathfrak {b}}\mathfrak {a}}\) one shows by \(\in \)-induction that \(\forall x\forall y(x\in y\leftrightarrow x\in 'y)\) from which the desired result follows. \(\square \)

Corollary 14.2

\(\textsf{SST}\) is conservative over \(\textsf{HA}\), with respect to \({\mathfrak {b}}\), in that, for \(\phi \) from the language of \(\textsf{HA}\), when \(\textsf{SST}\vdash \phi ^{\mathfrak {b}}\), \(\textsf{HA}\vdash \phi \). Similarly, \(\textsf{HA}\) is conservative over \(\textsf{SST}\) with respect to \(\mathfrak {a}\). (In the context of interpretations the notion of conservativy is also known as faithfulness.)

Proof

Immediate from the definition of definitional equivalence. \(\square \)

15 Some metamathematics of \(\textsf{SST}\)

Heyting arithmetic is known for having several pleasing meta-mathematical features, among them the disjunction property and the existence property (due to Kleene [16]). Owing to definitional equivalence, they propagate to \(\textsf{SST}\).

Corollary 15.1

  1. (i)

    \(\textsf{SST}\) has the disjunction property, that is, whenever \(\textsf{SST}\vdash \phi \vee \psi \), where \(\phi \) and \(\psi \) are sentences, then \(\textsf{SST}\vdash \phi \) or \(\textsf{SST}\vdash \psi \).

  2. (ii)

    \(\textsf{SST}\) has the existence property, that is, whenever \(\textsf{SST}\vdash \exists x \theta (x)\) for a sentence \(\exists x \theta (x)\), then there exists a formula \(\vartheta (x)\) with at most x free such that

    $$\begin{aligned} \textsf{SST}\vdash \exists ! x\,[\vartheta (x)\,\wedge \,\theta (x)]. \end{aligned}$$

Proof

  1. (i)

    By Theorem 10.2, \(\textsf{SST}\vdash \phi \vee \psi \) yields \(\textsf{HA}\vdash \phi ^\mathfrak {a}\,\vee \,\psi ^\mathfrak {a}\), thus \(\textsf{HA}\vdash \phi ^\mathfrak {a}\) or \(\textsf{HA}\vdash \psi ^\mathfrak {a}\), whence, by Theorem 11.2, \(\textsf{SST}\vdash (\phi ^\mathfrak {a})^{\mathfrak {b}}\) or \(\textsf{SST}\vdash (\psi ^\mathfrak {a})^{\mathfrak {b}}\), and consequently, \(\textsf{SST}\vdash \phi \) or \(\textsf{SST}\vdash \psi \), owing to Theorem 14.1.

  2. (ii)

    Suppose that \(\textsf{SST}\vdash \exists x \theta (x)\). Then \(\textsf{HA}\vdash \exists x\,\theta (x)^\mathfrak {a}\). The existence property for the usual functional version of \(\textsf{HA}\) furnishes a numeral \(\bar{n}\) such that \(\textsf{HA}\vdash \theta (\bar{n})^\mathfrak {a}\). However, for the interpretation \({\mathfrak {b}}\) we used the relational version of \(\textsf{HA}\). So let \(\phi (x)\) be a description of the n-th natural number in the relational language. Then \(\textsf{HA}\vdash \exists !x\,[\phi (x)\wedge \theta (x)^\mathfrak {a}]\) and therefore \(\textsf{SST}\vdash \exists !x\, [\phi (x)^{\mathfrak {b}}\,\wedge \, (\theta (x)^\mathfrak {a})^{\mathfrak {b}}]\), whence \(\textsf{SST}\vdash \exists !x\, [\vartheta (x)\,\wedge \, \theta (x)]\), with \(\vartheta (x)\) being \(\phi (x)^{\mathfrak {b}}\).

\(\square \)

15.1 Church’s theses for sets

\(\textsf{HA}\) is known to be consistent (e.g. via Kleene’s realizability interpretation [16]) with Intuitionistic Church’s Thesis or CT, that is, the statement that every total binary relation on the natural numbers comprises the graph of a natural number function that is Turing computable. CT lends itself to various versions germane to \(\textsf{SST}\).

Definition 15.2

  1. 1.

    Let \(\text {T}(n, m, p)\) be Kleene’s informal arithmetic computation or T-predicate. Informally, \(\textrm{T}\) if and only if p is the complete code of a computation of the Turing machine with index n on input m. Let \(\mathrm {T_{set}}(x,y,z)\) be corresponding predicate defined within the language of SST via the \({\mathfrak {b}}\)-translation as \((\textrm{T}(x,y,z))^{\mathfrak {b}}\).

  2. 2.

    Let \(\textrm{U}(n, m)\) be Kleene’s upshot predicate. Informally \(\textrm{U}(n,m)\) just in case whenever n is the code of a complete computation, then m is the output of that computation. Let \(\mathrm {U_{set}}(u,v)\) be the predicate defined within the language of SST via the \({\mathfrak {b}}\)-translation as \((\textrm{U}(u,v))^{\mathfrak {b}}\).

  3. 3.

    A total (class) endofunction F on the universe of sets is said to be Turing computable on sets if there exists a set e such that

    $$\begin{aligned} \forall y\,\exists z\,[\mathrm {T_{set}}(e,y,z)\,\wedge \,\mathrm {U_{set}}(z,F(y))]. \end{aligned}$$

    Church’s Thesis for Sets or \(\mathrm {CT_{set}}\) is the claim that every total (class) relation on the universe of SST comprises the graph of a class function F that is Turing computable on sets. In the language of SST, \(\mathrm {CT_{set}}\) is expressed by the following scheme, with \(\phi \) a formula,

    $$\begin{aligned} \forall x \exists y \, \phi (x,y) \rightarrow \exists e \forall x \exists u \exists v\, [\mathrm {T_{set}}(e, x, u) \wedge \mathrm {U_{set}}(u, v) \wedge \phi (x, v)]. \end{aligned}$$

In view of Theorem 14.1 we then have:

Theorem 15.3

\(\textsf{SST}+\mathrm {CT_{set}}\) is a consistent theory. Moreover, \(\textsf{HA}+\textrm{CT}\) and \(\textsf{SST}+\mathrm {CT_{set}}\) are definitionally equivalent theories.

\(\mathrm {CT_{set}}\) embodies a version of Church’s thesis that is based on the idea of using the coding function \(\textrm{Num}\) of Corollary 9.3 to pull Turing computability over \(\mathbb {N}\) back onto the class of all sets, the universe of hereditarily finite sets of SST. This approach can be chided for being too parasitic on Turing computability. Perhaps, in its stead one would like to see a genuinely set-theoretic notion of computability on sets.

Such an alternative approach exists. It is known as set recursion or E-recursion (see, e.g., [26] or [21] or [31, Ch.X]). In it, one has unlimited application of sets to sets, conveyed by the symbol \(\{e\}(x)\), where e and x are sets. Here Kleene’s curly bracket notation is used to signify that e is viewed as an index for a partial (class) function on the universe of sets which takes a set input x to produce a result \(\{e\}(x)\) if x happens to be in the domain of that function. This notion of computability over the universe of sets is based on a few simple starting functions (known as rudimentary functions, such as \(x,y\mapsto x \cup \{ y\}\)) and also has the combinators k and s from combinatory logic baked in. Formally, it’s introduced via an inductive definition (similar to [37, Ch. 3, 7.2]). It allows one to view the universe of sets as a class-sized partial combinatory algebra. Various notions of realizability can be based on this pca (see [30]). Moreover, it gives rise to a general set-theoretic version of Church’s thesis, dubbed \(\textrm{CTS}\) (see also [36] and [38]).

Definition 15.4

We write \(\{e\}(x)\simeq y\) to indicate that x is in the domain of the function with index e and y is the value of that function applied to x. In the language of set theory, let \(\textrm{CTS}\) be the following scheme, with \(\phi \) any formula:

$$\begin{aligned} \forall x \exists y \, \phi (x,y) \rightarrow \exists e \forall x\exists y\, [\{e\}(x)\simeq y \,\wedge \, \phi (x,y)]. \end{aligned}$$

Indices for set recursive functions can serve as realizers of formulae.

Definition 15.5

We define a relation \(a\Vdash \theta \) between sets and set-theoretic formulae \(\varphi \) by induction on the buildup of \(\varphi \). Below \(a\bullet c\Vdash \theta \) will be an abbreviation for \(\exists x[\{a\}(c)\simeq x\,\wedge \,x\Vdash \theta ]\).

Theorem 15.6

Let \(\phi (u_1,\ldots ,u_r)\) be a formula all of whose free variables are among \(u_1,\ldots ,u_r\). If

$$\begin{aligned} \textsf{SST}+\textrm{CTS}\vdash \phi (u_1,\ldots ,u_r), \end{aligned}$$

then one can effectively construct an index of an E-recursive function g such that

$$\begin{aligned} \textsf{SST}\vdash \forall a_1,\ldots , a_r\,\; g(a_1,\ldots ,a_r)\Vdash \phi (a_1,\ldots ,a_r)\,. \end{aligned}$$

Proof

Similar to [30, Theorem 4.2]. \(\square \)

There is also a version of realizability combined with truth (see [30, 3.1]) that can be used to prove the disjunction and existence properties for \(\textsf{SST}\) directly rather than using definitional equivalence with \(\textsf{HA}\).

An obvious question is how the two set-theoretic versions of Church’s thesis are related to each other. One can show the following.

Theorem 15.7

\(\textsf{SST}+\mathrm {CT_{set}}\) and \(\textsf{SST}+\textrm{CTS}\) are the same theories, more precisely, they prove the same theorems.

Proof

One shows that every set recursive function induces a Turing-computable function on \(\mathbb {N}\) via the translation \({\mathfrak {b}}\). Conversely, one shows that every Turing computable function on \(\mathbb {N}\) gives rise to a set-recursive function via \(\mathfrak {a}\). \(\square \)

16 Categoricity of SST in Markovian set theories

Prominent principles of Markov’s constructivism are Church’s thesis (CT) and Markov’s Principle (MP). MCarty showed in [19] that augmentations of standard intuitionistic set theories (e.g., \(\textsf{IZF}\) and \(\textsf{CZF}\)) by CT and MP prove the categoricity of \(\textsf{HA}\), that is, that all models of \(\textsf{HA}\) are isomorphic. Notably, the latter entails that there are no nonstandard models of arithmetic. This result carries over to models of \(\textsf{SST}\) as will be spelled out below. But first the definition of MP.

Definition 16.1

Markov’s Principle or MP is the following statement expressed in the language of set theory.

If S is a decidable subset of the natural numbers, and \(\lnot \lnot \,\exists n\mathop {\in }\mathbb {N}\, \, n\mathop {\in }S\), then \(\exists n\mathop {\in }\mathbb {N}\,\, n \in S\).

We could well have pursued the preceding study of SST and its interpretations entirely within the Intuitionistic Zermelo–Fraenkel set theory IZF or Constructive Zermelo–Fraenkel set theory CZF plus CT and MP. IZF+CT+MP is consistent relative to IZF and CZF+CT+MP is consistent relative to CZF; number realizability for cumulative sets demonstrates this (see [18] and [29], respectively).

As described in Corollary 9.3, the isomorphism articulated by the function Set between natural numbers and hereditarily finite sets shows that models of SST containing only standard natural numbers, i.e., T-sets representing standard natural numbers, must be standard. It is known (see MCarty [19]) that, within the contexts of IZF and CZF, CT implies that there are no nonstandard natural numbers, while CT plus MP imply that all models of HA are isomorphic. Therefore, we have the following theorem.

Theorem 16.2

  1. (i)

    \(\textsf{IZF}+\textrm{CT}\) and \(\textsf{CZF}+\textrm{CT}\) prove that in models of \(\textsf{SST}\) there are no nonstandard natural numbers.

  2. (ii)

    \(\textsf{IZF}+\textrm{CT}+\textrm{MP}\) and \(\textsf{CZF}+\textrm{CT}+\textrm{MP}\) prove that \(\textsf{SST}\) is categorical: \(\textsf{SST}\)’s sole model—up to isomorphism—is the universe of hereditarily finite sets. Indeed, there is a single theorem of \(\textsf{SST}\) that is itself categorical, as per the techniques developed in [19].

Remark 16.3

The categoricity results have miniaturised counterparts in terms of interpretability. See for example Appendix C of [17].