Keywords

1 Introduction

In formal languages terms are usually treated as these elements of language which only refer to the objects in the domain of discourse. In particular, this way of treating terms is prevailing in proof theory and automated deduction where usually only functional terms are approved. In contrast, in natural languages, naming expressions are used very often not only for referring to objects but also for conveying information about them. In the earlier stages of development of mathematical logic several formal devices were introduced for this aim which currently are rather neglected. These term-forming operators, also called shortly tfos or vbtos (variable binding term operators), include, among others:

  • iota-operator (Peano): \(\imath x\varphi \) - the (only) x such that \(\varphi \);

  • epsilon-operator (Hilbert): \(\epsilon x\varphi \) - a(n) x such that \(\varphi \);

  • abstraction-operator: \(\{x:\varphi \}\) - the set of (all) x satisfying \(\varphi \);

  • counting-operator (Frege): \(\sharp x\varphi \) - the number of x such that \(\varphi \);

  • lambda-operator (Church): \(\lambda x\varphi \) - the property of being \(\varphi \).

It seems that currently only the lambda-operator is treated as an important tool and found diverse applications in recursion theory, type theory and proof theory. Abstraction-operator, although commonly used in practice, is rather not treated seriously in the formal development of set theories. The remaining ones are sadly treated as formal tools having only some historical value. Since the role of complex terms as information conveying tools is crucial in communication it is important to fill this gap.

Recently, some more attention was paid to proof theory of definite descriptions. In particular, cut-free sequent calculi were provided for Fregean [11], Russellian [17] and free description theories [13]. The latter theories were also characterised in terms of tableau systems [18] and tableau calculus was also used to develop a Russelian theory in the language enriched with lambda-operator [19]. Some modal logics of definite descriptions were also developed in terms of cut-free sequent calculus [10], in particular, the logic of Fitting and Mendelsohn [5] was independently formalised as a labelled sequent calculus [28] and as a hybrid system [12]. Alternatively, interesting natural deduction and sequent calculi were proposed for free and intuitionistic logics of definite descriptions characterised in terms of binary quantifier [21,22,23,24,25].

Since definite descriptions are amenable to proof theoretic treatment it is tempting to suspect that for other tfos we can obtain equally interesting results. Perhaps one should start with posing a question whether a general theory of such operators is possible? In fact at least two different attempts to develop such a theory were proposed. The earlier approach was independently introduced by several authors, including: Scott [32], Da Costa [3, 4], Hatcher [7, 8], Corcoran and Herring [1, 2]. It was formulated semantically and as an axiomatic theory. In what follows it will be called simply S-theory (after Scott). The second approach was introduced by Neil Tennant [33], and then developed in [35] as a general theory of abstraction operators (see also [34, 36]). This T-theory was formulated in terms of natural deduction system and with adequate semantical characterisation. In what follows we will examine these two approaches and show how they can be formulated as well-behaved sequent calculi in Sect. 3. Then, in Sect. 4 we consider their specification with respect to set-abstraction operator. For this aim we focus on Quine’s version of set theory NF (New Foundations) [29] (see also [30]) but the proposed systems may be modified to apply to other formulations of set theory as well.

2 Preliminaries

We will be using standard first-order predicate languages with quantifiers \(\forall , \exists \), identity predicate \(=\) and arbitrary term-forming operator \(\tau \) making complex terms from formulae of the language. The definition of a term and formula is standard, by simultaneous recursion on both categories. In the presented system the only terms are variables and complex terms constructed by means of arbitrary unary tfo \(\tau \). The complex terms are written as \(\tau x\varphi \) where \(\varphi \) is a formula in the scope of respective operator.

In accordance with Gentzen’s custom we divide individual variables into bound \(VAR = \{x,y,z, \ldots \}\) and free variables (parameters) \(PAR = \{a, b, c, \ldots \}\). It makes easier an elaboration of some technical issues concerning substitution and proof transformations. In the metalanguage \(\varphi , \psi , \chi \) denote any formulae and \(\varGamma , \varDelta , \varPi , \varSigma \) their multisets. Metavariables \(t, t_1, \ldots \) denote arbitrary terms. \(\varphi [t_1/t_2]\) is officially used for the operation of correct substitution of a term \(t_2\) for all occurrences of a term \(t_1\) (a variable or parameter) in \(\varphi \), and similarly \(\varGamma [t_1/t_2]\) for a uniform substitution in all formulae in \(\varGamma \). Ocassionally, we will use simplified notation \(\varphi (t)\) to denote the result of correct substitution.

First-order logic in general will be abbreviated as FOL or FOLI if identity is primitive. CFOL(I), PFFOL(I), NFFOL(I) denote the classical, positive free and negative free versions. The basic system GC for CFOL consists of the following rules:

figure a

where a is a fresh parameter (eigenvariable), not present in \(\varGamma , \varDelta \) and \(\varphi \).

If instead of \((\forall \!\!\Rightarrow )\) and \((\Rightarrow \!\!\exists )\) we introduce:

figure b

we obtain a pure variant GPC which is adequate for CFOL with variables as the only terms but in general incomplete for extensions with some tfos.

The variant GF for PFFOL can be obtained by changing all quantifier rules into:

figure c

where E is the existence predicate, which is usually defined as \(Et := \exists x(x=t)\). This form of rules follows from the fact that in free logics terms may designate nonexistent objects whereas quantifiers have existential import. For pure version GPF again we use b instead of t in \((\forall \!\!\Rightarrow )^F\) and \((\Rightarrow \!\!\exists )^F\).

Moreover, in negative free logic atomic formulae with such terms are false which implies that \(Et \rightarrow t=t\) and \(\varphi (t)\rightarrow Et\), for any atomic formula \(\varphi \). Hence to obtain GNF (or GPNF) for NFFOL we have to add to GF (or GPF) the rule requiring all predicates to be strict in the sense that they are satisfied only by denoting terms:

(Str) \(\dfrac{Et, \varGamma \!\!\Rightarrow \varDelta }{\varphi (t), \varGamma \!\!\Rightarrow \varDelta }\)         where \(\varphi \) is atomic.

Identity can be characterised in GC (GPC) and GF (GPF) in several ways (see [16]). For our purposes we use the following rules:

figure d

where \(\varphi \) is atomic.

GCI, GPCI, GFI, GPFI will denote the respective calculi with the rules for identity added. In case of NFFOLI, due to strictness condition, reflexivity does not hold unconditionally and we must weaken the first rule, using instead:

\((Ref)^N\) \(\frac{{t=t, \varGamma \!\!\Rightarrow \varDelta }}{{Et, \varGamma \!\!\Rightarrow \varDelta }}\)

GNFI, GPNFI will denote the respective calculi for NFFOLI with the rules for identity having \((Ref)^N\).

Proofs are defined in the standard way as finite trees with nodes labelled by sequents. The height of a proof \(\mathcal{D}\) of \(\varGamma \Rightarrow \varDelta \) is defined as the number of nodes of the longest branch in \(\mathcal{D}\). \(\vdash _k \varGamma \Rightarrow \varDelta \) means that \(\varGamma \Rightarrow \varDelta \) has a proof with height at most k. Let us recall that formulae displayed in the schemata are active, whereas the remaining ones are parametric, or form a context. In particular, all active formulae in the premisses are called side formulae, and the one in the conclusion is the principal formula of the respective rule application.

Note that the Cut-elimination theorem holds for all above mentioned calculi (see e.g. [15]) and the full Leibniz’ Law LL: \(t_1=t_2, \varphi [x/t_1]\Rightarrow \varphi [x/t_2]\) (for arbitrary formula \(\varphi \)) is also provable.

3 The General Theory

The S-theory of tfos is expressed by two general principles:

EXT: \(\forall x(\varphi (x)\leftrightarrow \psi (x))\rightarrow \tau x\varphi (x)=\tau x\psi (x)\)

AV: \(\tau x\varphi (x)=\tau y\varphi (y)\)

or, equivalently, by one principle:

EXTAV: \(\forall xy(x=y\rightarrow (\varphi (x)\leftrightarrow \psi (y)))\rightarrow \tau x\varphi (x)=\tau y\psi (y)\)

Such a general theory was first developed on the basis of positive free first-order logic with identity by Scott [32]. However, the remaining authors used the classical first-order logic with identity as the basis. In both cases the general completeness theorem was provided and several important model theoretic results which hold for CFOLI (see in particular Da Costa [4]). In what follows, we will pay more attention to classical case since for several kinds of tfos, in particular for descriptions, it is rather difficult to find reasonable theories, in contrast to the situation in free logic (see [26]).

Several possible objections can be raised against such a theory. In a sense it is too general and too weak, on the other hand, for specific kind of operators it may be too strong, in particular in the setting of classical logic. Let us illustrate these remarks with some examples. For example, for \(\imath \)-operator Rosser [30] is enforced to add (in CFOLI) to EXT and AV the following axiom:

$$\begin{aligned} \exists _1x\varphi (x)\rightarrow \forall x(x=\iota x\varphi (x)\leftrightarrow \varphi (x)) \end{aligned}$$

which still gives incomplete logic as noticed by Hailperin [6]. Da Costa [4] adds:

$$\begin{aligned} \exists _1x\varphi (x)\rightarrow \forall x(x=\iota x\varphi (x)\rightarrow \varphi (x))\,\textrm{and} \end{aligned}$$
$$\begin{aligned} \lnot \exists _1x\varphi (x)\rightarrow \iota x\varphi (x)= \iota x(x\ne x) \end{aligned}$$

In fact, the theory of descriptions axiomatised by the addition of these two axioms to EXT and AV is redundant, since the latter principles can be proven with their help. This theory is in fact equivalent to Fregean/Carnapian theory of descriptions (often called the chosen object theory), in particular in the formulation of Kalish and Montague [20]. However, we call an S-theory every theory of arbitrary tfo where EXT and AV hold either as axioms or as derived theses.

On the other hand, for some theories of definite descriptions these two principles are too strong. For example, in the Russellian theory [31, 37] both principles do not hold. Instead we have their weaker versions:

wEXT: \(E\imath x\varphi (x) \rightarrow ((\varphi (x)\leftrightarrow \psi (x))\rightarrow \imath x\varphi (x)=\imath x\psi (x))\)

wAV: \(E\imath x\varphi (x)\rightarrow \imath x\varphi (x) = \imath y\varphi (y)\).

In other cases of tfos, like set-abstraction operator or counting operator, EXT may be even more disastrous, since for the latter it yields one half of the Fregean ill-famed V law, in fact this half which is sufficient for deriving contradiction. Similar problems with set-abstraction will be discussed below.

3.1 The Formalisation of S-Theory

To obtain an adequate sequent calculus for S-theory we add to GCI the following two rules:

figure e

where a is a fresh parameter.

Alternatively, we can add just one rule corresponding to EXTAV:

figure f

where both ab are fresh parameters.

Theorem 1

GCI+\(\{(Ext), (AV)\}\) and GCI+\(\{(ExtAV)\}\) are equivalent to axiomatic formulations of S-theory of tfos.

Proof

It is sufficient to prove respective axioms in GCI+\(\{(Ext), (AV)\}\) or in GCI+\(\{(ExtAV)\}\) and to show that the above rules are derivable in GCI with added axioms EXT, AV or EXTAV. We will show this for the more compact version with (ExtAV) and EXTAV; proofs for the remaining rules and axioms are similar and simpler. Provability of EXTAV:

figure g

where the rightmost leaf is provable and \(\mathcal{D}\) is an analogous proof of \(\forall xy(x=y\rightarrow (\varphi (x)\leftrightarrow \psi (y))), a=b, \psi (b)\Rightarrow \varphi (a)\).

Derivability of (ExtAV):

figure h

where both leaves are premisses and \(\mathcal{D}\) is a proof of \(\forall xy(x=y\rightarrow (\varphi (x)\leftrightarrow \psi (y)))\Rightarrow \tau x\varphi (x)=\tau y\psi (y)\) from the axiom \(\Rightarrow EXTAV\).    \(\square \)

Let us consider the question of cut elimination for either of the two formalisations of S-theory. We can observe that the choice of the rule (2LL) for representation of LL was connected with the shape of (Ext) or (ExtAV). In both calculi identities can appear as the principal formulae of some rule application only in the succedent. This makes it safe for proving cut elimination since identities in antecedents can only appear either as parametric formulae or as formulae introduced by weakening. In both cases if identity is a cut formula under consideration it is eliminable either by induction on the height of cut or directly.

Still there is a problem connected with the application of \((\forall \Rightarrow )\) and \((\Rightarrow \exists )\) to complex terms. If for example \(\forall x\varphi \) is a cut formula which was in both premisses of cut introduced as the principal formula, and in the right premiss x was instantiated with \(\tau y\psi \), then the formula \(\varphi [x/\tau y\psi ]\) may have higher complexity than \(\forall x\varphi \) and the induction on the complexity of cut formulae fails. This problem may be overcome either by introduction of more complex way of measuring the complexity of formulae (see e.g. [11]) or by replacing the basic calculus GCI with its pure version GPCI. Of course, the restriction of all quantifier rules to parameters makes the calculus with complex terms incomplete. However, to avoid the loss of generality we can add to GPCI the rule:

figure i

where a is a fresh parameter.

Theorem 2

The calculus GPCI+\(\{(Ext), (AV)\}\) (or GPCI+\(\{(ExtAV)\}\)) with added \((a\Rightarrow )\) is equivalent to GCI+\(\{(Ext), (AV)\}\) (or GCI+\(\{(ExtAV)\}\))

Proof

It is enough to show that \((a\Rightarrow )\) is derivable in GCI:

figure j

and that unrestricted \((\forall \Rightarrow ), (\Rightarrow \exists )\) are derivable in GPC with \((a\Rightarrow )\):

figure k

where the rightmost sequent being an instance of LL is provable. Similar proof works for \((\forall \Rightarrow )\).    \(\square \)

Let us call GPCI+\(\{(Ext), (AV)\}\) (or GPCI+\(\{(ExtAV)\}\)) with added \((a\Rightarrow )\) simply GS (GS’). Note that for both systems the following lemma holds:

Lemma 1

  1. 1.

    \(\vdash t_1=t_2, \varphi [x/t_1] \Rightarrow \varphi [x/t_2]\), for any formula \(\varphi \).

  2. 2.

    If \(\vdash _k \varGamma \Rightarrow \varDelta \), then \(\vdash _k \varGamma [b_1/b_2] \Rightarrow \varDelta [b_1/b_2]\), where k is the height of a proof.

Proof

1. follows by induction on the complexity of \(\varphi \) and is standard for all cases. The proof of 2 is by induction on the height of proofs.    \(\square \)

The first result is Leibniz’ Law (LL) stated in full generality, i.e. covering also complex terms. Since (2LL) yields only LL restricted to atomic formulae, we need its unrestricted form for completeness. The second result is a substitution lemma which is necessary for unifying terms while proving the cut elimination theorem. Note that it is restricted to parameters only but in the case of GS (GS’), which is an extension of GPCI, it is sufficient since only parameters are instantiated for bound variables in all applications of quantifier rules.

Theorem 3

The cut elimination theorem holds for GS and GS’.

Proof

The proof is standard and essentially requires two inductions: on the complexity of cut formula and on the height of the derivations of both premisses of cut. In general we can follow the strategy applied for example in [15]; here we focus only on the crucial points connected with the new rules which could lead to troubles.

Consider the situation where the cut formula in the left premiss is the principal formula of the application of (2LL). It is an atomic formula, possibly an identity. Since in no logical rule atomic formula in the antecedent can be a principal formula, so in the right premiss a suitable cut formula is either introduced by weakening or is just a parametric formula. In the first case it is directly eliminated, in the second it is eliminated by induction on the height of the proof. The case where the right premiss is axiomatic is also directly eliminable.

The cases where in the left premiss the cut formula is the principal formula of the application of (Ext) or (ExtAV) are treated in a similar way. Eventually, rules like (AV) or \((a\Rightarrow )\) have no impact on the elimination of cuts since there are no principal formulae in the conclusion.    \(\square \)

Although we cannot totally avoid the loss of the subformula property in GS and GS’, the introduction of complex terms is separated from quantifier rules and technically it is more desirable. In fact, from the semantic point of view we are not really in need of introducing an arbitrary complex term in the premiss while doing a proof-search. The rule is required only for these terms which either occur already in \(\varGamma , \varDelta \), or have in their scope the formulae from \(\varGamma , \varDelta \). It can be shown by providing Hintikka-style completeness proof for this system which is possible since Henkin-style proofs were provided by the mentioned authors; we omit the details because of space restrictions.

In fact, for the needs of proof-search we could simplify GS (GS’) a little bit. In particular we could use a more convenient one-premiss rule of Negri and von Plato [27] for LL of the form:

figure l

for all cases where at least one of \(t_1, t_2\) is a parameter and \(\varphi (t_1)\) is not an identity with both arguments being complex terms. In fact, the only troublesome cases of LL which could make a clash in the proof of cut elimination are three:

  1. 1.

    \(b=t, t=t' \Rightarrow b=t'\)

  2. 2.

    \(t=t', \varphi (t) \Rightarrow \varphi (t')\)

  3. 3.

    \(t=t', t'=t'' \Rightarrow t=t''\)

where \(t, t'\) are complex terms, and only for these cases a two-premiss rule (2LL) is necessary.

Also note that instead of (Ref) we can use more restricted version:

figure m

since \(\tau x\varphi (x)=\tau x\varphi (x)\) is derivable by (Ext) or (ExtAV).

3.2 The Formalisation of T-Theory

The theory of abstraction-operators developed by Tennant, which we call here a T-theory of tfos, is generally much stronger than S-theory. But we must emphasize that it is formulated in the setting of much weaker logic, namely NFFOLI (negative free FOLI), where not only quantifier rules are weaker but also the identity is not (unconditionally) reflexive.

Tennant’s theory of tfo is based on the following natural deduction rules:

\((\tau I)\) If \(\varphi (a), Ea\vdash aRt\) and \(aRt\vdash \varphi (a)\) and Et, then \(t=\tau x\varphi (x)\);

\((\tau E1)\) If \(t=\tau x\varphi (x)\) and \(\varphi (b)\) and Eb, then bRt

\((\tau E2)\) If \(t=\tau x\varphi (x)\), then Et

\((\tau E3)\) If \(t=\tau x\varphi (x)\) and bRt, then \(\varphi (b)\)

where a is an eigenvariable, and R is a specific relation involved in the characterisation of \(\tau \). For example, R is \(=\) for the case of \(\imath \), and \(\in \) for set-abstraction operator. The corresponding sequent rules are:

figure n

where a is not in \(\varGamma , \varDelta , \varphi \)

figure o
figure p
figure q

To get more standard SC we can apply the rule-generation theorem (see e.g. [14]) and obtain left introduction rules for \(\tau \):

figure r
figure s
figure t

Note that if we transfer these rules to the setting of CFOLI we do not need formulae of the form Et, and the rule \((\tau \Rightarrow 2)\), being specific to negative free logic, is superfluous. As a result we obtain the following three rules:

figure u

where a is not in \(\varGamma , \varDelta , \varphi \)

figure v
figure w

In general what we obtain with these rules is equivalent to the following principle, often called Lambert axiom:

LA: \(\forall y(y = \tau x\varphi (x) \leftrightarrow \forall x(\varphi (x)\leftrightarrow xRy))\)

which is derivable also in the setting of NFFOLI. In the setting of CFOLI it is equivalent to Hintikka axiom:

HA: \(t = \tau x\varphi (x) \leftrightarrow \forall x(\varphi (x)\leftrightarrow xRt)\)

for which we demonstrate syntactically the equivalence with the stated rules. In one direction we have:

figure x

In the second direction:

figure y

Derivability of the specific rules is straightforward. Notice that from HA as additional axioms we obtain:

(a) \(t=\tau x\varphi (x)\Rightarrow \forall x(\varphi (x)\leftrightarrow xRt)\) and

(b) \(\forall x(\varphi (x)\leftrightarrow xRt)\Rightarrow t=\tau x\varphi (x) \).

From the premisses of any variant of \((\tau \Rightarrow )\), applying weakening we deduce:

figure z

which, by cut with (a) yields the conclusion of \((\tau \Rightarrow )\). In a similar way we deduce \(\varGamma \Rightarrow \varDelta , \forall x(\varphi (x)\leftrightarrow xRt)\) from premisses of \((\Rightarrow \tau )\), and by cut with (b) we obtain the conclusion of this rule.

One should note that T-theory is much stronger than S-theory; both central principles EXT and AV are provable (in fact even in the setting of NFFOLI by means of the weaker rules).

figure aa

where the second leaf is directly provable and \(\mathcal{D}\) is an analogous proof of \(\forall x(\varphi (x)\leftrightarrow \psi (x)), \psi [x/a]\Rightarrow aR\tau x\varphi (x) \).

figure ab

Note that \(\varphi [x/a]\) and \(\varphi [y/a]\) are identical since \(\varphi (x)\) and \(\varphi (y)\) are alphabetic variants.

One may even prove the converse of EXT:

figure ac

where \(\mathcal{D}\) is a similar proof of \(\tau x\varphi (x)=\tau x\psi (x), \psi [x/a] \Rightarrow \varphi [x/a] \).

To realise how strong is this principle on the ground of CFOLI notice that when t is instantiated with \(\tau x\varphi (x)\) we obtain:

\(\tau x\varphi (x) = \tau x\varphi (x) \leftrightarrow \forall x(\varphi (x)\leftrightarrow xR\tau x\varphi (x))\).

which by (unrestricted) reflexivity of = yields:

\(\forall x(\varphi (x)\leftrightarrow xR\tau x\varphi (x))\).

For several term-forming operators, at least on the ground of CFOLI, it is too strong. For example if we instantiate this principle with iota-operator (where R is = ) we run into contradiction:

  1. 1.

    \(\imath x(Ax\wedge \lnot Ax) = \imath x(Ax\wedge \lnot Ax) \rightarrow \forall x(Ax\wedge \lnot Ax\leftrightarrow x = \imath x(Ax\wedge \lnot Ax))\)

  2. 2.

    \(\imath x(Ax\wedge \lnot Ax) = \imath x(Ax\wedge \lnot Ax)\)

  3. 3.

    \(\forall x(Ax\wedge \lnot Ax\leftrightarrow x = \imath x(Ax\wedge \lnot Ax))\) 1, 2

  4. 4.

    \(A(\imath x(Ax\wedge \lnot Ax))\wedge \lnot A(\imath x(Ax\wedge \lnot Ax))\leftrightarrow \imath x(Ax\wedge \lnot Ax) = \imath x(Ax\wedge \lnot Ax))\) 3

  5. 5.

    \(A(\imath x(Ax\wedge \lnot Ax))\wedge \lnot A(\imath x(Ax\wedge \lnot Ax))\) 4, 2

Similarly in the case of set-abstraction operator (where R is \(\in \)) we obtain just unrestricted axiom of comprehension which immediately leads to Russell’s paradox. Hence it is crucial to establish what is R for the specific tfo to decide if Tennant’s rules may be safely added to GCI or GPCI. Therefore, we do not attempt here to state T-theory as a general calculus GT. Instead we will consider in the next section the application of his theory to set-abstraction operator, since even in this context one may introduce restrictions which can prevent us against troubles.

4 Application to Set-Abstracts

Several kinds of set theory with set-abstraction operator as primitive can be rather easily developed on the basis of S- or T-theory as formalised in the preceding section. In fact, both Scott [32] and Tennant [33] applied their theories to set-abstract operators but in the context of free logic the unrestricted axiom of comprehension does not lead to Russell’s paradox. However we work here in the setting of CFOL so the rules responsible for its derivation must be somehow restricted. For these reasons we decided to examine the possible formalisations of Quine’s NF (New Foundations) as developed in [30], where the comprehension axiom is suitably restricted by means of the outer syntactic side condition which is independent of the structure of rules. In fact, NF is not very popular formalisation of set theory due to some peculiarities. However, it has also several advantages which we are not going to discuss here because of the space restrictionsFootnote 1. In particular, the syntactic simplicity of NF make it a very convenient theory for proof-theoretic investigations.

Before we focus on sequent calculi for NF let us start with some general preliminaries concerning arbitrary formalisation of set theory. It often goes unnoticed that it may be developed in the language where only \(\in \) is a primitive predicate or in the language with = primitive, which is rather more popular choice. In the latter case we assume that we have already some axioms/rules for = , so the only specific axiom we need for sets is:

ExtAx :  \(\forall xy(\forall z(z\in x\leftrightarrow z\in y)\rightarrow x=y)\)

since the converse is already provable by LL.

If we start with CFOL (only \(\in \) primitive), = may be defined either in the Leibnizian spirit:

\(=^L\): \( t=t' := \forall z(t\in z\leftrightarrow t'\in z)\)

or in the way Quine prefers:

\(=^Q\): \( t=t' := \forall z(z\in t\leftrightarrow z\in t')\)

The first choice leads to the standard characterisation of = and the axiom ExtAx is still required. The second one is different since ExtAx is provable but still we cannot obtain the full characterisation of identity. Therefore we must add a special form of LL as an extensionality axiom:

\(ExtAx'\): \(\forall xyz(x=y \rightarrow (x\in z\rightarrow y\in z))\)

and this is the way Quine proceeded with the development of NF. The second axiom is the axiom of abstraction:

ABS: \( \forall x(x\in \{y:\varphi (y)\} \leftrightarrow \varphi [y/x])\)

where \(\varphi \) is stratified. Assuming that the only predicate is \(\in \) this condition may be defined roughly as follows: it is possible to define a mapping from variables of \(\varphi \) into integers in a way that for each atom we have \(i\in i+1\). In case we admit =, a mapping should yield \(i=i\). In what follows we will admit both kinds of formulae as atomic, briefly called \(\in \)-atoms and =-atoms.

We will consider two approaches to construction of cut-free sequent calculus for NF. Although the rules (Ext), (AV) will be not primitive but derivable in both, the first one, following closely Quinean formulation, is closer to the general GS, whereas the second starts with Tennant’s rules suitably restricted.

4.1 The S-Approach to NF

There is no sense to take the instances of (Ext) and (AV) as primitive rules since it will not save us from addition of most of the specific rules for set-abstraction operators and =. So it is better to follow quite closely the original Quinean axiomatisation of NF. A difference with the latter is connected with the treatment of identity, since we take it as a primitive predicate characterised by rules. However, we do not take the primitive rules of GPCI for identity as primitive but rather provide new rules based on \(=^Q\). Hence we take GPC as the basis and add:

figure ad
figure ae

These rules correspond to \(=^Q\). Moreover, we add two rules corresponding to the axiom ABS:

figure af

with \(\varphi \) stratified.

We omit easy proofs of the equivalence of stated rules with respective axioms: ABS and the object language counterpart of \(=^Q\). Proofs of these axioms, as well as derivability of our rules in GPC enriched with axiomatic sequents expressing ABS and \(=^Q\) are straightforward and similar to proofs from Theorem 1. Instead we will show that although we have neither (Ext) nor (AV) as primitive rules they are derivable in such a system for stratified \(\varphi \).

Lemma 2

Derivability of (Ext) and (AV)

Proof

:

figure ag

The proof of (AV) or alternatively, of (ExtAV) is similar.    \(\square \)

But the rules \((\Rightarrow =)\) and \((=\Rightarrow )\) are not sufficient for obtaining the complete characterisation of identity in NF. In particular they are not sufficient for the case corresponding to the specific instance of LL expressed by the axiom \(ExtAx'\) Note that in general we must be able to prove:

  1. 1.

    \(t=t', t''=t' \Rightarrow t=t''\)

  2. 2.

    \(t=t', t''\in t \Rightarrow t''\in t'\)

  3. 3.

    \(t=t', t\in t'' \Rightarrow t'\in t''\)

With case 1 there is no problem; it is derivable by \((\Rightarrow =), (=\Rightarrow )\), similarly as other properties of =, including reflexivity and symmetry. The case 2 would be provable by \((=\Rightarrow )\) provided instead of b we are allowed to use any term \(t''\). So this case is problematic and needs reformulation of the rules which in general destroys the subformula property and may be troublesome in proving the cut elimination theorem. The case 3 corresponds exactly to \(ExtAx'\) and requires a separate rule which possibly covers also the case 2. To avoid troubles we might follow the general solution introduced for GS and use the rule (2LL) as two-premiss right-sided rule but it does not work since \((Abs \Rightarrow )\) introduces an \(\in \)-atom as a principal formula in the antecedent. As a result while proving cut elimination we cannot make a reduction of the following cut instance:

figure ah

It seems that in the presence of \((Abs\Rightarrow )\) and \((\Rightarrow Abs)\) the only solution is to add a 3-premiss version of LL:

figure ai

where \(\varphi (t)\) and \(\varphi (t')\) are either \(t''\in t\) and \(t''\in t'\) or \(t\in t''\) and \(t'\in t''\).

Summing up we obtain a system GSNF which adds to GPC the following rules: \((=\Rightarrow ), (\Rightarrow =), (Abs\Rightarrow ), (\Rightarrow Abs)\) and (3LL) ((Ref) is derivable).

Theorem 4

GSNF is an adequate formalisation of NF.

Moreover the cut elimination theorem can be proved for GSNF in a similar fashion as in [13] where similar solution was provided for sequent calculi for free description theories. Note however that the situation with the subformula property is even worse than in GS (GS’) due to the presence of (3LL). Is it possible to obtain a better formalisation of NF by means of Tennant’s rules?

4.2 The T-Approach to NF

If we want to apply the approach of Tennant to NF we have = as a primitive predicate not only present in the language but already characterised by specific rules so we start with GPCI and add the following Tennant’s-style rules:

figure aj
figure ak
figure al

where a is not in \(\varGamma , \varDelta , \varphi \), t is any term and \(\varphi \) is stratified.

Note that (Ext) and (AV) are derivable which follows from the proofs of EXT and AV presented in Sect. 3.2. As we noticed there, also the axiom ABS is provable, so we do not need special rules \((Abs\Rightarrow ), (\Rightarrow Abs)\) too. We do not need to care even about the axiom ExtAx since it is provable:

figure am

It seems that T-approach is better than S-approach to NF since it is more economical. However, if we think about cut elimination we must consider carefully the problem of primitive rules for identity. Although we first stated that we add the special Tennant’s-style rules to GPCI and we used (2LL) in the above proof it seems that we cannot keep (2LL) since in general we face the same problem with cut elimination as in the case of S-system illustrated in Subsect. 4.1. To prove the cut elimination theorem we must again either generally replace (2LL) with (3LL) or to follow the strategy introduced in [17] and separate the rules for LL dealing with special cases of atomic formulae. One possibility is to keep:

figure an

for \(\varphi \) being \(\in \)-atom and restrict (3LL) only to =-atoms:

figure ao

This way we obtain a system GTNF which adds to GPC the rules: \((:\Rightarrow ), (\Rightarrow :), (2LL'), (3LL'), (Ref)\). \((2LL')\) deals only with \(\in \)-atoms and all properties of identity are derivable by (Ref) and (3LL).

Theorem 5

GTNF is an adequate formalisation of NF.

The cut elimination theorem is provable for GTNF as well. Unfortunately, the situation with the subformula property is similar to that in the system GSNF from the preceding subsection. However, there are possible some simplifications obtained by reduction of the applications of \((3LL')\) if at least two of \(t, t', t''\) are parameters. Consider the cases with at most one term t complex:

  1. 1.

    \(a=b, a=c \Rightarrow b=c\)

  2. 2.

    \(t=b, t=c \Rightarrow b=c\)

  3. 3.

    \(a=t, a=b \Rightarrow t=b\)

  4. 4.

    \(a=b, a=t \Rightarrow b=t\)

\((2LL')\) may be modified to cover identities from case 1 and 2:

figure ap

for \(\varphi (t')\) being \(\in \)-atom or =-atom of the form \(b=c\) (a third term in the premisses may be complex or a parameter). For cases 3 and 4 we may add the rules:

figure aq

or

figure ar

Any of them will do the task. For example, if we take (E) we have a direct proof of 4 and the following proof of 3:

figure as

As a result we have to keep \((3LL')\) only for all cases where at least two of \(t, t', t''\) are complex terms at the price of adding (Tr) or (E). Let us call such a modified system GTNF’.

5 Conclusion

We have provided a proof theoretic treatment of the general theory of tfos introduced independently by several authors (S-theory), and proposed a modification of a different approach (T-theory) in a way which allows us to compare their relative strength. Moreover, we examined the ways in which both approaches may be extended to cover set theory NF of Quine. All obtained sequent systems satisfy the cut elimination theorem, but do not satisfy the subformula property. Hence, in the case of the systems for NF, we cannot obtain a syntactical consistency proof on the basis of the cut elimination theorem, because of the rules like (3LL). Still these systems, in particular a system GTNF described in the last subsection, allow us to keep a stricter control over the construction of proof.

The natural next step of this research is connected with the application of, possibly modified, systems GS, GS’, or (suitably restricted) rules of Tennant’s approach, to other kinds of term-forming operators, and careful examination of their specific features.

Eventually it is also interesting to investigate if the obtained systems allow us to prove other desirable properties in constructive way. One of such important points is the interpolation theorem. Since it was proved semantically for the general S-theory in [4], it is an important task to find a constructive proof as well. However, the method of split-sequents due to Maehara, which is usually applied in the setting of sequent calculi, fails for the presented systems since it does not work with rules like \((a\Rightarrow )\). The problem is connected with the fact that the complex term occuring in the active formula in the premiss may contain some predicates which do not occur in the rest of the respective division of a split-sequent but occur in the interpolant (and of course in the other division of a split-sequent). In this case the interpolant of the premiss fails to be an interpolant of the conclusion, where the active formula is deleted. Only the weaker form of interpolation can be proved in which we require that interpolants have only parameters (but not predicates) common to both divisions of the split-sequent. It is an open problem if such difficulties can be overcome.