Interpreting the compositional truth predicate in models of arithmetic

We present a construction of a truth class (an interpretation of a compositional truth predicate) in an arbitrary countable recursively saturated model of first-order arithmetic. The construction is fully classical in that it employs nothing more than the classical techniques of formal proof theory.


Introduction
The goal of this paper is to sketch a fully classical construction of truth classes in models of first-order arithmetic. The initial introductory remarks describe the background and the motivation for this endeavor.
It is well-known that every non-standard model of arithmetic will contain nonstandard arithmetical formulas. In other words, for an arbitrary nonstandard model M there will be elements s of M such that M | 's is an arithmetical formula' (all such objects will be called here 'formulas in the sense of the model' or 'M-formulas') even though in the real world s is not a formula at all. 1 In this situation it is natural to ask whether semantics for formulas in the sense of the model can be developed. First attempts in this direction were made by Robinson [19] and Krajewski [16], with the notion of a satisfaction class playing the key role. Namely, a satisfaction class in M is characterized as an arbitrary subset of the model which can be treated as a reasonable interpretation of the satisfaction predicate: roughly, it is a set of pairs (ϕ, v) which satisfies the usual Tarski-style compositional clauses for satisfaction (the assumption here is that ϕ is an M-formula and v is a variable assignment for ϕ.) One of the most remarkable results in the theory of satisfaction classes is that a (non-inductive) satisfaction class can be constructed in an arbitrary countable recursively saturated model of Peano arithmetic. 2 Since every model of arithmetic has an elementarily equivalent recursively saturated model, it immediately follows that the compositional axioms of satisfaction are proof-theoretically conservative over firstorder arithmetic. 3 Later it transpired that the conservativity of compositional satisfaction (or truth) is an interesting property not just to the mathematical logicians but also to philosophers. In particular, in recent philosophical debates on the so-called 'deflationism about truth' conservativity has been explicitly postulated as a desirable trait of axiomatic truth theories. 4 Independently of the outcome of these philosophical discussions, the upshot is that proofs of the conservativity results became important and interesting for a quite wide and diverse community of researchers investigating the properties of semantic notions.
However, the original proof of the theorem using the (so-called) 'technique of approximations' was difficult to follow for many readers. From the author's experience, the machinery of approximations developed by KKL in their paper, remains one of the main stumbling blocks in the wider dissemination of this important result. Accordingly, the question has been asked whether the result can be proved by purely classical methods. One successful attempt in this direction has been recently made by Enayat and Visser [6]. In their paper, they showed how to construct a satisfaction class using classical techniques of formal semantics, that is, compactness and the union of elementary chain theorem. 5 In the present paper I propose to prove the theorem by the classical techniques of formal proof theory, namely, by cut elimination. Coupled with Enayat and Visser's construction, this makes the fascinating field of satisfaction classes accessible to the students and the logicians, whose primary interest is either model theory or proof theory.

Basic notions
Axiomatic characterizations of semantic notions (that is, of truth or satisfaction) typically assume some base theory of syntax in the background. In the original construction of KKL the role of the base theory is played by first-order Peano arithmetic (PA) formulated in purely relational language, with all the usual function symbols replaced by predicates. In contrast, here the language of first-order arithmetic (from now on denoted as L Ar ) will be assumed to contain function and constant symbols, namely '+', '×', '0' and 'S' for addition, multiplication, zero and the successor operation. As for the base arithmetical theory, we weaken the assumption of KKL by choosing I 0 + exp for this role-the theory obtained from PA by restricting the schema of induction to 0 formulas and by adding an axiom which states that the exponential function is total. Throughout the paper we assume that all the coding and formalization of syntax is carried out in I 0 + exp. 6 The expressions V ar, T m, T m c , Fm L Ar and Sent L Ar will be used here in a double role. Firstly, they will be treated as referring (respectively) to the sets of variables, terms, constant terms, formulas and sentences of L Ar . In addition, they will also be used as shorthands for arithmetical predicates expressing the relevant sets in the background arithmetical theory I 0 + exp. 7 For a model M, we write Sent L Ar (M) for the set of all objects a such that a ∈ M and M | Sent L Ar (a).
The perspective adopted in this paper is that of truth, not satisfaction. Accordingly, we will consider the language L T obtained from L Ar by adding the unary truth predicate 'T (x)' (instead of a binary satisfaction predicate). Sent L T is the set of sentences of L T .
For the sake of avoiding cumbersome formulations, we will often eschew the notation for syntactic operations (dots, square corners) in the scope of the truth predicate. Thus, for example, instead of '∃x ∈ Sent L Ar T (¬x)' ('there is an arithmetical sentence such that the syntactic operation of preceding it with the negation symbol produces a true result') we write simply: '∃x ∈ Sent L Ar T (¬x)'.
We introduce now the basic axiomatic theory of truth, denoted as CT − .

Definition 1 Let
Ax be a set of axioms of a theory T h in L Ar containing I 0 + exp. Then CT − (Ax) is defined as the theory in the language L T axiomatized by Ax together with the following truth axioms: The acronym 'CT ' stands for 'compositional truth'. The natural truth axioms listed above follow the familiar pattern of Tarski's inductive truth definition. A theory T h axiomatized by Ax is called the base theory of CT − (Ax). The assumption that T h contains I 0 + exp ensures that T h can play the role of a theory of syntax, strong enough to formalize syntactic operations. The superscript in CT − indicates that if T h is schematically axiomatized, we are not allowed to substitute formulas of L T in the schemas. Thus, for example, if T h is axiomatized by means of some schema of induction, then in CT − (Ax) there will be no induction available for formulas containing the truth predicate. With an axiomatization of T h being fixed, we will Let us emphasize that the quantifier axioms of CT − employ numerals. A numeral is an arithmetical constant term of the form 'S . . . S(0)'; in other words, numerals are expressions obtained by preceding the symbol '0' with arbitrarily many successor symbols. 9 Accordingly, the intended meaning of the existential quantifier axiom is that '∃vϕ(v)' is true iff the result of substituting some numeral for v in ϕ(v) is true (similarly for the general quantifier truth axiom). 10 On the other hand, the first axiom of CT − states the truth condition for arbitrary atomic sentences, not just for identities between numerals. We adopt the axiom in this form because it is simply stronger than the corresponding version for numerals, hence we will obtain a more general conservativity result. Note that this motivation is not applicable in the case of the quantifier axioms, since the strength of their term and numeral versions cannot be so easily compared in the context of a theory with no extended induction.
One of the key results in the area of axiomatic truth theories is the conservativity theorem, stating that the truth axioms of CT − are conservative over mathematical theories realizing a sufficient amount of a theory of syntax. Conservativity is a direct corollary of the KKL theorem, which has been originally formulated as an expandability result concerning countable recursively saturated models of theories containing Peano arithmetic.
Two definitions below introduce the notion of a recursive type and the concept of a recursively saturated model.

Definition 2
Let Z be a set of formulas with one free variable x and with parameters a 1 · · · a n from a model M. We say that: 8 The choice of axiomatization is not always innocent. Thus, for example, let Ax be a set of axioms of P A. Is CT − (Ax)+ 'All elements of Ax are true' a conservative extension of P A? As observed by Łełyk [18], the answer to this question depends on the choice of our axiomatization of Peano arithmetic. 9 Our base theory I 0 + exp proves that every number has a numeral which names it.
(a) Z is realized in M iff there is an s ∈ M such that every formula in Z is satisfied in M under a valuation assigning s to x. (b) Z is a type of M iff every finite subset of Z is realized in M. (c) Z is a recursive type of M iff apart from being a type of M, Z is also recursive.
Definition 3 M is recursively saturated iff every recursive type of M is realized in M.
The classical KKL theorem states that every countable, recursively saturated model of a relational version of PA 11 carries a truth class. It immediately follows that compositional truth axioms are syntactically conservative over relational version of Peano arithmetic. Later in an unpublished paper Enayat and Visser [5] proved conservativity of the compositional truth axioms over weaker arithmetical theories formulated in the relational language. In [3] it is demonstrated that Enayat and Visser conservativity argument can be reconstructed for arithmetical theories containing I 1 formulated in the language with function symbols. In this paper we are going to work in the language with function symbols obtaining the following strengthened version of KKL theorem.

Truth and satisfaction
In our presentation we take the notion of truth, and not of satisfaction, as basic. 12 In this context, let us emphasize that in a non-inductive setting the choice of the basic semantic notion is not entirely innocent. In general, proofs of results about noninductive satisfaction classes do not automatically deliver corresponding results about truth classes, nor the other way round. In order to appreciate the differences between truth and satisfaction, let us introduce properly the basic theory of satisfaction, denoted as C S − . This time we extend the arithmetical language with a new binary predicate S(x, y) (the satisfaction predicate). In what follows 'v ∈ Asn(x)' is an arithmetical formula which reads 'v is an assignment for a formula x' (roughly, 'v ∈ Asn(x)' states that v is a finite function which assigns numbers to variables which are free in x). The expression 'x = val(t, v)' reads 'x is a value of the term t under the assignment v'.

Definition 5 Let
Ax be a set of axioms of an arithmetical theory T h. Then C S − (Ax) is defined as the theory in the language L S axiomatized by Ax together with the following satisfaction axioms: 11 In the relational arithmetical language instead of saying, for example, 'x + y = z' we would use the ternary predicate ' A(x, y, z)' with the intuitive reading 'z is the result of adding y to x'. Peano arithmetic can be formulated as a theory in this language. 12 In the discussions of the KKL theorem and related results both truth and satisfaction have been used in the role of the basic semantic concept. Thus, in [6,11,16] satisfaction is the primary notion, with a satisfaction class defined as an interpretation of the satisfaction predicate of (something resembling) the theory C S − defined below. On the other hand, in [7,14,17] the discussion is carried out in terms of truth, not satisfaction. Somewhat misleadingly, both Kotlarski et al. [14] and Engström [7] use the expression 'satisfaction class' when referring to what we called the truth classes.
As in the case of CT − , the axioms of C S − contain only arithmetical substitutions of the axiom schemata of Ax (i.e., no such substitution by a formula containing 'S' is an axiom of C S − ). A satisfaction class in a model M is a subset S of the model such that (M, S) | C S − .
Below we formulate a general schematic characterization of the KKL theorem for C S − . Concrete formulations require specifying the language and choosing a base theory of syntax.

Theorem 6 Let L B be the language of a base theory of syntax B. Let T h be a theory in L B extending B. For every countable, recursively saturated model M of T h there is a set S ⊆ M such that (M, S) | C S − (T h).
An example of a concrete version can be found in [11], where the theorem is proved for B being Peano arithmetic formulated in the language with the usual function symbols.
What we want to emphasize is that it is not obvious at all how to derive Theorem 4 from a corresponding concrete version of Theorem 6. The derivation would be trivial if C S − permitted us to define the truth predicate of CT − , that is, if for some τ (x) ∈ L S , C S − proved every formula obtained by replacing 'T ' with τ in some axiom of CT − . However, it seems implausible that C S − can do that. 13 Anyway, all the usual methods of defining truth from satisfaction (truth as satisfaction under all assignments, under some assignment or under the empty assignment) fail to deliver the truth predicate of CT − in the context of our non-inductive satisfaction theory. 14 In view of this, the transition from satisfaction to truth is sometimes made by postulating stronger properties of a satisfaction class. A prominent example of this strategy can be found in [6], where results are obtained about satisfaction classes satisfying not just the usual compositional axioms of C S − , but also the so-called 'extensionality condition', guaranteeing the possibility of defining the truth predicate of CT − in the theory of satisfaction. 15 The approach adopted in this paper is different in that we deal directly with truth, without a detour via satisfaction.
Finally, let us mention in passing that the transition in the opposite direction, that is, from results about truth to results about satisfaction, can also be problematic and it might depend on the choice of the quantifier truth axioms. In the version of CT − adopted here, with the quantifier axiom employing numerals, the satisfaction predicate of C S − can indeed be defined. 16 However, it is not obvious at all how to define the satisfaction predicate if we switch to the version with the truth axioms for quantified sentences which employ constant terms instead of numerals.

From consistent M-logic to a truth class
From now on, we will work with a fixed countable and recursively saturated model M of I 0 +exp. Following the original strategy of KKL, our first step is the development of a proof system called 'M-logic' (M L in short). Intuitively, M L is a system which permits us to process arbitrary sentences in the sense of M, including the nonstandard ones. The system is described externally (not in the model) in the form of a sequent calculus. 17 We will use '⇒' for the sequent arrow, with expressions of the form ' ⇒ ' referring to sequents. We shall always assume that both and are externally finite sequences of M-sentences. Note that, unlike in Gentzen's original system, we do not admit formulas with free variables in the sequents. This deficiency will be compensated by the presence of infinitary rules of inference in M L.
The definition of M-logic is framed after Gentzen's original system L K (see [8]). All the initial sequents have the form ϕ ⇒ ϕ, for an arbitrary ϕ ∈ Sent L Ar (M). The following rules of M L are copied directly from Gentzen's system: • Contraction, left and right (C-left and C-right): • ¬-left and ¬-right: • ∧-left and ∧-right (for arbitrary sentences θ and χ such that one of them is ϕ): • ∨-left and ∨-right (for arbitrary sentences θ and χ such that one of them is ϕ): In addition, M-logic has the following rules of inference: • ∃-right and ∀-left: Occasionally we will employ Gentzen's terminology, classifying rules as 'structural' or 'logical'. A logical rule always introduces a logical symbol (for example, the M-rules are logical, since they introduce quantifiers); all the other rules will be referred to as 'structural'. 18 For a logical rule, the phrase 'principal formula' denotes the formula in the lower sequent containing the newly introduced logical symbol (thus, for example, the implication ϕ → ψ appearing as the first formula on the left in the lower sequent of the rule →-left is the principal formula of this rule). For a logical rule, the phrase 'active formula' refers to the formula(s) in the upper sequent(s) used in the derivation of the principal formula (thus, for example, in the rule M-right every formula of the form ϕ(a) appearing as the last element in the succedent of one of the upper sequents is to be classified as active).
Proofs in M L are (possibly infinite) trees of finite height, where the height of a proof is defined (as usual) as the length of the maximal path. By definition, trees with no maximal finite path do not qualify as proofs in M L. Note that this is a minor difference between our construction and the one of KKL, who work without such a finiteness restriction. One consequence of this difference is that KKL need to employ the assumption that M is recursively saturated in the consistency proof of their version of M-logic. Indeed, for recursively saturated models the distinction between two systems of M-logic (with and without the finite height restriction) does not correspond to any real difference, namely, it transpires that if M is recursively saturated, then sentences provable in M-logic with no restriction on the height of proofs will always have proofs of finite height. 19 The effect of restricting the heights of proofs is that we will not need the recursive saturation assumption in our consistency proof (the assumption will be employed only to guarantee the transition from the consistent M-logic to the truth class, see Lemma 7 and its proof). Another effect is the overall simplification of the proof, with the point being that in the construction of a truth class we simply do not need to analyze the relation between the two versions of M-logic. 20 Observe that in M L, the infinitary rules M-left and M-right replace the original rules ∃-left and ∀-right of Gentzen. 21 It should be also emphasized that in all the quantifier rules of M L we employ numerals. Thus, for example, in order to apply ∃-right, we need a sentence ϕ(a) where a is a numeral. In contrast, in Gentzen's original system the rule ∃-right would permit us to derive ⇒ , ∃xϕ(x) from ⇒ , ϕ(t) for an arbitrary term t, not necessarily a numeral. The effect of this modification of Gentzen's system is that the truth class which we construct can contain term pathologies. Thus, in a model (M, T ) of CT − which we eventually obtain there can exist a nonstandard formula ϕ(x) such that for some term t, ϕ(t) belongs to T (so that, loosely speaking, the model thinks that ϕ(t) is true), while the sentence ¬∃xϕ(x) also belongs to T . In this way we obtain a disconcerting effect: the model thinks that ¬∃xϕ(x) is true even 18 The reader should keep in mind this special meaning of 'logical'. In particular, the phrase 'logical rule' does not mean 'a rule which is valid by logic alone', as it could be argued that, e.g., the M-rules are not valid by logic alone, while various structural rules, including cut, are valid by logic alone. 19 See Lemma 8 of Kotlarski et al. [14]. The same observation is formulated as Lemma 3.1 in [20], where the full proof is given. 20 As for the terminology, the proof system described in this paper could be called 'finite height M-logic', in contrast to the original M-logic with no finiteness restriction. However, for brevity I will simply use the name 'M-logic' when referring to our finite height system. 21 Proof systems with similar infinitary rules have already been studied in the literature in the context of cut elimination. See, for example, [22]. though it considers as true some term instantiation of ϕ(x). 22 (This will happen if all the numerical instantiations of ϕ(x) are seen as false by the model, that is, if for all numerals a, the sentence ¬ϕ(a) belongs to T .) However, (M, T ) can still be a model of CT − , since the quantifier axioms of CT − only employ numerals.
The expression 'M-logic' is a bit of a misnomer, since it is clearly not a system of pure logic for sentences in the sense of M. The extralogical intrusions are not just the infinitary rules (the M-rules given above). In addition, thanks to the (Tr-lit) rule, the system contains the means permitting it to recognize the truth of literals, thus going beyond pure logic also in this respect. We write 'M L ϕ' as an abbreviation of 'M L ⇒ ϕ' (in other words, 'M L ϕ' means that M-logic proves the sequent whose antecedent is empty and the succedent contains just ϕ). Now, if ϕ is a true literal (that is, if ϕ is of the form 't = s' and M | val(t) = val(s) or ϕ is of the form 't = s' with M | val(t) = val(s)), then M L ϕ, since the sequent ⇒ ϕ can be derived from the initial sequent ϕ ⇒ ϕ by (Tr-lit).
The lemma below establishes a connection between M-logic and truth classes. We will use the notation 'M L n S' for expressing that the sequent S has a proof in M-logic of height at most n (in short, S is n-provable). For the proof of the lemma, we introduce first the family of unary predicates 'Pr n (S)' expressing this relation in the arithmetical language; in other words, 'Pr n (S)' will be an arithmetical formula expressing that S is n-provable. Observe that for each rule R of M-logic, the relation 'S can be obtained by R from n-provable sequents' can always be expressed by an arithmetical formula, provided that n-provability is arithmetically expressible. Thus, for example, 'S can be obtained by M-left from n-provable sequents' can be written down as: In view of this, we introduce the following definition. By external induction on natural numbers it can be demonstrated that: We can now turn to the proof of Lemma 7.

Proof of Lemma 7
Let ϕ 0 , ϕ 1 , . . . be an enumeration of the set of M−sentences (this is the only place where the countability assumption is used).
We define: if M L (T n → ¬ϕ n ) and ϕ n is not existential, for an a ∈ M such that M L (T n → ¬ψ(a)), T n ∪ {¬ϕ n } otherwise.
The above definition strongly resembles the one typically employed in the proof of Lindenbaum's lemma. The expression 'T n ' on the right side of the definition (as in 'M L (T n → ¬ϕ n )') stands for the conjunction of all the sentences added on previous levels of the construction.
The difference with Lindenbaum's construction is that here existential statements are always added to T n together with the witnessing formulas. In view of this, we need to verify that whenever M L (T n → ¬∃xψ(x)), there will exist an a ∈ M such that M L (T n → ¬ψ(a)). There is no analogous step in the proof of Lindenbaum's lemma; this is also the only place in the whole proof where recursive saturation of the model is important.
Thus, assume that M L (T n → ¬∃xψ(x)). Define: We observe that p(x) is a type. Otherwise there is a natural number k such that M | ∀a Pr k (T n → ¬ψ(a)). 23 Hence for all a, M L k T n → ¬ψ(a). But then by the M-rule and cut, M L T n → ¬∃xψ(x), which is a contradiction.
Since p(x) is a type, by recursive saturation there is an a ∈ M which realizes it and we have: ∀k M | ¬ Pr k (T n → ¬ψ(a)), hence the sentence T n → ¬ψ(a) is not provable in M-logic, as required. Now, define T as n∈ω T n . The proof of Lemma 7 is completed by demonstrating that (M, T ) | CT − provided that M-logic is consistent.
The set T is clearly complete (for every M-sentence ψ, either ψ or negation of ψ belongs to T ) and it contains a numerical example for every existential statement which belongs to T . In addition, since by assumption M-logic is consistent, there is no ψ such that both ψ and ¬ψ belongs to T . In this setting, checking that all the axioms of CT − are true in (M, T ) is fairly easy and we consider just one example, namely, the axiom for the existential quantifier. In other words, we verify that Observe that in the proof of both implications the assumption of the consistency of M-logic will be used.

Assume that (M, T ) | T (∃vϕ(v)). Let '∃vϕ(v)' be ϕ n (that is, let it be the nth sentence in our enumeration of M-sentences).
Then on the level n + 1 of the construction, ϕ n must have been added to T n together with the witnessing statement ϕ(a) for some numeral a (otherwise the negation of ϕ n was added, but this is impossible since it would make T inconsistent). Therefore (M, T ) | ∃xT (ϕ(ẋ)).
For the opposite implication, assume that (M, T ) | ∃xT (ϕ(ẋ)), then for some a ∈ M, ϕ(a) ∈ T . Assuming for the indirect proof that ∃vϕ(v) / ∈ T , pick a natural number n such that both ϕ(a) and ¬∃vϕ(v) belong to T n . But then all T k -s for k ≥ n are inconsistent in M-logic, meaning that for every k ≥ n, M L T k → ψ for every M-sentence ψ. In effect T would have to contain a pair of contradictory statements, which is impossible.

Consistency of M-logic
At this stage all that is missing for the proof of Theorem 4 is the argument for the consistency of M-logic. In [14] the consistency of M-logic is proved by the technique of approximations; a detailed argument along the same lines is also presented in [7]. Roughly, the authors define a new language which contains fresh predicate symbols corresponding to formulas in the sense of M (formulas of this new language are said to 'approximate' those of L(M)); they introduce also a new logic for this language called 'template logic' in Engström's dissertation. The rest of the proof proceeds then by proving the soundness of template logic and establishing the link between M-logic and template logic. 24 We will not employ this formal machinery; we opt instead for a syntactic proof with cut elimination as the main tool. 25 Let us start by observing that cut elimination is indeed sufficient for our goal.

Proposition 10 If every sequent provable in M-logic has a cut-free proof, then M-logic is consistent.
Proof If M-logic is inconsistent, then it proves that 0 = 1. By cut elimination, take a cut-free proof P of 0 = 1. It is easy to observe that every sentence in P has to be either atomic or negated atomic (the reason is that without cut, (Tr-lit) is the only rule that permits us to eliminate sentences in the proof and (Tr-lit) can eliminate literals only.) For an arbitrary occurrence S of a sequent in P, let the level of S in P be defined 24 See [7], Sections 3.4.3 and 3.4.4. See also p. 33 of Engström's dissertation, where he writes: 'Proving consistency of a logic can be done, mainly, in two different ways; the proof theoretic way, by a cut-elimination theorem, or the model theoretic way, by a soundness theorem. We will use the model theoretic approach and prove a soundness theorem'. 25 Cut elimination is also the technique employed by Leigh [17] in the proof of the conservativity theorem for CT − . However, in Leigh's paper the perspective is different, namely, it is CT − itself that gets reformulated as a sequent calculus. Leigh obtains then a special version of cut elimination lemma for his truth system, from which the conservativity result follows. Our starting point is different, as we work with M-logic (not CT − ) formulated as a sequent calculus, which permits us to give a simpler proof of the conservativity result. Admittedly, Leigh's proof gives in addition a formalized version of the conservativity theorem (see the second part of his Theorem 1.1). I am not sure at the moment whether the formalized conservativity result can be obtained by the methods presented in this paper. as the length of maximal path generated by S in P. 26 Let T r 0 (x) be the arithmetical truth predicate for atomic sentences and their negations. By external induction on the level of occurrences of sequents in P, it can be demonstrated that for every S in P, if all sentences in the antecedent of S are T r 0 , then some sentence in the succedent of S is T r 0 . 27 This trivially holds for level 0 (that is, for the initial sequents). In the inductive part, observe that any S of level n + 1 must have been obtained in P from sequents of lower level by weakening, contraction, exchange, (Tr-lit) or by the rules for negation applied to atomic sentences; this is so by assumption that P is cut-free and the application of any other rule would introduce a superfluous logical symbol to the conclusion. In effect, very weak resources are enough to verify that if all sentences in the antedecent of S are T r 0 then some element of the succedent of S is T r 0 (in particular, the argument does not require that the model M satisfies any stronger arithmetical theory than I 0 + exp).
It immediately follows that M | T r 0 (0 = 1), which is impossible.
The next lemma states that cut can be eliminated in all proofs in M L.

Lemma 11 Let M be an arbitrary model of I 0 + exp. For every sequent S, if S is provable in M L, then S has a cut-free proof in M L.
It immediately follows that M-logic is consistent for every model M | I 0 + exp. 28 The aim of the remaining part of the paper is to lead the proof of Lemma 11 to the point at which it can be completed simply by repeating Gentzen's original argument for cut elimination. It should be emphasized that we are not there yet. Our setting is that of possibly nonstandard sentences (sentences in the sense of M) and this generates an obstacle which first has to be removed.
In order to see the obstacle, let us recap the classical argument. The aim is to show that the system with the following mix rule (which is a generalized version of cut) admits mix elimination: where and contain ϕ (the mix formula); * and * differ from and only in that they do not contain any occurrence of ϕ. Since mix and cut produce equivalent proof systems, mix elimination gives us the desired result. 26 Thus, occurrences of sequents which are initial in P have level 0 and the maximal level of an occurrence of a sequent in P is not larger than the height of P. Note that the proof can contain different occurrences of one and the same sequent; in such a case the levels of these occurrences can also be different. 27 Note that we are using here the resources of the model M. Strictly speaking, we demonstrate that: for every S in P, if for every ψ in the antecedent of S M | T r 0 (ψ), then for some sentence ψ in the succedent of S M | T r 0 (ψ). 28 The reader should keep in mind our finiteness restriction, that is, proofs in M-logic are by definition trees of finite heights. This explains the apparent discrepancy between Lemma 11 and the result of KKL, which states that a countable model M of Peano arithmetic is recursively saturated if and only if M-logic is consistent (see the remarks on p. 286 after the proof of Lemma 2 in [14]; cf. also Theorem 3.15 of Smith [20]). The point is that both KKL and Smith work with a version of M-logic with no finiteness restriction on the height of proofs.
In the next stage it is demonstrated that mix can be eliminated from any proof which contains only a single application of the mix rule in the last step. This is done by double induction on the degree of proofs (main induction) and on the rank of proofs (subinduction). For proofs with mix used only in the last step, we define: • The left rank of the proof is the largest number of consecutive sequents in a path starting with the left-hand upper sequent of the mix and such that every sequent in the path contains the mix formula in the succedent. • The right rank of the proof is the largest number of consecutive sequents in a path starting with the right-hand upper sequent of the mix and such that every sequent in the path contains the mix formula in the antecedent. • The rank of the proof is the left rank of the proof + the right rank of the proof.
• The degree of the proof is the syntactic complexity of the mix formula.
After this is done, it follows that mix can be eliminated from an arbitrary proof (not just from proofs which contain only a single application of the mix rule in the last step). Namely, given an arbitrary proof P, we can eliminate all the applications of mix stage by stage, by considering subproofs of P which contain mix only in the last step.
When applying this strategy to the case of M-logic, one immediate difference is that in the final part of the reasoning (the one in which mix is eliminated from an arbitrary proof) we have to make sure that the height of the mix-free proof remains finite. 29 As for the earlier parts, there is no problem in our setting with induction on the rank of proofs, since both the left and the right rank of the proof in M L will always be a (standard) natural number, restricted by the height of the proof. However, the induction on the degree of proofs in M L is quite problematic. Since the mix formula might be a non-standard element of the model M, its syntactic complexity might be a non-standard number. Arguing externally by induction on non-standard numbers is clearly an invalid move and this is the main obstacle complicating the situation.
Our remedy is to replace the general notion of a degree with a notion relativized to a proof. Assume that we are given a proof P with mix applied only in the last step, that eliminates the (possibly non-standard) mix formula ϕ. The guiding intuition to be formalized below is that in the mix-elimination proof the syntactic shape of ϕ matters only comparatively. For example, ϕ might have the form ¬ψ. The intuition is that this will matter only provided that ψ itself (without negation) appears in P; otherwise in the context of a mix-elimination proof ϕ might just as well be treated as a formula of syntactic complexity 0, even if it is non-standard. The underlying reason is that in a mix-elimination proof the notion of a degree of a mix formula ϕ is used only in analysing the case of ϕ being obtained in the proof by a logical rule (thus, in our example, ϕ would be obtained by one of the rules for negation, which means that ψ itself must appear in the proof).
Our objective is to make these ideas precise. In what follows the word 'sequence' should always be interpreted externally; in other words, sequences are finite or infinite objects in the real world, not necessarily elements of M. The length of a finite sequence a = (a 0 . . . a k ) is the number of its elements, that is, lh(a) = k + 1. For an infinite sequence a we define lh(a) as ω.
• Let ϕ ∈ Sent L Ar (M). We say that s is a -sequence for ϕ iff s 0 = ϕ and for every k such that k + 1 < lh(s), s k+1 s k .
The notion of a degree can now be defined in the following way.

Definition 13
Let P be an arbitrary proof in M L with mix used only in the last step. Let ϕ be the mix formula in P. We define: • d(ϕ, P) (the degree of ϕ in P) = sup{lh(s) : s is a -sequence for ϕ such that for every k < lh(s) s k ∈ P}. • d(P) (the degree of P) is defined as d(ϕ, P).
The expression 's k ∈ P' means that the sentence occupying the k-th place in the sequence s appears in some sequent in the proof P. In effect, given a proof P with a mix formula ϕ, its degree is identified with the maximal length of a -sequence generated by ϕ and containing just the sentences used anywhere in P (not necessarily on a single path in P). The actual length of this sequence is left open by the definition; in particular, it is not decided whether the relevant sequence is finite or not. Nevertheless, we are going to demonstrate that proofs with mix used only in the last step always have finite degrees. 30 In order to prove this finiteness property, we introduce first the function str(x) ('the structure of a formula x'). Let the letter p be a new symbol, which will be treated as a propositional variable. Intuitively, given an arithmetical formula ϕ, the function str produces a formula of the language with the new symbol which is exactly like ϕ, except that the letter ' p' is substituted for all occurrences of atomic formulas in ϕ. 31 The function str can be defined by the following arithmetical condition. 30 This property does not hold for proofs with arbitrarily many applications of the mix rule, hence the restriction. A counterexample can be produced along the following lines. Take an infinite -sequence ϕ, ϕ 0 , ϕ 1 . . . and consider a proof with mix for ϕ in the last step in which some sequent is earlier obtained by an infinitary rule. The idea is that then you can add all the ϕ i -s to this proof on distinct branches of the infinitary rule (say, introducing them by weakening and then eliminating them by mix). 31 Thus, for example, str(∃x(x + x = 0 ∨ ∀y¬x × y = 0)) is '∃x( p ∨ ∀y¬ p)'. Definition 14 str(x) = y iff there are sequences s and s and a number k + 1 such that: 1. lh(s) = k + 1 and lh(s ) = k + 1, 2. s is a syntactic construction of x (hence s k = x), 32 3. s k = y, 4. for every l < k + 1: • if s l has the form t = s, then s l is p, • if for some j < l, s l has the form ¬s j , then s l has the form ¬s j , • if for some i, j < l, s l has the form s i • s j , then s l has the form s i • s j , • if for some j < l, s l has the form Qvs j , then s l has the form Qvs j .
Let us extend the arithmetical language by the new unary function symbol 2 x for the x-th power of two. Let 0 (exp) be the class of bounded formulas in the extended language, with new terms functioning as possible bounds (so that, for example, the quantifier ∀x < 2 y counts as bounded). Define I 0 (exp) as the theory extending Robinson's arithmetic with appropriate axioms for exponentiation together with the induction schema for all 0 (exp) formulas. 33 Since it is known that I 0 (exp) and I 0 + exp are equivalent theories, 34 the observations to be made below in terms of I 0 (exp) apply automatically to I 0 + exp as well.
Note that given the assumption of binary coding, the expression 'str(x) = y' can be written down as a 0 (exp) formula. Let us denote by |x| the length of the binary expansion of x. The quantifier for the lengths of sequences s and s from Definition 14 can be bounded by |ϕ| · |ϕ|; the codes of these sequences will be bounded by 2 |ϕ|·|ϕ| . In effect, the totality of str can be proved already in I 0 + exp.
Abbreviate str(ϕ) = str(ψ) as ϕ ∼ ψ. Let compl(ϕ) be the number of connectives and quantifiers in ϕ; we assume that 'compl(x) = y' is a 0 (exp) formula. We now observe that a formula which appears at some stage in a -sequence will always have a larger syntactic complexity than formulas appearing later in this sequence; moreover, sameness of structure implies the same number of logical symbols. This is the content of the proposition below. 32 In other words, s is a sequence with x as the last element, such that every element of s is either an atomic formula (identity between terms) or it is a complex formula formed from some earlier elements of s by attaching a quantifier or adding a sentential connective. We assume that the condition of being a syntactic construction of x is expressed here by an arithmetical formula. 33 For a precise definition, see [9, p. 37].

Proposition 15 (a) Let ϕ be an arbitrary (possibly nonstandard) arithmetical sentence and let s be an arbitrary -sequence generated by ϕ. Then for every ψ ∈ s, if ψ is not the first element of s (that is, if
Proof For the proof of Proposition 15(a), given a -sequence generated by ϕ denote by s n (for an arbitrary number n smaller than the length of s) the n-th element of s. 35 By external induction we demonstrate that: In the proof of Proposition 15(b) we work with the assumption that M | ϕ ∼ ψ. The conclusion that M | compl(ϕ) = compl(ψ) follows then directly from the fact that for an arbitrary χ , the number of connectives and quantifiers in χ is the same as the number of connectives and quantifiers in str(χ ), i.e., M | ∀χ compl(χ ) = compl(str(χ )). This can be established in I 0 (exp) by 0 (exp) induction on the length of sequences a, b, c, d such that: The key property of the equivalence relation ∼ is encapsulated in the following corollary to Proposition 15.

Corollary 16 Let Z ⊆ Sent L Ar (M). For every s, if s is a -sequence with elements from Z , then lh(s)
Proof Let s be a -sequence with elements from Z . It is enough to observe that for no ϕ, ψ ∈ s we will have: ϕ ∼ ψ (in effect, no two different elements of s belong to the same equivalence class of the ∼ relation, hence the length of s is not greater than the number of equivalence classes). Fix ϕ and ψ ∈ s and assume (without loss of generality) that ϕ precedes ψ in s. Let s be the sequence obtained from s by removing all sentences preceding ϕ. Then s is a -sequence generated by ϕ. By Proposition 15(a) we obtain: compl(ψ) < compl(ϕ) and hence by Proposition 15(b) not ϕ ∼ ψ.
We are now ready to demonstrate that proofs with mix used only in the last step always have finite degrees.

Lemma 17 Let P be an arbitrary proof in M L with mix used only in the last step. Then d(P) is a natural number (in other words, it is never ω).
Proof Fix a proof P in M L which contains mix only in the last step. Let Z be the set of all sentences which appear in P. We demonstrate that {[ϕ] ∼ : ϕ ∈ Z } is finite, which by Corollary 16 guarantees the conclusion of Lemma 17.
For an arbitrary sequent S in P, let l(S) (the level of S) be the length of the path leading from S to the end sequent of P. We denote by S i the set of all sequents in P whose level is not greater than i. Let Sent i be defined as the set of all sentences which appear in some element of S i . Let k be the height of P. The task is to show that: This will end the proof, since Sent k = Z .
We proceed by induction. For i = 0 the conclusion is trivial, as Sent 0 itself is finite (Sent 0 is the set of sentences which appear in the end sequent of P). Now, assuming that {[ϕ] ∼ : ϕ ∈ Sent i } is finite, we claim that {[ϕ] ∼ : ϕ ∈ Sent i+1 } is also finite.
Observe that Sent i+1 can contain more sentences than Sent i only provided that some sequent of the level i is obtained in P either by mix or by (Tr-lit) or by a logical rule (other structural rules do not generate any new sentences on level i + 1). If some sequent of the level i is obtained in P by mix, then i = 0, since by assumption P contains mix only in the last step. Then the conclusion easily follows, since in such a case Sent i+1 is finite (it is simply Sent 0 together with the mix sentence). In effect, we can now assume that i = 0, with no sequent of the level i being obtained in P by mix.
Assume that the set Sent i+1 \Sent i of new sentences at the level i + 1 is not empty. Then each element of this set is either eliminated by (Tr-lit) in the next stage of the proof or it is an active formula of the logical inference with the principal formula belonging to Sent i . Let us note first that all the elements of Sent i+1 \Sent i which are eliminated by (Tr-lit) fall into just two possible ∼-classes (one for atomic sentences and one for their negations). This leaves us with the second case -that of sentences new on the level i + 1 which are obtained by logical rules. By inductive assumption, all the principal formulas of logical rules belong to finitely many ∼-classes. We will demonstrate that in such a case all active formulas also belong to finitely many ∼classes, which will finish the proof.
Let [ϕ] ∼ be a ∼-class on Sent i . We claim that there are at most two ∼-classes x and y such that for every ψ ∼ ϕ, if ψ is the principal formula of a logical inference rule which produces in P the sequent S belonging to S i , then every active formula of this inference rules belongs either to x or to y. Let us analyse cases.
• Case 1: ϕ = ¬θ . Define x as [θ ] ∼ . Take an arbitrary ψ ∼ ϕ such that ψ is the principal formula of a logical inference rule which produces the sequent S. Then ψ has the form ¬γ and the active formula must be γ . Since ¬γ ∼ ¬θ , we have γ ∼ θ , therefore γ ∈ x. • Case 2: ϕ = χ • θ . Define x as [χ ] ∼ and y as [θ ] ∼ . Take an arbitrary ψ ∼ ϕ such that ψ is the principal formula of a logical inference rule which produces the sequent S. Then ψ has the form χ • θ and the active formula must be χ or θ . Since χ ∼ χ and θ ∼ θ , we have: all active formulas belong either to x or to y. • Case 3: ϕ = Qvθ . Define x as [θ ] ∼ . Take an arbitrary ψ ∼ ϕ such that ψ is the principal formula of a logical inference rule which produces the sequent S. Then ψ has the form Qvχ and the active formulas must be of the form χ(a). Since χ ∼ θ and for every a, χ(a) ∼ χ , 36 we have: all active formulas belong to x.
In the proof of the cut elimination lemma one more property of degrees of proofs will be important. In what follows we use the notation Sent(P) for the set of all sentences which appear in a proof P.

Proposition 18
Let ϕ and ψ be M-sentences such that ψ ϕ. Let P and P be proofs in M-logic such that: • Both P and P contain mix only in the last step.
• The mix formula in P is ϕ.
• The mix formula in P is ψ.

Then d(P ) < d(P).
Proof Since the degree of a proof has been defined as the maximal length of asequence of sentences belonging to the proof generated by the mix formula, let (ψ, θ 0 . . . θ k ) be a maximal such sequence for P . Since Sent(P ) ⊆ Sent(P), it is easy to observe that the sequence (ϕ, ψ, θ 0 . . . θ k ) is a a -sequence of sentences belonging to P generated by ϕ. Hence, d(P ) < d(P).
In effect, Definition 13, Lemma 17 and Proposition 18 give us the notion of a degree such that degrees of proofs (with mix used only in the last step) are always standard natural numbers. Hence, induction on the degree of such proofs can be applied and the way to proving cut elimination theorem for M L is now open. I will not present the whole cut elimination proof, since it is mostly a repetition of Gentzen's reasoning. Instead, I will mostly restrict myself to discussing the cases of the new rules (the ones not present in the original Gentzen's system).

Proof of Lemma 11 (Outline and chosen cases).
It is demonstrated that: (1) mix can be eliminated from any proof which contains only a single application of the mix rule in the last step, (2) given a proof P with mix only in the last step, the new mix-free proof will employ only sentences used in P, (3) the height of the new mix-free proof P is determined by the height of the initial proof P. Let us assume (main induction) that mix can be eliminated in this way in every proof of degree < n. Let us also assume (subinduction) that mix can be eliminated in this way in every proof of a degree n but with rank < k. Our task is to show that mix can be eliminated in this way in proofs of degree n and rank k.
The proof starts with the case of k = 2 (the lowest possible rank) and proceeds by analysing subcases. Here we analyse only two subcases, with the first one corresponding to a rule of M L absent in L K . 37 Namely, let us assume for a start that the mix formula ∀xϕ(x) 38 is obtained in P by a logical rule in both the succedent of the left-hand upper sequent of the mix and in the antecedent of the right-hand upper sequent of the mix. Then the last stage of the proof runs as follows: (mix elimination introduces no new sentences, cf. Proposition 18), as this guarantees that in both cases the degrees of our mixes fall within the main inductive hypothesis.
When k > 2, we have in addition the case of (Tr-lit) to analyse. Thus, in the last stage (Tr-lit) could be used to obtain the right-hand upper sequent of the mix. In effect, the last stage of the proof might run as follows: If ϕ is the mix formula, we can omit (Tr-lit) and use the mix rule instead: If the mix formula is not ϕ, we can build the following proof: ⇒ ϕ, ⇒ mix , ϕ, * ⇒ * , some exchanges ϕ, , * ⇒ * , Tr-lit , * ⇒ * , Since in both cases the new proof has lower rank than k (we moved the mix up the derivation), the inductive hypothesis applies and the mix rule is eliminable. The case of (Tr-lit) being used to obtain the left-hand upper sequent of the mix is very similar.
With the proof of Lemma 11, the construction of a truth class in a recursively saturated model of I 0 + exp has been completed.
We leave as open several natural questions concerning the possibility of applying the present technique in conservativity proofs of extensions of CT − (that is, of CT − with new axioms added). In our framework, adding new axioms to CT − would enforce a corresponding modification in M-logic, where new rules or new initial sequents will also have to be added. In general, the question then would be whether the addition of these new elements preserves the property of cut eliminability, possibly in some restricted form but still permitting to derive the consistency of the extended version of M-logic. 41 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.