## 1 Introduction

The rules of inference of a logical system define an inductive class of formal derivations. The most natural way to prove properties for the class is by induction on the construction of derivations, i.e., by induction on the last rule applied. It is often a crucial component in such proofs to show that the property in question is maintained under the composition of two derivations, even if this aspect is regularly ignored and the composability of derivations taken for granted. Results that show composition to maintain properties of derivations were called Hilfssätze in work of Gentzen that remained unpublished in its time. His original proof of the consistency of arithmetic of 1935 contained a Hilfssatz by which the ‘reducibility of sequents’ is maintained under composition. After he changed this proof into one that used transfinite induction, all traces of the Hilfssatz disappeared (see von Plato 2015  for details).

A formal implementation of the Hilfssatz methodology requires that composition be made into an explicit rule that is added to the logical rules of a calculus. The following results are shown as illustrations of the use of such an explicit composition rule: (1) A proof of normalization by a Hilfssatz for intuitionistic natural deduction. (2) A proof of strong normalization by bar induction.

## 2 Notation for Natural Derivations

The rules of natural deduction are production rules by which the class of formal derivations is defined inductively. Whenever there is such a definition, the most natural way to prove properties of the corresponding class is by induction on the last rule applied. This is so also in proof theory; a proof of normalization for intuitionistic natural deduction is given as a first example.

For a uniform treatment, we use natural deduction with general elimination rules and the related notion of normal derivability in which the condition is that the major premisses of elimination rules have to be assumptions. The modified rules are, with the multiplicity $$n,m \geqslant 0$$ of closed formulas indicated by exponents as in $$A^n, B^m$$ (Table 1).

The normalizability result to be presented can be worked out also for the standard rules that can be seen as special cases of the general ones (Table 2).

It will be convenient in this situation to leave out the degenerate derivations of the minor premisses, to have exactly the Gentzenian rules.

In the standard tree notation for natural derivations, as above, the composition of two derivations can be indicated schematically, as in: Composition has the condition that the eigenvariables and discharge labels of the two derivations be distinct, if not, they can be changed.

No trace is left of the composition in the rightmost derivation. As the calculus is defined by its logical rules, composition in natural deduction is usually left implicit. To represent the composition of two derivations formally and to reason about its properties in a convenient form, we write the logical rules and the additional rule of composition in sequent calculus style, with the open assumptions of each formula D in a derivation written out as a multiset $$\varGamma$$ in a sequent $$\varGamma {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}D$$.

More formally, we define a root-first translation into sequent calculus style. If the last rule is & I, we have: $$\vee I$$ is similar, and $$\mathord {\supset }I$$ is: The translation continues from the premisses until assumptions are reached. The logical rules of the calculus NLI are obtained by translating the rest of the logical rules into sequent notation. The nomenclature NLI was used in some early manuscripts of Gentzen to denote a “natural-logistic intuitionistic calculus” (Table 3).

The calculus is completed by adding initial sequents of the form $$A{{\mathrm{\mathbin {\varvec{\rightarrow }}}}}A$$, with A an arbitrary formula, and the zero-premiss rule $$\bot E$$ by which $$\bot {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}C$$ can begin a derivation branch.

We say that the closing of an assumption formula in E-rules and in rule $$\mathord {\supset }I$$ is vacuous if $$n=0$$ or $$m=0$$. Similarly, the closing of an assumption is multiple if $$n>1$$ or $$m>1$$. With $$n=1$$ or $$m=1$$, the closing of an assumption is simple. Vacuous and multiple closing of assumptions is seen in: The former case corresponds to the situation in sequent calculus in which a formula active in a logical rule stems from a step of weakening, the latter to a situation in which it stems from a step of contraction, as shown in von Plato (2001) .

The composition of two derivations is an essential step in the normalization of derivations. It can now be written quite generally in the form: Iterated compositions appear as so many successive instances of rule Comp.

In a permutative conversion, the height of derivation of a major premiss derived by $$\vee E$$ or $$\exists E$$, i.e., number of successive steps of inference, is diminished. The effect of the general rules is that such conversions work for all derived major premisses of elimination rules:

### Definition 1

A derivation in natural deduction with general elimination rules is normal if all major premisses of E-rules are assumptions.

As a first step towards normalization, we need to show that derivations in natural deduction can be composed:

### Lemma 1

(Closure of derivations with respect to composition) If given derivations of the sequents $$\varGamma {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}D$$ and $$D,\varDelta {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}C$$ in NLI are composed by rule Comp to conclude the sequent $$\varGamma ,\varDelta {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}C$$, the instance of Comp can be eliminated.

### Proof

We show by induction on the height of derivation of the right premiss of Comp that it can be eliminated.

1. Base case. The second premiss of Comp is an initial sequent, as in: The conclusion of Comp is identical to its first premiss, so that Comp can be deleted.

If the second premiss is of the form $$\bot {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}D$$, the first premiss is $$\varGamma {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}\bot$$. It has not been derived by a right rule, so that Comp can be permuted up in the first premiss. In the end, a topsequent $$\varGamma ' {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}\bot$$ is found as the left premiss of Comp, by which $$\bot$$ is in $$\varGamma '$$, so that the conclusion of Comp is an initial sequent.

2. Inductive case with the second premiss of Comp derived by an I-rule. There are two subcases, a one-premiss rule and a two-premiss rule. In the former case, Comp is permuted up to the premiss, with a lesser height of derivation as a result. In the latter case, we use the notation (D) to indicate a possible occurrence of D in a premiss: Rule Comp is permuted to any premiss that has an occurrence of D, say the first one, with the result: 3. Inductive case with the second premiss of Comp derived by an E-rule, as in: As in case 2, Comp is permuted up, to whichever premiss has an occurrence of the composition formula D, with a lesser height of derivation as a result. The other cases of E-rules are entirely similar.   QED.

In the case of a multiple discharge, a detour conversion will lead to several compositions, with a multiplication of the contexts as in the example The conversion is into Such multiplication does not affect the normalization process. Note well that normalization depends on the admissibility of composition which latter has to be proved before normalization.

## 3 Normalization by Hilfssatz

In normalization, derived major premisses of E-rules are converted step by step into assumptions. There are two situations, depending on whether the major premiss was derived by an E-rule or an I-rule:

### Definition 2

(Normalizability) A derivation in NLI is normalizable if there is a sequence of conversions that transform it into normal form.

The idea of our proof of the normalization theorem is to show by induction on the last rule applied in a derivation that logical rules maintain normalizability.

The cut elimination theorem is often called Gentzen’s Hauptsatz, main theorem. He used the word Hilfssatz, auxiliary theorem or lemma, for an analogous result by which composition of derivable sequents maintains the reducibility of sequents, a property defined in his original proof of the consistency of arithmetic (Gentzen 1935 [2, p. 106]). Henceforth any result in proof theory in which it is shown that a property of sequents or derivations is maintained under composition shall be called a Hilfssatz. Normalizability will be the first such property to be proved.

### Theorem 1

(Normalizability for intuitionistic natural deduction) Derivations in $$\mathbf {NLI}$$ convert to normal form.

### Proof

Consider the last rule applied. The base case is an assumption that is a normal derivation. In the inductive case, if an I-rule is applied to premisses the derivations of which are normalizable, the result is a normalizable derivation. The same holds if a normal instance of an E-rule is applied. The remaining case it that a non-normal instance of an E rule is applied. The major premiss of the rule is then derived either by another E-rule or an I-rule, so we have two main cases with subcases according to the specific rule in each. Derivations are so transformed that normalizability can be concluded either because the last rule instance resolves into possible non-normalities with shorter conversion formulas, or because the height of derivation of its premisses is diminished.

1. E-rules: Let the rule be&E followed by another instance of&E, as in: By the inductive hypothesis, the derivations of the premisses of the last rule are normalizable. The second instance of&E is permuted above the first: The height of derivation of the major premiss of the last rule instance in the upper derivation has diminished by 1, so the subderivation down to that rule instance is normalizable. The height of the major premiss of the other rule instance has remained intact and therefore normalizability follows.

All other cases of permutative convertibility go through in the same way.

2. I-rules: The second situation of convertibility is that the major premiss has been derived by an I-rule, and there are five cases:

2.1. Detour convertibility on&: Let us assume for the time being that $$n=m=1$$. The detour conversion is given by: The result is not a derivation in NLI . We proved in Lemma 1 that Comp is eliminable. The next step is to show that Comp maintains normalizability. This will be done in the Hilfssatz to be proved separately. By the Hilfssatz, the conclusion of the upper Comp is normalizable, and again by the Hilfssatz, also the conclusion of the lower Comp. If $$n>1$$ or $$m> 1$$, Comp is applied repeatedly, the admissibility of an uppermost Comp giving the admissibility of the following ones. If $$n=0$$, the instance of Comp with the left premiss $$\varGamma {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}A$$ falls out of the derivation, and similarly with $$m=0$$. If $$n=m=0$$, the right premiss of rule & E before conversion is $$\varTheta {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}C$$, and it is taken in place of the original conclusion $$\varGamma ,\varDelta ,\varTheta {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}C$$. This situation is called a ‘simplification convertibility’ in Prawitz (1965) . In all cases, the result of conversion is uniquely defined.

2.2. Detour convertibility on $$\vee$$. There are two cases, as in:  As in 2.1, assume for the time being that $$n=m=1$$. The detour conversion is given by: The multiplicities are treated as in 2.1, except for the case of $$m=0$$ or $$n=0$$. Then the given derivation has a simplification convertibility, say when $$m=n=0$$: There is a conversion, but it is not uniquely defined: Either one of the original minor premisses of $$\vee E$$ can be taken. Similarly, if say $$n>0$$ and $$m=0$$, either a composition with composition formula A can be made, or a simplification conversion.

2.3. Detour convertibility on $$\mathord {\supset }I$$: In the conversion, multiple discharge of assumptions is again resolved into iterated compositions, so we may assume $$n=m=1$$ and have the conversion: If $$m=0$$, there is a simplification convertibility with the uniquely defined result $$\varTheta {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}C$$.

2.4. Detour convertibility on $$\forall$$: As before, assume for the time being that $$n=1$$. The eigenvariable y in the derivation of $$\varGamma {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}A(y)$$ is replaced by the term t and the detour conversion given by: The multiplicities are treated as before.

2.5. Detour convertibility on $$\exists$$: As before, assume for the time being that $$n=1$$. The eigenvariable y in the derivation of $$A(y),\varDelta {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}C$$ is replaced by the term t, and the detour conversion is: Multiplicities are treated as before.   QED.

It remains to give a proof of the Hilfssatz:

### Hilfssatz 1

(Closure of normalizability under composition) If the premisses of rule Comp are normalizable, also the conclusion is.

### Proof

The proof is by induction on the length of the composition formula D with a subinduction on the sum of the heights of derivation of the two premisses.

1. $$D\equiv P$$. With an atomic formula P, we have P is never principal in the right premiss, so that Comp can be permuted up with a lesser sum of heights of derivation as a result. There are two cases, a one-premiss rule and a two-premiss rule. For the latter, we use again the notation (P) to indicate a possible occurrence of P in a premiss: Rule Comp is permuted to the premiss that has an occurrence of P, say the first one, with the result: In the end, the second premiss of Comp is an initial sequent, as in: The conclusion of Comp is identical to its first premiss, so that Comp can be deleted.

2. $$D\equiv \bot$$. Because $$\bot$$ is never principal in the left premiss, Comp is permuted up as in the proof of admissibility of composition.

3. $$D\equiv A$$ & B. If A & B is not principal in the right premiss, Comp can be permuted as in 1.

If A & B is principal, there has to be a normal rule instance in the right premiss, as in: Comp is permuted up to the first premiss: Comp is now deleted and a generally non-normal instance of rule&E created. If the major premiss is concluded by an E-rule, a permutative conversion is done and no instance of Comp created. If the last rule is&I, a detour convertibility with the conversion formula A & B is created. A detour conversion will lead to new instances of Comp, but on strictly shorter formulas.

The other cases of composition formulas are treated in a similar way.   QED.

Lemma 1, closure of derivations with respect to composition, merely shows that a derivation in natural deduction can be got from two composable derivations. The Hilfssatz adds the property of preservation of normalizability. It is even important to give the details for the composition of derivations as in the proof of Lemma 1, for the algorithm of normalization depends crucially on the steps needed for the admissibility of composition. Even so, one searches in vain for more than a mere indication of this proof in the logical literature.

## 4 Strong Normalization by Bar Induction

Derivations are denoted by $$d_0, d_1, d_2,\dots$$, and let N(d) express that d is a normal derivation, i.e., that all major premisses of E-rules are initial sequents. This property can be decided by an inspection of the derivation. The choice sequences in normalization are defined as follows:

### Definition 3

(Conversion choice sequence for a derivation) Given a derivation d, a conversion choice sequence for d is a succession of conversions on d with the restriction that whenever d has a permutative convertibility, it has to be chosen.

The restriction is in fact not necessary, but it will make the proof go through smoothly. It is not met if disjunction and existence are left out of the language and the standard elimination rules used, so there is sense in calling the result of this Section a strong normalization theorem.

We shall indicate by $${{\mathrm{\textit{PF}}}}(d)$$ that a derivation d is free of permutative conversions.

The notation $$\overline{\alpha }_n(d)\equiv d_n$$ stands for the derivation that is obtained from a given derivation d after n steps of conversion $$\overline{\alpha }_n$$. The notation $$\alpha _1(\overline{\alpha }_n(d))\equiv \alpha _1(d_n)$$ stands for the result of a one-step continuation of the sequence of conversions $$\overline{\alpha }_n$$.

### Definition 4

(Normalizing and strongly normalizing derivations)

1. i.

A derivation d is normalizing whenever $$\exists \alpha \exists x N(\overline{\alpha }_x(d))$$.

2. ii.

A derivation d is strongly normalizing whenever $$\forall \alpha \exists x N(\overline{\alpha }_x(d))$$.

We write $${{\mathrm{\textit{WN}}}}(d)$$ for the former and $${{\mathrm{\textit{SN}}}}(d)$$ for the latter.

We shall use the standard formulation of bar induction in the proof of strong normalization, with the two predicates $${{\mathrm{\textit{PF}}}}(d)$$ and $${{\mathrm{\textit{SN}}}}(d)$$. It has to be established that: (1) The base case predicate $${{\mathrm{\textit{PF}}}}(d)$$ is decidable. (2) Every conversion choice sequence of a given derivation d has an initial segment such that a permutation-free derivation is obtained. (3) Permutation-free derivations are strongly normalizing. (4) If every one-step continuation of conversions of a derivation d is strongly normalizing, also d is strongly normalizing.

### Theorem 2

(Strong normalization for intuitionistic natural deduction) Derivations in $$\mathbf {NLI}$$ are strongly normalizing.

### Proof

We show in turn that the four conditions of bar induction are satisfied by the predicates $${{\mathrm{\textit{PF}}}}(d)$$ and $${{\mathrm{\textit{SN}}}}(d)$$. Let $$d_0$$ be the given derivation that we assume to be non-normal.

1. 1.

Decidability: $${{\mathrm{\textit{PF}}}}(d)$$ is decidable as noted above.

2. 2.

Termination of permutative conversions: Let a derivation d have permutative convertibilities. As seen in the proof of normalization, each such conversion diminishes the height of derivation of the major premiss in question by 1 and leaves the other heights unaltered. Therefore permutative conversions terminate in a bounded number n of steps in a derivation $$d_n$$ such that $${{\mathrm{\textit{PF}}}}(d_n)$$.

3. 3.

If PF(d), then SN(d): The proof is by induction on the last rule in d and we can assume d not to be normal and the derivations of the premisses to be strongly normalizing. By $${{\mathrm{\textit{PF}}}}(d)$$, all non-normalities are detour convertibilities. Any conversion chosen resolves into compositions, and a Hilfssatz needs to be proved by which composition of derivations maintains strong normalizability. This is done below.

4. 4.

If $$\forall \alpha _1$$ SN $$(\alpha _1(d_n))$$, then $${{\mathrm{\textit{SN}}}}(d_n)$$: Each one-step continuation of the conversion of $$d_n$$ is by assumption strongly normalizing, therefore the derivation $$d_n$$ is by definition strongly normalizing.

By 1–4, $${{\mathrm{\textit{SN}}}}(d_0)$$.    QED.

It remains to add a proof of the Hilfssatz used in condition 3:

### Hilfssatz 2

(Closure of strong normalizability under composition) Given strongly normalizing derivations of $$\varGamma {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}D$$ and $$D,\varDelta {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}C$$, their composition into a derivation of $$\varGamma ,\varDelta {{\mathrm{\mathbin {\varvec{\rightarrow }}}}}C$$ is strongly normalizing.

### Proof

As before, the proof is by induction on the length of the composition formula D, with a subinduction on the sum of heights of derivation of the premisses of rule Comp, and goes through virtually identically to the proof of Hilfssatz 1.    QED.

## 5 Concluding Remarks and Further Applications

Looking at the single detour conversion schemes in the proof of Theorem 1, we notice that simplification convertibility with disjunction in case 2.2 leaves two possible results of conversion. For the rest of detour conversions, the local transformations produce unique converted derivations, and that property is sufficient for the overall result: Bar induction is a principle by which such local control of a suitably chosen property is turned into global structure, one could put it.

There is at each stage of strong normalization a finite number of non-normalities from which to choose the conversion to be made. Therefore strong normalization is a consequence of the variety of bar induction known as the fan theorem. The consistency of arithmetic was originally proved by bar induction by Gentzen and soon replaced by a proof through transfinite induction (see von Plato 2015 , and Siders and von Plato (2015)  for an explicit formulation of Gentzen’s bar induction). As with Gentzen’s proof, also the present proof could be carried through by the use of transfinite ordinals. What the least ordinal needed is, is at present not known, but because the fan theorem suffices for the result, Gentzen’s $$\varepsilon _0$$ gives a strict upper bound.

The proofs of normalization and strong normalization through Hilfssätze should work without problems for classical natural deduction with the rule of indirect proof and the same definition of normality as above, as in von Plato and Siders (2012) .

The proofs can obviously be worked through also for standard natural deduction, along the lines of my paper (von Plato 2011 ).

Two more applications of explicit composition can be noted here:

1. The interpretation of arbitrary cuts in natural deduction: A comparison of natural deduction in sequent calculus style with sequent calculus proper shows that a non-normal instance of an E-rule corresponds exactly to the case of a cut in which the right premiss of cut has been derived by a corresponding left rule. In the translation from sequent derivations with cuts to natural deduction, such cuts turn into non-normalities. The rest of the cuts are translated as explicit delayed compositions. What corresponds to cut elimination is seen from the admissibility of composition in natural deduction: An uppermost instance of Comp is permuted up until it either reaches an assumption and vanishes or hits a normal instance of an E-rule and gets turned into a non-normality. After the delayed compositions have been eliminated, there remain the proper non-normalitites and these can be eliminated in any order whatsoever. When in the normal derivation the major premisses are left unwritten, a sequent derivation is obtained. The overall procedure gives strong cut elimination in precisely the same sense in which there is strong normalization in natural deduction. Details are found in Sect. 13.4 of von Plato (2013) .

2. Normalization and strong normalization of $$\lambda$$ -terms: Any proof of normalization and strong normalization can be turned into a corresponding proof for typed $$\lambda$$-terms. The term structure is particularly transparent with general elimination rules, for the selector terms have now, with implication elimination as an example, the following structure (von Plato 2001 [5, p. 566]): A selector term is normal if its first argument is a variable, in particular, for the above “generalized application” as it is called in von Plato 2001 , the nested “tower” of applications, met with the standard application function, does not occur for normal terms. Permutative conversions reduce a suitably defined notion of depth of selector terms, and detour conversions reduce to substitutions. A Hilfssatz is used to prove that strong normalizability of $$\lambda$$-terms is maintained under such substitution.