Distilling the Requirements of Gödel’s Incompleteness Theorems with a Proof Assistant

We present an abstract development of Gödel’s incompleteness theorems, performed with the help of the Isabelle/HOL proof assistant. We analyze sufficient conditions for the applicability of our theorems to a partially specified logic. In addition to the usual benefits of generality, our abstract perspective enables a comparison between alternative approaches from the literature. These include Rosser’s variation of the first theorem, Jeroslow’s variation of the second theorem, and the Świerczkowski–Paulson semantics-based approach. As part of the validation of our framework, we upgrade Paulson’s Isabelle proof to produce a mechanization of the second theorem that does not assume soundness in the standard model, and in fact does not rely on any notion of model or semantic interpretation.


Introduction
Gödel's incompleteness theorems [14,17] are landmark results in mathematical logic.Both theorems refer to consistent logical theories that satisfy some assumptions, notably that of "containing enough arithmetic."The first incompleteness theorem (IT 1 ) says that there are sentences that the theory cannot decide, i.e., neither prove nor disprove; the second theorem (IT 2 ) says that the theory cannot prove an internal formulation of its own consistency.It is generally accepted that IT 1 and IT 2 have a wide scope (and wider for IT 1 than for IT 2 ), covering many logics and logical theories.However, when it comes to rigorous presentation, typically these results are proved for particular, albeit paradigmatic cases, such as theories of arithmetic or hereditarily finite (HF) sets, within classical first-order logic (FOL); and even in these cases the constructions and proofs tend to be significantly sketchy and incomplete.Hence, the theorems' scope remains largely unexplored on a rigorous/formal basis.
The emergence of powerful proof assistants (also known as interactive theorem provers) has been slowly changing the rules of the game and, we argue, the expectation.Using proof assistants, we can reliably keep track of all the constructions and their properties.Proof automation (sometimes achieved through the cooperation between proof assistants and automatic theorem provers [25,41]), makes complete, entirely rigorous proofs feasible.And indeed, researchers have successfully met the challenge of mechanizing IT 1 [21,36,40,53] and recently IT 2 [40].Besides reassurance, these verification tours de force have brought superior technical insight into the theorems.But they have taken place within the same solitary confinement of scope as the informal proofs.
This article takes steps towards a fully formal exploration of the incompleteness theorems and their wide scope, by a detailed analysis of their assumptions.We use Isabelle/HOL [34,35] to establish general conditions under which the theorems apply to a partially specified logic.Our formalization is publicly available in the Archive of Formal Proofs [44][45][46][47][48], but is not necessary for following this article, which is self-contained and does not employ Isabelle jargon (except for the dedicated Appendix A).
After discussing related work (Sect.2) and guiding principles (Sect.3), we describe our formal development.The abstract part of this development starts by setting the stage of a partially specified logical system, some partially specified arithmetic components, and representability (Sect.4), proving the diagonalization lemmas (Sect.5), and proving several flavors of the end results in this setting: IT 1 (Sect.6) and IT 2 (Sect.7).Some of the abstract results (summarized in Sect.8) are instantiated to concrete first-order logic theories (Sect.9).We also discuss proof-engineering aspects (Appendix A) and include an index of our abstract assumptions (Appendix B).
We start with a notion of logic whose terms and formulas are kept abstract (Sect.4.1).In particular, substitution and free variables are not defined, but axiomatized by some general properties.Provability is also axiomatized (Sect.4.2).We distinguish between a basic provability relation, capturing minimal theories that are sufficiently expressive for representing concepts via Gödel encodings (e.g., Peano arithmetic or weaker theories), and a (plain) provability relation, capturing consistent or ω-consistent extensions of the minimal theories.Thus, basic provability is subsumed by provability.Yet, provability will be represented internally and reasoned about within basic provability.
On top of this logic substratum, we consider an arithmetic substratum, consisting of a set of closed terms called numerals and an order-like relation (Sect.4.3).Our framework also incorporates encodings of formulas and proofs into numerals, the representability of various functions and relations as formulas (Sect.4.4), the Hilbert-Bernays-Löb derivability conditions (Sect.4.5), and standard models (Sect.4.6).
Overall, our assumptions capture the notion of "containing enough arithmetics" in a general and flexible way.It is general because only few assumptions are made about the exact nature of formulas and numerals.It is flexible because different versions of the incompleteness theorems consider their own "amount of arithmetics" that makes it "enough," as proper subsets of these assumptions.Indeed, our formalization of the results (the diagonalization lemma in Sect.5, IT 1 in Sect.6, and IT 2 in Sect.7) proceeds in an austere-buffet style: Every result picks just enough infrastructure needed for it to hold-ranging from diagonalization which requires very little, to Rosser's version of IT 1 which is quite demanding.This approach caters for a sharp comparison between different formulations of the theorems, highlighting their tradeoffs: Gödel's original formulation of IT 1 versus Rosser's improvement (Sect.6.3), proof-theoretic versus semantic versions of IT 1 (Sect.6.4), and Gödel's original formulation of the IT 2 versus Jeroslow's improvement (Sect.7.3).
Abstractness is our development's main strength, but also a potential weakness: Are our hypotheses reasonable?Are they consistent?These questions particularly concern our axiomatization of free variables and substitution-a notoriously error-prone area.As a (partial) remedy, we instantiate part of our framework to Paulson's semantics-based IT 1 and IT 2 for hereditarily finite (HF) set theory [40], also upgrading Paulson's IT 2 to a more general and standard formulation: for consistent (not necessarily sound) theories (Sect.9).This article extends our CADE 2019 conference paper [43] with a significantly more finegrained and self-contained presentation of the results, which includes lemmas and detailed proof sketches.Given the existence of formal proofs in Isabelle, one may question the usefulness of paper proof sketches; however, we believe these are important for reaching out to a wider audience-perhaps interested in following the reasoning behind our fine-grained results discovered with the help of Isabelle, but not willing to read and understand Isabelle scripts.Compared to the conference paper, the results are also established in a more general setting, where we distinguish between basic provability and provability (as explained above).This generalization had been left as future work in the conference paper.

Related Work
There is a vast amount of literature on the incompleteness theorems and their extensions and ramifications.We only discuss works that are strongly related to the ideas and techniques we tackle in this article.Gödel initially gave a proof of IT 1 and the rough proof idea of IT 2 [17].Hilbert and Bernays gave a first detailed proof of IT 2 [22].Subsequently, a large amount of work was dedicated to the (re)formulation, proof, and analysis of these results [5,50,56,57].The now canonical line of reasoning goes through the derivability conditions devised by Hilbert and Bernays [22] and simplified by Löb [32].These conditions have inspired a new branch of modal logic called provability logic [5,58].Jeroslow has proved that, contrary to prior belief, one of these conditions is redundant when proving IT 2 [24].
Smullyan [59], Kreisel [28] and Jeroslow [24] were among the first to study abstract conditions on logics under which the incompleteness theorems apply.Feferman [13] gives an essential incompleteness account of IT 2 applicable to extensions of Peano arithmetic in classical FOL.Buldt [7] surveys the state of the art on IT 1 up to 2014 with a focus on the theorem's scope, also sketching the applicability to non-standard logics.Our abstract approach, based on generic syntax, provability and truth predicates, resembles the style of institution-independent model theory [12,18] and our previous work on abstract completeness [4] and completeness of ordered resolution [51].On distinguishing between two notions of provability, one stronger than the other, we take inspiration from Smorynski's account [57].Dimensions of generality that our formalized work does not (yet) explore include quantifierfree logics [24] and arithmetical hierarchy refinements [27].Our syntax axiomatization is inspired by algebraic theories of the λ-calculi syntax [15,16,42].
In the realm of mechanical proofs, the earliest substantial development was due to Sieg [55], who used a prover based on TEM (Theory of Elementary Meta-Mathematics) to formalize parts of the proofs of both IT 1 and IT 2 .Full mechanical proofs of IT 1 were subsequently achieved by Shankar [52,53] in the Boyer-Moore prover, O'Connor in Coq [36], and Harrison in HOL Light [21].Harrison also proved abstract versions of IT 1 and IT 2 in a simple LCF-style prover implemented in OCaml [19].IT 2 has only been fully proved recently-by Paulson in Isabelle/HOL [39,40] (who also proved IT 1 ).All these mechanizations target theories over a fixed language in classical FOL: variants of the language of arithmetic (Harrison and O'Connor) and variants of the language of set theory (Sieg, Shankar, and Paulson, with Shankar also allowing the convenience of new symbols to be defined from the existing ones).The targeted theories are usually (finite) extensions of given standard FOL theories-so the results state the (finitary) essential incompleteness of these theories.Sieg considers the theory Z * , Shankar finite extensions of the theory Z2, and Paulson finite extensions of HF set theory.Each of Z * , Z2 and HF set theory are variations of Zermelo-Frankel set theory without the axiom of infinity, and have the same expressive power as Peano arithmetic [10,61].O'Connor targets self-representable extensions of the theory NN [23, §7.1], a modification of Robinson arithmetic obtained by replacing the dichotomy axiom (stating that any element is either 0 or a successor) with three axioms regulating the behavior of an additional binary relation symbol for strict order.Harrison targets Robinson arithmetic, and additionally proves a variant of IT 1 for an abstract class of theories in the FOL language of Robinson arithmetic.On their way to IT 1 , Shankar and O'Connor prove representability of all partial, respectively primitive recursive functions-important standalone results.We will revisit some of these mechanized concrete results in Sect.9, with the hindsight of our abstract framework.
Outside the realm of holistic interactive proof development, there have been efforts to fully automate parts of the proofs of Gödel's and related theorems [8,49,54].

Formal Design Principles
Our long-term goal is a framework that makes it easy to instantiate the incompleteness theorems and related results to different logics.This is a daunting task, especially for IT 2 , where a lot of seemingly logic-specific technicalities are required to even formulate the theorem.The challenge is to push as much as possible of the technical constructions and lemmas to a largely logic-independent layer.
To this end, we strive to make minimal assumptions with regard to structure and properties when inferring the results-we will call this the Economy principle.For example, we do not define, but axiomatize syntax in terms of a minimalistic infrastructure.We assume a generic single-point substitution, then define simultaneous substitution and infer its properties.This is laborious, but worthwhile: Any logic that provides a single-point substitution satisfying our assumptions gets the simultaneous substitution for free.
As another instance of Economy, when faced with two different ways of formulating a theorem's conclusion we prefer the one that is stronger under fewer assumptions.(And dually, we prefer weakness for a theorem's assumptions.)For example, we discuss two variants of consistency: (1) "does not prove false" or (2) "there exists no formula such that itself and its negation are provable" (Sect.7.3).While the statements are equivalent at the meta-level, their representations as object-logic formulas are not necessarily equivalent; in fact, (1) implies (2) under mild assumptions but not vice versa.So in our abstract theorems we prefer (1).Indeed, even if (2) implies ( 1) in all reasonable instances, why postpone for the instantiation time any fact that we can show abstractly?
Applying the Economy principle not only stocks up generality for instantiations, but also accurately outlines tradeoffs: How much does it cost (in terms of other added assumptions) to improve the conclusion, or to weaken an assumption of a theorem?For example, an Economybased proof of Rosser's variant of IT 1 reveals how much arithmetic we must factor in for weakening the ω-consistency assumption into consistency.

Abstract Assumptions
Roughly, the incompleteness theorems are considered to hold for logical theories that (1) contain enough arithmetic and (2) can themselves be arithmetized.Our goal is to give a general formulation of these favorable conditions.To this end, we identify some logic and arithmetic substrata consisting of structure and axioms that express the containment of (various degrees of) arithmetic more abstractly and flexibly than relative interpretations [63].We also identify abstract notions of encodings and representability that have just what it takes for a working arithmetization.

The Logical Substratum: Syntax
We start with some unspecified sets of variables (Var, ranged over by x, y, z), numerals (Num, ranged over by m, n), terms (Term, ranged over by s, t) and formulas (Fmla, ranged over by ϕ, ψ, χ).We assume that variables and numerals are particular terms, i.e., Var ⊆ Term and Num ⊆ Term, and that Var is infinite.Free-variables and substitution operators, FVars and _[_/_], are assumed for both terms and formulas.We think of FVars(t) as the (finite) set of free variables of the term t, and similarly for formulas.We think of s[t/x] as the term obtained from s by the (capture-avoiding) substitution of t for the free occurrences of variable x; and similarly for ϕ[t/x], where ϕ is a formula.
In FOL, terms introduce no bindings, so any occurring variable is free.FOL terms fall under our framework, and so do terms with bindings as in λ-calculi and higher-order logic (HOL).To achieve this degree of inclusiveness while also being able to prove interesting results, we work under some well-behavedness assumptions about FVars and _[_/_]: (1) Free-variables and substitution act on variable terms as expected: -FVars(x) = {x}; x[s/x] = s, and y[s/x] = y if x = y.
(2) Substitution on terms is vacuous outside the free variables: x / ∈ FVars(t) implies t[s/x] = t; and similarly for substitution on formulas.
In addition, for the operators on formulas we assume the following: (3) Free-variables distribute over substitution: (5) Substitution is compositional (under some freshness assumptions): Of the above assumptions, (1) only applies to, and only makes sense for, substitution on terms.By contrast, we assume (2) for both terms and formulas.The last group, (3)-( 5) would makes sense for terms too, but is only assumed for formulas; this is in line with our Economy principle, since our results will not need these assumptions for terms.In these assumptions, just like in the rest of this paper, "=" denotes the usual equality of two mathematical entities (formally represented by the Isabelle/HOL equality), and not some more abstract equality.This means that our assumptions do not hold for "raw" formulas, but for formulas quotiented to alpha-equivalence, i.e., equivalence classes modulo alpha (of the kind provided, e.g., by using de Bruijn indices or the Nominal Isabelle package [64]); likewise, if the terms have bindings, they would need to be quotiented to alpha-equivalence to satisfy our assumptions.
The incompleteness theorems rely heavily on simultaneous substitution, whose properties are tricky to formalize-for example, Paulson's formalization article dedicates them ample space [40, 6.2].To address this problem once and for all generically, we define simultaneous substitution, written ϕ[t 1 /x 1 , . . ., t n /x n ], from the single-point substitution, ϕ[t/x].Accordingly, we derive the properties of simultaneous substitution from the single-point substitution axioms.For example, we prove that ] for some fresh y 1 , . . ., y n , the choice of which we must show to be immaterial.This definition's complexity is reflected in the proofs of its properties.But again, this one-time effort benefits any "customer" logic: In exchange for a well-behaved single-point substitution, it gets back a well-behaved simultaneous substitution.
We call a term with no free variables closed and a formula with no free variables a sentence.Sen denotes the set of sentences.We let v 1 , v 2 , . . .be fixed mutually distinct variables.Fmla k denotes the set of formulas whose set of free variables is exactly {v 1 , . . ., v k }.In particular, In addition to free variables and substitution, our theorems will require formulas to be equipped with some of the following: term equality (≡), Boolean connectives (⊥, , →, ¬, ∧, ∨), universal and existential quantifiers (∀, ∃).When we need negation, we define it taking ¬ ϕ to be ϕ → ⊥.On the other hand, even in the presence of negation, we do not assume that ∨ and ∃ are definable from ∧ and ∀ or vice versa.This is because, in line with the Economy principle, we will not assume classical logic except in results that need it.For the rest, we will only assume intuitionistic logic, where these operators are not inter-definable.
The above are not assumed to be constructors (syntax builders), but unspecified operators on terms and formulas, e.g., ≡ : Term → Term → Fmla, ⊥ ∈ Fmla, ∀ : Var×Fmla → Fmla.This caters for logics that do not have them as primitives.For example, HOL defines all connectives and quantifiers from λ-abstraction and either equality or implication.
Free variables and substitution are assumed to be well-behaved w.r.t.these operators, e.g., FVars(∀x.ϕ) = FVars(ϕ) − {x}.Finally, numerals are assumed to be closed terms.Thanks to our substitution axioms, this implies that substitution on numerals is vacuous.

Logical Substratum: Provability
We fix two unary relations on formulas, ⊆ Fmla and b ⊆ Fmla, called provability and basic provability, respectively.We write ϕ instead of ϕ ∈ , and say the formula ϕ is provable; similarly, we write b ϕ instead of ϕ ∈ b , and say the formula ϕ is basic-provable.Henceforth, we will assume that on sentences basic provability is included in provability: For all ϕ ∈ Sen, b ϕ implies ϕ.
Typical instances of these relations will be as follows: -for b , provability in some minimal theory, e.g., Robinson arithmetic or HF set theory -for , provability in some recursive extension of such a minimal theory As we will see, b will be assumed to be sufficiently expressive to reason about , and sometimes also sound w.r.t. the standard model.Whenever certain formula connectives or quantifiers are needed in our results, we will assume that b and are closed under the usual (Hilbert-style) intuitionistic FOL axioms and rules with respect to these connectives and quantifiers.Stronger systems, such as those of classical logic, also satisfy these assumptions.
Consistency of , denoted Con , is defined as the impossibility to prove false, namely ⊥.Another central concept is ω-consistency-we carefully choose a formulation that works intuitionistically, with conclusion reminiscent of Gödel's negative translation [11]: Assuming classical deduction in , this is equivalent to the standard formulation: For all ϕ ∈ Fmla 1 , it is not the case that ϕ(n) for all n ∈ Num and ¬ ( ∀x. ϕ(x)).
Occasionally, we will consider not only provability but also explicit proofs.We fix a set Proof of (entities we call) proofs, ranged over by p, q, and a binary relation between proofs p and sentences ϕ, written p ϕ and read " p is a proof of ϕ."We assume and to be related as expected, in that provability is the same as the existence of a proof: Rel : For all ϕ ∈ Sen, ϕ iff there exists p ∈ Proof such that p ϕ.

Convention 1
In all shown results we will implicitly assume: (1) the generic syntax (free variable and substitution) axioms, (2) at least → and ⊥ plus whatever connectives and quantifiers appear in the statement, (3) the inclusion of b into and (4) the closedness of b and under intuitionistic deduction rules for the assumed connectives and quantifiers.Other assumptions (e.g., order-like relation axioms, classical logic deduction, standard models, etc.) will be indicated explicitly.The appendix contains an index with the explicit assumptions.
In our proof sketches, arguing "by logic" will mean invoking closedness of b or under intuitionistic deduction rules; "by classical logic" will explicitly indicate a step that assumes closedness under classical deduction rules.
We will label local facts in proofs for later reference by parenthesized Arabic or Roman numbers, such as (1), ( 2), (i), (ii).The first occurrence of a parenthesized number will label a fact by preceding it, as in "we obtain (ii) ϕ ∧ ψ", while later occurrences will mean we refer to it, as in "from (ii) we obtain . ..".

Arithmetic Substratum
On one occasion, we will assume an order-like binary relation modeled by a formula ≺ ∈ Fmla 2 .We write t 1 ≺ t 2 instead of ≺ (t 1 , t 2 ) and ∀x ≺ n .ϕ instead of ∀x .x ≺ n → ϕ.It turns out that at our level of abstraction it does not matter whether ≺ is a strict or a nonstrict order.Indeed, we only require the following two properties, where x ∈ M denotes m∈M x ≡ m and expresses the disjunction of a finite set of formulas: Ord 2 : For all n ∈ Num, there exists a finite set M ⊆ Num such that ∀x .x ∈ M ∨n ≺ x.
Ord 1 states that if a property ϕ is basic-provable for all numerals, then its universal quantification bounded by any given numeral n is provable.Having in mind the arithmetic interpretation of numerals, it would also make sense to assume a stronger version of Ord 1 , replacing "if b ϕ(m) for all m ∈ Num" by the weaker hypothesis "if b ϕ(m) for all m ∈ Num such that m ≺ n".But this stronger version will not be needed.Also, note that we formulate Ord 1 in the weakest possible way w.r.t. the choice of provability relations: with a hypothesis about b and a conclusion about .Ord 2 states that, for any numeral n, any element x in the domain of discourse is provably either greater than n or equal to one of a finite set M of numerals.
If we instantiate our syntax to that of first-order arithmetic and take b to be intuitionistic Robinson arithmetic (and any larger relation), then both Ord 1 and Ord 2 hold when taking ≺ as either < or ≤.In the presence of a numeral-restricted form of anti-symmetry of the relation (which would include < but exclude ≤), the second condition is stronger: Proof Let ϕ ∈ Fmla 1 and n ∈ Num.Assume b ϕ(m), in particular, ϕ(m), for all m ∈ Num and let M be as in Ord 2 .We must prove ∀x ≺ n .ϕ(x).Working inside the formal proof system , we fix x such that x ≺ n.Thanks to the antisymmetry assumption, we obtain ¬ n ≺ x, which implies by Ord 2 that x equals some m ∈ M; this means that ϕ(x) holds, as desired.

Encodings and Representability
Central in the incompleteness theorems are functions that encode formulas and proofs as numerals, _ : Fmla → Num and _ : Proof → Num.For our abstract results, the encodings are not required to be injective or surjective.Various concepts will be assumed to be representable (via these encodings) inside our object logic, via the basic provability relation b .We will consistently employ b , and not , to represent concepts.On the other hand, and its associated proof-of relation will be among the concepts we will want to represent.
Let A 1 , . . ., A m be sets, and let, for each of them, _ : A i → Num be an "encoding" function to numerals.Then, an m-ary relation R ⊆ A 1 × . . .× A m is said to be represented by a formula R ∈ Fmla m if the following hold for all (a 1 , . . ., When the formula by which a relation/function P is represented or term-represented is irrelevant, we call P representable or term-representable. The terms "representability" and "weak representability" are fairly standard [50].We refer to Raatikainen [50, §2.2] and Smith [56, §5.6] for an account of different terminologies used in the literature for (variations of) these concepts.In contrast, "term-representability" is a notion that we have introduced ourselves (and so is "cleanness", defined below).
It is immediate that, assuming b consistent, if a relation R is weakly represented by a formula then it is also represented by that formula.Moreover, if we assume deductive injectivity of the encoding, i.e., b a 1 ≡ a 2 implies a 1 = a 2 for all a 1 , a 2 ∈ A, then the following holds: If a function f is represented by a formula, then its graph Gr( f ) is represented (as a relation) by the same formula, in particular, representability of f implies representability of Gr( f ).The converse, i.e., representability of Gr( f ) implying representability of f (this time by a modified formula), also holds under some assumptions-essentially saying that there is an order-like relation on A that is represented by a formula ≺ as in Sect.4.3.We do not elaborate on these aspects since they are not used in our end results.Smith works them out in detail in his monograph [56, §16]; he does it for the particular case of Robinson arithmetic, but in such a way that the more general assumptions under which the results hold can be depicted from his proofs.(Smith uses the following terminology: A relation or a function being "captured" means it is represented, and a function being "weakly captured" means its graph is represented as a relation.) We will also need an enhancement of relation representability: Given i < m, we call the representation of an m-ary relation R by R i-clean if b ¬ R (n 1 , . . ., n m ) for all numbers n 1 , . . ., n m such that n i (the i'th number among them) is outside the image of _ (i.e., there is no a ∈ A i with n i = a ).Cleanness would be trivially satisfied if the encodings were surjective.However, surjectivity is not a reasonable assumption.For example, most of the numeric encodings used in the literature are injective but not surjective.
The key property of cleanness is that it makes a representation behave well with respect to universal quantification of negative statements.We illustrate this for the binary case: and this representation is 1-clean.Then the following are equivalent: ) for all a ∈ A. This, in turn, is equivalent to (2) by 1-cleanness, which lets us exclude numerals outside _ 's image.
We let S : Fmla 1 → Sen be the self-substitution function, which sends any ϕ ∈ Fmla 1 to ϕ( ϕ ), i.e., to the sentence obtained from ϕ by substituting the encoding of ϕ for the unique variable of ϕ.An alternative to the above "hard" version of S is the following "soft" version, which sends any ϕ ∈ Fmla 1 to ∃v 1 .v 1 ≡ ϕ ∧ ϕ, where v 1 is the single free variable of ϕ.The soft version yields provably equivalent formulas and has the advantage that it is easier to represent inside the logic, since it does not require formalizing the complexities of capture-avoiding substitution.All our results involving S have been proved for both versions.
We will consider the properties Repr ¬ , Repr S , and Repr , stating the representability of the functions ¬ and S, and of the relation .In addition, Clean will state that the considered representation of is 1-clean, i.e., it is clean on the proof component.For the representing formulas of the above relations and functions we will use their circled names, ¬ , , etc.; for example, Repr means that (1) p ϕ implies b ( p , ϕ ) and (2) p ϕ implies b ¬ ( p , ϕ ) for all p ∈ Proof and ϕ ∈ Sen.

Derivability Conditions
For several relations R, we will assume representability by formulas R .However, the case of the provability relation is special.It will have an associated formula ∈ Fmla 1 , but we will assume for it conditions weaker than representability, and also additional conditions.The following are known as the Hilbert-Bernays-Löb derivability conditions [22,32]: Above and elsewhere, we omit parentheses when instantiating one-variable formulas with encodings of formulas to lighten notation-e.g., writing ϕ instead of ( ϕ ).HBL 1 states that, if a sentence is provable, then its encoding is basic-provable inside the representation.We would obtain a weaker version of HBL 1 if we replaced b with in the conclusion, namely asking that ϕ implies ϕ .HBL 3 is, roughly speaking, a formulation of this weaker version of HBL 1 "one level up," inside the proof system b .Finally, note that the provability relation is closed under modus ponens, in that ϕ and ϕ → ψ implies ψ for all ϕ, ψ ∈ Sen. Thus, HBL 2 roughly states the same property inside the proof system.In short, the derivability conditions state that the representation of provability acts partly similarly to the provability relation.(The above internalizations are "rough" in that they use meta-level quantification instead of object-level quantification-we will come back to this in Sect.7.2, in the context of IT 2 where these conditions are being used.) We will also be interested in the converse of HBL 1 : The weak representability of (as defined in Sect.4.4) is the conjunction of HBL 1 and HBL ⇐ 1 .Moreover, 's representability implies HBL 1 for (x) defined to be ∃y .(y, x): Lemma 4 Rel and Repr imply HBL 1 .
Proof Assume ϕ.Then there exists p ∈ Proof such that p ϕ.By Repr , we have b ( p , ϕ ), hence b ∃y .(y, ϕ ), as desired.(Note that we did not need the whole Repr ; one implication in the representability condition of would have sufficed.) Convention 5 Whenever we assume explicit proofs and representability of proof-of, the formula will be defined from as shown above.

Standard Models
We fix a unary relation | ⊆ Sen, representing truth of a sentence in the standard model.We write | ϕ instead of ϕ ∈ | , and read it as "ϕ is true."We consider the assumptions: LCQ | : Logical connectives and quantifiers handle truth as expected: (1) | ⊥; (2) for all ϕ, ψ ∈ Sen, | ϕ and | refers to b , not .Not having to assume that is sound will allow us to capture, for example, consistent or ω-consistent extensions of Robinson arithmetic that are not sound in the standard natural numbers model.
LCQ | (1-4) form a partial description of the connectives' and quantifiers' behavior w.r.t.truth: corresponding to elimination rules for ⊥, → and ∃ and introduction rule for ∀.This partial description suffices for our results.Note that LCQ | (4) is a strong form of existential elimination, saying that (the interpretations of) numerals are a complete set of witnesses for existential formulas valid in the standard model; in particular, this holds for the case when the standard model is built of numerals only.LCQ | (5) states that the standard model decides every sentence.TIP | is a form of completeness: It states that can prove whatever the standard model "agrees" that can be proved by .
The above axiomatization of standard models will be used to obtain semantic versions of IT 1 .At the heart of these results there will be the connection between the representability of and HBL ⇐ 1 in the presence of standard models.Recall that, by Convention 5, whenever we assume representable, we also assume that 's representation is naturally defined from 's representation (matching the definition of from ).This is crucial for IT 2 , where the internal definitions must faithfully capture the external ones [1], but not for IT 1 , where we only care about producing, no matter how, an undecided (and true) sentence.In fact, for recursively enumerable extensions of the Robinson arithmetic and related FOL theories, it is possible to produce an artificial provability formula that enjoys better properties than the above natural choice: While the latter satisfies HBL 1 but not necessarily HBL ⇐ 1 , the former is guaranteed to satisfy both HBL 1 and HBL ⇐ 1 (i.e., to weakly represent provability).This is why, for example, in his abstract account, Buldt takes the liberty to assume not only HBL 1 but also HBL ⇐ 1 in his most general formulation of IT 1 [7, Theorem 3.1].We will not attempt to model such "artificial" versions of in our framework, but will focus on the "natural" one, which works for both IT 1 and IT 2 .
On his way to formalizing IT 2 for extensions of HF set theory (and thus having in mind the "natural" ), after proving HBL 1 Paulson notes [40, p. 21]: "The reverse implication [namely HBL ⇐ 1 ], despite its usefulness, is not always proved."However, for the "natural" , HBL ⇐ -With OCon , we obtain n ∈ Num such that ¬ (n, ϕ ), in particular b ¬ (n, ϕ ).
-With Clean , we obtain p ∈ Proof such that n = p .Hence b ¬ ( p , ϕ ).
( -With TIP | , we obtain ϕ, as desired. ( -Now the proof of ϕ proceeds just like at point (1): using Rel , Repr and Clean .
Lemma 6 shows that, in the presence of standard models with reasonable properties and the soundness of b , clean representability of the proof-of relation implies HBL ⇐ 1 ; and recall from Lemma 4 that it also implies HBL 1 .Interestingly, a converse of these implications also holds.To state it, we initially assume there is no "outer" notion of proof (i.e., no set Proof and no relation ), but only an "inner" one, given by a formula Pf ∈ Fmla 2 such that: Rel Pf is the inner version of Rel : It expresses that, inside the representation, proofs and provability are connected as expected.Compl Pf and Compl ¬Pf state that provability is complete on Pf statements about formula encodings, as well as on their negations; in traditional settings, this is true thanks to Pf being a Δ 1 -formula.Now the converse result states that, thanks to standard models, HBL 1 and HBL ⇐ 1 , we can define an outer notion of proof that is represented by the inner notion Pf: 1 .Take Proof = Num and define by n ϕ iff b Pf(n, ϕ ).Then Rel , Repr and Clean hold, with being represented by Pf (i.e., being Pf).
Proof To show Rel in this context (that is, for this particular definitions of Proof and relation ), we must show the equivalence between (i) ϕ and (ii) the existence of n ∈ Num such that b Pf(n, ϕ ).First assume (i).
-With Rel Pf , we obtain b ϕ .
-With HBL ⇐ 1 , we obtain (i), as desired.Showing half of Repr in this context is trivial, as it amounts to showing that b Pf(n, ϕ ) implies b Pf(n, ϕ ).For the other half, assume b Pf(n, ϕ ).
Finally Clean is trivial in this context, since the encoding of proofs is the identity.
The property TIP | will be pivotal in the proofs of our semantic versions of IT 1 .As Lemma 6(3) shows, TIP | follows from the soundness of b , reasonable properties of | (namely LCQ | (1,2,4)), and the Rel , Repr , Clean trio; and the last trio follows by Lemma 7 from the other assumptions if we assume an additional reasonable property of | (namely LCQ | (5)), together with Rel Pf , Compl Pf , Compl ¬Pf , and the weak representability of (i.e., HBL 1 and HBL ⇐ 1 ).One disadvantage of this indirect route for obtaining TIP | is the need to have both Compl Pf and Compl ¬Pf -which are very tedious to prove for concrete logics, especially Compl ¬Pf .However, it turns out that we can directly prove TIP | from a subset of the above assumptions, not including Compl ¬Pf : -With (i) and LCQ | (2), we obtain | ∃x.Pf(x, ϕ ).
-By logic, from this we obtain b ∃x.Pf(x, ϕ ).
-With Rel Pf , by logic we obtain b ϕ , as desired.
Note that point (1) of the above lemma states that basic provability is complete for sentences of the form ϕ .For Robinson arithmetic and related theories, this follows from the completeness of provability for Σ 1 -sentences (Σ 1 -completeness).

Diagonalization
The formula diagonalization technique (due to Gödel and Carnap [9]) yields "self-referential" sentences.All we need for it to work is (logic plus) the representability of substitution.
-From the fact that S is represented by S we obtain (provably in the formal system b ) that ϕ is the unique y for which S ( χ , y) holds.-By logic, this implies b (∃y.S ( χ , y) ∧ ψ(y)) ← → ψ ϕ .
-By the definition of χ, the above means exactly (1).A similar argument works for soft self-substitution.
A sentence ϕ ∈ Sen is called: Above, the formula RosserTwist(x, y) is ∀x .x ≺ x → ∀y .¬ (y, y ) → ¬ (x , y ).Here, y represents the negation of y.If negation were represented not by a formula but by a unary function symbol ¬ , RosserTwist(x, y) would be written ∀x .
Since b is included in , any basic Gödel or Rosser sentence is in particular a Gödel or Rosser sentence, respectively.It will turn out that basic Gödel sentences will be needed for the model-theoretic versions of IT 1 , whereas (not necessarily basic) Gödel or Rosser sentences will suffice for the proof-theoretic versions.
Proposition 10 Assuming Repr S , there exist basic Gödel and basic Rosser sentences.
Proof Follows immediately from Proposition 9, taking ψ(x) to be ¬ (x) and ¬ (∃y .(y, x) ∧ RosserTwist(y, x)), respectively.Thus, any (basic) Gödel sentence is (basic-)provably equivalent to the negation of its own provability; in Gödel's words, it "says about itself that it is not provable" [17].A Rosser sentence ϕ asserts its own unprovability in a weaker fashion: Rather than saying "I am not provable" (i.e., "it is not the case that there exists a proof p of me"), it says "it is not the case that there exists a proof p of me such that all smaller q are not proofs of ¬ ϕ." Here, "smaller" refers to the order that the encoding of proofs as numerals imposes.

First Incompleteness Theorem
After last sections' preparations, we are now ready to discuss different versions of the incompleteness theorems, based on alternative assumptions.This section deals with IT 1 , and the next one with IT 2 .
For a consistent or ω-consistent theory that is sufficiently expressive (in particular able to express concepts about itself, such as formulas and provability), IT 1 identifies sentences that are neither provable nor disprovable, and are also true in the standard model -these are usually the Gödel and Rosser sentences discussed in the previous section.

Informal Account and Roadmap
Before embarking on the formal analysis of IT 1 , it is worth recalling informally the line of reasoning behind some of its variants.(More details can be found, e.g., in Boolos's [5] and Smith's [56] monographs.)Gödel's original formulation referred to a system called P, a form of simple type theory enriched with the Dedekind-Peano axioms for natural numbers.However, it was soon recognized that the argument works for much weaker systems, notably Robinson arithmetic and a fortiori Peano arithmetic, as well as for any (ω-)consistent recursively axiomatizable FOL theories that extend these.
When reading the informal (but quite detailed) recollection that follows, the reader should feel free to think of any of the above systems as target systems-so the term "provable" will refer to provability in one of these systems.To simplify the discussion, we will assume the availability of classical logic reasoning, but the later formal analysis will refine this by singling out the results that only need intuitionistic logic.Moreover, here we will not distinguish between provability and basic provability, but leave this too for our later formal analysis.Enclosing a statement in double quotes will mean that we refer to its internalization as a sentence in the language of the considered system; for example, the provability of "n is not a proof of R" can be written using our formal notations as ¬ (n, R ).
(1) Let us first consider a purely proof-theoretic IT 1 , which ignores the notion of truth and focuses on undecidability.
(1.1.1)That G is unprovable is argued straightforwardly: The provability of G on the one hand, by HBL 1 , would imply that its provability is provable, and on the other hand, by virtue of G being a Gödel sentence, would imply that its unprovability is provable, thus contradicting consistency.(1.1.2)That ¬ G is unprovable needs a more subtle argument, which delves into actual proofs and their representation: The provability of ¬ G would imply, by consistency, the unprovability of G, i.e., the nonexistence of any proof of G, i.e., by proof representability, the provability of "n is not a proof of G" for all n, i.e., by ω-consistency, the unprovability of "there exists a proof of G", i.e., unprovability of "G is provable", i.e., by virtue of G being a Gödel sentence, the unprovability of ¬ G. (1.2) Rosser's variant removes the need for ω-consistency in Gödel's argument for ¬ G.This is done by using Rosser sentences R instead of Gödel sentences G. (Recall form Sect. 5 that Rosser sentences assert about themselves something weaker than their unprovability, namely the nonexistence of any proof of them such that Rosser's twist holds, i.e., there is no smaller proof of their negation.)

123
(1.2.1) Arguing that ¬ R is unprovable goes the same as in Gödel's case until the point of establishing the provability of "n is not a proof of R" for all n, while additionally recording a proof p of ¬ R (from the assumption that ¬ R is provable), which by proof representability brings the provability of " p is a proof of ¬ R".So, taking m = p , we have the provability of "m is a proof of ¬ R" for a fixed m, and also of "n is not a proof of R" for all n.Using a bit of Robinson arithmetic, this gives us the provability of "there exists no x such that x is a proof of R and Rosser's twist holds for x." Hence, by virtue of R being a Rosser sentence, we obtain the provability of R-which, given our initial assumption that ¬ R is provable, contradicts consistency.(1.2.2) On the other hand, due to the aforementioned weaker "self-assertion" in Rosser sentences, Rosser's argument for the unprovability of R is not as immediate as in Gödel's case, but itself needs to delve into proofs.First, proceeding in the same way as for ¬ R, we obtain a dual of the situation from there: the provability of "m is a proof of R" for a fixed m, and also of "n is not a proof of ¬ R" for all n.Again using a bit of Robinson arithmetic (a different bit than before!), we obtain the provability of "Rosser's twist holds for m", hence the provability of "there exists x such that x is a proof of R and Rosser's twist holds for x", hence, by virtue of R being a Rosser sentence, the provability of ¬ R-which, given our initial assumption that R is provable, contradicts consistency.
(2) Now we move to the argument for why the given undecided sentence is also true in the standard model.In what follows, truth and falsity will implicitly refer to the standard model.
(2.1)For a Gödel sentence G, we know that G is not provable, hence there is no proof of G, hence, by proof representability, it is provable that "n is not a proof of G" for all n.In particular, since deduction is sound w.r.t.truth, it is true that "n is not a proof of G" for all n, i.e., that "for all x, x is not a proof of G", i.e., that "G is not provable".Hence, by virtue of G being a Gödel sentence and deduction being sound, we obtain that G is true.(2.2) The truth of a Rosser sentence R follows by the same argument as above, noting that we only used that a Gödel sentence is implied by the statement of its own unprovability, which is also true for Rosser sentences.
(3) As we will show later during the formal discussion, if stated carefully the above arguments do not need the full power of classical logic, but intuitionistic logic suffices.On the other hand, if we assume classical logic (i.e., double negation) and additional properties mentioned below, more direct arguments can be given for some of IT 1 's componentsmore precisely, the arguments for the unprovability of ¬ G and the truth of G no longer need to delve into proofs, but can stay at the level of provability.Below we only discuss the case of Gödel sentences; Rosser sentences can be treated in exactly the same way.
(3.1) To argue that ¬ G is unprovable, we assume that it is provable.Hence, by virtue of G being a Gödel sentence and making essential use of classical logic, we obtain the provability of "G is provable".At this point, we invoke the converse of HBL 1 (i.e., the provability of any ϕ follows from the provability of ϕ's provability) to obtain the provability of G, which together with our assumption contradicts consistency.(3.2) To argue that G is true, we assume otherwise and try to reach a contradiction (thus making essential use of classical negation).Since G is false, ¬ G must be true, hence by soundness and by virtue of G being a Gödel sentence, "not not G is provable" must be true, hence "G is provable" must be true.At this point, we invoke that the truth of provability implies provability, a property that we called TIP | in Sect.4.6, to reach the desired conclusion, namely that G is provable.In turn, TIP | can be inferred from Σ 1 -completeness (which states that, for all Σ 1 sentences, in particular for those asserting provability, their truth implies their provability) and the converse of HBL 1 .
This concludes our informal recollection, which offers a roadmap for our formal and more abstract development that follows: Sects.6.2-6.5 tackle the above points (1.1), (1.2), ( 2) and (3), respectively.We distill the exact assumptions needed in these arguments.This forms a basis for generalizing them to a large variety of logical systems, and also reveals some interesting properties required from the logic and arithmetic infrastructures and from the encodings that are not clearly visible in the concrete setting.In particular, we identify the purely intuitionistic line of reasoning that suffices for ( 1) and ( 2), the amount of arithmetic needed in (1.2), the tradeoffs between (1.1) and (1.2), and, in Sect.6.6, the limits in combining provability with basic provability to widen these arguments' scope.

Gödel's Proof-Theoretic Version
We start with an analysis of Gödel's original argument for the undecidability of Gödel sentences, which requires consistency for one half and ω-consistency for the other half.
Proposition 11 Assume Con and HBL 1 .Then G for all Gödel sentences G.
Proof Let G be a Gödel sentence.To prove G, we assume (1) G and aim to reach a contradiction.
-From (1) and G being a Gödel sentence, we obtain ¬ G .-From (1) and HBL 1 , we obtain b G , hence G .-The last two facts contradict Con .
For showing that the Gödel sentences are not disprovable, a standard route is to assume explicit proofs, strengthen the consistency assumption to ω-consistency, and strengthen HBL 1 to representability of the proof-of relation.While the line of reasoning in the above proof is mostly well-known, it contains two subtle points about which the literature is not explicit (due to the usual focus on classical first-order arithmetic and particular choices of encodings).
First, we must assume the representation of the proof-of relation to be 1-clean, i.e., clean with respect to the proof component.Indeed, the argument crucially relies on converting the statement " p G for all p ∈ Proof" into " b ¬ (n, G ) for all n ∈ Num," which is only possible for 1-clean encodings.This assumption is needed in many of our results.By contrast, cleanness is never required with respect to the sentence component of proof-of or for the provability relation (which only involves sentence encodings).In short, cleanness is only needed for proofs, not for sentences.
Second, to reach the desired contradiction for our intuitionistic proof system , from " ¬ (n, G ) for all n ∈ Num" it is not sufficient to employ standard ω-consistency, which would only give us ∃x.(x, G ), i.e., G ; the last together with G ← → ¬ G would be insufficient for obtaining ¬ G.However, our stronger version of ω-consistency, OCon , does the job.-From OCon , we obtain Con .
-Applying Lemma 4 to Rel and Repr , we obtain HBL 1 .
-Applying Proposition 11 to the last two facts, we obtain G, as desired.

Rosser's Version
Rosser's contribution to IT 1 was an ingenious trick for weakening the ω-consistency assumption into plain consistency-as such, it is usually seen as a strict improvement over Gödel's version.While this is true for the concrete case of FOL theories extending arithmetic, from an abstract perspective the situation is more nuanced: The improvement is achieved at the cost of asking more from the logic.Our framework makes this tradeoff clearly visible.The idea is to use Rosser sentences instead of Gödel sentences to "repair" the ω-consistency assumption of Theorem 13 (inherited from Proposition 12).
Proof To prove ¬ R, we assume (1) ¬ R and aim to reach a contradiction.
-With Rel , we obtain p ¬ R for some p ∈ Proof.
-With Rel , we obtain q R for all q ∈ Proof.
-By Ord 2 , we obtain a finite M ⊆ Num such that (4) ∀x.
The proof is performed in the intuitionistic proof system of , but we describe it informally: We fix x, assume (x, R ) ∧ RosserTwist(x, R ), and aim to reach a contradiction.We perform a case distinction according to (4): -If x equals some m ∈ M, then (m, R ), which together with (3) leads to a contradiction.-If p ≺ x, then from RosserTwist(x, R ) and ¬ ( R , ¬ R ) (which holds thanks to Repr ¬ and b being included in ), we obtain ¬ ( p , ¬ R ), which together with (2) leads to a contradiction.-This concludes (our informal description of) the -formal proof of ( 5).
-Thanks to R being a Rosser formula, we obtain R.
-Together with (1), this contradicts Con .Thus, ω-consistency (assumption OCon ) has been weakened to consistency (assumption Con ), but in exchange we needed to additionally assume a special formula ≺ satisfying Ord 2 .This represents a quite strong commitment to the arithmetical ordering.
Even worse, this fix on the assumptions needed to show the unprovability of the negated formula (¬ R) complicates the proof of the unprovability of the direct formula (R), which was trivial in Gödel's version (Proposition 11).Now we again need a cleanly representable proof-of relation, representable negation, and well-behavedness of the order-like relation ≺: Proposition 15 Assume Con , Ord 1 , Rel , Repr Repr ¬ and Clean .Then R for all Rosser sentences R.

Proof
To prove R, we assume (1) R and aim to reach a contradiction.
-With Rel , we obtain p R for some p ∈ Proof.
-With Rel , we obtain q ¬ R for all q ∈ Proof.
-With Repr , Clean and Lemma 3, we obtain b ¬ (n, ¬ R ) for all n ∈ Num.
-The following reasoning is performed in the (intuitionistic) proof system of , but we describe it informally.
-By Repr ¬ and the fact that b is included in , the only -From RosserTwist( p , R ) and (2), we obtain ∃x.(x, R )∧RosserTwist(x, R ).
-The last two facts contradict consistency.
(2) R and ¬ R for all Rosser sentences R.
(2) : R follows by applying Proposition 15 to the assumptions, and ¬ R follows by applying Proposition 14 to the assumptions.
Highlighted in the statements of Theorems 16 and 13 is the assumption tradeoff between the two versions of IT 1 : Rosser's weakening of ω-consistency into consistency is paid by additionally assuming representability of negation and an order-like relation satisfying Ord 1 and Ord 2 .Certainly, negation representability is not a big price, since for concrete logics this tends to be a lemma that is anyway needed when proving HBL 1 .On the other hand, the ordering assumptions seem to be a significant generality gap in favor of Gödel's version.

Semantic Versions
A semantic version of IT 1 is one that establishes not only the unprovability of Gödel or Rosser sentences and of their negations, but also the truth of these sentences.To capture this abstractly, we leverage our concept of truth from Sect.4.6, denoted | .The next variant of the semantic IT 1 does not directly assume the existence of proofs and their representations, but "recovers" them using HBL ⇐ 1 as prescribed in Lemma 7: Proof The same as Theorem 18's proof, but using Theorem 19 rather than Theorem 17.

Theorem 17 (Semantic
The assumption tradeoff between Theorems 17 and 18 on the one hand and Theorems 19 and 20 on the other hand is the same as that between their proof-theoretic counterparts (discussed in Sect.Thus, if b = and some reasonable properties hold for | , then ω-consistency comes for free.Hence, in this case Gödel's versions, Theorems 17 and 18, are strictly more general than Rosser's versions, Theorems 19 and 20 (if we ignore the difference in the way Gödel and Rosser sentences are actually defined).This further illustrates the idea that Rosser's trick is not always an improvement.

Classical Logic Versions
The results so far do not require going beyond intuitionistic logic.But if we commit to classical logic for (i.e., assume ¬¬ ϕ → ϕ) and also assume HBL ⇐ 1 , there is a well-known more direct argument for showing that Gödel sentences are not disprovable, which immediately proves IT 1 .(This is documented, for example, as Theorem 3.1 in Buldt's monograph [7].)However, in our generalized setting with two provability relations, this argument does not go through unless we strengthen HBL ⇐ 1 (which currently refers to b ) to refer to : HBL ⇐ 1, : ϕ implies ϕ for all ϕ ∈ Sen.
(2) G and ¬ G for all Gödel sentences G. Point (2) of the above theorem refers to Gödel sentences (defined using ).Note that weakening the statement to refer to basic Gödel sentences (defined using b ) would not help with relaxing the assumption HBL ⇐ 1, to HBL ⇐ 1 ; the former would still be needed to finish the proof.Of course, HBL ⇐ So from the lemma we infer TIP | , as desired.
We used Gödel, not Rosser sentences in our classical semantic versions of IT 1 .Unlike for the (intuitionistic) semantic versions in Sect.6.4, here a Rosser-style improvement would serve no purpose, since we already assume to be consistent, not ω-consistent.

Benefits of the Two-Relation Take on Provability
Our framework distinguishes between basic provability ( b ) and provability ( ).This seems to be a rational design choice when aiming high in terms of generality for the incompleteness theorems.For example, this choice has been made explicitly by Smorynski [57] and more implicitly by Feferman [13] in their general accounts.Let us analyze what are the choice's benefits to IT 1 in the context of our development.The main questions are of course whether the scope of these theorems has to gain from the two-relation approach, as opposed to working with only one relation; and, if so, by how much.
In some cases, the gain is undeniable: Our Sect.6.4's semantic Theorems 17-20 gain significant generality by assuming soundness for b only, and merely consistency or ω-consistency for .This covers the case of Gödel or Rosser sentences being true for unsound theories as well.And of course the above theorems are based on the proof-theoretic theorems in Sects.6.2 and 6.3, which means that the latter's two-relation formulations are also needed.
At the other extreme, in one case, namely the classical-logic-based Theorem 22, there is no gain.Indeed, say we ignore b and modify all this theorem's assumptions to replace for all occurrences of b -which is the same as assuming b = .Then we would lose no generality, because the modified assumptions would be the same or weaker than the original assumptions.In conclusion, Theorem 22 stays equally general if we identify b and .
The other cases, namely the classical-semantic Theorems 23 and 24, are somewhere in between these two extremes: Their two-relation formulation is more general than a onerelation formulation, but the gain from this is doubtful.Like in Theorems 17-20, they allow an unsound as an extension of a sound b .On the other hand, their assumptions HBL 1 and HBL ⇐ 1, (inherited from Theorem 22) force to coincide with b on all sentences of the form ϕ ; and it is not clear if one can find interesting classes of unsound relations that satisfy this constraint (for standard choices of b ).

Second Incompleteness Theorem
For a consistent theory that is sufficiently expressive, IT 2 states that this theory cannot prove (the internal formalization of) its own consistency, which in our notations will be written as ¬ ⊥ .Here, "sufficient expressiveness" refers to something similar to the case of IT 1 , namely the theory's ability to express concepts about itself such as formulas and provability, but is a stronger requirement than for IT 1 : For IT 2 , the theory needs to be expressive enough to formalize and prove part of IT 1 .This includes Peano arithmetic and stronger theories but excludes Robinson arithmetic.
IT 2 is of course a perfectly mathematical theorem, just like IT 1 .However, the informal paraphrasing of IT 2 's conclusion, taking ¬ ⊥ to mean that the theory cannot prove its own consistency, relies on an extra-mathematical assumption of an intensional nature [1] [13, §1]: that adequately expresses the provability relation .The mathematical property of (weakly) representing is only an extensional approximation of this assumption.By contrast, IT 1 only needs as an auxiliary concept used in its proof; the adequate expression of is irrelevant there, and it is only (weak) representability that matters.When discussing variants of IT 2 , we will always work under the adequate expression assumption.

Informal Account and Roadmap
Similarly to the case of IT 1 , we start with an informal account of the argument behind IT 2 , where again we use double quotes for sentences that internalize certain statements in the language of the considered system.
(1) Gödel realized that IT 2 follows by internally formalizing the positive half of his (proof-theoretic) IT 1 , henceforth referred to as IT 0.5 .It states the unprovability of a Gödel sentence G, covered by Sect.6.1's point (1.1) and Proposition 11.This leads to the provability of "the theory is consistent implies that G is not provable".Moreover, by virtue of G being a Gödel sentence, IT 0.5 itself implies the unprovability of "G is not provable".From the above together with consistency, we obtain the unprovability of "the theory is consistent".
The three derivability conditions HBL 1−3 recalled in Sect.4.5 were perfected by Löb [32] based on previous work by Hilbert and Bernays [22] to make the above informal argument fully rigorous without referring to internal formalization details (although such details do need to be worked out to prove the conditions).The way these conditions work together to achieve this goal will be discussed in Sect.7.2.For now, we should just note that the unqualified requirement of internally formalizing IT 0.5 is in itself not sufficient.The internalized concepts must exhibit certain similarities to the original concepts from one level up; and this is what the derivability conditions express.For example, the above informal argument had a silent shift from the provability of "the theory is consistent implies that G is not provable" (with the whole statement inside quotes) to the provability of "the theory is consistent" implies "G is not provable" (where the implication operator is outside the quotes, i.e., is positioned one level up)-which is where HBL 2 comes to help.
(2) An alternative line of reasoning due to Jeroslow [24] is often cited [50,56,57] as a simplification of the canonical route to prove IT 2 : Whereas traditionally IT 2 requires all three derivability conditions, Jeroslow's version does not make use of HBL 2 .
Jeroslow's approach relies on pseudo-terms.These are formulas that satisfy existence and uniqueness properties on one of their free variables, say, x, meaning that x denotes a uniquely identified item depending on any items denoted by the other free variables; in short, pseudo-terms can essentially be treated like terms.In the informal discussion that follows, the reader is free to think of actual terms instead of pseudo-terms.
Jeroslow proved an alternative diagonalization lemma, producing pseudo-term fixpoints instead of formula fixpoints.In particular, one obtains a pseudo-term τ that is provably equal to the encoding of the sentence "non-τ is provable".If we let ϕ denote the latter sentence, we obtain that ϕ is provably equivalent to "¬ ϕ is provable".Let us call any sentence satisfying this fixpoint property a Jeroslow sentence.Such a sentence states about itself something stronger-sounding than a Gödel (or Rosser) sentence: not that it is merely not provable, but that even its negation is provable.(We write "stronger-sounding" rather than "stronger" because it would be actually stronger only assuming the provability of consistency.) Now, the argument for IT 2 goes as follows.Assume that "the theory is consistent" is provable.Because a Jeroslow sentence ϕ asserts the provability of something, (a slightly stronger form of) HBL 3 applies, so ϕ provably implies "ϕ is provable".On the other hand, by virtue of being a Jeroslow sentence, ϕ also provably implies "¬ ϕ is provable".So ϕ provably implies "the theory is inconsistent", which together with our assumption gives the provability of ¬ ϕ.With HBL 1 , we obtain the provability of "¬ ϕ is provable", i.e, by virtue of ϕ being a Jeroslow sentence, the provability of ϕ.So both ϕ and its negation are provable, which contradicts consistency.
The above argument invokes HBL 1 and HBL 3 but not HBL 2 .It is specific to Jeroslow sentences and cannot be achieved with Gödel or Rosser sentences.The argument has several loose ends, which will be addressed in our formal discussion.In light of that, it will become clear that the ¬ in "¬ ϕ" and the "non" in "non-τ " are different, but related operators: The former is formula negation (applied to ϕ), while the latter is substitution (with τ ) in a pseudoterm that represents the operator on numerals corresponding to ¬ via formula encoding.This concludes our informal discussion.Next, we engage in formal accounts of the above arguments: point (1) in Sect.7.2 and point (2) in Sect.7.3.

Standard Version
Let us slightly rephrase the statement and proof of IT 0.5 (Proposition 11) in a way that will make it convenient to highlight its internal formalization within the proof of IT 2 : Proposition 11 (rephrased).Assume HBL 1 .Let G be a Gödel sentence.Then Con implies G.
G and G implies ⊥.
Step 4. From the last three facts, G implies ⊥.
The standard proof of IT 2 uses all three derivability conditions in key places in order to internalize the above proof of IT 0.
Invoking IT 0.5 : -From Con and HBL 1 , by Proposition 11 we obtain G.
The above proof of IT 2 starts with an internalization of aspects of the IT 0.5 's proof.It does not literally formalize the end-to-end proof, but instead proceeds by plugging in the derivability conditions, which can be thought of as pre-formalized reasoning patterns.
-Step 2 is internalized using HBL 3 , which asserts the provability of some instances of HBL 1 , replacing object-level quantification with meta-level quantification.To see this, note that a full formalization of HBL 1 would be a sentence of the form ∀x. Sen (x) ∧ (x) → ( inst ( , _ (x))), where Sen , _ , inst and formalize membership to the set of sentences Sen, the encoding operator _ , the formulainstantiation (i.e., substitution of a term for the first variable, v 1 ) operator, and the inner representation of provability (one further level inside), respectively.By instantiating the ∀-quantified x with ϕ for any ϕ ∈ Sen, we obtain sentences that can be equivalently written in a more palatable from, ϕ → ϕ , which are exactly the sentences whose provability is asserted by HBL 3 .
-Similarly, Step 3 is internalized using HBL 2 , which asserts the provability of some instances of the modus ponens rule, again replacing object-level quantification with meta-level quantification-whereas a full formalization of HBL 1 would be a sentence of the form ∀x, y.
-The internalization of Step 1 is more interesting: To formalize the fact that G implies ¬ G , one takes advantage of the availability of the stronger and "more formal" property G → ¬ G , which is pushed inside the proof system via HBL 1 , and then its implication is lifted one level up using HBL 2 .
-Steps 4 and 5 are internalized by mapping meta-implication and meta-negation to the implication and negation operators, → and ¬, using the latter's deductive properties.
In summary, a judicious use of the derivability conditions and other ad hoc procedures are used to prove an internalized version of IT 0.5 , while avoiding the need to fully formalize the proof inside the system.(On the other hand, proving the derivability conditions does require a substantial internal formalization effort in the first place.)Theorem 25's proof is concluded according to the plan sketched in Sect.7.1: by combining the formalized and the original IT 0.5 to obtain the unprovability of consistency.
Finally, let us scrutinize IT 2 with respect to the benefit of the two-relation take on provability (as was done for IT 1 in Sect.6.6).We see that for IT 2 there is no benefit from using two relations.The same reason as the one discussed for Theorem 22 applies: Replacing b with does not decrease generality.Thus, when discussing IT 2 , we can assume b = without loss of generality.Note also that, even if we used a formula b corresponding b , no meaningful two-relation strengthening of IT 2 would be in sight; in particular, the consistency of the basic theory b could well be provable in the extended theory .

Convention 26
For the rest of Sect.7, we will assume b = and no longer refer to b .

Jeroslow's Version
Next we study Jeroslow's approach to IT 2 [24].To analyze its features and pitfalls, we need to recall into some notions and notations employed by Jeroslow.
A pseudo-term is a formula ϕ ∈ Fmla m+1 expressing a provably functional relation via "exists unique": ∀x 1 , . . ., x m .∃!y .ϕ(x 1 , . . ., x m , y).Note that we have already seen examples of pseudo-terms: Sect.4.4's formulas f representing functions f .We let PTerm, ranged over by σ, τ , be the set of pseudo-terms.While pseudo-terms are particular formulas, they will be treated as an extension of the notion of term.Indeed, a term t having free variables v 1 , . . ., v m can be regarded as the pseudo-term v m+1 ≡ t.
Pseudo-terms can be composed freely with terms and other pseudo-terms in a term-like fashion, and also substituted in formulas, as indicated in the following notation.
Above, y is chosen to be distinct from the other occurring variables.It is possible to introduce multi-input extensions of this notation, but we will not need them.The notation smoothly integrates pseudo-terms with terms, as shown in the following example properties: Example 28 (1) If σ ≡ t (employing point (1) of the notation) and ϕ(σ ) (employing point ( 3)) then ϕ(t), where ϕ(t) is the usual instance of ϕ with t.
Jeroslow fixes an abstract class of "computable" m-ary functions, F m ⊆ Num m → Num, for all arities m ∈ N, on which he considers the following assumptions: Repr F : Every f ∈ F m is represented by some pseudo-term f ∈ PTerm m under the identity encoding Num → Num.CapN: Some N ∈ F 1 correctly captures negation: N ϕ = ¬ ϕ for all ϕ ∈ Sen. CapSS: Some ssub : Fmla 1 → F 1 correctly captures substituted self-substitution: ssub(ψ) f = ψ( f f ) for all ψ ∈ Fmla 1 and f ∈ F 1 . 1ote that, in CapSS, we take advantage of the introduced notation for pseudo-terms: If we spell out Notation 27(2), the highlighted text denotes ∃y.f ( f , y) ∧ ψ(y).Moreover, employing Notation 27(1), the statement of Repr F for some f ∈ F 1 and n ∈ Num would be written as f (n) ≡ f (n); and combining CapN with the instance of Repr F for N, we obtain a fact that, using the same notation, can be written as N ϕ ≡ ¬ ϕ .
When our logical theory is a recursive extension of Robinson arithmetic and Num = N, F m could be any sufficiently rich the set of m-ary computable functions, ranging from the primitive recursive functions to all total μ-recursive functions.Then, every f ∈ F m would indeed be represented by a formula f .Moreover, assuming a computable and injective encoding of formulas, _ : Fmla 1 → N, we can take N : N → N to be the following computable function: Given input n, it checks if n has the form ϕ ; if so, it returns ¬ ϕ ; if not, it returns any value (e.g., 0).And ssub(ψ) can be defined similarly, obtaining the desired property for every ϕ ∈ Fmla 2 , not necessarily of the form f .In short, Jeroslow's assumptions cover arithmetic (but also potentially many other systems).
Lemma 29 can be used to produce Gödel and Rosser sentences, which can be used like in Sect.6, leading to variants of IT 1 .
However, as discussed in Sect.7.1, Jeroslow's main innovation affects IT 2 : It removes from the assumptions the second derivability condition, HBL 2 .
As with Rosser's trick, we analyze this innovation's tradeoffs from an abstract perspective.A first tradeoff is in the employment of a stronger version of the third condition, SHBL 3 , holding for all closed pseudo-terms and not only those that encode sentences.
Another tradeoff is in the way consistency is expressed in the logic.Jeroslow does not conclude ¬ ⊥ , but something more elaborate, namely jcon.While the formula ¬ ⊥ internalizes the statement ⊥, jcon internalizes the equivalent statement "for all ϕ, it is not the case that ϕ and ¬ ϕ."But are the internalizations themselves equivalent, i.e., is it the case that ¬ ⊥ iff jcon?This surely holds for many concrete logics, but it is only one direction that we can infer logic-independently, under mild assumptions: Proof Assume jcon.

It seems impossible to infer the other direction without knowing what
looks like more concretely.Therefore, ¬ ⊥ , the original IT 2 's conclusion, is abstractly stronger than, hence preferable to jcon.In short, Jeroslow somewhat weakens the theorem's conclusion.
Let us now look at (a slight rephrasing of) Jeroslow's proof: Proof of Theorem 30.We assume (1) jcon and aim to reach a contradiction.
The above proof has a subtle gap, which makes Theorem 30 incorrect under its stated assumptions.The problem lies in the highlighted description of the formula ϕ.Strictly speaking (i.e., rigorously employing our Notation 27), the correct form of fact ( 2) is not ϕ ← → ( N ϕ ) but ϕ ← → ( N ) ϕ , and the correct ϕ is not ( N (τ )) but ( N )(τ ).So let us write ϕ for the correct version, ( N )(τ ), and ϕ for ( N (τ )).Notice the difference: ϕ is obtained by first instantiating with N and then instantiating the remaining formula with τ , whereas ϕ is obtained by first instantiating N with τ and then instantiating with the result.Both sentences occur in the proof: ϕ comes from Lemma 29, while ϕ comes from SHBL 3 .For most purposes in logic, the difference is minor, since (as we note in Example 28(2)) ϕ and ϕ are provably equivalent.However, as we discuss below, shifting between ϕ and ϕ must be done with care, since the proof uses them under the encoding _ .
A first attempt to fill this gap would be to require ϕ = ϕ , or at least ϕ ≡ ϕ .The latter would be true under the assumption that the encodings of provably equivalent sentences are provably equal.But assuming this is unreasonable: Usually sentence equivalence is undecidable, so no computable encoding can achieve that.2,3A more feasible solution comes from noting that the proof does not need ϕ ≡ ϕ , but could work with the weaker property ϕ → ϕ .The latter would be true under the following assumption: Since the → in WHBL 2 can be replaced with ← → without changing the meaning, WHBL 2 can be read as: encodings of provably equivalent sentences are provably equiprovable.Also, WHBL 2 is a weakening of ϕ → ψ implies ϕ → ψ for all ϕ, ψ ∈ Sen which, in the presence of HBL 1 , is seen to be a weak form of HBL 2 . 4This motivates the name "WHBL 2 ".We are led to the following solution: Correction 1 Theorem 30 becomes correct if we add WHBL 2 as an assumption.
-By SHBL 3 applied to N (τ ), we obtain In summary, one solution to filling the gap in Jeroslow's approach, which aimed at removing HBL 2 , was to (re)introduce a weaker version of HBL 2 , namely WHBL 2 .
An alternative solution is to replace representation by pseudo-terms with actual term-representation (defined in Sect.4.4).To this end, we amend SHBL 3 to quantify over all closed terms t instead of all closed pseudo-terms τ ; moreover, also factoring in the observation that Jeroslow's proof does not need F n for all n but F 1 suffices, we change Repr F into: under the identity encoding Num → Num, by some f taken from a set Ops ⊆ (Term → Term) for which an encoding as numerals _ : Ops → Num is given, and such that FVars(g(t)) = FVars(t) and (g(t))[s/x] = g(t[s/x]) for all g ∈ Ops, s, t ∈ Term and x ∈ Var.
(In concrete logics, the elements of Ops can be constructors or derived operators on terms.)Correction 2 Theorem 30 becomes correct if we work with terms rather than pseudo-terms and amend SHBL 3 and Repr F as indicated above.
Proof Indeed, all the proofs of CapSS, Lemma 29 and Theorem 30 work if we switch from pseudo-terms to terms.
In summary, our second solution requires the following amendment to Jeroslow's approach: For representing computable functions, we must have available not just pseudoterms, but actual terms.This usually means that the logic has built-in Skolem symbols and axioms.
Finally, let us see what it takes to alleviate the second tradeoff: from jcon to the more desirable ¬ ⊥ .We consider the following condition: HBL 4 has a similar flavor as HBL 2 , but refers to conjunction rather than implication: It states that conjunction introduction holds inside the proof system.
Theorem 32 If we modify Theorem 30 by applying Correction 1 (i.e., adding assumption WHBL 2 ) and adding assumption HBL 4 , then its conclusion can be upgraded to ¬ ⊥ .

Proof
The only time when jcon is used in the proof is via its specific instance ¬ ( ϕ ∧ ( N ϕ )), which by Repr F and CapN would follow from (1) ¬ ( ϕ ∧ ¬ ϕ ).So it suffices to show that the last follows from ¬ ⊥ , WHBL 2 and HBL 4 : -From HBL 4 , we obtain ⊥ , we obtain (1), as desired.
Note that a version of Theorem 32 relying on Correction 2 rather than Correction 1 would be weaker than Theorem 32, since WHBL 2 is necessary in the proof even if we work with terms instead of pseudo-terms.
In summary, Theorem 32 highlights the following assumption tradeoff in Jeroslow's approach, provided the same strong conclusion as in the standard IT 2 is desired: the removal of HBL 2 against the addition of WHBL 2 and HBL 4 (and the slight strengthening of HBL 3 into SHBL 3 ).Whether this is a good tradeoff will of course depend on the logic's specificity, in particular, on its primitive rules of inference.
Jeroslow presented his approach for an abstract logical theory over a FOL language, which is not necessarily a FOL theory-so it found a natural fit in our generic framework.Jeroslow's account is extremely sketchy and notationally ambiguous.In spite of this account having become part of the IT 2 folklore, very few subsequent authors present it rigorously, and none at its original level of generality.Smith's monograph gives a rigorous account for arithmetic [56, §33], silently performing Correction 2,5 but failing to detect the need for SHBL 3 instead of HBL 3 (which Jeroslow had noticed).A mechanical proof assistant is of invaluable help with detecting such nuances and pitfalls.
We conclude with an anecdote involving our Isabelle formalization and Jeroslow's notations.Given the relative simplicity of Lemma 29, we were not too surprised that Isabelle's Sledgehammer [41] was able to prove it automatically.But Sledgehammer went further.It reported to have used the equality-reflexivity rule for in the proof.And it had found a term (not a pseudo-term) t for which it had proved not just t ≡ ψ(t) , but actual equality, t = ψ(t) ; in particular, the term was a numeral.All this was too good to be true.It took us some time to realize why that happened: Due to one of Jeroslow's notations, who wrote f instead of f (thus identifying a function with its representing pseudo-term), we had at first misstated CapSS, writing ψ( f f ) instead of ψ( f f ) ; the former is still a valid expression, since f is a function between numerals which are particular terms.Embarrassingly, it took us even longer to realize why this variation discovered by chance was not an improvement of Jeroslow's diagonalization lemma: because the assumption CapSS becomes unreasonable.Indeed, no concrete computable function would then be able to act like the intended ssub(ψ): Given an input n, (1) decode it into a unique formula ϕ such that n = ϕ , (2) decode ϕ into a unique function f such that ϕ = f and (3) proceed to apply f as part of producing ψ( f f ) .The second step requires an injective and computable encoding of computable functions into formulas, which is impossible.

Summary of the Abstract Results
Using our generic infrastructure (Sect.4), we have formally proved Gödel-style and Rosserstyle diagonalization lemmas (Sect.5) and several abstract incompleteness results.
They include several versions of IT 1 : -Gödel's original IT 1 (Theorem 13) and an IT 1 based on classical logic (Theorem 22) required the formalization of some well-known arguments without change.
They also include two versions of IT 2 : -The standard IT 2 based on the three derivability conditions (Theorem 25) again only required formalizing a well-known argument.-The alternative, Jeroslow-style IT 2 (Theorem 30 with its two corrections, and Theorem 32) involved a detailed analysis and correction of an existing abstract result.

Concrete Instances
All the results presented so far operate abstractly, under certain assumptions-starting with a logic as generic as possible and adding structure and hypotheses as needed, while exploring conditions that enable different formulations of the results with various tradeoffs; concrete encodings and recursiveness are below the abstraction level of these results.By contrast, some of the previous mechanization projects, namely those by Shankar [52,53], O'Connor [36], Harrison [21] and Paulson [40], focused on the impressive goal of "getting all the work done."They fully proved the incompleteness theorems in particular settings, which involved defining the concrete Gödel encodings.These two types of developments are complementary, and they both contribute to formally taming the complex ramifications of the incompleteness theorems.This section will discuss concrete instances of the abstract results.We start by listing our mechanized instances (Sect.9.1), and explain how they have been based on Paulson's prior Isabelle development (Sect.9.2).When instantiating our abstract assumptions to Paulson's setting, not only did we recover his results, but were also able to upgrade them.This did require modifying some concrete proofs, but even when doing that we relied on top-down insight from the abstract results; in fact, as we are about to discuss, insight has traveled bottom-up as well.We also revisit major developments in other provers (Sect.9.3), and finally briefly sketch a wider array of possible instances (Sect.9.4).

Our Mechanized Instances
We first validate the assumptions about our abstract logic and arithmetic: Proposition 33 (1) Any FOL theory that extends Robinson arithmetic or HF set theory satisfies all the axioms in our logical and arithmetic substrata (in Sects.4.1, 4.2 and 4.3).
(2) If, in addition, the theory is sound, then, together with its corresponding standard model, it also satisfies all our model-theoretic axioms (in Sect.4.6).
In particular, point (2) shows that our abstract framework for standard models applies equally well to N and the datatype of HF sets.In the latter case, Num becomes the entire set of closed terms, so that numerals can denote arbitrary HF sets.This illustrates the versatility of our abstract concept of numeral.
We instantiate two of our main theorems in three ways: Theorem 34 Let T be a FOL theory that extends HF set theory with a finite set of axioms, and let b and be the same relation, namely provability from T .
(1) If T is sound in the standard HF set model , then the hypotheses of Theorems 24 and 25 are satisfied, i.e., IT 1 (classical semantic version) and IT 2 hold for T .(2) If T is consistent , then the hypotheses of Theorem 25 are satisfied, i.e., IT 2 holds for T .

Connection to Paulson's Results
The above instances are heavily based on the lemmas proved by Paulson in his Isabelle/HOL formalization of IT 1 (covering both the proof-theoretic and the semantic aspect) and IT 2 [39,40].Paulson formalized quite faithfully Świerczkowski's detailed account [61], but he also strengthened and slightly corrected it.Świerczkowski's work applies to HF set theory [61,62], a classical FOL theory axiomatizing hereditarily finite sets by means of an induction principle stating that the universe is comprised of such sets only.Paulson extended Świerczkowski's incompleteness to essential incompleteness with respect to any finite sound extension of HF set theory within the same FOL language.Our Theorem 34's point ( 1) is a restatement of Paulson's formalized results: theorems Goedel_I and Goedel_II in [40].By contrast, point ( 2) is an upgrade of Paulson's Goedel_II, applicable to any finite consistent, though possibly unsound theory.This stronger version is a more standard form of IT 2 , free from any model-theoretic dependencies.Paulson proved both HBL 1 and HBL ⇐ 1 taking advantage of soundness, so to achieve the upgrade we had to discard HBL ⇐ 1 and re-prove HBL 1 by replacing any semantic arguments with proofs within the HF calculus.We also removed all invocations of the Σ 1 -completeness lemma, which happened to depend on soundness due to Paulson's choice of Σ 1 -sentence definition.
This instantiation process has offered us important feedback into the abstract results.A formal development such as ours is (largely) immune to reasoning errors, but not to missing out on useful pieces of generality.We experienced this firsthand with our assumptions about substitution.An a priori natural choice was to assume representability of the numeral substitution Sb : Fmla 1 ×Num → Sen (defined as Sb(ϕ, n) = ϕ(n)), part of which means (1) b Sb ( ϕ , n, Sb(ϕ, n)).Instead, Paulson had proved (2) b Sb ( ϕ , n , Sb(ϕ, n)).Unlike (1), Paulson's (2) applies the term encoding function _ : Term → Num to numerals as well (which are particular terms); and since his _ function is injective, it is far from the case that n = n for all numerals n.Paulson's version makes more sense than ours when building the results bottom-up: Representability should not discriminate numerals, but filter them through the encodings like other terms.However, top-down our version also made sense: It yielded the incompleteness theorems under reasonable assumptions, which do hold, by the way, for HF set theory-even though in a bottom-up development one is unlikely to prove them.We resolved this discrepancy through a common denominator: the representability of self-substitution S : Fmla 1 → Sen (Sect.4.4), which made our results more general.
Paulson's formalization has also inspired our abstract treatment of standard models (Sect.4.6).Since Paulson proved HBL ⇐ 1 and used classical logic, an obvious "port of entry" of his IT 2 into our framework is Theorem 22, taking both b and to be Paulson's provability relation (which is classical provability in a finite extension of HF set theory).But this theorem tells us nothing about the Gödel sentences' truth.Delving deeper into Paulson's development, we noted that, following Świerczkowski, he (unconventionally) completely avoided Repr , and did not even define .This raised the question of whether HBL ⇐ 1 and Repr are somehow interchangeable in the presence of Standard Models standard models (on which Paulson relies heavily); and we found that they indeed are, under mild assumptions about truth (as we discuss in Sect.4.6).This analysis has led to variants of our semantic IT 1 , Theorems 18 and 20, which incidentally do not need classical logic.Although our Theorem 18 seemed like an excellent candidate to instantiate to Paulson's semantic IT 1 , its instantiation turned out to be difficult.All its assumptions were easy to fulfill based on what Paulson had already proved, except for Compl ¬Pf .Indeed, whereas Paulson proved that his proof-of relation is a Σ 1 -formula (which implies Compl Pf by Σ 1 -completeness), he did not prove the same for its negation (which would imply Compl ¬Pf ).Instead, we recovered Paulson's IT 1 as an instance of our Theorem 24 (which requires classical logic).
There are two further improvements that we could perform to Paulson's formalization, leveraging our abstract results: (1) replacing the soundness assumption from Paulson's IT 1 with consistency, and (2) removing all traces of classical reasoning in the object logic to port Paulson's IT 1 and IT 2 to intuitionistic logic.For the first improvement, we must prove the aforementioned missing link between Paulson's IT 1 and our Theorem 18, namely showing that Compl ¬Pf holds in Paulson's setting; we are confident that this is true (any reasonable proof-of relation is a Δ 1 -formula, implying that its negation is a Σ 1 -formula), but the proof will be very laborious.The second improvement will have a large formal overlap with the first: To remove the uses of the unrestricted Excluded Middle axiom, we must prove that instances of this axiom hold intuitionistically for several formulas expressing decidable predicates, including many predicates that participate in the definition of Paulson's Pf, as well as Pf itself; and, in the presence of Compl Pf , we have that Compl ¬Pf is equivalent to Excluded Middle holding for Pf(n, ϕ ).

Connection to Results Mechanized in Other Provers
Shankar's 1986 development.In pioneering work [52,53], Shankar proved formally the proof-theoretic version of IT 1 for any finite extension of the FOL theory Z2 [10], i.e., he proved Z2's finitary essential incompleteness.Z2 is a variation of HF set theory, the difference between the two being that the latter postulates an induction principle for all the HF sets, whereas the former singles out the natural numbers as those transitive HF sets that are totally ordered by membership and postulates induction for numbers only.The underlying object logic considered by Shankar was classical FOL enriched with definitions by the Skolemization of any proved "exists unique" sentences.He worked in Thm, an early version of the Boyer-Moore prover that eventually evolved into Nqthm [6] and then ACL2 [26].This prover's logic, i.e., the meta-logic of Shankar's development, is a quantifier-free FOL enriched with induction and recursion principles for reasoning about total functions expressed in pure Lisp.This is significantly less expressive than HOL, and in fact close to primitive recursive arithmetic (PRA).Formally proving IT 1 within the constraints of this minimalistic meta-logic was an impressive achievement even by today's standards.
Shankar's development follows a similar structure to Cohen's high-level informal presentation [10, §9] (which Shankar cites).He proved that all partial recursive functions are representable in Z2, a result we will refer to as RR.Besides being a central result in itself, RR is a convenient tool for proving Gödel's theorems.Some proof developments for IT 1 , including the Świerczkowski-Paulson one, do not prove RR in its generality, but prove the representability of needed functions only.On the other hand, the RR route is usually the one preferred in textbooks due to its elegance and generality.As Shankar observed, textbook proofs of IT 1 via RR often step from the meta-logic (where the usual informal mathematical discourse takes place) into a meta-meta-logic: The formula-and proof-manipulating functions needed for IT 1 are defined (as usual) as meta-level functions, then a meta-meta-level argument is being made that they are recursive, in order to conclude that they are representable.In a mechanization, however, such an argument must stay in the meta-logic.Shankar achieves this by formalizing a pure Lisp interpreter that is able to evaluate any recursive function when taking its description as input.His formulation of RR refers to this interpreter, stating that the interpreter's partial-function behavior (in relational form) is representable in Z2.Each function needed in the proof of IT 1 is proved to be representable by first showing it to be equivalent to its interpreted version.Special care is required to have these definitions and proofs work in the meta-logic, where all functions must terminate-to that end, the interpreter takes an additional numeric argument representing the maximum allowed size of the computation.Using notations close to the ones in this paper and bypassing the indirection through the interpreter, Shankar's proof of IT 1 can be summarized as follows.He defined a partial function THM : Sen → {0, 1} that, upon an input ϕ, enumerates all the possible proofs and: -terminates and returns 1 if a proof of ϕ is found; -terminates and returns 0 if a proof of ¬ ϕ is found.
In particular, THM loops (i.e., is undefined) if neither ϕ nor ¬ ϕ is provable.Also, if both ϕ and ¬ ϕ are provable (meaning the considered extension of Z2 is inconsistent), then the output of THM depends on whose proof comes first in the enumeration.But regardless of that, it holds that THM(ϕ) = 1 implies ϕ, and THM(ϕ) = 0 implies ¬ ϕ.
Let ψ ∈ Fmla 1 be the formula that represents the unary relation {ϕ ∈ Sen | THM(ϕ ϕ ) = 1}; this is obtained by (i) invoking RR to produce a formula χ ∈ Fmla 2 that represents the graph of the partial function THM • S (where S is the self-substitution operator), and (ii) substituting 1 for χ's second variable.Let CS be the Cohen-Shankar sentence ¬ ψ ¬ ψ .Now, assume that CS or ¬ CS, meaning that THM(CS) terminates and returns 1 or 0. We have two cases, both of which contradict consistency: -If THM(CS) = 1 (i.e., (THM • S)(¬ ψ) = 1), then we have The above proof, which is similar to Cohen's proof sketch, 6 does not make explicit reference to HBL 1 , although this is of course a consequence of RR via the representability of the "proof of" relation.In fact, the proof makes use of the representability of THM • S, which is a variation of the representability of (for particular sentences of the form ϕ ϕ ) featuring a positive version of the Rosser twist discussed in Sect.5, but at the meta-level: The considered relation is not just provability, but provability by a proof p such that there is no proof q of the formula's negation occurring earlier in the enumeration.
The above argument is based on the diagonalization, though at the meta-level not at the object level as in Proposition 9.As Shankar remarked, the sentence CS says "my negation is provable by a proof that comes in the enumeration before any proof of me".This is true in the context of the above argument by contradiction, namely under the assumption that CS is decided (either provable or unprovable).Indeed, from the definitions of ψ and THM, we see that CS says "it is not the case that a proof of CS comes before a proof of ¬ CS", which, given the assumption, is equivalent to the above.
Let us refer to such sentence CS as Cohen-Shankar sentences (without claiming historical accuracy about the ideas behind them, which seem to go back at least as far as Smullyan substitution is primitive recursive.On the other hand, O'Connor's formalized representability result is stronger than Shankar's on the theory expressiveness dimension, since it is proved for the minimalistic theory NN. O'Connor proved a version of IT 1 that would classically read as follows: For any consistent self-representable extension of NN, there exists a sentence ϕ such that neither ϕ nor ¬ ϕ is provable.Due to the intuitionistic meta-logic, O'Connor preferred the intuitionistically stronger (and classically equivalent) formulation: For any self-representable extension of NN, there exists a sentence ϕ such that, if ϕ or ¬ ϕ are provable, then that extension proves everything (i.e., is inconsistent).Another consequence of the intuitionistic meta-logic is the need for an additional assumption: that the given extension's set of axioms is decidable, i.e., its (meta-level) membership predicate satisfies Excluded Middle.
The above universally quantified ϕ is witnessed by a Rosser sentence constructed via diagonalizaton, so the result essentially falls under Propositions 9,10 and Theorem 16, where both b and are taken to be deduction in a self-representable extension of NN. (Note that all the FOL theories of interest for IT 1 can already be represented in NN, not only in an extension of NN; and the corresponding (slightly weaker) version of O'Connor's result assuming NN-representability instead of self-representability is obtained by taking b to be deduction in NN and to be deduction in the considered extension.) Since here the FOL infrastructure is fixed, self-representability is equivalent to representability of the "proof of" relation (which O'Connor proved), hence it implies HBL 1 (which he did not mention explicitly but inlined in his proof).Incidentally, O'Connor's formalization improves on Hodel's account, who unnecessarily added an axiom to NN for coping with Rosser's trick [37, §6.4].
O'Connor's self-representability assumption in IT 1 is more general than the standard recursive axiomatizability assumption.In informal accounts of essential incompleteness including Hodel's, this more general result is usually inlined in the proof and only the end result is stated, which assumes not self-representability but recursive axiomatizability; an exception is the account of Feferman, who assumes a generalized form of self-representability (namely representability in a sub-theory) in his statements of IT 0.5 and IT 2 (Theorems 5.3 and 5.6 in [13]).In a formal account, such more general results are valuable for easier reusability across different instances.
O'Connor did not prove that all recursively axiomatizable extensions of NN are selfrepresentable (which would have followed from RR).However, he used his PR together with a proof that Peano arithmetic has its axioms primitively recursive to instantiate IT 1 to Peano arithmetic.He also proved the consistency of this theory (by showing that the natural numbers form a model, via a semantic interpretation function wrapped up in a negative translation to ensure classical validity within the intuitionistic meta-logic).Thus, he obtained the theory's unconditional incompleteness.
Harrison's 2009-2010 development.Harrison [21] proved formally versions of IT 1 for theories in the language of Robinson arithmetic with ≤ and < included as primitive predicate symbols.In what follows, we will refer to this language as LA, and by "Robinson arithmetic" we will mean the definitional extension of Robinson arithmetic as a theory in LA (with added axioms that define ≤ and <).Harrison worked in HOL Light [20], a proof assistant belonging to the HOL family together with Isabelle/HOL and HOL4.
In his development towards IT 1 , Harrison followed a semantic approach, based on ideas that go back to Gödel's introduction of his original paper [17].The approach was promoted by Smullyan [60] for its simplicity and elegance, and Harrison himself further elaborated and improved on it in his textbook [19, §7].The focus is no longer on the concept of a relation's representability (for a given theory), but on that of a relation's definability in the standard model (for a given language).In our notations, definability is obtained by replacing b with | in either the representability or the weak representabilty condition. 8(Harrison formalized an equivalent definition of definability using valuations in the model.)The advantage of definability over representability is that the former is typically much easier to prove for concrete relations, without having to work inside a formal proof system.
LA is sufficient to achieve the definability (in the standard model of natural numbers) of the relevant syntactic concepts.These include (soft) self-substitution, which gives a semantic version of diagonalization: Proposition 9 with b replaced by | .In turn, this leads to the semantic version of Tarski's theorem on the undefinability of truth, which concludes the non-existence of a one-variable formula T such that | ϕ← →T ϕ for all ϕ.And after showing that provability in Robinson arithmetic is definable, one obtains that provability is distinct from truth; in particular, for sound theories this implies the incompleteness of provability, a first version of the proof-theoretic IT 1 .In fact, Harrison proved something more general: If a theory T in LA is definable (in that its set of axioms is definable), then its set of provable sentences is definable, hence different from the set of true sentences.This leads to a form of essential incompleteness: Any sound definable theory in LA, in particular, any extension of Robinson arithmetic with a sound definable set of axioms, is incomplete.
Harrison also pursued an alternative semantic route to IT 1 , which does not go through Tarski's theorem, but instead: (1) assumes (for starters) the soundness of the theory, (2) obtains a semantic version of Gödel sentences G using the semantic diagonal lemma, and (3) performs (what can be regarded as) a modification of the Gödel's original argument (the proofs of Propositions 11 and 12), appealing to soundness whenever needed for shifting from provability to truth.The advantage of this last line of reasoning is that it can be sharpened: Noting that soundness is only needed for G, ¬ G and ⊥, and using the fact that G is a Π 1 -sentence (making ¬ G a Σ 1 -sentence) if the theory is Σ 1 -definable (i.e., definable by a Σ 1 -formula), Harrison obtained the following stronger, symmetric version of proof-theoretic IT 1 : If a theory in LA is Σ 1 -definable, then (i) if it also Π 1 -sound then G and (ii) if it also Σ 1 -sound then ¬ G (where denotes deduction from this theory, and X -soundness or X -completeness means soundness or completeness for all X -sentences).And from representability and the semantic Gödel-sentence property, under the assumptions of (i), it follows that | G.So he obtained both the proof-theoretic and the semantic component of IT 1 .
In the above statement of IT 1 , the Π 1 -soundness assumption can be replaced by consistency plus Σ 1 -completeness, since the latter two imply the former.Finally, using the Σ 1 -completeness for Robinson arithmetic (and hence for any extension), Harrison formalized an essential incompleteness generalization and strengthening of the original Gödel-style IT 1 : For any consistent Σ 1 -definable extension of Robinson arithmetic, we have G and | G; and if the extension is also Σ 1 -sound, then ¬ G.In the presence of Σ 1 -completeness, the Σ 1 -soundness property (also called 1-consistency) is weaker than the ω-consistency property used originally by Gödel, which we assume in our Proposition 12 and Theorem 13.
Currently, refinements of IT 1 based on arithmetical hierarchy considerations are below the level of abstraction of our general framework.On the other hand, the high-level aspects of the Smullyan-Harrison semantic line of reasoning could be incorporated in this framework, which has infrastructure for both provability and truth.Our Archive of Formal Proofs entry [44] already contains proof-theoretic and semantic versions of Tarski's theorem.

Other Potential Instances
Many other logics and logical theories satisfy our theorems' assumptions.We do not require the logic to be reducible to a single syntactic category of formulas, Fmla, a single pair of judgments, b and , etc.; but only that such (well-behaved) formulas, provability relations, etc. are identifiable as part of that logic, e.g., localized to a given type and/or relativised by a given predicate.This allows our framework to capture most variants of higher-order logic and type theory (including the variant underlying Isabelle/HOL itself [29,30]), and also, we believe, many of the logics surveyed by Buldt [7], including non-classical and fuzzy.But enabling "mass instantiation" that is both formal and painless requires more progress on the agenda we started here: recognizing reusable construction and proof patterns and formalizing them as abstract results.extensions of HF set theory.This merely required us to discharge the abstract assumptions of Theorems 24 and 25 by instantiating them with results from Paulson's formalization-a simple exercise spanning 400 lines (not counting the 12 300 lines of Paulson's formalization).Formalizing the strengthened Theorem 34(2) [46] was significantly more difficult, because we could not simply reuse Paulson's formalization.Instead, we had to replace all of Paulson's semantic arguments with proofs within the HF calculus.In terms of proof-engineering, we started by copying Paulson's formalization (12 300 lines) and by removing from it every argument and definition that referred to standard models, which saved about 5000 lines.After that, we reintroduced the arguments needed for Gödel's second incompleteness theorem and proved them within the HF calculus.The new proofs span about as much as we had removed, such that overall we obtain the stronger result in 12,800 lines.
Our formalization relies heavily on locales [2], Isabelle's mechanism for maintaining contexts with parameters and assumptions.The two abstract AFP entries [44,48] declare 65 interdependent locales.These locales allow us to flexibly select just the needed assumptions for each theorem's variant.On the downside, complex locale hierarchies like ours tend to cause the formalizers to write seemingly redundant boilerplate code.In particular, every locale which extends another locale has to repeat the parameters (but fortunately not the assumptions) of the extended locale to ensure that correct type variables are used in the new locale.
In our locales, we fix explicit sets as universes of variables, numerals, terms, and formulas.Thus, any quantification over these entities must be expressed as bounded quantification over the fixed sets.This complicates the reasoning inside of the locales, because every step that uses a theorem with bounded quantification must discharge these additional universe-membership assumptions.We have even developed an ad hoc collection of specialized Eisbach proof methods [33] to deal with such assumptions.A natural alternative that would avoid these complications is to use types as universes.We opted for the set-based formulation instead of the type-based one, because set-based result can be instantiated more flexibly.For example, numerals are a subset of terms in Paulson's HF set theory and we instantiate our locales' universe of numerals with this subset.A type-based formulation would require introducing a separate type for numerals and lifting all results involving numerals to this type.Another alternative, the types-to-sets approach [31], combines the strengths of type-based and setbased theorems, at the expense of extending the logic, which we wanted to avoid.
The abstract parts of our formalization use declarative Isar proofs.This makes the proofs readable and ensures that they closely resemble the pen-and-paper arguments presented in this paper.In fact, the information flow for this algorithm went in the opposite direction: the pen-and-paper arguments constitute a (sometimes compressed) transcript of the formal Isar proofs.The concrete parts use a mixture of declarative and procedural (apply-style) proofs.Especially proofs in the HF calculus tend to follow the procedural style.
All our concrete theorems use Nominal Isabelle [64] to represent formulas with binders.This, however, is attributed to the fact that in all cases we took Paulson's formalization, which uses Nominal, as a blueprint.Our abstract development does not prescribe the usage of Nominal-it can similarly well accommodate de Bruijn indices, locally nameless terms, or other representations that equate alpha-equivalent terms.

Proposition 12
Assume OCon , Rel , Repr , Clean .Then ¬ G for all Gödel sentences G. Proof Let G be a Gödel sentence.To prove ¬ G, we assume (1) ¬ G and aim to reach a contradiction.-From OCon , we obtain Con .-With (1), we obtain G. -With Rel , we obtain p G for all p ∈ Proof.-With Repr and Clean , by Lemma 3 we obtain b ¬ (n, G ) for all n ∈ Num.-Since b is included in , we obtain ¬ (n, G ) for all n ∈ Num.-With OCon , we obtain ¬ ¬ ∃x.(x, G ), i.e., ¬ ¬ G .-With G being a Gödel sentence, we obtain ¬ G, which contradicts (1).
IT 1 ) If we enrich the assumptions of Theorem 13 with LCQ | (2,3) and Sound b | , then its conclusions can be enriched with the following: (3) | G for all basic Gödel sentences G. Proof We know from Theorem 13 that G, and ¬ G.It remains to show | G. -From G and Rel , we obtain that p G for all p ∈ Proof.-With Repr and Clean , by Lemma 3 we obtain b ¬ (n, G ) for all n ∈ Num.-With Sound b | , we obtain | ¬ (n, G ) for all n ∈ Num.-With LCQ | (3), we obtain (i) | ∀x.¬ (x, G ). -By logic we obtain b (∀x.¬ (x, G )) → ¬ G .(Recall Convention 5.) -With the definition of basic Gödel sentence, by logic we obtain b (∀x.¬ (x, G )) → G. -With Sound b | , we obtain | (∀x.¬ (x, G )) → G. -With LCQ | (2) and (i), we obtain | G, as desired.

Proof ( 1 )
: Immediate from Repr S by Proposition 10. (2): Let G be a Gödel sentence.-From Con and HBL 1 , by Proposition 11 we obtain G. -So we are left to prove ¬ G.To this end, we assume (i) ¬ G and aim to reach a contradiction.-Since G is a Gödel sentence, by logic we obtain ¬¬ G .-By classical logic, from this we obtain G .-With HBL ⇐ 1, , we obtain G. -With (i), this contradicts Con .

1 and HBL ⇐ 1 ,
coincide in the important case when b = .Two semantic versions are possible for classical IT 1 .The first one additionally assumes some reasonable properties of | , soundness for b , and TIP | : Theorem 23 (Classical Semantic IT 1 ) If we enrich the assumptions of Theorem 22 with LCQ | (1,2,5), Sound b | , TIP | , then its conclusions can be enriched with the following: (3) | G for all basic Gödel sentences G. Proof We know from Theorem 22 that (i) G, and ¬ G.It remains to show | G.To this end, we assume (ii) | G and try to reach a contradiction.-From (ii), by LCQ | (5) we obtain (iii) | ¬ G. -From the basic Gödel sentence definition we obtain b ¬ G → ¬ ¬ G .-With Sound b | , we obtain | ¬ G → ¬ ¬ G .-With (iii), by LCQ | (2) we obtain | ¬ ¬ G .-With LCQ | (1,2), we obtain | ¬ G .-With LCQ | (5), we obtain | G .-With TIP | , we obtain G, which contradicts (i).The second one replaces TIP | with some assumptions that, in the presence of the others, ensure TIP | -hence is strictly less general than the first one: Theorem 24 (Classical Semantic IT 1 , second version) The conclusions of Theorem 23 still hold if we replace TIP | with the assumptions Rel Pf , Compl Pf and LCQ | (4).Proof It suffices to show that TIP | follows from its replacements and the other assumptions.We do this using Lemma 8(2).To apply this lemma, we need: -Rel Pf , Compl Pf , LCQ | (4), which are assumed above; -LCQ | (2), Sound b | and HBL ⇐ 1 , which are assumptions of Theorem 23.
n} and x i ∈ FVars(ϕ)}.The technicalities are delicate: To avoid undesired variable replacements, ϕ[s 1 /x 1 , . . ., s n /x n ] must be defined as ϕ[y 1 /x 1 ] . . .[y n /x n ][s 1 /y 1 ] . . .[s n /y n a m ) R is said to be weakly represented by R if, for all (a 1 , ..., a m ) ∈ A 1 × ...×A m , it holds that (a 1 , ..., a m ) ∈ R if and only if b R ( a 1 , ..., a m ).Occasionally, we will use the alternative formulation " R (weakly) represents R."Let A be another set with _ : A → Num.An m-ary function f : A 1 × . . .A m → A is said to be represented by a formula f ∈ Fmla m+1 if for all (a 1 , . . ., a m second variant) The conclusions of Theorem 17 remain true if we replace its assumptions Rel , Repr , Clean with the assumptions Rel Pf , Compl Pf , Compl ¬Pf , HBL ⇐ 1 , LCQ | (4,5).Immediate by Lemma 7 and Theorem 17, noting that, by Lemma 4, HBL 1 (which is needed by Lemma 7) is implied by Rel and Repr .Similar semantic theorems can be obtained for Rosser-style IT Proof Exactly the same as the proof of Theorem 17, but using Rosser sentences and applying Theorem 16 (rather than using Gödel sentences and applying Theorem 13).Note that the last part of the proof of | G also works for R, because b (¬ R ) → R follows from the definition of Rosser sentence (by logic).20 (Semantic IT 1 à la Rosser, second variant) The conclusions of Theorem 19 remain true if we replace its assumptions Rel , Repr , Clean with the assumptions Rel Pf , Compl Pf , Compl ¬Pf , HBL ⇐ 1 , LCQ | (4,5).
1 : Theorem 19 (Semantic IT 1 à la Rosser) If we enrich the assumptions of Theorem 16 with LCQ | (2,3) and Sound b | , then its conclusions can be enriched with the following: (3) | R for all basic Rosser sentences R.