On Representations of Intended Structures in Foundational Theories

Often philosophers, logicians, and mathematicians employ a notion of intended structure when talking about a branch of mathematics. In addition, we know that there are foundational mathematical theories that can find representatives for the objects of informal mathematics. In this paper, we examine how faithfully foundational theories can represent intended structures, and show that this question is closely linked to the decidability of the theory of the intended structure. We argue that this sheds light on the trade-off between expressive power and meta-theoretic properties when comparing first-order and second-order logic.


Introduction
This paper addresses the philosophical question of how well foundational mathematical theories are able to represent mathematical structures. Much of mathematical practice concerns the study of particular structures. Famous examples are the arithmetical structure of the natural numbers (N, +, ×, 0, 1, <), the ordered field of the reals (R, +, ×, 0, 1, <), or the field of complex numbers (C, +, ×, 0, 1). The study of these structures is conducted mainly informally, such as the manner of reasoning we see in mathematical journals.
This fact concerning mathematical practice is coupled with the existence of foundational theories. There are various such theories, first and foremost set theory ZFC, but also category theory, and more recently homotopy type theory. There are many features we might want a foundational theory to have, but two (interlinked) desiderata that have emerged are: (1.) to provide a generous arena for mathematical discoursewe want to provide proxies for all the objects of informal mathematics, and (2.) to yield a shared standard-we want to be able to codify informal proofs in the theory in order to compare them and say when a construction or proof counts as legitimate. 1 Roughly, this means that the foundational theory can encode or formalise all our informal mathematical discourse about the 'usual' objects of mathematics. In this way, if one had sufficient patience and time, once could formalise all theorems of informal mathematics as theorems within one's favourite foundational theory. The starting question of this short paper is: What is the desired relationship between informal and formalised mathematics?
Being a very general question we restrict attention to the informal study of concrete structures, like the natural, real or complex numbers mentioned above; in a philosophical context these are often referred to as intended structures. Now, it is one thing to be able to formalise some piece of informal mathematics any-old-how, and quite another to do so faithfully. We would like the intuitive meaning of the formal statements to be similar to the intuitive meaning of the informal statements.
For motivational purposes let us roughly distinguish two approaches to the foundations of mathematics: the axiomatic and the genetic method (see [10]). 2 The first, chiefly embodied by Hilbert, replaces the intended structure by a set of axioms we argue are (or take to be) true. When done in first-order logic this approach is often incomplete (by Gödel's results). When done in higher-order logics, we lose various pleasant meta-theoretic properties, and so whilst of philosophical interest it has less practical value.
The genetic approach is via construction. Instead of asserting axioms for the intended structure, one first constructs the structure in question by finding an object coding it in one's foundational theory, and then one asks about its properties. Theories with a high degree of interpretive power are able to translate some mathematical constructions into first-order definitions inside the theory. For example, ZFC can mimic the classical constructions of the natural, real or complex numbers by defining formulas. In a little more detail (we provide a full outline in §2) formalisation of the mathematical study of an intended structure S typically proceeds in two steps. First, S is represented by a sequence of formulas R that identifies an object within the foundational theory (in the case of ZFC, a set or a class). Second, the informal talk about S is translated to formal talk about R. Thus, in every model of ZFC we will find an avatar of the natural, real or complex numbers.
Here, we are concerned with the first step; the choice of R. 3 We will focus mainly on this style of doing mathematics: by first constructing the structures and then examining their properties, and especially their first-order theory. We also restrict attention to first-order theories F to be our foundational theories (such as ZFC). Our proposal is to analyse one dimension of the faithfulness or similarity of meaning of a formalisation as dependent upon the similarity of S and what we define by R. We thus arrive at a more specific formulation of our question: what kind of similarity of S and R should we aim for, or at least hope for? A notion of similarity of obvious interest in this context is elementary equivalence. Our main claim then reads as follows (see Theorem 14 for a precise statement): Main Claim. Let F be suitable first-order foundational theory. Given a particular analysis of faithfulness in terms of elementary equivalence, an intended structure can be faithfully represented in F if and only if its (first-order) theory is decidable and F knows some decision procedure for it.
For our example structures, this implies that (R, +, ×, 0, 1, <) and (C, +, ×, 0, 1) are faithfully representable, but (N, +, ×, 0, 1, <) is not. On the positive side this shows that foundational theories have an especially good grip on decidable parts of informal mathematics. Our main interest in the claim is, however, on the negative side. Many intended structures have undecidable theories, and so their study cannot be faithfully formalised in our sense. Moreover, the underlying assessment of faithfulness via elementary equivalence seems to be a fairly modest requirement on the representation of a structure, philosophically speaking.
Outline: In ( §2) we recall some basics about translations between first-order theories; in particular formalising the study of some intended structure in a foundational theory. We then motivate one way of understanding the idea of a faithful such formalisation that we shall call absolute representability; an intended structure is absolutely representable when its representatives in models of F are elementarily equivalent to it. In ( §3) we establish our main claim by proving Theorem 14. We then ( §4) outline applications of our results to debates concerning first-order and higher-order resources. Finally ( §5) we conclude and present some open questions.

Absolute Representability
In this section we set up some key notions and motivate the formal definition we shall use, namely absolute representability.
Recall, a first-order language consists of a set of relation symbols and function symbols, each having an associated natural number called its arity; we view constant symbols as nullary function symbols. First, we fix a finite language L and a first-order L -structure S L : this is our informal intended structure. 4 We also fix a consistent computably enumerable first-order theory F: this is our foundational theory. We shall add another assumption on F later when needed. Examples for S L to keep in mind are (N, +, ×, 0, 1, <), (R, +, ×, 0, 1, <) or (C, +, ×, 0, 1), the example to keep in mind for F is ZFC; we assume ZFC is consistent.
We employ a standard definition (see e.g. [2, Chapter VIII]) of how our intended structure S L is represented in F: 5 Definition 1 A representation R of an L -structure in F is a finite sequence of formulas in the language of F, namely a formula ψ U (x) such that F proves ∃xψ U (x), and for every r-ary relation symbol S ∈ L a formula ψ S (x 1 , . . . , x r ) and for every r-ary function symbol f ∈ L a formula ψ f (x 1 , . . . , x r , y) such that F proves: Such a representation definably singles out an L -structure in every model of the foundational theory F as follows. That R(M) is a well-defined L -structure follows from the assumptions on what F proves about R in Definition 1, namely, the universe U is non-empty and ψ f really defines the graph of some function on U .

Example 3
The usual representation R ZFC N of (N, +, ×, 0, 1, <) in ZFC is given taking for ψ U (x) the formula x ∈ ω (understood as a formula in the language {∈} of ZFC) that defines the finite von Neumann ordinals; the formula ψ < ( y) state the recursive definitions of addition and multiplication, and the formulas ψ 0 (y) and ψ 1 (y) are y=∅ and y={∅}, respectively. The models R ZFC N (M), for M |= ZFC, are called ZFC-standard models of arithmetic in [3]. We refer to this paper and the references therein for some information about these structures.
Given a representation R of our intended structure S L in our foundational theory F, it is straightforward to translate first-order talk about S L into F. The following is folklore (see e.g. [2, Section VIII.2.E, Satz 2.2]): ( 1 ) Proof (Sketch) It is straightforward to compute, given a formula ϕ, a logically equivalent term-reduced formula, i.e., a formula whose atomic subformulas are of the form x=y, S(x) or f (x)=y for variables x, y,x, relation symbols S ∈ L , and function symbols f ∈ L . It thus suffices to define R for term-reduced formulas. For atoms define: R(x=y) := x=y and R(S(x)) :

Remark 5
The proof sketch defines R(¬ϕ) = ¬R(ϕ), a property of the map ϕ → R(ϕ) that we are going to use. Slightly more generally we could use only that Here, by a notion of similarity we mean an equivalence relation on L -structures. The finer this equivalence relation, the stronger the corresponding notion of representability. Obviously, taking the identity for ∼ results in an empty concept: no structure is identically representable in F. Taking isomorphism for ∼ means asking whether our intended structure S L is isomorphically representable in F, i.e., whether there exists a representation R such that R(M) ∼ = S L for all M |= F. This suggestion for ∼ is naive because it is a quick consequence of the Compactness Theorem that: 6 Proposition 6 Only finite L -structures are isomorphically representable in F.
Hence isomorphic representability is a far too strong notion (at least as far as first-order logic is concerned). Philosophers and logicians often analyse a spectrum of similarity notions far coarser than isomorphism. 7 We examine the prospects of choosing elementary equivalence: recall, two L -structures A, B are elementarily equivalent if they satisfy the same first-order L -sentences, i.e., Th(A) = Th(B), or equivalently, Th(A) ⊆ Th(B). Here, Th(A) denotes the first-order theory of A, i.e., the set of first-order sentences true in A.
The corresponding notion of representability reads as follows: Indeed, let Con(ZFC) be an arithmetical sentence expressing the consistency of ZFC. Then Con(ZFC) is true in (N, +, ×, 0, 1, <) but fails in some ZFC-standard models of arithmetic by Gödel's Second Incompleteness Theorem.
Given the naturality of elementary equivalence, the question which structures are absolutely representable in F deserves our mathematical curiosity. The above example hints at serious limitations, and we shall exactly delineate them in the next section where we establish our main claim from the Introduction. For now, we mention some reasons to find the notion philosophically interesting.
First: Clearly one goal of the informal mathematical investigation of the intended structure S L is to find out what is true in S L , and first-order truth is undoubtedly an important part of it. It thus seems that an absolute representation is a clear desideratum for the foundational theory. It states that first-order truth in the intended structure does not vary with different assumptions on the model of the foundational theory we are living in.
Second: Absolute representation ensures a certain level of stability in the informal mathematical investigation of S L with respect to changes in the foundational theory. Thereby it provides comfort to the working mathematician who is not willing to restrict their investigations to F alone. For example, if we (consistently) expand F by adding more axioms, absolute representability of a structure in S L ensures that we do not change F's beliefs about what holds in S L by doing so. 8 Third: Absolute representability provides a reasonable way of balancing two applications of logic-what has been called the descriptive and deductive role of logic in the foundations of mathematics (see [4]). In the descriptive mode, we try to describe structures up to some level of equivalence (often isomorphism) using a logical theory. In the deductive mode, we use logic to analyse the particular kinds of inference patterns that appear in the relevant part of mathematics. Often these two applications of logic are argued to be in tension since there is a trade off between descriptive power and deductive completeness. Specifically, all of finiteness, natural number, real number, and various infinite well-orderings evade characterisation in first-order logic. On the other hand, logics with greater than first-order resources at their disposal are able to characterise some of these notions at the expense of pleasing meta-theoretic properties, namely compactness and Löwenheim-Skolem by Lindström's theorem [6] and specifically completeness with respect to a finitary proof system. There is thus a trade-off between descriptive power and the smoothness of transition between validity and proof (we will discuss this further in Section 4). This tension has been formulated, for example, in [16] who shows that the ability to determine models categorically is incompatible with a weak completeness requirement. 9 One response to this predicament has been the pursuit of 'optimization projects' (cf. [1,4,11])-programmes that seek to include the least amount of higher-order vocabulary in a theory necessary to ensure categoricity whilst keeping things relatively deductively well-behaved. The study of absolute representability presents an alternative approach to optimisation along a different dimension-rather than trying small increases of higher-order resources to get categoricity whilst retaining some pleasant metatheoretic properties, we stay in first-order logic (and thereby automatically obtain these properties) and weaken the categoricity requirement to soundness and completeness regarding first-order truth.

Absolute Representability and Decidability
In this section we establish our main claim from the Introduction. We need the following lemma:

Lemma 9 Let R be a representation of an L -structure in F. Then R absolutely represents S L in F if and only if
( 2 ) 8 It is important that we compare first-order truth of the intended structure S L and its formal counterparts R(M) in the informal meta-language. An analogous notion inside F = ZFC would state that Th(R(M)) as defined in M (which, assuming the universe of R(M) is a set, can be done) does not vary with M. This now includes non-standard sentences and ceases to be a property of R: this theory can vary with M even when keeping R(M) fixed; we refer to [3] for precise statements. 9 See [16], p. 267.
Proof Assume R absolutely represents S L in F. To show ⊆ in Eq. 2, let ϕ ∈ Th(S L ).
We have to show that F proves R(ϕ): let M be a model of F, so ϕ ∈ Th(R(M)) = Th(S L ) by absolute representation, that is, R(M) |= ϕ, so M |= R(ϕ) by Lemma 4. To show ⊇ in Eq. 2 let ϕ / ∈ Th(S L ). Then ¬ϕ ∈ Th(S L ), so F proves R(¬ϕ) = ¬R(ϕ) by the inclusion just proved. Hence F R(ϕ) because F is consistent.
Conversely, assume Eq. 2 and let M |= F. We have to show that R(M) |= Th(S L ). But, by Lemma 4, R(M) models the right-hand-side of Eq. 2.

Proposition 10 If S L is absolutely representable in
Proof Given as input an L -sentence ϕ compute R(ϕ) and ¬R(ϕ) = R(¬ϕ) and enumerate all consequences of F (which we assumed to be computably enumerable). By Lemma 9, exactly one of R(ϕ) and ¬R(ϕ) is eventually enumerated, and we accept or reject our input accordingly.
In Example 8 we saw that a particular representation R ZFC N is not an absolute representation. We can now say more:
We now prove a partial converse to the above under an additional assumption on F: Assume there is a representation R F N of (N, +, ×, 0, 1, <) in F such that F proves R F N (Q) where Q is the conjunction of the finitely many axioms of Robinson arithmetic. For any F worth calling a foundational theory, this surely is less than a minimal requirement (and clearly met by ZFC).
Given this assumption, we note the following direct consequence of Lemma 4.

Lemma 12
Let ϕ be an arithmetical sentence such that Q proves ϕ. Then F proves R F N (ϕ).
We now present our notion of what it means for F to "know" a Turing machine deciding Th(S ϕ ). Let ϕ denote the Gödel number of an L -sentence ϕ.

Definition 13
Let R be a representation of an L -structure in F and let A be a Turing machine. F pointwise verifies A with respect to R if for every L -sentence ϕ: We should remark here that this definition is rather weak among various reasonable notions for what it means that F "knows" some Turing machine. Note that A and the inputs ϕ are standard (given in the meta-language), and F is asked to provide a proof of correctness "pointwise", i.e., separately for every standard ϕ. Alternative notions could quantify A and/or ϕ inside F. For example, we might require that F proves some sentence expressing "there exists a Turing machine such that for all Lsentences. . ." where the witnessing machine might be non-standard but has to work also for nonstandard sentences.
We view Theorem 14 below as evidence that our notion of knowing a Turing machine is the right one in our context. As its proof shows, F knows in our sense any Turing machine deciding the theory of an absolutely representable structure, see Corollary 15. Philosophically speaking, we might view this knowability condition as a mere technicality, and regard our result as showing that for all practical purposes absolute representability and decidability are equivalent.
The following establishes our main claim from the Introduction.

Theorem 14 Let R be a representation of an L -structure in F. Then R absolutely represents S L in F if and only if there exists a Turing machine A deciding Th(S L ) and F pointwise verifies A wrt R.
Proof For the forward direction, assume R absolutely represents S L in F. By Proposition 10, there is a Turing machine A deciding Th(S L ). We claim that F pointwise verifies A wrt R. Let ϕ be an L -sentence. We show Eq. 4 by distinguishing cases.
For the converse direction, assume A decides Th(S L ) and F pointwise verifies A wrt R. We verify Eq. 2 of Lemma 9.
The proof shows:

Corollary 15
Let R be a representation of an L -structure in F and let A be a Turing machine. If R absolutely represents S L in F and A decides Th(S L ), then F pointwise verifies A wrt R.

Applications to Foundational Debates
We have seen thus far that an intended structure is absolutely representable in a firstorder foundational theory F if and only if its (first-order) theory is decidable and F knows some decision procedure for it. In this section we'll discuss applications of this observation to the debate between proponents of first-order versus higher-order foundations.
Delineating trade-offs An important debate in the philosophy of logic and mathematics is whether foundations should be conducted in first-order or higher-order logic (or, if one is more tolerant in outlook, which logic is suited for what purposes). Throughout this paper, we have explicitly restricted our attention to first-order theories-both with respect to the foundational theory F under consideration and the theory of informal mathematics that we are trying to formalise in F. Many authors argue that our foundational theory should contain expressive resources greater than first-order, since many notions cannot be characterised up to isomorphism in firstorder logic. 10 As noted in §1, all of finiteness, natural number, real number, and various infinite well-orderings evade characterisation in first-order logic whilst logics with greater than first-order resources are able to characterise more at the expense of pleasing meta-theoretic properties, especially compactness, Löwenheim-Skolem, and completeness with respect to a finitary proof system. Our results inform this trade-off by providing bounds on when a first-order foundational theory can capture truth in an intended structure. Whilst it is clearly true that for an infinite structure S L , asking for isomorphic representation of S L is too much, nonetheless our results show that there are precise conditions on which a first-order theory can be omniscient concerning truth in S L . This shows that for a certain class of infinite structures, even first-order logic can have a good deal of traction (structurally speaking), in contrast to the view that only finite structures can be well-treated in first-order logic. Moreover, we obtain this traction precisely when the theory is decidable and F knows some decision procedure for it. We now discuss some payoffs of these observations for debates concerning the roles of first-order and higher-order logics.

Technical correlates of epistemic arguments
Our main result provides a technical correlate of a kind of epistemic argument one can find in the literature. Read (in [11]) puts forward the following epistemic objection to the use of higher-order logic (albeit in order to reject it later): Without deductive completeness, knowledge of any mathematical treatmentindeed of any theory-would be impossible. Only through completeness can claims be tested and verified. If someone is challenged concerning a claim made as part of a complete theory, production of a proof can settle the matter -where 'proof' is understood as in first-order logic, as a decidable property, so that any putative proof can be checked for correctness. If the theory was not complete, one might claim that a thesis of it was true but not provable. How could such a claim be checked and supported or refuted? ( [11], p. 88) Of course, as Read notes, the objection is not sound since it appears to motivate decidability as the operative property rather than completeness: Even in pure first-order logic, completeness guarantees only semi-decidabilitythat we can find a proof if there is one. It does not yield a corresponding effective method of refutation. The argument for the epistemological need for completeness in fact suggests that decidability is really the desired property; but that feature, though dreamed of, perhaps, by Leibniz and Wittgenstein, was denied us even for first-order consequence in the aftermath of Gödel's incompleteness theorem. ( [11], p. 88) Our results show that, from the point of view of an appropriate F there is a technical manifestation of this argument put forward for consideration in [11]. As it turns out (from the point of view of F) we have traction on truth in a structure S L exactly when F knows a Turing machine witnessing the decidability of the theory of S L . Thus, if one (controversially) wishes to maintain that the decidability criterion is the important one for epistemic tractability, one obtains a good match between truth in a structure and satisfaction of this epistemic criterion for those structures with decidable theories, and poor epistemic traction on truth when the theory of S L is undecidable.

Limitations of first-order foundations
On the other side of the coin, we have shown that a foundation which is both first-order and computably enumerable has limits in absolutely representing theories. The meta-theoretic advantages given by compactness, Löwenheim-Skolem, and completeness have their price. Not only will any first-order foundational theory F fail to determine the cardinality of an intended infinite structure S L , but if the theory of S L is undecidable F loses traction on truth in S L too. Authors such as Tennant (in [16]) show that weak completeness requirements on a theory are incompatible with the categorial representability of a structure in that theory. We have shown further that if we want traction on truth in intended structures with undecidable theories, we will need higher-order resources. 11 This in turn shows that if we want to use some foundational theory to provide a generous arena and shared standard (in Maddy's sense from [7]) accuracy concerning truth for structures with undecidable theories requires non-first-order resources.

Implications for optimization projects
The previous observations have implications for so-called 'optimization projects'. In §1 we noted that various authors (e.g. [1,4,11]) consider the idea of trying to find the 'mildest' strengthenings of first-order logic in order to be able to capture certain structures up to isomorphism whilst retaining many desirable meta-theoretic properties. We suggested absolute representability as a different approach to optimisation, instead of increasing the strength of the logic to obtain categoricity, we weaken the level of accuracy concerning structural description required and coarsen the similarity relation to elementary equivalence. Our results indicate that this optimisation project can be successful for precisely those structures which have decidable theories, but a different approach is needed for structures with undecidable ones.

Conclusion
Recall, we asked for a representation R such that S L and R(M) are similar for all models M of F. Taking similarity as elementary equivalence, we saw that this requires that Th(S L ) be decidable. We have analysed some implications for studying intended structures via absolute representability. If Th(S L ) is not decidable, however, it is natural to ask for weaker notions of representability.
There are many possibilities and we briefly discuss one of them, namely the one obtained by weakening the equality in Eq. 2 of Lemma 9 to an inclusion: call R a sound representation of S L in F if Roughly said, F proves only true first-order sentences about S L . Clearly, the working mathematician studying S L would reject any foundational theory not providing such a representation. Proof Assume Eq. 5 holds. It suffices to show that the theory is consistent. Indeed, a model M of this theory has the property that R(M) |= ϕ for all ϕ ∈ Th(S L ) by Lemma 4, so R(M) and S L are elementarily equivalent. The claimed consistency follows from compactness: if the theory above is inconsistent, then there are finitely many ϕ 1 , . . . , ϕ k ∈ Th(S L ) such that F refutes R(ϕ 1 ) ∧ . . . ∧ R(ϕ k ) = R(ψ) for ψ := ϕ 1 ∧. . .∧ϕ k (see Remark 5); as F proves ¬R(ψ) = R(¬ψ) and ¬ψ / ∈ Th(S L ), this contradicts Eq. 5.
Conversely, if Eq. 5 fails, then there exists ϕ / ∈ Th(S L ) such that F proves R(ϕ). By Lemma 4, R(M) |= ϕ for every model M of F while S L |= ϕ, so R(M) and S L are not elementarily equivalent.
Thus, asking for a sound representation is asking for a special model M of F, namely one such that R(M) and S L are elementarily equivalent. This makes sense also for other notions of similarity, and in particular for isomorphism. For example, in the case of (N, +, ×, 0, 1, <) the latter asks F to have an ω-model. It is thus philosophically justified to ask F to be more than just consistent, namely to ask for the existence of some model M of F such that R(M) and S L are similar in some or another sense for various intended structures S L . But it is unclear how much one should or can ask for.
Another way to weaken the notion of representability, in order to make it apply to structures with undecidable theories, is to consider only "intended" models M of F, or an expansion thereof formulated using greater than first-order resources. 12 For example, restricting M to transitive standard models of F = ZFC makes R ZFC N an isomorphic representation of (N, +, ×, 0, 1, <); on the other hand, (R, +, ×, 0, 1, <) is absolutely but not isomorphically representable. By contrast, if we instead formulate ZFC in quasi-weak second-order logic (where second-order variables are stipulated to range over countable relations), (R, +, ×, 0, 1, <) becomes isomorphically representable. We suggest that various "semantic" extensions of F can, in principle, be philosophically assessed by their capability to represent intended structures with respect to one or another notion of similarity. 13 This project, while worthwhile, is very broad, and so we leave the study of such similarity relations in semantic extensions as an open problem.