1 Introduction

Mathematical structuralism is usually viewed as a theory about mathematical ontology (‘What is mathematics about?’) and about the semantics of mathematical discourse (‘To what kind of objects do mathematical terms refer?’). However, it can also be understood as a claim about the epistemology of mathematics, that is, about the nature of mathematical knowledge. The central epistemological claim made by structuralists is that mathematical knowledge is purely structural in character. It is the claim that warranted mathematical statements express purely structural information about the objects of their subject field. Let us call this the structuralist thesis in mathematics.Footnote 1

How can we qualify the thesis? To do this, it seems reasonable to first address two other questions, namely (i) what is constitutive for mathematical knowledge, and (ii) what does structural information consist of? Regarding the first point, the central method for gaining mathematical knowledge is clearly the method of proof and the practice of proving theorems. Thus, we know that a mathematical statement is true just in case there exists a proof of it. Importantly, in mathematical practice, proofs are usually understood to be informal, to be contrasted with a formal notion of proof or derivability specified relative to a logical system of deduction. Given this, the structuralist thesis in mathematics seems conceptually closely related to the method of proving. We can thus reformulate it as follows: Mathematical theorems express only structural information about the objects of their subject domain.

This claim has been defended by several proponents of a structuralist account of mathematics. Compare, for instance, Hellman for such a view:

On a structuralist view, (...), the mathematician claims knowledge of structural relationships on the basis of proofs from assumptions that are frequently taken as stipulative of the sort of structure(s) one means to be investigating (1989, p. 5).

This description immediately raises the second question mentioned above, namely how the ‘structural relationships’ between objects should be understood in the context of mathematics. When is a piece of information about mathematical objects (such as graphs, rings, or number systems) structural in character? Notice that many statements about such objects clearly do not express structural information, such as the claims that Peterson graphs have a beautiful composition or that natural numbers are abstract objects.

The structuralist thesis therefore calls for a principled distinction between structural and non-structural information about mathematical objects. One prominent idea to characterize structurality in this sense is in terms of the notion of invariance under certain transformations or mappings. Generally speaking, a statement about some mathematical objects is said to have structural content just in case this content is preserved under particular transformations. That is, a piece of information about the objects of a given class is structural in case it meets certain invariance criteria (to be specified relative to this class).Footnote 2

The aim of this paper is to make the structuralist thesis logically precise and to show that it is true for mathematic knowledge. We will do so in terms of a formal reconstruction of the two central components of the thesis, namely (i) the notion of informal proof and (ii) the notion of structural information about mathematical objects. Specifically, we propose to describe each notion in a particular modal logic. Concerning (i), this will be the S4 modal logic of informal provability discussed in Gödel (1933), Shapiro (1985), and Leitgeb (2009). Following Leitgeb and others, we hold that the method of proving statements in mathematical practice usually does not align with the notion of formal provability in some formal system. Moreover, the notion of informal provability, of proofs not relativized to a formal system, should be viewed as a ‘modal expression’. Similarly, concerning (ii), we will give a modal reconstruction of the notion of structural information in terms of a Kripke semantics for mathematical languages by introducing a structurality operator. This operator attaches to a sentence, to indicate that the sentence expresses structural information. We show that an S5 modal system adequately describes the logical behaviour of this operator. With the two logics of informal provability and structurality in place, we will finally give a precise modal reconstruction of the structuralist thesis based on a bi-modal framework, and argue that this reconstruction of the structuralist thesis holds.

The paper is organized as follows: Sect. 2 first outlines an account of structural information about mathematical objects in terms of an invariance condition. This account is specified logically in terms of an S5 modal logic of structural information. Section 3 then discusses the S4 modal logic of informal provability and a suitable interpretation of it in terms of a Kripke semantics. In Sect. 4, we discuss a combination of the two modal systems in terms of constructing the product of the two Kripke frames and give a proof of the structuralist thesis in the resulting bi-modal logic. Section 5 then gives a more philosophical discussion of the conceptual connection between informal provability and structurality. Section 6 contains a brief summary of our findings.

2 Structural information: a modal framework

In the context of mathematical structuralism, the predicate ‘\(\dots \) is structural’ is usually attached to properties of, or relations between, mathematical entities, where these entities are (mere) positions in larger mathematical structures. Each natural number, for example, is a position in the natural number structure; each real number is a position in the real number structure. These positions have structural and non-structural properties. For example, being odd is a structural property; being abstract is arguably a non-structural property.Footnote 3 How can one further specify the notion of structural properties? One way to characterize the difference between structural and non-structural properties is in terms of invariance under isomorphism, where structural properties are those that are preserved in each isomorphic system.Footnote 4

More precisely, systems have the form \(\langle D, R_1 \dots R_i, f_1 \dots f_j, c_1 \ldots c_k \rangle \), comprising a domain, some primitive relations and operations on the domain, as well as some primitive elements of the domain. A given system may be of a recognizable mathematical type, such as a graph, or a group, or a field. For example, a system \(S = \langle G, \cdot \rangle \) is a group, which is a basic system of a specific mathematical type, comprising a set G and a single primitive operation on G.

Systems can be described by formal languages that have a particular signature, which contains symbols that interpret the system’s primitive relations, operations and elements of the domain. A system interpreted by a particular language \(\mathcal {L}\) is called an \(\mathcal {L}\)-system. For example, \(\mathcal {L}\) may be the first-order language of group theory. The theory of groups contains the group axioms, which are formulated in this language. And what makes an \(\mathcal {L}\)-system, such as S above, a group is the fact that it satisfies the group axioms, where satisfaction of a formula by an \(\mathcal {L}\)-system is defined in the usual way. Two \(\mathcal {L}\)-systems are isomorphic (under the relevant signature) when there is a bijective mapping between the domains of the systems that preserves the primitive relations. A property or relation of an entity in the domain of a system is then structural iff it holds of the matching objects in each isomorphic system.Footnote 5

2.1 A modal operator for structural information

This paper offers a different approach to the notion of structurality, giving it an epistemological spin. According to this approach, the informal predicate ‘\(\ldots \) is structural’ is something that attaches to sentences, which we will take to be (or at least to represent) objects of knowledge. As one does with properties and relations, one can distinguish between structural and non-structural sentences. More specifically, we say that a sentence is structural when it provides purely structural information concerning the entities it is about. For example, in the system Z of the integers, the following important result from number theory is a structural sentence.

Euclidean Division. Given integers a and d with \(d \ge 1\), there exist unique integers q and r such that \(a = dq + r\) and \(0 \le r \le d - 1\).

This theorem, formulated as a sentence in the language of number theory, arguably expresses a piece of structural information about any system of integers. How can the structural nature of its semantic content be further specified? Interestingly, the distinction between structural and non-structural sentences can again be made more precise by using the notion of invariance. A sentence, which holds according to a mathematical system of a given type, is structural if and only if it is true according to each isomorphic system. That is, a sentence expresses purely structural information about a mathematical object just in case the information also applies to all isomorphic copies of the object considered.Footnote 6 For example, the Euclidean division theorem holds in the ring of integers, and it is structural because it holds in any system of entities that is isomorphic to the integers.

With this approach in mind, structurality can be seen as an operator, which attaches to a sentence, \(\phi \), resulting in a new sentence that says ‘\(\phi \) is structural’. If a sentence is structural, and it holds according to a system of objects, then we say that it is structurally true according to the system. And what this means is that the sentence holds according to each system that is isomorphic to the original. Thus, the structurality operator can be viewed informally as a qualification of the mode of truth of mathematical statements. The Euclidean division theorem is true of the ring of integers in a particular way, namely it is structurally true in the sense described above.

It is important to emphasize that structurality is not specified as a predicate but as a sentential operator in the present context. Thus, combined with a sentence, the operator yields a new sentence. This choice comes with a deliberate limitation of expressive power and a restriction to the propositional level. As introduced here, we want to understand the informal expression “... is structural” as a specification of the type of information expressed by sentences. The mathematical sentences considered (as in the case of Euclidean division) usually have a more complex logical or quantificational structure. But in the theory of structural information outlined here, they function as atomic propositions. Understood as a sentential operator, structurality is thus similar to the necessity operator of modal logic.Footnote 7 According to the standard modal semantics for necessity, a sentence is necessarily true at a world iff it is true at every accessible possible world. Similarly, on our approach a sentence is structurally true in a system iff it is true in every isomorphic system. On this analogy, modal worlds or states are simply understood as systems or models in the model theoretic sense of the term. The ‘accessible worlds’ of a system, in turn, are just its isomorphic systems. In what follows, we use these similarities to develop a modal semantics for the structurality operator.

2.2 A modal semantics for structural information

Modal semantics allow for the interpretation of languages with modal operators, like the familiar \(\square \) and \(\lozenge \), which are often used to represent some form of necessity and possibility. The semantics are given in terms of Kripke models, which in the propositional case are ordered triples \(\mathcal {M}= \langle \mathcal {W}, \mathcal {R}, \mathcal {I}\rangle \). Here, \(\mathcal {W}\) is a set of points, usually called ‘worlds’ in the context of the alethic modalities. The accessibility relation \(\mathcal {R}\) is a binary relation on points in \(\mathcal {W}\), and \(\mathcal {I}\) is a function that provides assignments for different parts of the language in question.

Each closed formula, \(\phi \), of the language can be assigned a truth value, \(\mathcal {I}(w, \phi )\), at each world w. Truth conditions for non-modal formulas are given in the usual way. For the modal operator \(\square \), we have:

$$\begin{aligned} \mathcal {I}(w, \square \phi ) = 1 \text { iff, for all } w' \in \mathcal {W}\text { such that } w\mathcal {R}w', \mathcal {I}(w', \phi ) = 1 \end{aligned}$$

When \(\square \) represents some form of necessity, the truth conditions indicate that \(\phi \) is necessarily true at a world when \(\phi \) is true at every accessible possible world.

In this paper, we deploy the modal semantics to capture a notion of structurality as it applies to sentences, indicating when a sentence express purely structural information. The structurality operator, which we represent as S, attaches to sentences of a mathematical language, such that S\(\phi \) says that \(\phi \) is structurally true. Recall that a sentence is structurally true, according to a system of objects, when it is true according to every isomorphic system of objects.

This approach suggests a model theory in which the worlds of the Kripke models (the points in \(\mathcal {W}\)) are mathematical systems. For simplicity we give the details for purely relational systems of the form \(\langle D, P_1, \dots , P_n \rangle \), though this can be extended to include primitive elements of the domain and primitive operations on the domain. The accessibility relation holds between two systems of a particular mathematical type when they are isomorphic. We use triples \(\mathcal {M}_s = \langle \mathcal {W}_s, \mathcal {R}_s, \mathcal {I}_s \rangle \) to refer to Kripke models for this modal logic of structural information. As has been suggested in Schiemer and Wigglesworth (2019), we take the set of worlds in \(\mathcal {M}_s\) to consist of mathematical systems of specified mathematical types. Given two systems of the same type, i.e, two worlds \(w, w' \in \mathcal {W}_s\), we have that \(w\mathcal {R}_sw'\) if and only if there is a bijective function f from the domain D of w to the domain \(D'\) of \(w'\) such that for each P in the relevant signature, \((\forall x_1, ..., x_n \in D)[P(x_1, ..., x_n) \leftrightarrow P(f(x_1), ..., f(x_n))]\), with \(f(x_1), ..., f(x_n) \in D'\). The notion of isomorphism generates an equivalence relation between systems, a relation that is reflexive, symmetric, and transitive. A modal logic in which the accessibility relation is reflexive, symmetric, and transitive corresponds to an S5 modal logic. Using the accessibility relation to represent when two systems are isomorphic thus gives us an S5 modal logic of structural information.

This framework is intended to be a general framework for the formal representation of pure mathematics, applicable to mathematical systems of various types (graphs, groups, rings, number systems, spaces, etc).Footnote 8 The worlds in \(\mathcal {W}_s\) comprise mathematical systems of different types, where systems of a particular type may be isomorphic to one another, according to the notion of isomorphism appropriate for that type.Footnote 9 Moreover, one could think of the formal language of structural mathematics to be a first-order language of set theory, extended by non-logical constants (of each type and arity) in order to express the different mathematical types.

The approach outlined above is similar to a view known as set-theoretic structuralism (cf. Reck and Price (2000), Hellman and Shapiro (2019), and Reck and Schiemer (2023)). According to the set-theoretic structuralist, the mathematical universe contains only (structured) sets, and mathematical theories should be understood as being about different kinds of sets. On this view, there are no sui generis mathematical objects, like the natural numbers, real numbers, complex numbers, etc. There are only sets. Compare Reck and Schiemer (2023) on this point:

In set-theoretic structuralism (...) the only mathematical objects at play are those that axiomatic set theory, possibly with urelements, allows us to introduce. We do not need to postulate abstract structures in addition (ibid., Sect. 2.2).Footnote 10

Mathematical theories are then interpreted in the universe of sets, where various structured sets satisfy sentences in the languages of those mathematical theories. What makes the position a structuralist one is that the choice of a particular structured set (or set-theoretic system) for the interpretation of the theory is completely arbitrary. Thus, any system satisfying the axioms of the theory in question can be used for the semantic interpretation of its primitive vocabulary.

To illustrate this approach, consider again the statement of Euclidean division given above, which is a result from number theory, and which holds in the system Z of the integers. This system has the algebraic structure of a ring, with the form \(\langle Z, +, \cdot , 0, 1\rangle \), comprising a domain, two distinguished operations (functions) on the domain, and two distinguished elements of the domain. Rings have a corresponding signature, and a notion of isomorphism—ring isomorphism—which is a bijective mapping between two systems that preserves the ring operations.

Like the set-theoretic structuralist, if we were to take our background theory to be a theory of sets (such as Zermelo-Fraenkel set theory with choice), then our background mathematical universe is the cumulative hierarchy of sets V. In V, there are no sets that are the integers. There are many structured sets, or systems, of the form \(\langle Z, +, \cdot , 0, 1\rangle \), all of which have the structure of the integers. Talk about the integers Z can be understood as referring to any one of these systems. When working with results about the integers, such as the Euclidean division result, one can interpret the results as being about any of these systems, because they are all ring-isomorphic to one another under the relevant notion of isomorphism.

As was shown above, we propose to capture these relationships between isomorphic systems in a modal framework. Using the integer example, any system that has the structure of the integers, \(\langle Z, +, \cdot , 0, 1\rangle \), can be understood as a world in the domain \(\mathcal {W}_s\) of a Kripke model. We can call this world Z, because it can play the role of the integers. In this model, Z accesses other worlds in the model, namely those rings that it is isomorphic to, according to the appropriate notion of isomorphism, which in this case is ring isomorphism. That is, it accesses those other worlds that have the structure of the integers. The Euclidean division theorem is a sentence in the language of rings. If we were to choose our modal framework to be built on a theory of sets, similar to the set-theoretic structuralist, the language in question is the standard first-order language of sets extended by the signature of rings.Footnote 11 The sentence holds according to Z, which is effectively the ring of integers. Furthermore, it holds according to every system (i.e., every ring) that is ring isomorphic to Z. That is, the Euclidean division theorem holds at every world that is accessible from Z. It is therefore a structural truth according to Z. In other words, Euclidean division provides us with structural information about the integers.

Understanding mathematical systems as points or worlds in a Kripke model, accessible to one another in virtue of being isomorphic, has consequences for what can be true at these points. Two cases should be distinguished here. The first concerns statements expressed in the formal language of a given mathematical type, that is, in terms of a given mathematical signature. Given two systems of a particular type, if they are isomorphic to one another, then they are elementarily equivalent, making exactly the same sentences of the particular mathematical language true. Thus we have that, for a sentence \(\phi \) in the relevant signature, if \(\mathcal {I}(w, \phi ) = 1\) and \(w\mathcal {R}_sw'\), then \(\mathcal {I}(w', \phi ) = 1\). For example, any statement expressed in the language of rings, if true in a particular ring, is also true in any other isomorphic ring.Footnote 12

The second case concerns statements about the objects of a mathematical type that are formulated in a more general set-theoretic language. Such statements may also express information about the systems in question, for instance, about the set-theoretic composition of the elements in the ring of integers. The properties of numbers expressed here are precisely the kind of non-structural properties first discussed in Benacerraf (1965). Note also that the present modal framework is general enough to semantically evaluate sentences of this form. While such sentences may be locally true, that is, true at a particular point in the Kripke model, they clearly lack the feature of structurality expressed by truth in all admissible points. In other words, the information expressed by them is not structural in character.

3 Semantics for informal provability

We turn now to the epistemology of mathematics and, more specifically, to a logical analysis of the notion of mathematical proof. As was stated in the Introduction, the central method for establishing mathematical knowledge is usually identified with the practice of formulating deductive proofs of theorems. Compare, for instance, Hamami (2021) on the central epistemological role of proofs in mathematics:

(...) mathematical practice, as an epistemic practice, enforces a standard or norm of justification according to which proof is the only accepted or legitimate form of justification for mathematical propositions. (ibid., p.2)

The focus in the present section will be on a logical study of the practice of developing informal proofs. Thus, it will be important to distinguish here between the notion of informal provability and the notion of formal provability. Roughly put, formal provability simply means provability in a formal system, a notion that is familiar from the study of proof theory in formal logic. Informal provability, in turn, is closer to the looser conception of proof familiar from mathematical practice.Footnote 13 Urbaniak and Pawlowski (2018) characterize the notion in the following way:

Informal provability is closely related to what mathematicians do when they prove theorems, rather than to formal provability in an axiomatic system (...). A sentence is informally provable if it is provable by any commonly accepted mathematical means (ibid., p. 222).

Compare also Antonutti Marfori (2010) on the crucial distinction between formal(-ized) and informal proofs in mathematics:

As a matter of fact, proofs in ordinary mathematical practice are not instances of formal derivations. They are not commonly formalised either at the time they appear in mathematical journals or afterwards, nor are they even generally presented in a way that makes their formalisations apparent or routine (ibid., p. 266).

It is this notion of non-formalized proofs or informal provability in mathematics that we focus on. However, in this section we will attempt to capture informal provability in a formal setting, which uses tools from modal logic. Provability logics are generally characterized as modal logics describing the logical behaviour of different provability operators. Accordingly, the notion of informal provability can be formally represented by a modal operator, P, which attaches to sentences \(\phi \), so that P\(\phi \) says that \(\phi \) is informally provable.Footnote 14

3.1 A modal operator for informal provability

The informal provability operator, P\(\phi \), is to be distinguished from the structurality operator, S\(\phi \), which says that a sentence \(\phi \) is structurally true. As shown in Sect. 2, S obeys the modal logic \(\textsf {S5}\). Informal provability arguably obeys a weaker modal logic. Several scholars, including Shapiro (1985), and Leitgeb (2009), have argued that informal provability, when expressed as a modal operator, is captured by the modal logic \(\textsf {S4}\), which is given by the axioms and rules of classical logic, plus the following modal axioms and rule:

figure a

The idea that \(\textsf {S4}\) captures the logic of informal provability goes back to Gödel (1933), who exploits the connection between provability and intuitionistic logic.Footnote 15 Along these lines, Gödel provides an interpretation of the intuitionistic propositional calculus in the modal logic \(\textsf {S4}\):

$$\begin{aligned} \textsf {IPC} \vdash \phi \rightarrow \textsf {S4} \vdash \tau (\phi ) \end{aligned}$$

The converse implication was established by McKinsey and Tarski (1948), and the mutual implication result was extended to predicate logic by Rasiowa and Sikorski (1953) and independently by Maehara (1954):

$$\begin{aligned} \textsf {IQC} \vdash \phi \leftrightarrow \textsf {QS4} \vdash \tau (\phi ) \end{aligned}$$

That is, \(\phi \) is a theorem of intuitionistic predicate logic (\(\textsf {IQC}\)) if and only if \(\phi \)’s translation, \(\tau (\phi )\), into the relevant modal language, is a theorem of quantified \(\textsf {S4}\) (\(\textsf {QS4}\)). The translation function, \(\tau \), is given recursively:

figure b

In discussing the modal interpretation of intuitionistic logic, Gödel notes that \(\square \phi \) should be read as ‘\(\phi \) is provable’, but he is clear that this should not correspond to formal provability.

It is to be noted that for the notion ‘provable in a certain formal system S’ not all of the formulas provable in \(\textsf {[S4]}\) hold. For example, \(\square (\square \phi \rightarrow \phi )\) never holds for that notion, that is, it holds for no system S that contains arithmetic. For otherwise, for example, \(\square (0 \ne 0) \rightarrow 0 \ne 0\) and therefore also \(\lnot \square (0 \ne 0)\) would be provable in S, that is, the consistency of S would be provable in S (1933, pp. 301–303).

In other words, taking \(\square \) to represent formal provability would contradict Gödel’s own second incompleteness theorem. These considerations suggest that \(\square \) should be understood to represent informal, rather than formal, provability.

In the following, we will assume that \(\textsf {S4}\) captures the modal logic of informal provability. Given this, we now look at this interpretation of informal provability from a model-theoretic perspective. Unfortunately, neither Gödel nor later writers have offered an explicit semantic characterisation for the logic of informal provability.Footnote 16 Our starting point will be a system developed by Stewart Shapiro (1985) called Epistemic Arithmetic (EA). The language of EA combines the language of (second-order) Peano Arithmetic with a modal necessity operator that satisfies the S4 axioms, and the model theory initially proposed for EA deployed the standard Kripke-style models for epistemic first-order logic. Shapiro originally represented this modal necessity operator with the symbol K. The intention was for \(K\phi \) to represent that \(\phi \) is knowable. If we accept that the standard for mathematical knowledge is informal proof, then to say that a sentence is knowable is to mean that it is informally provable or “provable in principle”.

Shapiro’s EA was extended along these lines by Leon Horsten (1993, 1994, 2000), who proposed the idea of splitting the single provability operator into a metaphysical possibility operator \(\lozenge \) and another sentential operator P, such that \(\lozenge \)P\(\phi \) represents “it is possible that there is an informal proof of \(\phi \)” or perhaps “it is possible that \(\phi \) has been informally proven”. Horsten’s intention was for the possibility in question to be metaphysical possibility, and for this reason he developed a system according to which that modal operator obeyed the axioms of the modal logic S5. Notably, however, the P operator was not treated as a modal operator in itself.

Horsten calls this system Modal Epistemic Arithmetic MEA, and a few more details about this system will be helpful in what follows.Footnote 17 The language of MEA combines the language of Peano Arithmetic with the operators \(\lozenge \) and P defined above. Horsten’s relational models for MEA are ordered triples \(\mathcal {M}= \langle \mathcal {W}, \mathcal {R}_P, \mathcal {I}\rangle \), where \(\mathcal {W}\) is a set of sets of sentences of \(\mathcal {L}_\textsf {MEA}\) such that:

  1. (1)

    (1a) for all \(w \in \mathcal {W}\) and for every sentence \(\phi \in \mathcal {L}_\textsf {MEA}\), if \(\phi \in w\), then P\(\phi \in w\);

    1. (1b)

      for all \(w \in \mathcal {W}\) and for every sentence \(\phi \in \mathcal {L}_\textsf {MEA}\), if \(\phi \in w\), then \(\mathcal {I}(w, \phi ) = 1\);

  2. (2)

    \(R_P\) is a reflexive, symmetric, and transitive binary relation on W such that for any \(w_i, w_j, w_k \in \mathcal {W}\), if \(w_i\mathcal {R}_Pw_j\) and \(w_i\mathcal {R}_Pw_k\), then there is a \(w_l \in \mathcal {W}\) such that \(w_i\mathcal {R}_Pw_l\) and \(\{\phi \vert w_j \cup w_k \vdash _\textsf {MEA} \phi \} \subseteq w_l\);

  3. (3)

    The interpretation function \(\mathcal {I}\) is defined in the usual way. For the non-logical propositional operators, we have:

    1. (3a)

      \(\mathcal {I}(w, {{\textbf {P}}}\phi ) = 1\) if and only if \(\phi \in w\).

    2. (3b)

      \(\mathcal {I}(w, \lozenge \phi ) = 1\) if and only if there is a world \(w'\) such that \(w\mathcal {R}_Pw'\) and \(\mathcal {I}(w', \phi ) = 1\).Footnote 18

Condition (1) details the fact that Horsten is using syntactic models, according to which worlds are sets of sentences. These are the sentences that are taken to be established at a given point in time, either based on proofs or as the axioms of a given theory. In other words, the worlds or situations in a relational model capture our (or an arbitrary mathematician’s) information state at some point in time.

The first part of condition (2) ensures that the \(\lozenge \) modality satisfies the S5 modal axioms; the second part says that if a world access two worlds, then there is a single world that it accesses that combines the MEA-provable sentences of the two worlds. Furthermore, it follows from (1a) and (3a) that if \(I(w, {{\textbf {P}}}\phi ) = 1\) then \(I(w, \phi ) = 1\). Thus, if a sentence is provable in a given world, then it must be true in that world.

Horsten defends the strength of his S5 possibility operator along the following lines.

Consider an arbitrary mathematician who is occupied solely with proving sentences of \(\mathcal {L}_\textsf {MEA}\). A possible world of a model can be thought of as a possible situation in which this mathematician has a particular (possibly empty, possibly infinite) collection of statements of \(\mathcal {L}_\textsf {MEA}\) of which she has a demonstration. No limitations are imposed on the means which our mathematician has at her disposal for proving such sentences. She may in some possible situation have a higher-order demonstration of the ontic sentence of \(\mathcal {L}_\textsf {MEA}\) which expresses the consistency of MEA. She may in some possible situation “intuitively” see that a statement which is independent of PA expresses an irreducible truth of arithmetic (1994, p. 287).

There are two points to note about the semantics that Horsten proposes for the notion of provability. The first is that he presents an “existential” approach, according to which a sentence is provable if there exists a possible situation in which the sentence has been proven. This diverges from other modal approaches to provability, where provability is often captured by a necessity operator.Footnote 19

The second point is that in the possible worlds semantics for modal logic, S5 models partition the space of worlds into equivalence classes, where each world accesses every world in its equivalence class. Taking Horsten’s interpretation of worlds, what he suggests is a “universal” sense of provability, such that if there is a proof of \(\phi \) in any world (in a given equivalence class of worlds), then \(\phi \) is provable from the perspective of any world (in that equivalence class).Footnote 20

3.2 A modal semantics for informal provability

Arguably, what Horsten describes in MEA captures a notion of provability in principle, rather than a notion of informal provability. What is provable in principle may be “universal”, in which case an existential reading of the provability operator, combined with an S5 semantics, may be appropriate. As we saw, according to Horsten, a sentence is provable, as expressed by the complex statement \(\lozenge {\textbf {P}} \phi \), if there exists an epistemic state of the arbitrary mathematician which is accessible from her current state and in which she has a proof of \(\phi \). Thus, provability in principle is understood here in terms of the possibility of a proof in some accessible stage. In contrast, there is a view that perhaps what is informally provable shouldn’t be universal, and is instead relative to what has already been proven in the current world. Or in other words, informal provability is relative to the current information state. Hence, provability in our current epistemic state simply means that one can develop an informal proof of a given statement \(\phi \) such that \(\phi \) turns out to be warranted in any further accessible epistemic state.

This view will be familiar from the intuitionistic approach. Semantics for intuitionistic logic can also be given in terms of Kripke models. The modal semantics for intuitionistic logic try to capture the following idea, which is strikingly similar to Horsten’s characterization, and which van Dalen (2008) calls the heuristic motivation.

Heuristic motivation. Think of an idealized mathematician (in this context traditionally called the creative subject), who extends both his knowledge and his universe of objects in the course of time. At each moment k he has a stock \(\Sigma _k\) of sentences, which he, by some means, has recognised as true ... Since at every moment k the idealized mathematician has various choices for his future activities (he may even stop altogether), the stages of his activity must be thought of as being partially ordered, and not necessarily linearly ordered (van Dalen, 2008, pp. 162–163).

The idea is roughly that at some stage there are certain proven statements that the mathematical community agrees on, and at accessible (possible ‘future’) stages this collection of proven statements will grow. As these stages form a partial order, the accessibility relation between them is reflexive and transitive, which corresponds to the semantics for an \(\textsf {S4}\) modal logic, as defined axiomatically above. An additional monotonicity condition is also imposed in the Kripke semantics for intuionistic logic, such that if \(\phi \) is true at a stage w, then \(\phi \) is true at every stage accessible from w. For example, and invoking van Dalen’s heuristic motivation, let us take the idealized mathematician to be a number theorist who has just proven the Euclidean division theorem. The monotonicity condition represents the idea that the theorem then holds at the current present stage and continues to hold at any accessible later stage. Once it has been proven, there is no possible future stage where it fails to hold.

In what follows, we propose a novel semantics for informal provability that combines aspects of both Horsten’s and the intuitionistic approach.Footnote 21 The relevant language \(\mathcal {L}\) combines the language of Peano Arithmetic with a single propositional operator P, such that P\(\phi \) is intended to mean \(\phi \) is informally provable. Thus, unlike in Horsten’s account, P is understood here as a proper and primitive modal operator. Models will again be ordered triples \(\mathcal {M}_p = \langle \mathcal {W}_p, \mathcal {R}_p, \mathcal {I}_p \rangle \), where \(\mathcal {W}_p\) is a set of sets of sentences of \(\mathcal {L}\). As with the semantics for MEA, these are syntactic models, where each \(w \in \mathcal {W}\) is a set of sentences representing our current information state, i.e., the sentences that have been established. Along these lines we use Horsten’s condition (1). Drawing on the intuitionistic approach for conditions (2) and (3), Horsten’s S5-like accessibility relation, is replaced with a weaker S4 relation, and the provability operator P is treated as an S4 necessity operator.

  1. (1)   
    1. (1a)

      for all \(w \in \mathcal {W}_p\) and for every sentence \(\phi \in \mathcal {L}\), if \(\phi \in w\), then P\(\phi \in w\);

    2. (1b)

      for all \(w \in \mathcal {W}_p\) and for every sentence \(\phi \in \mathcal {L}\), if \(\phi \in w\), then \(\mathcal {I}_p(w, \phi ) = 1\);

  2. (2)

    \(\mathcal {R}_P\) is a reflexive and transitive binary relation on \(\mathcal {W}_p\);

  3. (3)

    The interpretation function \(\mathcal {I}_p\) is defined in the usual way. For the single non-logical propositional operator P, we have:

    1. (3a)

      \(\mathcal {I}_p(w, {{\textbf {P}}}\phi ) = 1\) if and only if for all worlds \(w'\), if \(w\mathcal {R}_pw'\), then \(\mathcal {I}_p(w', \phi ) = 1\).

Putting all of this together, in what follows, we take the modal logic of informal provability to be the modal logic \(\textsf {S4}\). Two further conceptual points about the model-theoretic semantics presented above should be made. The first point concerns the relation between provability and truth. As in Horsten’s system MEA, provability implies or presupposes truth in the current system. Put more formally, if \(\mathcal {I}_p(w, {{\textbf {P}}}\phi ) = 1\) then \(\mathcal {I}_p(w, \phi ) = 1\). The other direction clearly does not hold in general. Thus, it won’t generally be the case that for every sentence true in a particular state there exists a proof of that sentence in that state. As Horsten points out, this might apply even to simple corollaries of established theorems for which no proof has been given in a certain state. Second, unlike in Horsten’s semantics for MEA, the provability of \(\phi \) in a given world does not necessarily imply that \(\phi \) is already contained as an element in that particular world. Given the current understanding of worlds as epistemic states or sets of warranted statements, what the truth of \( {\textbf {P}} \phi \) in a given state implies is that \(\phi \) must be contained in all accessible epistemic states.

Thus, in the present account, the accessibility relation \(\mathcal {R}_p\) is both monotonic and also induces informational expansions of a mathematician’s information states. Monotonicity can be specified more formally as follows: for a sentence \(\phi \), a model \(\mathcal {M}_p = \langle \mathcal {W}_p, \mathcal {R}_p, \mathcal {I}_p \rangle \) and any two states \(w, w' \in \mathcal {W}_p\), if \(\mathcal {I}_p(w, \phi ) = 1\) and \(w \mathcal {R}_p w'\), then \(\mathcal {I}_p(w', \phi ) = 1\). In turn, the required condition on information expansions can be formulated as follows: for a sentence \(\phi \), a model \(\mathcal {M}_p = \langle \mathcal {W}_p, \mathcal {R}_p, \mathcal {I}_p \rangle \) and any two states \(w, w' \in \mathcal {W}_p\), if \(\mathcal {I}_p(w, {\textbf {P}} \phi ) = 1\) and \(w \mathcal {R}_p w'\), then \( \phi \in w'\). Thus, if a sentence has been proven at a certain stage, it becomes a permanent member of the set of accepted or warranted beliefs of an arbitrary mathematician or a mathematical community.

Given this, we can think of the relational models for the modal logic of informal provability as tuples \(\mathcal {M}_p = \langle \mathcal {W}_p, \mathcal {R}_p, \mathcal {I}_p \rangle \). We recall from Sect. 2 that models for structural information are also tuples \(\mathcal {M}_s = \langle \mathcal {W}_s, \mathcal {R}_s, \mathcal {I}_s \rangle \). In the next section, we look at techniques for creating a model theory that combines these two logics, with the goal of showing that the structuralist thesis is valid in this combined logic.

4 The structuralist thesis

The previous two sections have described modal logics for structural information and informal provability. In this section, we combine the two logics to see how they relate. Combining logics enables one to define a logical system that allows distinct logical operators to behave in different ways, and to interact with one another. For example, one could combine a modal epistemic logic with a modal deontic logic into a single multi-modal system. This system could then be used to explore how the modal operators for knowledge and obligation interact with one another, and it could then be leveraged to establish connections between these two concepts. The focus here will be on combining the logics of structural information and informal provability. We will demonstrate that combining these logics into a single bi-modal system enables one to show that the structuralist thesis in mathematics holds. We qualified this thesis in the Introduction as a claim about the epistemology of mathematics, namely that statements provable from a given theory express purely structural information about the objects in question. As was shown in the previous two sections, each of the two components connected in this claim, namely structurality and (informal) provability, can be characterized formally in a modal logic with a suitable model theoretic semantics. The idea of combining the resulting two logics thus presents a way to explicate (in a Carnapian sense) the underlying epistemological thesis.

4.1 Combining modal logics

There are several established formal methods and results for combining logics, modal logics in particular.Footnote 22 To illustrate an example, consider a bi-modal logic that contains modal operators for physical necessity, symbolized by \(\blacksquare \), and metaphysical necessity, symbolized by \(\square \). A natural bridge principle between these two would be that if \(\phi \) is metaphysically necessary, then it is physically necessary:

$$\begin{aligned} \square \phi \rightarrow \blacksquare \phi \end{aligned}$$

One way to try to establish this bridge principle is by looking at the model theory for this bi-modal logic, which would effectively combine the separate logics for the respective modalities. This could be done with what is called the fusion method, which works well when the model theories of the logics, given in terms of Kripke models, have domains of worlds that contain the same kind of entities (see, e.g., Fitting (1969) and Thomason (1984)). In this case, the Kripke models for the combined logic would be ordered tuples \(\mathcal {M}= \langle \mathcal {W}, \mathcal {R}_\blacksquare , \mathcal {R}_\square , \mathcal {I}\rangle \) with two accessibility relations, corresponding to the two necessity operators, and the domain \(\mathcal {W}\) of worlds would contain metaphysically possible worlds. Accordingly, if \(w \mathcal {R}_\square w'\) then \(w'\) is metaphysically possible with respect to w, and if \(w \mathcal {R}_\blacksquare w'\) then \(w'\) is physically possible with respect to w. Without saying more, these box operators are independent of one another. Neither is defined in terms of the other, and there are no logical connections between them. By making connections between the boxes, one makes connections between physical necessity and metaphysical necessity. These connections are often expressed in the form of bridge principles like the one above.

Connections between these accessibility relations correspond to connections between formulas that contain the two modal operators. For example, one could try to establish that if \(w'\) is physically possible with respect to w, then \(w'\) is metaphysically possible with respect to w:

$$\begin{aligned} w\mathcal {R}_\blacksquare w' \rightarrow w\mathcal {R}_\square w' \quad \text { or equivalently}\quad \mathcal {R}_\blacksquare \subseteq \mathcal {R}_\square \end{aligned}$$

If successful, establishing this connection between the accessibility relations of the model theory would entail that the bridge principle \(\square \phi \rightarrow \blacksquare \phi \) holds as a consequence.Footnote 23

Returning to the logics of informal provability and structural information, what is a suitable approach to combine them? As we saw, the Kripke models we have developed use domains of worlds that have quite different kinds of entities. In the logic of structurality, these worlds are identified with models or structures in the model-theoretic sense. In contrast, worlds in the logic of provability are syntactic in character, consisting of sets of sentences of a formal language. The fusion method for combining logics is thus less successful in this case. Instead of using fusions, we will therefore use the product method to connect these logics.Footnote 24 Syntactically, the resulting system is expressed in a bi-modal first-order language \(\mathcal {L}_{\textbf{P},\textbf{S}}\) with two primitive operators \(\textbf{P}\) and \(\textbf{S}\). Semantically, these operators function as box operators and sentences containing them are to be specified in terms of suitable truth conditions in the corresponding Kripke models. Given models \(\mathcal {M}_p = \langle \mathcal {W}_p, \mathcal {R}_p, \mathcal {I}_p \rangle \) for informal provability and models \(\mathcal {M}_s = \langle \mathcal {W}_s, \mathcal {R}_s, \mathcal {I}_s \rangle \) for structural information, the product of these two logics is characterized by Kripke structures of the form \(\langle \mathcal {W}_p \times \mathcal {W}_s, \mathcal {R}^\vee _p , \mathcal {R}^\vee _S, \mathcal {I}_p \times \mathcal {I}_s \rangle \). That is, the domain of each model is a set of ordered pairs of worlds, which pair up sets of sentences from \(\mathcal {W}_p\) and classical relational structures from \(\mathcal {W}_s\). Each model has two accessibility relations, \(\mathcal {R}^\vee _p\) and \(\mathcal {R}^\vee _S\), which are defined in terms of \(\mathcal {R}_p\) and \(\mathcal {R}_s\), and which correspond to the two modal operators, P and S, of the combined modal logic. For these accessibility relations, we have that \(\mathcal {R}^\vee _p, \mathcal {R}^\vee _s \subseteq (\mathcal {W}_p \times \mathcal {W}_s) \times (\mathcal {W}_s \times \mathcal {W}_p)\). They are defined as:

  • \(\langle u_1, w \rangle \mathcal {R}^\vee _p \langle u_2, w \rangle \) iff \(u_1\mathcal {R}_pu_2\) and \(w \in \mathcal {W}_s\)

  • \(\langle u, w_1 \rangle \mathcal {R}^\vee _s \langle u, w_2 \rangle \) iff \(w_1\mathcal {R}_sw_2\) and \(u \in \mathcal {W}_p\)

For the interpretation function \(\mathcal {I}= \mathcal {I}_p \times \mathcal {I}_s\), we have:

  • \(\mathcal {I}_p \times \mathcal {I}_s(\phi ) = \mathcal {I}_p(\phi ) \times \mathcal {I}_s(\phi )\)

For the modal operators P and S, the interpretation function of the combined, two-dimensional logic gives the following:

  • \(\mathcal {I}_p \times \mathcal {I}_s(\langle u_1, w \rangle , {{\textbf {P}}}\phi ) = 1\) iff for all worlds \(u_2,\) if \(u_1\mathcal {R}_pu_2\), then \(\mathcal {I}_p \times \mathcal {I}_s(\langle u_2, w \rangle , \phi ) = 1\)

  • \(\mathcal {I}_p \times \mathcal {I}_s(\langle u, w_1 \rangle , {{\textbf {S}}}\phi ) = 1\) iff for all worlds \(w_2,\) if \(w_1\mathcal {R}_sw_2\), then \(\mathcal {I}_p \times \mathcal {I}_s(\langle u, w_2 \rangle , \phi ) = 1\)

The resulting logic is a bimodal logic of the form \(\textsf {S4} \times \textsf {S5}\). In this new framework, the structuralist thesis can be expressed as a schematic conditional sentence of the form:

$$\begin{aligned} {{\textbf {P}}}\phi \rightarrow {{\textbf {S}}}\phi \end{aligned}$$
(ST)

While it is straightforward to state, this principle represents the core of the philosophical position outlined in this paper. As already stated in the Introduction, we argue that the structuralist thesis can be interpreted as an epistemological thesis about the nature of mathematical knowledge. The thesis claims that mathematical knowledge, as given to us by the method of informal proof, is structural in character. Mathematical truths that have been established provide us with purely structural information relevant to the objects that those truths are about. For instance, the Euclidean division result is a theorem in number theory and hence expresses a purely structural fact about numbers systems.

4.2 Mathematical structuralism and the structuralist thesis

Before looking at the status of the bridge principle ST as a modal-logical truth, it is worthwhile to situate the modal-logical formulation of the structuralist thesis in the wider context of recent discussions on mathematical structuralism. Two points are noteworthy here. First, one can find in the structuralist literature alternative formulations of the thesis, often focusing on the properties of mathematical objects (see, e.g., Shapiro (1997) and Linnebo & Pettigrew (2014). We have applied the operator ‘\(\dots \) is structurally true’ to sentences, but we appreciate that one can also apply the predicate to properties, such that objects may have some properties that are structural and some that are not. One can then formulate the structuralist thesis as a claim about the kinds of properties that mathematical objects instantiate. This approach is developed in the work of many who have written about mathematical structuralism. For example, Charles Parsons, has articulated the view as follows:

The idea behind the structuralist view of mathematical objects is that such objects have no more of a ‘nature’ than is given by the basic relations of a structure to which they belong (Parsons, 2004, p. 57).

On this approach the core of mathematical structuralism is that mathematical objects are mere positions in structures, having no internal nature and thus instantiating no non-structural properties.Footnote 25

Unfortunately, this straightforward characterization of mathematical structuralism appears to be false, as mathematical objects have non-structural properties, such as being abstract. In fact, Burgess (1999) has argued that this version of the structuralist thesis is inconsistent, as the property of having only structuralist properties is itself a non-structural property. In order to rescue this characterization of the structuralist thesis, some have proposed restricted versions of the thesis. On this approach, which has had some recent success, one can define in a non-trivial way a subset of properties of mathematical objects and argue that all such properties are structural.Footnote 26

The approach we take here moves the focus from structural properties to structural sentences, thus understanding structuralism as an epistemological thesis, rather than a metaphysical one. Our approach tries to capture the core of mathematical structuralism as a thesis about the character of mathematical knowledge. This is not to deny the metaphysical aspects of mathematical structuralism, but to recognize and explore the idea that structuralism has a distinctive understanding of mathematical epistemology. According to this understanding, mathematical knowledge is purely structural in character.

As was pointed out in Sect. 3, mathematical knowledge is undeniably connected to the notion of deductive proof. Mathematical proofs, constructed as informal proofs, are the primary vehicle through which we attain mathematical knowledge. The theorems that result from informal proofs become sentences that we are warranted to believe and that we can claim to know. Connecting informal proof with structural information, we thus arrive at a more precise version of the epistemological structuralist thesis: if a mathematical sentence is informally provable, then it expresses purely structural information about the objects that it refers to.

The second point to note here concerns the modal character of our approach. As introduced above, the structuralist thesis is expressed as the bi-modal statement (ST). The modal nature of the thesis clearly bears some similarity to a prominent version of eliminative structuralism, namely the modal structuralism developed in Hellman (1989). Roughly put, Hellman proposes in his book a modal-logical reconstruction of the “structural content” of theorems of arithmetic and set theory that is nominalist in character. His approach is based on the translation of these theories into second-order modal logic, in which the two alethic modal operators are treated as primitive logical symbols. In chapter 1 of the book, he gives a “modal-structural interpretation” of Peano arithmetic that consists of two steps: (i) a “categorical” step, and (ii) a “hypothetical” step. The first component is simply the assumption that systems satisfying the Peano axioms possibly exist. In Hellman’s modal framework, this is expressed by the purely logical statement:

$$\begin{aligned} \lozenge \exists X \exists x \exists f (\mathsf {PA_2}(X, x, f)). \end{aligned}$$

where \(\mathsf {PA_2}\) expresses the conjunction of the axioms of Peano arithmetic and ‘X’, ‘x’, and ‘f’ are bound variables substituted for the primitive terms ‘N’ (x is a number), ‘0’ (the numeral for zero), and ‘s’ (y is the successor of x) of the theory. Assuming that the categorical thesis holds, the hypothetical step in Hellman’s reconstruction then consists in the translation of an arithmetical theorem \(\phi \) (expressed in the language of second-order Peano arithmetic) into a purely logical conditional sentence of the form:

$$\begin{aligned} \Box \forall X \forall x \forall f (\mathsf {PA_2}(X, x, f) \rightarrow \phi (X, x, f)). \end{aligned}$$

This statement is taken to express the “modal-structural content” of the arithmetical theorem \(\phi \): necessarily, \(\phi \) is true in any system satisfying the axioms of \(\mathsf {PA_2}\). This understanding shows that while Hellman’s modal reconstruction is syntactic in nature, the underlying structuralist motivation is clearly a semantic one. What the modal translation of \(\phi \) asserts is, in fact, a metatheoretic claim, namely that a certain arithmetical fact holds in any number system satisfying the relevant axioms in question. Thus, the boxed conditional statement can be understood as a reformulation of the metatheoretical claim that

$$\begin{aligned} \Box \forall \mathcal {M} (\mathcal {M} \models \mathsf {PA_2} \Rightarrow \mathcal {M} \models \phi ). \end{aligned}$$

Thus, the structural content expressed by \(\phi \) is clearly model-theoretic in nature.Footnote 27

How does Hellman’s modal eliminative structuralism relate to the modal characterization of the structuralist thesis outlined above? While a closer comparison cannot be given in the present context, it is instructive to note several points of contact between the accounts. The first point to mention is that both approaches aim to qualify the structural content of mathematical sentences in a modal framework. Thus, unlike in recent work on non-eliminative structuralism, the focus here is on the structural information expressed by sentences and not on the structural properties of objects or positions in a structure. Moreover, this content is captured in both contexts in terms of a conditional statement. In Hellman’s theory, this is the hypothetical component of his modal reconstruction, that is, the translation of a statement into a boxed, universally quantified conditional sentence. In our approach, the relevant structuralist principle is (ST), connecting the modal logics of informal provability and of structural information.

This shared focus on conditional statements also points to several differences between the two approaches: Hellman’s modal structuralism is motivated by a semantic version of an if-thenism about mathematical knowledge. This is, roughly put, the view that pure mathematics is the study of consequences from freely chosen axiom systems expressed in a formal language. More specifically, the implication symbol ‘\(\rightarrow \)’ in his modal translation can be viewed as an object-theoretic expression of the metatheoretic consequence relation ‘\(\models \)’. In contrast, the implication symbol in our account expresses a bridge result between two fields in the epistemology of mathematics, namely (informal) provability and the structural nature of mathematical knowledge.

Moreover, unlike in the approach suggested here, the notion of informal provability of a theorem is not mentioned in Hellman’s account. While motivated by a model-theoretic notion of semantic consequence, his if-thenism seems compatible with a notion of the derivability of a statement from certain axioms in a formal system of deduction. A final difference concerns the role of axioms in mathematical reasoning. In Hellman’s account, axiomatic principles such as the axioms of \(\mathsf {PA_2}\) are explicitly mentioned in his reconstruction, namely as conditions in the antecedent of the hypothetical statement described above. In contrast, in our account, axioms are not explicitly mentioned on the syntactic side of the modal reconstruction. Nevertheless, they might be included or assumed in the models (understood as epistemic states or proof situations) of the logic of provability sketched in the previous section. We will address this point in the final section.

Returning to our modal reconstruction, we saw that the epistemological version of the structuralist thesis can be represented in the bi-modal logic that combines the modal logics of informal provability and structural information. It is captured by the bridge principle (ST). Within the bi-modal framework, using the definition of product models given above, we can now show that the structuralist thesis is valid for a particular class of models. Specifically, the models we consider are those where the domain \(\mathcal {W}_p \times \mathcal {W}_s\) is restricted to pairs \(\langle u, w \rangle \), where u is a set of sentences and w is a classical relational structure, and if \(\phi \in u\), then w satisfies \(\phi \). Building on our discussion in Sect. 3 of informal provability, the sentences \(\phi \) can be thought of as those sentences that have already been established at a given point in time. The structuralist thesis (ST): \({{\textbf {P}}}\phi \rightarrow {{\textbf {S}}}\phi \) can be evaluated semantically relative to this restricted class of product frames.

Let \(\phi \) be a modal-free sentence of \(\mathcal {L}_{\textbf{P},\textbf{S}}\). Assume that \({{\textbf {P}}}\phi \) is true in a given state u of a suitable model \(\mathcal {M}\), that is, \(\mathcal {I}(\langle u, w \rangle , {{\textbf {P}}}\phi ) = 1\). It follows from axiom T (of system S4) that \(\mathcal {I}(\langle u, w \rangle , \phi ) = 1\). How should we interpret the fact that \(\mathcal {I}(\langle u, w \rangle , \phi ) = 1\) in the present context? Recall from above that u is understood here as a purely syntactic model, i.e., a set of sentences consisting of axioms and proved statements of a given mathematical field. A natural interpretation is to think of the truth of a statement in terms of a semantic consequence relation. Thus, what the condition above expresses is simply the fact that sentence \(\phi \) follows semantically from the sentences in u, that is, \(u \models \phi \). Moreover, given the restriction of product worlds to tuples \(\langle u, w \rangle \) where each formula \( \psi \in u\) is semantically true in structure w, it follows that our modal-free formula \(\phi \) is also true in w, i.e., \(w \models \phi \). Consequently, by the isomorphism lemma mentioned in Sect. 2, we have that \(\mathcal {I}(\langle u, w \rangle , {{\textbf {S}}}\phi ) = 1\) since for any product world \(\langle u, w' \rangle \), if \(w R_{S} w'\), we have \(\mathcal {I}(\langle u, w' \rangle , \phi ) = 1\). Hence, the structuralist thesis (ST) holds in \(\mathcal {M}\).

5 Interpreting the structuralist thesis

Given the logical reconstructions of structural information and informal provability stated above, we see that the structuralist thesis holds. For any statement expressed in the language of a given mathematical field, it can be shown that if the statement presents a theorem, that is, if it is informally provable, then it presents purely structural information about the objects of that field. As presented above, this thesis is based on an interesting bridging result between the two modal logics considered. However, one could object here that this logical result does not help us to clarify how the two notions of structurality and informal provability are conceptually connected. In particular, our logical reconstruction gives no explanation as to why the structuralist thesis holds, i.e., why provable statements or theorems express purely structural information about mathematical objects. In the remaining part of the paper, we want to address this issue.

5.1 The semantic content of informal proofs

Let us first consider the notion of informal provability and, in particular, of the role of mathematical axioms in informal proofs. A proof in mathematics can be described as a demonstration that a certain statement is (necessarily) true. Demonstrations of this form are often conceived of purely linguistically, that is, as syntactic derivations in a formal or semi-formal language. Understood in this sense, it is not surprising that the relation between provability and structurality seems unclear, given that the two concepts appear to belong to different spheres of mathematical thinking. Structurality was described above as an essentially semantic or model-theoretic concept, specified in terms of the notion of isomorphism invariance.

Provability, in turn, is often viewed as a purely syntactic notion, concerning inferential relations between mathematical statements. Nevertheless, as has been stressed in the recent literature on informal provability, this purely syntactic reading of mathematical proofs overemphasizes the analogy with formal proofs in a logical system. Unlike derivations in some formal system, informal proofs in mathematical practice do have ‘semantic and intuitive components’ to it (see Leitgeb (2009)). Compare Rav on this difference between formal and informal proofs:

Let us fix our terminology to understand by a proof a conceptual proof of customary mathematical discourse, having an irreducible semantic content, and distinguish it from derivation, which is a syntactic object of some formal system (1999, p. 11).

How can one think of the semantic content of an informal proof, and what role do semantic considerations play in the act of demonstrating the necessary truth of a mathematical statement? Leitgeb, in his recent study of informal provability, gives the following insightful characterization of the semantic component of proofs:

We can informally prove propositions about mathematical structures (...) semantically, by extracting propositional information from the complex categorical concepts that determine these structures (Leitgeb, 2009).Footnote 28

The picture suggested here is this: a mathematical demonstration of a statement allows one to ‘extract’ information about a given mathematical structure which is implicitly contained in the concepts specifying this structure. Take, for instance, the case of rings. A ring is an algebraic structure that is determined by certain basic properties, e.g., by the fact that addition is commutative, that addition and multiplication are both associative, and that the distributative laws hold between them. An informal proof of a theorem in ring theory thus extracts propositional information about such structures, notably information which is already implicitly contained in the basic properties or structural facts that specify what a ring is.

Note that this ‘semantic’ account of informal proofs is captured by our modal-logical reconstruction of provability. As we saw, sentences containing a provability operator can be evaluated semantically relative to proof situations or epistemic states of an arbitrary mathematician (or a community of mathematicians). These worlds are syntactic in character, as they are sets of sentences that are warranted as true at a given point in time. Nevertheless, as suggested above, the truth of a formula of the form “\(\phi \) is provable” in a particular world u can be understood in a genuinely semantic way, that is, in terms of a relation between sentences and mathematical structures. Specifically, the truth of \({{\textbf {P}}}\phi \) implies that \(\phi \) is a model-theoretic consequence of u. Sentence \(\phi \) is thus true (in the Tarskian sense) in any structure in which the established sentences in u hold. Moreover, given our understanding of product worlds of the form \(\langle u, w \rangle \), we can say that an informal proof of \(\phi \) allows us to extract propositional information about a particular structure w.

5.2 The role of axioms in informal proofs

This brings us to the specific role of mathematical axioms in informal proofs. Here again, it is important to emphasize first a difference to the notion of formal derivability. Formal proofs are derivations in a formal system, derivations based on axioms and inference rules. Inferences in the context of an informal mathematical proof, in contrast, are usually not specified relative to such a system of axioms or rules. Thus, as Leitgeb puts it, mathematical axioms are usually ‘not essential to the “proofhood” of the proofs in which they figure’ (2009, p. 270).Footnote 29 To assess this point, it is instructive to distinguish between two important roles or functions that axioms can play in mathematical practice.Footnote 30 A central function of axioms is that of systematizing the propositions of a theory. As Schlimm (2013) puts is, “[h]ere, the axioms—together with some notion of consequence (...)—determine the sentences (theorems) that can be derived from them” (ibid, p. 48). Understood in this way, axioms clearly play an important inferential role in mathematics: they are used in proofs to derive new results about a given subject matter.

Nevertheless, as philosophers of mathematics have pointed out, axioms also have non-inferential roles in modern mathematics, in particular, as semantic conditions for reasoning about mathematical objects. Moreover, following Schlimm and Leitgeb, we can think of a main function of axioms in mathematical practice as ‘partial definitions’ of general mathematical concepts. The axioms are thus constitutive of mathematical objects in the sense of stating the basic structural properties that these entities must satisfy. For instance, the ring axioms define which properties an algebraic structure must have in order to count as a ring. Compare again Schlimm (2013) on a semantic reading of what he calls the “prescriptive” role of axioms:

If the meanings of the primitives of an axiomatic system depend entirely on the relations that are expressed by the axioms, then the axioms are considered to be prescriptive (or normative), i.e., they determine the subject matter under consideration, and one also speaks of them as ‘implicit definitions’. (...) Considered in this way, a system of axioms generally cannot determine a single interpretation for its terms, but only the structure of the relations that hold between them; understood in this way, axiom systems are also called relational, structural, algebraic, or abstract. In other words, such a system of axioms does not define one single model, but a class of models (ibid., pp. 50–51).

Given the above characterization of informal proofs as the method of ‘extracting propositional information’ implicit in complex concepts, it is clear that the role of axioms thus conceived is to define such concepts of mathematical structures.Footnote 31 Thus, axioms specify the primitive structural facts from which new information can be gained inferentially.Footnote 32

Finally, given the semantic account of informal provability, why should we think of the propositional information extracted from axiomatically defined mathematical concepts to be purely structural in character? Note first that the properties or concepts expressed in mathematical axioms are by definition structural in character. The axioms specify the primitive structural facts (or information) about the mathematical objects in question. In a sense, they are constitutive of the structures. Thus, any axiom of a theory is clearly isomorphism invariant in the sense specified above. Second, the method of gaining new information in terms of a proof arguably preserves this primitive structure (specified by the axioms) simply because nothing is assumed in a proof other than logical inference rules and informal set theory. An informal proof extracts information implicit in the primitive facts about a domain of objects. The only resources used in this process are logical and set-theoretical ones. More specifically, mathematical demonstrations usually make informal use of valid deduction rules, for instance, the rules of implication introduction or of negation introduction in indirect proofs. It is in this sense that logic allows one to deduce information about certain mathematical structures.

Moreover, in the proof of a theorem, mathematicians usually make use of the language of informal set theory in order to describe the relevant mathematical structures or their properties. This includes the ubiquitous reference to set-theoretic operations (such as union, intersection, or taking the power set, etc.), to basic set-theoretic notions (such as the cartesian product or the partition of a class induced by an equivalence relation), or to different types of set-theoretically defined mappings between mathematical objects. Both logic and set theory are arguably topic neutral in the sense that no mathematical content or information is added in the process of applying logical rules and set-theoretic concepts in a proof.Footnote 33 Given this topic neutrality, it follows that the propositional information extracted in an informal proof is also structural in the above sense.

6 Conclusion

We have given a logical analysis of the structuralist thesis in mathematics, namely the claim that mathematical theorems express only structural information about the objects of their particular subject field. We did so by expressing the thesis in a bi-modal framework, specifically, in a product of an S4 modal logic of informal provability and an S5 logic of structurality. Concerning the latter, the guiding idea was to extend a suitable language of mathematics, e.g. the language of first-order set theory, by a modal structure-operator which qualifies the information expressed by a statement as follows: if \(\phi \) is a well-formed statement of the mathematical language, then \({{\textbf {S}}}\phi \) expresses that ‘Structurally, it holds that \(\phi \).’ As was shown, the logical behaviour of this structure operator can be described semantically by Kripke models whose accessibility relation is specified in terms of a general notion of isomorphism between mathematical objects (of a certain type). Thus, a statement of the form \({{\textbf {S}}}\phi \) is true in such a model if the information expressed by \(\phi \) is isomorphism invariant.

A central aim in the paper was to connect this logic of structurality with the modal logic of informal provability first introduced by Gödel and discussed in detail in Leitgeb (2009). In particular, in Sect. 3, we propose a semantic interpretation of such a logic in terms of Kripke models, where the worlds of these models are syntactic in nature. The idea is to think of the worlds in a Kripke model as epistemic stages of an idealized mathematicians at a given point in time. Roughly put, a stage presents the class of theorems in a given mathematical field, and stages accessible from a given stage present possible theoretical extensions of such a class of established results. This semantic interpretation of the logic of informal provability allowed us to combine the two logics and to express the structuralist thesis in a resulting bi-modal framework containing both the provability and the structurality operators. More specifically, it was shown that, based on a suitable restriction of the product worlds, the thesis can be expressed as a bridge result of the form \( {{\textbf {P}}} \phi \rightarrow {{\textbf {S}}}\phi \) in the new logic.

While this modal result does not explain why the structuralist thesis holds, it seems well motivated by considerations concerning the conceptual relation between informal provability and structurality. In particular, following the considerations given by Rav and Leitgeb, we saw that the notion of informal provability is best understood as having a semantic component. Specifically, we proposed that an informal proof is a method of extracting propositional information by logical and set-theoretic means from axiomatically defined concepts. If proofs are understood in this way, the modal reconstruction of the structuralist thesis seems well justified.