1 Introduction

Proof is the gold standard for mathematics. But proofs are also hard to find. Frequently, mathematicians seek relief in evidence that merely supports a mathematical conjecture. Although such evidence is of lesser value than proof, it is not value-less to the working mathematician. Confirmation by instances is an example. According to Collatz’s (2010) conjecture, any process that consists in the application of a simple procedure (viz., to divide any input number by two if it is even or triple it and add one if the input is odd) will always end with the number 1 (where the algorithm always ends). As of 2022, this yet unproven mathematical conjecture has been computer-checked for all starting integers up to \({2}^{68}\). Clearly, this fact alone does not guarantee truth: several important numerical conjectures proved false for large enough values despite the enumerative evidence initially supporting them.Footnote 1 In practice, however, mathematicians often regard the vastness of the sample of positive instances as non-negligible support for a conjecture.

This paper addresses another common mode of non-deductive reasoning in mathematics: inference from analogy. This is the form of reasoning whereby, from the known similarities between two mathematical domains, one concludes that some claim that holds in the source is also true of the target. Despite its defeasible nature, analogical reasoning plays an important role in the history of mathematics. For instance, one of the greatest results of 20th-century mathematics, due to Deligne (1974), proves a conjecture by André Weil that is a close analogue in algebraic geometry of Riemann’s hypothesis (or, more precisely, an analogue of the hypothesis for varieties over finite fields). According to several experts, Weil’s and Deligne’s results “provide some of the best reasons for believing that the Riemann hypothesis is true” (Edwards, 1974:298)—even better than the enumerative evidence uncovered so far (cf. Deninger, 1994:493).

Notwithstanding the central role that analogies occupy in the mind and practice of working mathematicians, the inferential mechanism whereby results gathered in a familiar mathematical source may support conjectures in an analogous target has received marginal attention in contemporary philosophy of mathematics. A possible explanation is an underlying conviction, gathered by acquaintance with published articles and other finished works in the field, that the standards of evidence in mathematics are solely the deductive ones of proof. If that is the reason behind the observed neglect, however, it stands in need of reconsideration. Because of the professional expectations that accompany them, most finished works in mathematics are unlikely to be reliable indicators of the processes by which new mathematical results are generated. An epistemology of mathematics that considered exclusively those sources and neglected the cognitive and social dimensions of mathematical research would therefore be unlikely to make serious progress with regards to the question of how mathematical knowledge is possible.

For those who reject the purely deductive account of evidential standards in mathematical research as (to adapt a famous simile by Kuhn, 1962:1) “no more likely to fit the enterprise… than an image of a national culture drawn from a tourist brochure”, one of the most difficult philosophical tasks consists in accommodating the inductive (in this specific case, analogical) methodology that emerges from the research practice of mathematicians into a more general epistemological framework. In other words, the aim is to show that much of what expert mathematicians regard as strong reasoning by analogy in their field—as it transpires from how they tend to approach an unsolved problem, select new research questions, or assess the promise of a grant proposal—can be fitted within a more general framework for inductive reasoning in pure mathematics. If such an endeavor should succeed, the claim that mathematicians regularly deal in evidential coinage other than shiny gold proof would turn out to be not only historically well-grounded, based as it is on their testimony and practice, but also philosophically defensible, since the underlying inductive foundations of their method would be fully elucidated.Footnote 2

The two most recent discussions with the philosophical ambition just described, by Corfield (2003, §5.4) and Bartha (2009, §5.7), offer widely divergent suggestions. Drawing from Polya’s (1954b) pioneering work, Corfield (2003:103–5) defends Bayesian confirmation theory (BC) as a promising basis for developing an account of confirmation by analogy—without, however, providing details as to how BC’s tools can be applied. Conversely, Bartha (2009:279) denies that a Bayesian approach is even an option. On his view, an analogical argument in mathematics should not be seen as functioning as ordinary Bayesian evidence—in the ‘incremental’ sense of a conjecture’s probability being greater in light of it than given background knowledge alone; rather, it should be viewed as influencing a mathematician’s prior credences before any further evidence is sought. Bartha proposes to justify this role of analogy in establishing a ‘non-incremental’ form of confirmation for a conjecture (equivalent to regarding it as prima facie plausible) by means of an alleged constraint of symmetry on epistemic rationality.

Following up on these contributions, our aim in this paper is to outline a novel proposal regarding how an analogy with a more familiar domain may contribute to the inductive support of a conjecture in mathematics. On this view—a middle ground in between Corfield’s and Bartha’s—confirmation by analogy should be understood as possessing both incremental and non-incremental aspects. The former, we will argue, has a relatively straightforward Bayesian analysis as an instance of transitive confirmation (cf. Hesse, 1970; Roche & Shogenji, 2013). As for the non-incremental aspect, we believe that a fully formal treatment is not forthcoming. However, we will defend the role of the non-incremental notion in freely yet rationally informing the credences of mathematicians in those circumstances in which no new mathematical evidence is introduced. As we will demonstrate by means of various case-studies, our account captures several aspects of the logic of analogical inference in pure mathematics and therefore constitutes an important addition to an all-embracing, non-revisionary epistemology for this field.Footnote 3

The discussion below will proceed as follows. Section two will present the symmetry-based proposal by Bartha (2009) and illustrate its failures in recognizing the different forms of confirmatory support that analogical arguments can provide. Section three will move on to discuss the main obstacles for proposals such as Polya’s (1954b) and Corfield’s (2003), which purport to couch inductive support by mathematical analogy in Bayesian terms (in parallel to one standard treatment of inductive reasoning in the empirical sciences). In section four, we will offer illustrations of how, assuming that some general issues with the application of BC to mathematics can be set aside, an analogical argument that introduces new mathematical evidence can confirm a conjecture from BC’s standpoint. In section five, we will address cases of inductive support by analogy that arguably escape the Bayesian definition of confirmation, while remaining compatible with the adoption of BC as a doctrine about rational credence updating. Section six will conclude with an overview of our arguments and directions for future research.

2 Not Just ‘Plausibility’

Based on a reconstruction of various episodes in the history of mathematics, Bartha (2009, ch. 5) defends the view that analogies often underwrite the beliefs and expectations of working mathematicians about yet unexplored domains. When it comes to explicating this function, however, Bartha (2009) argues that the standard tools of probabilistic confirmation theories do not apply. Specifically, the inductive support that analogies provide should not be understood in the Bayesian terms of incremental confirmation—as when a piece of evidence increases a hypothesis’ probability. Instead, Bartha gives the sketch of an alternative account, encompassing both the empirical and mathematical sciences, on which analogy’s main scope is to inform expert ‘prior opinions’, i.e., expert credences in conjectures before any further evidence is sought.

His proposal is in two parts. First, Bartha claims that credences in the domain of mathematics should not be understood in terms of probabilities but in terms of “relative conditional betting quotients” (2009:182). Talk of betting quotients serves mainly to escape the problem (which will be discussed in section three) that our uncertainty towards mathematical conjectures is seemingly inexpressible in probabilistic terms because of logical omniscience. In the briefest terms, the problem is that an unfettered application of the axioms of probability apparently requires that a rational agent assign a credence of either zero or one to every decidable mathematical claim.

Second, Bartha proposes that the “[conditional] betting quotients are symmetry based” (2009:182). In more detail, suppose that a mathematician assigns a relatively high conditional betting quotient to H, formally Q (H | E), where E is evidence about a source domain and H is a well-confirmed hypothesis or theorem about that domain. She then finds out that H has an analogue in another mathematical domain, H*, that she had not considered before or such that her conditional betting quotient Q (H* | E*) was negligibly low (where E* stands for a set of properties similar to those in E that the target is known to possess). Under these circumstances, Bartha (2009:182) contends that a symmetry constraint on rationality imposes a revision of her Q (H* | E) to some “non-negligible” value, i.e., a value closer to Q (H | E). The underlying idea is that there is something asymmetric and incoherent about giving such a high quotient to H, without even giving a shot to the analogous bet in the target, absent decisive evidence against it.

By means of the doctrine that probabilities about mathematical conjectures are betting quotients and that a symmetry constraint on rationality governs them, Bartha recovers a sense in which an analogy with a familiar source can make a difference to a mathematician’s trust in a conjecture about a target without making use of probabilities. In particular, on Bartha’s account an analogy can justify assigning a ‘non-negligible’ value to one’s conditional betting quotients in some hypothesis H* before any new evidence for it is sought, equivalent to making the hypothesis “prima facie plausible” (2009:296). Such a role in confirmation is eminently non-Bayesian in at least one sense: if Bartha is correct, we can have a rational revision of opinions regarding mathematical subjects in spite of the fact that the Bayesian definition of confirmation is not satisfied and that the revision is not mandated by the Bayesian rule of credence updating known as conditionalization (whereby an agent’s posterior probability in a hypothesis must be exactly equal to its prior probability conditional on any new evidence).Footnote 4

In what follows, we will not question Bartha’s arguments for the claim that a symmetry principle constrains conditional betting quotients. The concerns that we will voice below are directed exclusively at the applicability of his framework to concrete examples of analogical reasoning in mathematics. That is to say, assuming for the sake of the argument that a symmetry-based justification can be made to work, is inductive support of mathematical conjectures from analogy best understood along the lines that Bartha indicates? We believe that the answer is negative. Let’s take a look at some of the examples that support our contention.

Example 2.1

Riemann’s Hypothesis

One problem with Bartha’s symmetry-based account is that it has limited scope. It identifies a way for an analogy to contribute distinctly to the plausibility of a conjecture only when the latter is assigned no or very little conditional betting quotient to begin with. However, this neglects cases in which the conjecture that the analogy supports already enjoys ‘non-negligible’ credence (or betting quotient). For instance, it may be thought that Deligne’s (1974) proof of a close analogue of Riemann’s hypothesis in algebraic geometry adds to the credibility of the latter conjecture even though, before Deligne’s work, the hypothesis was already regarded as plausible by the majority of working mathematicians. Insofar as Bartha’s proposal solely addresses those cases in which a mathematical conjecture was assigned no or negligible credence, it does not cover the role of a proof such as Deligne’s in indirectly supporting the Riemann hypothesis.

Example 2.2

Euler Characteristic

Second, the symmetry-based account seems unable to account for the different strengths that analogical arguments have in mathematical practice. The following example from solid geometry will be helpful to illustrate this point. Considering that in plane geometry the number of vertices (\(V\)) and edges (\(E\)) in convex polygons is the same (\(V = E\)), Euler (1758) asked if any similar regularity holds for the elements of all convex polyhedra. The correct answer is that a convex polyhedron’s edges (\(E\)), vertices (\(V\)) and faces (\(F\)) are governed by the following stable relation:

$$V - E + F = 2$$
(1)

Here are two distinct analogical arguments for this conclusion. The first (suggested by Polya, 1954a:43) starts from the algebraic observation that the relation \(V = E\) can be rewritten as:

$$V - E + F = 1$$
(2)

This suggests that the regularity we may be looking for is an alternating sum of the number of zero-dimensional elements (\(V\)), the number of one-dimensional elements (\(E\)), and that of two-dimensional elements (\(F\)). By analogy, and introducing the three-dimensional element of the number of solids (\(S\)), Eq. (2) induces the following generalization to the three-dimensional case (which can be easily verified to hold true for cubes, tetrahedra, and dodecahedra)Footnote 5:

$$V - E + F - S = 1$$
(3)

This reasoning yields the correct conclusion: since \(S = 1\) holds for all polyhedra, (3) entails (1).

An alternative argument, inspired to Cauchy’s (1811), runs as follows (cf. Lakatos, 1976:7–9). Imagine to smash a polyhedron, a cube say, until it is flat. The edges do not break, but may be elongated in the process. The result is a two-dimensional figure with the same number of vertices and edges as the cube, but with one face less (one can think of it as two squares one inside the other with the corresponding vertices connected by an edge; see Fig. 1). If we now draw lines among the unconnected vertices of the smashed cube, we obtain small triangles of different sizes and shapes. We note that the relation between vertices, edges, and faces remains constant. Further, if we remove one of the triangles obtained by our drawing, there are only two options: either we remove an edge and a face, or we remove a vertex, a face and two edges. Either way, \(V - E + F\) remains constant (that is, it remains equal to 1). By analogous reasoning, one conjectures that for yet unobserved smashed polyhedra the relation \(V - E + F\) will remain constant. Indeed, we can easily verify that the same is also true for tetrahedra and dodecahedra.Footnote 6

Fig. 1
figure 1

Triangularization method for a smashed cube

The problem for Bartha’s account is that it treats the two arguments exactly alike in justificatory potential, when they are not. The Polya-inspired reasoning is arguably of the weak heuristic variety, based on a rather superficial algebraic similarity between (1) and (2). A mathematician who did not know that (1) holds would hardly be persuaded by what appears to be a merely accidental algebraic similarity—one having no clear relation to the geometrical issue at stake.Footnote 7 Conversely, Cauchy’s geometrical reasoning is stronger in comparison.Footnote 8 We cannot but feel that the imaginary operations involved in the reasoning are possible on any smashed polyhedron whatsoever. In each case, it is not clear what could go wrong to falsify (1).Footnote 9 Yet, using Bartha’s approach, no difference can be traced as to their respective capacity for inductive support: in both cases, we start with a high conditional betting quotient Q (H | E), where H is Eq. (2) and E includes all relevant facts about convex polygons; and in both cases, by symmetry, one ought to assign a ‘close enough’ value to the bet Q (H* | E*), where H* is Eq. (1) and E* includes all relevant facts about convex polyhedra (analogous to those in E). In this way, Bartha’s account neglects the difference in strength between the two arguments.Footnote 10

Example 2.3

Area and Volume

If the above example has not been sufficiently convincing, here is an even starker illustration that Bartha’s account proves too much. It is well-known that for a rectangle of sides x and y the Area can be computed by \(x \cdot y\) and that, for a rectangular box of sides x, y and z, the Volume of is given by \(x \cdot y \cdot z\). Here is a strong analogical argument: given that, of all rectangles, the square maximizes Area, it is plausible that, of all rectangular boxes, the cube maximizes Volume. On Bartha’s view, the work of the analogical argument in making the conclusion plausible should be understood as follows. Let H be the theorem that the square maximizes Area and E the set of properties of two-dimensional rectangles that have analogues in the three-dimensional case. Since the conditional betting quotient Q (H | E) is high, by symmetry a rational agent ought to assign a ‘non-negligible’ value to the conditional bet represented by Q (H* | E*), where H* stands for the claim that the cube maximizes Volume and E* the set of analogs of E’s members.

However, consider rewriting Area by the following (mathematically equivalent) expression:

$$Area^{**}\left( {x,y} \right) = x^{{2 - 1}} \cdot y^{{2 - 1}} - sin^{{2 - 2}} \left( {x \cdot y} \right).$$

Since we know that the square maximizes Area, one could reason that the cube maximizes:

$$Volume^{**}\left( {x, y, z} \right) = x^{3 - 1} \cdot y^{3 - 1} \cdot z^{3 - 1} - sin^{3 - 2} \left( {x \cdot y \cdot z} \right).$$

The analogical inference in this case seems extremely weak. Yet, since Q (H | E) is high (where H is re-written in terms of Area**) by Bartha’s argument one ought to assign a ‘close enough’ value to the bet Q (H** | E*), where H** is the claim that the cube maximizes Volume** and E* is the same set of analogues of E’s members as before. This is highly implausible: it should be at least permissible for a rational agent to assign merely negligible value to the latter bet.Footnote 11

To summarize, we find that Bartha’s account is too weak in one sense and too strong in another. It is too weak because it neglects cases in which an argument from analogy supports a mathematical conjecture that is already assigned non-negligible probability (or betting quotient), as in the case of Deligne’s proof indirectly supporting the Riemann hypothesis. It is also too strong because, as the result of its commitment to a symmetry-based justification, it ends up giving confirmatory power (in the non-incremental ‘plausibility’ sense that Bartha identifies) to analogical arguments in mathematics that should not be understood as having such power at all. The appeal to symmetry thus flattens all analogical arguments in mathematics, preventing us from recognizing those that are evidentially relevant to a conjecture, and even investigating why they are so. For these reasons, we think that a more refined framework is necessary. The next section assesses the prospects for a Bayesian approach to confirmation by mathematical analogy.

3 Bayesianism in Mathematics

In his classic discussion on analogy and induction in mathematics, Polya (1954b) indicates BC as a potential candidate for capturing the role of analogical arguments in making plausible new mathematical conjectures. His proposal (which is discussed approvingly in Corfield, 2003:83) is that confirmation in mathematics comes in different degrees and that the degree of support offered by an analogical argument is proportional to the “hope” (Polya, 1954b:27) that a “common ground” exists between the analogy’s source and target. In other words, the idea is that an analogical argument is stronger when we may regard the similarities figuring in the argument as pointing to some deep but yet unknown connection between the two mathematical domains. Although Polya does not develop this idea in full probabilistic detail, his proposal appears to respond to at least one of the requirements that we found inadequately addressed by Bartha’s account: viz., the idea that different analogical arguments provide different degrees of confirmation to mathematical conjectures. As Corfield (2003) perceptively notes, the appeal to one’s expectation for a common ground promises to offer suitable ground for such a distinction.

Notwithstanding Polya’s and Corfield’s optimism, the prospects for applying the standard tools of BC to the case of confirmation by mathematical analogy are all but straightforward. Two varieties of obstacles need to be overcome, neither of which is trivial. A general problem is logical omniscience (mentioned above). Then there is the specific problem of couching analogical reasoning in terms of BC’s formal apparatus. Let us consider them in order.

Following Garber (1983), two readings of BC’s aims can be distinguished. On the “thought police model” (1983:101), BC’s aim is to patrol our reasoning as we collect more evidence. On the “learning machine model”, instead, BC’s aim is to describe the intellectual life of an ideal learner. On either model, the coherence requirement—viz., the idea that degrees of belief should obey the axioms of probability—forces one to have credence equal to 1 in any logical truth and credence of 0 in any contradiction. Moreover, the same axioms require that the probability of a deductive conclusion should not be smaller than the probability of the conjunction of the premises. It follows that anyone who (no matter how mathematically skilled) is not logically omniscient is thereby irrational from BC’s standpoint. In other words, BC appears to preclude rationally attributing credences other than zero or one to mathematical conjectures.

There is no agreed-upon way of solving the problem of logical omniscience. One well-known attempt is by Garber (1983). The guiding idea is to preserve logical omniscience for all propositions true in virtue of their truth-functional form (tautologies), while relaxing omniscience for all propositions that follow validly from others (not true in virtue of their truth-functional form). This idea is implemented by one’s treating “A ⊦ B” in a given propositional language L (over which the probability function is defined) as atomic propositions. In short, one replaces all instances of “A ⊦ B” in L severally with some ‘dummy’ expression τ and apply the probability function over the newly obtained language L*. In this way, the logical implication of A to B becomes invisible to the Bayesian apparatus, permitting an agent to be rationally uncertain as to whether τ holds. As Eells (1990) notes, however, the solution is incomplete, since an agent is still required to assign credence 1 to all tautologies. This is already an implausible demand on rationality since some tautologies are computationally very complex.

A more selective approach, suggested by Gaifman (2004), consists in applying the probability function to only a subset of (the formulas of) the language of interest. The intuition is to limit the application of probability to that part of the language that is easily grasped by a realistic agent (e.g., because it contains relatively short deductions) and to add specific axioms (regarding implication and the notion of “local provability”, which is not transitive), so that logical omniscience can hold solely within the selected subset of the given language. While this proposal overcomes the incompleteness affecting Garber’s proposal, it is not exempt from defects of its own. For instance, as Easwaran (2009) notes, the proposal seems unable to regard as rationally permissible a mathematician’s uncertainty about statements of which she has proposed a proof—one that she perfectly understands in all of its steps. Such uncertainty is bound to be deemed as irrational on Gaifman’s proposal, although it sometimes appears to be rational.

In light of the problems above, we may adopt a more detached perspective and insist, with Franklin (2013, 2020), that confirmatory relations exist and are knowable independently of entailments. Indeed, this is precisely the approach that we are going to assume below. Even though it may seem question-begging to simply state that some mathematically necessary truths have probability less than 1, we may interpret this approach minimally as a request for a waiver. After all, at least some partial proposals exist that attempt to relax the assumption of logical omniscience. Moreover, better attempts may become available with more time and effort.Footnote 12 In the meantime, we know that BC has been fruitfully applied to inductive reasoning in the empirical sciences. Accordingly, we may regard as our central concern that of understanding what a Bayesian approach can tell us about inductive reasoning in the realm of mathematics. This will determine whether it is worth spending time on the more foundational issues—in particular, whether the problem of relaxing logical omniscience deserves our further intellectual efforts.

Even if a waiver for logical omniscience is granted, it must be stressed that specific issues about the application of BC to analogical reasoning remain open. It is, after all, one thing to claim that an analogical argument may contribute to increasing one’s trust in a conjecture from BC’s standpoint; it is another to demonstrate this claim formally by means of BC’s probabilistic apparatus. Presumably, any proposed representation would need to have some sort of broad coverage: it would need to apply to a variety of uses of analogy. In each case, the account would also need to display some sort of diagnostic value: it would need to show that it yields a verdict of confirmation (or lack thereof) solely in virtue of the fact that a strong (respectively, weak) analogical argument is at work. It is not obvious that BC can achieve that. As a matter of fact, neither Polya (1954b) nor Corfield (2003) provide sufficient details about BC’s intended application to judge if their proposals are both sufficiently comprehensive and illuminating.

To summarize, BC’s reliance on degrees of confirmation promises great descriptive fit with the practice of analogical inference in mathematics—in addition, of course, to identifying a potential connection with the epistemology of the empirical sciences. However, the prospects for a Bayesian analysis of analogical inferences in mathematics are all but obvious. For one thing, there is the problem of logical omniscience. Even setting that aside, there remains the highly non-trivial problem of providing the details of the promised Bayesian account. In the next section, we are going to develop a proposal about representing inductive support from analogy within BC. Even though we believe that such a proposal represents a useful addition to understanding a variety of cases in which analogies play a role in mathematical research, our ultimate recommendation will be to embrace a hybrid framework—one which recognizes both incremental and non-incremental aspects of confirmation by analogy in mathematics. Before getting there, let’s first take a look at how far one can go by adopting the standard tools of BC.

4 Incremental Confirmation by Analogy

Let a reasonable notion of probability as applied to mathematical conjectures be given. In other words, let’s assume that the problem of logical omniscience can be dealt with (it does not matter by which solution). Here is one thing that we can expect BC to be able to model: roughly, the incremental form of confirmation that attaches to a hypothesis about the target as a consequence of the discovery that some analogous result holds in the source. The case from which we shall start is the confirmation of Riemann’s hypothesis due to Deligne’s proof of Weil’s conjecture. As it turns out, a Bayesian representation of such a case is possible in the spirit of Polya’s (1954b) suggestion concerning the ‘hope for a common ground’. This analysis extends the results in Nappo (2022) regarding analogical confirmation in the empirical sciences. Let’s take a look.

4.1 The Riemann Hypothesis

Some historical background about the case-study will be useful. Riemann’s (1859) hypothesis emerged in the context of studying prime numbers distribution. Using a procedure called analytic continuation, he generalized to the complex plane a function introduced by Euler in connection with his solution to the so-called ‘Basel problem’. Thus, the Riemann zeta function was defined:

$$\zeta \left( s \right) = \mathop \sum \limits_{n = 1}^{\infty } \frac{1}{{n^{s} }}$$
(4)

where the sum is over the natural numbers and \(s\) is a complex variable different from one. The main result that Riemann established was a relation between the zeros of his zeta functions and the distribution of prime numbers. Some of these zeros are called trivial in that their existence is easy to prove (they are exactly the negative even numbers). The question is about the non-trivial ones. It is known that they belong to the open strip of the complex plane defined by the set of complex numbers with real part between 0 and 1. The Riemann hypothesis is that all non-trivial zeros lie on the critical line defined by the complex number having their real part equal to ½.

While resisting several attempts at a proof, another area of mathematics came unexpectedly to the rescue of the Riemann hypothesis: algebraic geometry. Its principal object, the ‘algebraic variety’, is defined as the set of solutions to a system of polynomials. The coefficients of such polynomials belong to so-called ‘finite fields’, a set whose operations are analogous to those of the real numbers, but containing only a finite number of elements. Furthering Artin’s (1924) studies into this relatively recent field, André Weil (1949) demonstrated that it is possible to construct a zeta function involving the number of points over the finite extension of the original field. One of Weil’s four conjectures about this algebraic zeta function mimics the Riemann hypothesis in considering the distribution of the zeros of the integral polynomials linked to the rational form of the algebraic zeta function. Building upon Weil’s work, Deligne (1974) eventually proved the analogue of Riemann hypothesis for algebraic varieties over finite fields.

Soon after his algebraic formulation, Weil began to ponder about the bearing of his conjecture to the original number-theoretical question as formulated by Riemann. He wrote:

The Riemann hypothesis [...] appears to-day in a new light, which shows it to be closely connected with the conjecture of Artin on the L-functions, thus making these two problems two aspects of the same arithmetico-algebraic question. (1950:297)


The same sentiment of a connection between the two conjectures is shared by those several experts in the field who regard Deligne’s proof as “the best reason to believe that [the Riemann hypothesis] is true” (Deninger, 1994:493). Their reasoning is based on an appreciation of the analogy between the domains that Riemann’s and Weil’s works respectively explored. This “sense of the relatedness of mathematical facts” (Corfield, 2003:121) underwrites their allegations that the Riemann hypothesis has been significantly confirmed by Deligne’s proof.

The reasoning used by these mathematical experts can be represented formally as an instance of the phenomenon of transitivity of confirmation. As it is well-known, the Bayesian notion of confirmation is not generally transitive: sometimes A confirms B and B confirms C, but it is not the case that A confirms C. However, there exist several formal proposals in the philosophical literature (e.g., Hesse, 1970; Roche & Shogenji, 2013) about when transitivity of confirmation is guaranteed to occur. Confirmation of the Riemann hypothesis can be understood, from a Bayesian perspective, as an instance of the case in which some of the weakest conditions for transitive confirmation are met. To illustrate this Bayesian analysis, we define:

R:

The Riemann hypothesis is true.


We then make precise the evidence that, by the lights of several experts, supports R by analogy:

W:

Weil’s conjecture is true.


At this point, it can be shown that Bayesian confirmation of R by W occurs if non-extremal credence (i.e. neither zero nor one) is given to the following bridge claimFootnote 13:

G:

The distribution of the solutions of Weil’s zeta function is robust (i.e., does not change) in the passage from the algebraic-geometric question to the analogous number-theoretic question (or vice versa),


Following probabilistic conditions between W, G, and R are jointly satisfied:

a):

P (W | G) > P (W | ¬ G)

b):

P (R | G) > P (R)

c):

P (R | G ∧ W) \(\ge\) P (R | G)

d):

P (R | ¬ G ∧ W) \(\ge\) P (R | ¬ G)


In informal terms, a)-b) tell us that W must confirm G and G must confirm R (note that a) is equivalent to P (G | W) > P (G)), whereas c)-d) additionally tell us that W must not disconfirm R conditional on G (the presence of a common ground) or, alternatively, on ¬ G (its absence).

Note that all of a)-d) are plausibly satisfied. Specifically, conditions a) and b) express the plausible idea that discovering that an invariant structure of solutions exists for the algebraic and number-theoretic domains (as expressed by G) would increase the probability of both Riemann’s conjecture and Weil’s conjecture (which are both attempts at capturing that invariant structure) vis-à-vis other live conjectures that do not allow for the existence of such an invariant structure. Conditions c) and d) are weak additional assumptions, formally expressing one’s disposition (which is based on the “sense of relatedness of the mathematical facts” discussed above) to increase one’s trust in the Riemann hypothesis as the result of the discovery of an analogous result in algebraic geometry. Both assumptions are plausible. In particular, d) can be justified on grounds that W is irrelevant to R if ¬ G is the case; consequently, P (R | ¬ G ∧ W) \(=\) P (R | ¬ G).Footnote 14 A transitivity theorem by Roche and Shogenyi (2013; see Appendix) shows that a)-d) entail P (R | W) > P (R), meaning that there is confirmation of R by W in the standard Bayesian sense.

In summary, the above model shows that, provided it makes sense to assign non-extremal probabilities to mathematical hypotheses such as R, W and G, it is possible to spell out a precise Bayesian account of confirmation of the Riemann hypothesis from the proof of Weil’s analogue conjecture.Footnote 15 Such a full-fledged Bayesian formalization—a novelty for the existing literature—vindicates Polya’s insight that the degree of confirmation depends upon the ‘hope’ that a common ground exists between the mathematical domains being compared. The proposal above shows that Polya’s insight is correct at least insofar as confirmation depends upon one’s disposition to accept a non-extremal credence to a hypothesis (G) acting as a connecting bridge between source and target. As a way of validating the proposed Bayesian approach to incremental confirmation of mathematical conjectures, let’s now turn to another case-study.

4.2 Taylor Expansion of Functions

Here is another case that can be treated in a Bayesian fashion: that in which the discovery of a similarity between two apparently unrelated mathematical domains confirms a hypothesis about the target. (This is different from the case in 4.1, where the discovery of a result in a source supports a conjecture in an analogous target; see fn. 17). Here is an illustration. By calculating the Taylor series around \(0\) of the function \(\frac{1}{{1 - x^{2} }}\), we note that it is similar to that of \(\frac{1}{{1 + x^{2} }}\):

$$\frac{1}{{1 - x^{2} }} = 1 + x^{2} + x^{4} + x^{6} + \cdot \cdot \cdot$$
$$\frac{1}{{1 + x^{2} }} = 1 - x^{2} + x^{4} - x^{6} + \cdot \cdot \cdot$$

An additional aspect of similarity is that the respective series have similar convergence behavior: both converge when \(\left| x \right| < 1\) and diverge when \(\left| x \right| > 1\). This is so despite the fact that the corresponding functions have different shapes in the real plane (in particular, \(\frac{1}{{1 - x^{2} }}\) is not defined for \(x = 1\)). From these similarities, a mathematician might infer that \(\frac{1}{{1 + x^{2} }}\) is related to \(\frac{1}{{1 - x^{2} }}\) by some connection that is simply not ‘visible’ in the real plane.Footnote 16 This (factually correct) reasoning is based on the idea that it would be something of a coincidence if their respective series just happened to have similar expressions and convergence behavior, but were otherwise unrelated.

Using BC, one can represent the effect on a rational agent’s credences of the discovery of the similarities between the two functions. To obtain an adequate Bayesian model, the first step is to clearly define the evidence introduced by the analogical inference. This is arguably that:

O:

The Taylor series of the source and target functions display similar expressions and convergence behavior.

One then precisely defines the conclusion about the target that the argument aims to support:

T:

The function \(\frac{1}{{1 + x^{2} }}\) is related to \(\frac{1}{{1 - x^{2} }}\) by a connection not visible in the real plane.

Finally, one defines an appropriate ‘bridge’ between source and target (more on this below):

J:

Some deeper mathematical fact holds such that the two functions are connected.


By Roche and Shogenji’s (2013) transitivity theorem, when O, T and J are assigned non-extremal credence, then O confirms T if the following conditions obtain:

e):

P (O | J) > P (O | ¬ J)

f):

P (T | J) > P (T)

g):

P (T | J ∧ O) \(\ge\) P (T | J)

h):

P (T | ¬ J ∧ O) \(\ge\) P (T | ¬ J)

The conditions plausibly describe the epistemic situation of the expert mathematician who comes to suspect a connection between the two functions (T) as the result of some newly observed similarity between them (O). In particular, e) and f) plausibly express one’s defeasible recognition that the similarities in the respective series would be somewhat unlikely if there were no ‘common ground’ (existing in some deeper mathematical domain) relating the two functions. Conditions g) and h) ensure that confirmation of the bridge J (the common ground) by the similarities O properly translates into T’s confirmation. Both assumptions are plausible. In particular, h) is arguably justified on grounds that O is evidentially irrelevant to T if J is false; accordingly, P (T | ¬ J ∧ O) = P (T | ¬ J). It follows from e)-h) that P (T | O) > P (T). In informal terms, the model shows that there is confirmation of T by the observed similarities.

One point worth noting here is that, while formally the conditions are the same as in case 4.1, the content of the evidence and the bridge hypothesis has changed in the new example. This is not an ad hoc move. It is justified by the fact that the cases of confirmation by analogy in 4.1 and 4.2 are importantly different. In the former case, we have two mathematical domains that we already suspect to be connected in some way; the similarities between the two domains constitute the background information in virtue of which, from the discovery of a result in the source, one confirms the corresponding conjecture in the target. In the latter case, we have instead the discovery of a striking similarity between two apparently unrelated domains (the two functions); here the similarities constitute the new evidence in virtue of which the hypothesis of a common ground is confirmed (insofar as it makes the similarities more likely than they would be if there were no common ground).Footnote 17 Both cases exemplify equally valid, though distinct notions of confirmation by analogy in mathematics, which the account above correctly treats differently.

4.3 The Euler Characteristic

The above discussion can be brought to bear on an important test for the proposed Bayesian analysis: the case-study of the Euler characteristics examined in Example 2.2. As mentioned earlier, the case is interesting because two different analogical arguments can be used to support the conclusion that a regularity exists uniting the elements of a convex polyhedron (expressed by V – E + F – S = 1  ), analogous to that which holds for regular polygons in plane geometry (i.e., V = E ). The two arguments arguably differ with respect to the support that they provide to the conclusion—the Euler characteristic. If a Bayesian representation is to be a useful way of representing confirmation by analogy in the realm of mathematics, then, it should be able to tell apart the two analogical arguments with respect to their respective capacity to provide inductive support. (We noted earlier that the above is a desideratum that Bartha’s 2009 rival account fails to satisfy). Let’s take a look as to how well the present Bayesian analysis fares in this regard.

Based on the distinction drawn in 4.2, we should note immediately that the two analogical arguments differ not just in strength but in kind. The Cauchy-inspired argument resembles case 4.1 in that it aims to support its conclusion by using as background information the geometrical similarities that link all convex polyhedra together (in virtue of which they belong to the same geometrical genus); from these assumed (mostly tacit) similarities, and the fact that the observed polyhedra (viz., cubes, tetrahedra, and dodecahedra) satisfy the Euler characteristic when smashed, one defeasibly infers that the latter property holds true for yet unobserved polyhedra. Conversely, Polya’s algebraic argument is closer in kind to the inference in case 4.2: from the similarity between the expression V – E  +  F – S  =  1 (known to hold for cubes, tetrahedra and polyhedra) and the corresponding two-dimensional equation V – E  +  F  =  1,   one defeasibly infers the existence of a deeper mathematical fact that is capable of explaining the observed similarity.

Based on this preliminary observation, we fix the conclusion that both arguments arrive at:

  1. C

    Unobserved convex polyhedra satisfy equation (1)— the ‘Euler characteristic’.


Let us then reconstruct the two arguments in accordance with the framework developed in 4.1 and 4.2. For the Cauchy-inspired reasoning, the new evidence introduced by the argument is that:

  1. E

    Smashed cubes, tetrahedra, and dodecahedra satisfy the property whereby the relation V – E + F is constant even after removal of any inner triangular surface.


The bridge hypothesis is formulated in accordance with a recipe indicated for case 4.1:

  1. B

    The result of the smashing, triangularization and removal operations on convex polyhedra is robust with respect to the type of polyhedron.


As in case 4.1, by ‘robust’ here we mean that the result remains invariant under changes in the property specified in the hypothesis—in this case, a change in the polyhedron’s exact form.

For the Polya-inspired argument, instead, both the evidence and the bridge hypothesis need to be defined differently. The evidence that the argument makes salient is the observation that:

  1. E*

    Cubes, tetrahedra, and dodecahedra satisfy an alternate sum regularity (3)

    algebraically similar to that which unites the elements of regular polygons (2).


The bridge hypothesis is formulated in accordance with the recipe for case 4.2, as follows:

  1. B*

    Some deeper mathematical fact holds such that (2) is algebraically similar to (3).


The ‘deeper mathematical fact’ in this case may be a general theorem (in some yet unknown mathematical theory) having as a consequence that the alternating sum of the number of elements of an n-dimensional figure that satisfies some precise constraints is always equal to 1.

We are interested in comparing the analogical component of the confirmation of E and E* to C. On our account, the conditions for confirmation by analogy are, respectively, i)-l) and m)-p):

$$\begin{gathered} i)\;\,{\text{P }}\left( {{\text{E }}|{\text{ B}}} \right){\text{ > P }}\left( {{\text{E }}|\neg {\text{B}}} \right)\quad \quad \;\;\;\,m)\;{\text{P }}\left( {{\text{E}}^*{\text{ }}|{\text{ B}}^*} \right){\text{ > P }}\left( {{\text{E}}^*{\text{ }}|\neg {\text{B}}^*} \right) \hfill \\ j)\;{\text{P }}\left( {{\text{C }}|{\text{ B}}} \right){\text{ > P }}\left( {\text{C}} \right)\quad\quad\quad\quad\quad \; n)\;\,{\text{P }}\left( {{\text{C }}|{\text{ B}}^*} \right){\text{ }} > {\text{P }}\left( {\text{C}} \right) \hfill \\ k)\;{\text{P }}({\text{C }}|{\text{ B}} \wedge {\text{E}})~\geq{\text{P }}\left( {{\text{C }}|{\text{ B}}} \right)\quad \;\,\,\,o)\;\,{\text{P }}({\text{C }}|{\text{ B}}^* \wedge\; {\text{E}}^*)~\geq{\text{P }}\left( {{\text{C }}|{\text{ B}}^*} \right) \hfill \\ l)~\,{\text{P }}({\text{C }}|\neg {\text{B}} \wedge {\text{E}})~\geq{\text{P }}\left( {{\text{C }}|\neg {\text{B}}} \right) \;\;\;\,p)\;\,{\text{P }}({\text{C }}|\neg {\text{B}}^* \wedge \;{\text{E}}^*)~\geq{\text{P }}\left( {{\text{C }}|\neg {\text{B}}^*} \right) \hfill \\ \end{gathered}$$

As in the previous cases 4.1 and 4.2, we assume that a reasonable distribution of the agent’s probabilities, whereby each of E, E*, B, B* and C are assigned non-extremal prior credence. Let’s now consider the content of the probabilistic conditions for each of the two columns above.

The conditions in the first column, i)-l), are all plausiby satisfied. Specifically, i) and j) arguably express one’s recognition that, as the result of the geometrical similarities that link all convex polyhedra together, similar manipulations (smashing, triangularization, etc.) are likely to lead to the same result: in compressing a polyhedron, some initially unconnected vertices will be found on the same surface and will thus be potential vertices for (roughly) triangular figures; the removal of any such triangular surfaces is likely to keep the number of elements constant. Conditions k) and l) are weak additional assumptions, both of which are plausibly satisfied. In particular, l) can be justified on grounds that E is evidentially irrelevant to C if ¬ B is known. In more informal terms, if we know that the result of similar manipulations is not robust to the smashing and triangularization operations across different polyhedra, then E arguably ceases to be evidence for C. Altogether, acceptance of i)-l) entails the correct verdict that P (C | E) > P (C).

Conversely, it is not obvious that m)-p) are all satisfied. Condition m) is especially shaky. Considering the algebraic similarity between (2) and (3), the expert mathematician (which we are assuming not to know if C holds) may well regard it as little or no evidence for C. After all, the similarity with (3) may appear to be the accidental fallout of an algebraic manipulation of (1) into (2), one which has no independent basis in geometry. Accordingly, a mathematician may reject E*’s capacity to confirm (and not merely suggest) that some deep mathematical fact underlies the algebraic similarity between (2) and (3), and therefore that all convex polyhedra (and not merely the observed ones) obey the alternating sum regularity that (3) expresses. Hence, E* will fail to confirm C in anything but a weak enumerative sense. That is to say, even though E* may rationally increase the probability of the generalization C, it would only do so to a much slighter degree than the confirmation achieved by means of the geometrical reasoning. This is exactly as we should expect given the perceived difference in strength between the arguments.Footnote 18

To wrap up, a precise proposal about capturing confirmation by mathematical analogy in Bayesian terms has been outlined. Indeed, we have offered recipes for two modes of incremental confirmation by analogy, respectively exemplified in 4.1 and 4.2, showing that both can be viewed as instances of transitive confirmation. As we have seen in the (somewhat artificial, but still instructive) case of the Euler characteristic, the proposed framework possesses a satisfactory degree of diagnostic capacity. Specifically, when fed with the relevant background knowledge (understood in terms of reasonable credence assignments to conjectures), it can help pinpoint the factors that make certain inferences by analogy stronger than others. By identifying those features of the contextual information that can determine whether, and to what extent, confirmation by analogy occurs, the current proposal constitutes an important addition to our understanding of analogical inference in mathematics. The next section will address the question of what the proposal above omits with regards to reasoning from analogy in mathematics.

5 Non-Incremental Confirmation

The previous section has argued that, assuming that the problem of logical omniscience can be solved, we can represent many important instances of confirmation by mathematical analogy by means of BC—in line with one standard way of treating inductive reasoning in the empirical sciences. Even though our sample of case studies is necessarily limited, the illustrations provided in the previous section at least show that the notion of confirmation by analogy does not violate the tenets of BC and can thus be understood as a fully legitimate part of the inductive methodology of mathematics from a Bayesian standpoint. Moreover, as we stressed earlier, the recipes that we have provided for two types of analogical inferences in mathematics, exemplified by cases 4.1 and 4.2, reflect independently plausible ideas about when an analogical argument in mathematics is inductively strong. This fact makes us reasonably confident that the proposed formal framework is appropriately generalizable beyond the few examples considered above.

In this section, our aim is to show that there is one notion of confirmation by analogy that the Bayesian account above is, by its nature, not prepared to capture—what we call the ‘non-incremental’ notion of confirmation by analogy. What we will be primarily concerned with clarifying is how this notion interacts with the incremental form just discussed. For an illustration, let’s consider the geometrical case of the analogical inference from area to volume examined in Example 2.3. From the fact that, of all rectangles, the square maximizes Area, we are led to expect that, of all rectangular boxes, the cube maximizes Volume. The argument can be represented formally by means of the recipe indicated above. As in case 4.1, we have the discovery of a result in a source (viz., that the square maximizes Area) which bears on a conjecture about solid geometry (viz., that the cube maximizes Volume).Footnote 19 On our proposal, confirmation in BC’s sense occurs if an agent’s credence function countenances a ‘bridge’ hypothesis to the effect that the result about squares is robust to the passage to solid geometry.

One aspect of the Bayesian formalization (which we leave as an exercise for the reader, being a simple extension of the recipe in 4.1) must be emphasized. It assumes that a Bayesian agent would find it reasonable to assign non-negligible probability to the ‘bridge’ hypothesis that links squares to cubes from the standpoint of confirmation. Although such an assumption seems eminently reasonable in light of our grasp of geometrical relations, it must be stressed that nothing in the proposed framework dictates this choice. As far as the axioms of probability are concerned, there is no difference between epistemic agents who take it that cubes are the three-dimensional analogue of squares and agents who take it that (say) spheres are the three-dimensional analogue of squares. The latter agents may find it natural to think that the sphere’s surface area is the analogue of a square’s area but not that a cube’s volume is. Presumably, such deviant agents may fail to even entertain the bridge hypothesis that we find natural in this context, viz., that which makes cubes the three-dimensional analogue of squares.Footnote 20

We are, of course, considering an extreme example for illustration. But the issue is much closer to home than one might suppose. Some mathematicians, for instance, categorically reject the analogy between Weil’s theorems for algebraic varieties and Riemann’s hypothesis in number theory (cf. Corfield, 2003:98). Presumably, such experts—a minority among those who can be regarded as fully grasping the mathematical problem—assign little or no credence to the bridge hypothesis that links the proof of Weil’s conjecture for algebraic varieties to the Riemann hypothesis. Accordingly, they may regard Deligne’s proof of Weil’s conjecture as little or no evidence for the truth of the Riemann hypothesis.Footnote 21 Here we have a realistic case in which a difference in the bridge hypotheses that an agent considers overturns the verdict of confirmation.

Reflecting on these cases is useful because they indicate the existence of two distinct notions of confirmation by analogy. On the one hand, there is the ‘incremental’ sense in which the discovery of some new fact about a source, or some additional similarity between a source and a target, may justify additional credence to a conjecture about the target. That is the notion that the Bayesian account of the previous section aims to explicate. On the other hand, there is the ‘non-incremental’ sense in which a conjecture is judged to be plausible in light of an analogy between two mathematical domains.Footnote 22 For instance, as the natural analogue of squares in solid geometry, we find it plausible that cubes will satisfy analogues of various geometrical properties that hold for squares. This is not the sense of confirmation in which we say that a piece of evidence adds to a hypothesis’ probability. What we have instead is an all-things-considered judgment to the effect that a conjecture—what we have been calling a ‘bridge hypothesis’- deserves (some possibly small but) positive prior credence. In this sense, it is non-incremental.Footnote 23

It would be a mistake to suppose that, since the judgments of analogy in question do not contribute to the incremental confirmation of a mathematical conjecture, they cannot make a difference (in some broad sense) to a mathematician’s epistemic life. In many circumstances, mathematicians may fail to note the analogy between two domains of mathematical interest; or, while vaguely recognizing some element of resonance, they may fail to pinpoint exactly what the two domains have in common (cf. Polya, 1954b:111). Under those circumstances, a clearer recognition of the connection between two domains may rationally lead a mathematician to consider ‘bridge’ hypotheses that she had not even considered before. The result, from a Bayesian standpoint, would be a revision of the agent’s credence function to include a new hypothesis. The difference with the incremental form of confirmation by mathematical analogy is that such change in credence would not be dictated by the consideration of any new evidence. Rather, we can say that such change derives from looking at one’s old evidence in a new way: as possibly indicating a previously unnoticed connection between distinct mathematical domains.

A question that may be pressed at this point is whether the recognition of a non-incremental notion of confirmation by analogy is consistent with the adoption of a Bayesian epistemology. On this issue, we have little doubt: there is no tension. Since bare BC is silent about which combinations of priors and likelihoods that preserve coherence with the axioms of probability should be adopted by a rational agent, it is perfectly consistent for a Bayesian to recognize that judgments of analogy may rationally inform an epistemic agent’s priors and, at the same time, to hold that the only rational way for an agent to update her credences on the basis on new evidence is by applying the standard diachronic rule of conditionalization. Consistency with BC remains even if one accepts the plausible doctrine that there is often a fact of the matter as to which judgments of analogy in particular contexts of research actually justify consideration of which bridge hypotheses.Footnote 24 Insofar as judgments of analogy of the kind we are pointing to pertain to the sphere of prior credences, our entitlement to them escapes BC’s primary field of application.

Our claim that, even before new similarities and dissimilarities are considered, judgments of analogy may play a role in determining the ‘reasonable’ probabilities that a mathematician assumes in the course of research points to an intrinsic limit of Bayesian reconstructions of analogy in mathematics. Even though we can appeal to BC’s apparatus to describe the mechanisms whereby a mathematician updates her credences on the basis of new similarities and dissimilarities, questions with regards to which ‘bridge hypotheses’ should be given consideration, and precisely to what extent, are not answered from within that framework. That applies no matter what formal proposal for the so-called “problem of the priors” a Bayesian might come up with: symmetry, maximum entropy, parsimony, etc. (cf. Berger, 1985). The point remains that, even though the judgments of analogy in question are not in principle unanalyzable, it is hard to imagine providing a satisfactory formal account of them, i.e., a theory that successfully reduces the norms governing the appropriateness of certain judgments of analogy in mathematics to the algorithmic application of some general rules of reasoning.Footnote 25

To give a simple illustration of the difficulties, consider the plausibility judgments involved in another geometrical example (cf. Bartha, 2009:110). It is a theorem of plane geometry that the three medians of a triangle intersect in a point, called a ‘centroid’. By ‘median’, we mean the segment that unites the midpoint of a side to the opposite vertex. An interesting conjecture in solid geometry that the theorem induces, by analogy with the two-dimensional case, is the following (Fig. 2): the four medians of a tetrahedron intersect in a point. In this case, by a tetrahedron’s ‘median’ we mean the segment that unites a face’s centroid to the opposite vertex.

Fig. 2
figure 2

Analogy of triangle and tetrahedron

Although the conjecture about tetrahedra is highly plausible, the strength of the analogical inference depends on geometrical intuitions that are difficult to articulate. Possibly the easiest way to express them consists in borrowing the physical notion of a figure’s ‘center of mass’. The median of a triangle can be regarded as the point where the ideal mass of the triangle concentrates. Similarly, the median of a tetrahedron is where the mass of the tetrahedron (assuming it had any mass and that it were uniformly distributed across its volume). Given that the ‘median’ in both the plane and the solid case is what unites the center of mass of a figure or face to the opposite vertex, and given that we know that triangles and tetrahedra are analogous to one another in other ways, it is plausible that results about medians in triangles will have an analogue for tetrahedra. Insofar as making this judgment requires a complex combination of visualization and knowledge of the relevant geometrical properties of triangles and tetrahedra, it is very hard to imagine what combination of algorithmic procedures of reasoning would be able to capture the judgments of the trained mathematician even in such an elementary example.

Examples such as the above make it plausible that, even though a mathematician’s training and practice often predisposes her to see certain connections among mathematical domains rather than others, and to be accordingly induced to perform certain inferences rather than others, exactly which connections and inferences are not determined by a rule (or set thereof). For one, those judgments often rely on hardly explicable mental visualization. Moreover, mathematical creativity often consists precisely in the capacity of seeing new connections between mathematical domains when the latter were initially thought to be separate. In Polanyi’s (1958) famous terms, we may consider this a form of ‘tacit knowledge’—a hard-to-articulate inferential know-how that comes with mathematical training. The capacity to see such connections and to separate them into deeper and more superficial ones is thus something more akin to an aesthetic capacity—as when developing a capacity to recognize more and less beautiful pieces of art and to make one own’s beautiful pieces—rather than to an algorithmic procedure of reasoning. Because of this, we are highly skeptical that any formal account will fare significantly better than BC.

In summary, our claim is that we can capture one notion of confirmation by analogy in Bayesian terms but not another. Even though the judgments of analogy that underlie the non-incremental notion are not universally agreed upon, they are often widely shared—to the point that it becomes plausible to assume that the acquisition of the proper training for mathematical research is measured at least in part by one’s disposition to express certain judgments of analogy in the course of research rather than others. Yet, besides showing that the judgments of analogy of a ‘reasonable’ mathematician are encapsulated in assignments of non-zero credence to certain bridge hypotheses rather than others, it does not seem plausible to expect that a formal framework such as BC will be able to say something more informative on the subject. At the same time, as we stressed, the recognition of a non-incremental notion of confirmation by analogy does not pose trouble for Bayesians insofar as analogy’s role in informing priors is consistent with the adoption of BC as a doctrine about credence updating.

A comparison with Newton’s theory of gravity may be helpful at this point. It is well-known that Newton formulated his law of universal gravitation without advancing hypotheses as to the ultimate cause of the force of gravity. Our hybrid framework for analogical reasoning in mathematics does something similar. We have shown that (assuming that logical omniscience has been appropriately relaxed) it is possible to represent in precise terms the incremental form of confirmation that attaches to a mathematical conjecture as the result of the discovery of an additional fact about the source, or an additional similarity between source and target domains, given a reasonable distribution of prior and likelihoods. But as to the role that analogy plays in supporting that initial distribution, hypotheses non fingimus. Although the judgments of analogy in question may be sufficiently stable across experts and contexts to allow for generalizations to be made about them, we are unlikely to ever reduce a trained mathematician’s capacity to form plausible judgments of analogy to the application of a set of formal rules. Hence, the hybrid account gives us all we may plausibly hope as far as analytic rigor and precision are concerned.

6 Conclusion

In this paper, we have presented a new framework that systematically captures the common inferential mechanism whereby evidence in a familiar mathematical domain supports conjectures in an analogous target. For this purpose, we have identified two notions of confirmation by analogy that are relevant to the realm of mathematics – an incremental and a non-incremental notion—and discussed how each of their roles is to be understood from the standpoint of BC. One of our major achievements is to have provided the details of a Bayesian account of the incremental notion of confirmation by analogy. The proposed account can respond to questions about precisely when an analogy may contribute to increasing a mathematician’s credence in the truth of a conjecture, as well as precisely what sorts of facts about the background information can make a difference to whether there is (or is not) confirmation by mathematical analogy.

We have defended our novel account as a superior alternative to those proposed by, respectively, Bartha (2009) and Corfield (2003). Considering several case-studies, we have contended that Bartha’s (2009) symmetry-based proposal is too strong in one sense and too weak in another. It is too strong because it makes far too many analogies capable of providing conjectures with a form of ‘prima facie’ plausibility; it is also too weak because it fails to account for the different degrees of support that analogical arguments can provide to mathematical conjectures that already possess some ‘non-negligible’ credence to begin with. With regards to Corfield’s (2003) approach, we have mainly pointed out its failure to specify the details of a Bayesian account of confirmation by mathematical analogy. As we have seen in the previous section, once the details are fully out it becomes clear that there are two distinct forms of confirmation by analogy to account for and that, while a Bayesian account of the incremental form can be provided, a formal approach is unlikely to capture the non-incremental version.

In conclusion, let us indicate an issue that requires further work. While the above discussion has identified a plausible mechanism whereby analogies may support mathematical conjectures, it has systematically refrained from asking the hard problem of justification: specifically, what makes it rational to adopt analogy judgments as a guide to mathematical investigation. A discussion of the hard problem of justification will be the subject of a separate article.