While H&F want to remain neutral on questions concerning the nature of scientific explanation, it remains an open question whether there are any non-deductive explanatory relations that can be denoted by X in a sound interpretation of (12)–(15). If such explanations do exist, then H&F’s solution expands the class of solutions to the POE as compared to Garber’s solution, which requires a deductive relationship between H and E. In what follows, I show that the role that frequency data plays in the formulation of inductive explanations of old evidence renders H&F’s solution either unsound or under-motivated in a broad class of conceivable use cases.
One variety of non-deductive explanation that H&F believe is compatible with their solution is Hempel’s (1965) model of “inductive-statistical” explanation (H&F 2015, p. 716). Hempel’s basic idea is that an explanans (which includes some hypothesis) can explain an explanandum in an inductive-statistical sense if the truth of the explanans renders the truth of the explanandum sufficiently probable. Thus, we can define the extra-systemic propositions X and Y as follows.
$$\begin{aligned} X:= & {} \mathfrak {p}(E|H) \ge p \end{aligned}$$
(18)
$$\begin{aligned} Y:= & {} \mathfrak {p}(E|H^{\prime }) \ge p \end{aligned}$$
(19)
Where p is some threshold probability and \(\mathfrak {p}(\cdot )\) is a counterfactual conditional probability function. This function “represent[s] the degrees of belief of a scientist who has a sound understanding of theoretical principles and their impact on observational data, but who is an imperfect logical reasoner and lacks full empirical knowledge” (Sprenger 2015, pp. 391–392, cf. Earman 1992, p. 134). To get a feel for how this function works, suppose that a scientist knows that some theory H is true, and knows that it assigns some probability to some event E, but, counterfactually, does not know whether E occurred. The function \(\mathfrak {p}(E|H)\) represents the degree of belief that the scientist would have in E in this counterfactual scenario.
We can construct cases in which the explanatory relationship between hypothesis and evidence is inductive-statistical that could putatively give rise to the POE. For example, suppose that doctors have known for some time that a person x suffers from lung cancer, but have only recently discovered that smoking explains lung cancer in the inductive-statistical sense described above. Upon learning this explanatory proposition, should x’s doctors raise their degree of belief in the hypothesis that x is a smoker? I take it that the answer depends on how the doctors arrived at the realization that smoking explains lung cancer. As Roche and Sober (2013) point out, if the explanation was formulated through the analysis of previously known background data, then the explanatory proposition should not confer any additional credence on the hypothesis that x is a smoker. The hypothesis that x is a smoker entails a population frequency from which x’s cancer can be induced. This entailment, rather than the proposition that smoking explains lung cancer, does the confirmatory work in a Bayesian sense.
To see why, let H be the proposition that patient x is a smoker, let E be the proposition that x has lung cancer, and let B denote the doctors’ background knowledge. The extra-systemic propositions X and Y are defined according to H&F’s schema. It is clear that the following is true:
$$\begin{aligned} P(H|E,B,X)=P(H|E,B) \end{aligned}$$
(20)
If a Bayesian agent had to bet on whether x is a smoker, then the amount that she would be willing to wager would not change depending on whether or not she supposed that X were true. Rather, her credence in x’s being a smoker would be determined solely by the relevant frequency data, which is part of her background knowledge B. In other words, the statistical evidence “screens off” the hypothesis from the extra-systemic proposition (Roche and Sober 2013, p. 660). Thus, cases where frequency data is known prior to the formulation of the explanatory hypothesis are cases in which the conclusion of H&F’s argument is false, and therefore are not interesting use cases for their solution.Footnote 2 To put the point slightly differently, in cases where new hypotheses that explain old evidence are learned through statistical analysis, the explanatory relationship between the new hypothesis and the old evidence is transparent to the scientist; there is no moment of surprise when the scientist realizes that the new theory explains the old evidence.
So it must be that H&F have cases in mind in which the relevant frequency data is not included in the agent’s background knowledge. For instance, doctors might observe a patient with lung cancer but not know that smokers are more likely than non-smokers to develop lung cancer. When they learn the frequency data that implies this statistical fact, they might increase their degree of belief that H is a smoker; perhaps this is due to the additional supposition that smoking explains lung cancer in some causal sense. In these cases, there are two possibilities with regard to confirmation. On the one hand, it could be that learning new frequency data does all of the confirmatory work with respect to H. In this case, there is no problem of old evidence, since the new evidence, i.e. the frequency data, is doing all of the confirmatory work. On the other hand, it might be that the explanatory relationship between the hypothesis and the old evidence increases the probability that H is true in a way that is over and above any confirmatory power of the frequency data.
In response to this second case of putative confirmation via deductive explanation, I maintain Roche and Sober’s (2013) position that any learned frequency data will screen off the hypothesis from the extra-systemic, explanatory proposition. If we let \(F_{n,k}\) be the proposition that out of n smokers, k get lung cancer, and let Z be the proposition that the frequency \(F_{n,k}\) is explained by a causal or otherwise explanatory relationship between H and E, the following equation is still true:
$$\begin{aligned} P(H|E,B,F_{n,k},Z) = P(H|E,B,F_{n,k}) \end{aligned}$$
(21)
Thus, if we separate the learned frequency data from the newly formulated explanatory claim, it is not the case that the explanatory claim will have confirmatory power on its own; any increase in the probability of H will be due to the frequency data.Footnote 3
Thus, the best test case for H&F’s solution is one in which there is no frequency data linking the hypothesis to the evidence, but the relationship between the hypothesis and the old evidence is still probabilistic in nature. For example, imagine that the first and only time that two materials a and b are stored together, the warehouse containing them unexpectedly catches fire. Later, scientists develop a new hypothesis, which entails that if materials a and b are in close proximity, then fire will break out with probability \(p < 1\). In this example, the warehouse fire, which is old evidence, seems to confirm the new hypothesis solely in virtue of the explanatory relationship between the warehouse fire and the chemical hypothesis. This explanatory relationship is inductive; the hypothesis implies only that the fire was likely, not that it had to occur as a matter of necessity. If we suppose in addition that the new chemical hypothesis’ nearest competitor did not imply that the fire had as high a probability of occurring, then we might think that H is confirmed by the fact that H implies that the fire had a high probability of occurring, independently of any statistical data.
In these cases, I argue that H&F’s axioms lack any independent justification. Consider the axiom (12). It is true if and only if the following is true:
$$\begin{aligned} P(X|H,E,B,\lnot Y)>P(X|\lnot H,E,B,\lnot Y) \end{aligned}$$
(22)
We might attempt to justify (22) on the grounds that when H and E are both true, it is more likely that H explains E, since H is a true theory of events like E. The problem with this justification is that, as per the terms of the thought experiment, there are no events likeE. If there were, then there would be frequency data to do the confirmatory work, and screen off the hypothesis from the explanatory proposition. Thus, while the example appears to be a promising one for H&F’s solution, it ultimately leaves their axioms unjustified.
In response, one could argue that even if materials a and b are only stored in close proximity one time, it is not the case that there are no events like E. After all, H is introduced as a more general hypothesis from chemistry, and the materials a and b might have more fundamental chemical properties with behaviors that have been observed previously. If H is a theory of the behaviors of these more fundamental properties, then it is the case that H is a theory of events like E, and that therefore (22) is justified. This move is similar to one that Lewis (1994) makes in his discussion of the half-lives of elements. He considers a heavy element, unobtanium, that exists only twice and decays once after \(4.8\, \upmu \)s and once after \(6.1\, \upmu \)s . He argues that these decay times in themselves do not tell us much about the half-life of unobtanium. What we need to do is to consider “general laws of radioactive decay”, laws that are “function[s] of the nuclear structure of the atom in question” (Lewis 1994, p. 478). So even if a chancy event occurs very infrequently, we can subsume the event under more general laws, such that a justification for (22) might go through.
The problem with this kind of response is that when Lewis appeals to more general properties of nuclear structure in order to assign half-lives to atoms of very rare elements, he is doing so in order to arrive at a sufficiently large set of frequency data. This data set will then be informative as to the correlations between types of nuclear structure and an element’s half-life. Similarly, if we can use evidence about more fundamental properties of a and b to assign a high counterfactual probability to the event that materials a and b would explode when stored together, we can do so on the basis of frequency data linking these fundamental properties to events that might be associated with combustion, such as increased temperature, friction, etc. As demonstrated above, this frequency data will screen off the hypothesis H from the explanatory proposition X, rendering (22), and therefore (12), false.
Still, it could be argued that one case that is particularly amenable to H&F’s solution is a case in which a scientist possesses some old evidence, and then learns through the testimony of a second agent that a hypothesis H explains E better than H’s nearest rival \(H^{\prime }\). In such a case, there would be no frequency data to screen off H from X, and so one would be tempted to conclude that H&F’s solution goes through. However, in order for the first agent to accept the testimony of the second agent, the two agents must be epistemic peers to some extent. In this context, I take this to mean that they must both be scientists. Thus, there must ultimately be a scientist at the beginning of a chain of testimony who learned H and discovered that it explained, inductively, some old evidence E, or else there is no problem of old evidence.
In this context, the problem of old evidence is as follows. Why should the scientist at the beginning of the chain of testimony, i.e. the scientist who first realizes that H explains E, increase her degree of belief in H? Given that such a scientist would not be able to rely on testimony to learn this explanatory fact, it stands to reason that in the case where H inductively explains E, she would learn H through either frequency data or theoretical analysis, and then the worries discussed above would once again apply. In the context of the Bayesian approach to confirmation via scientific theory testing, we are concerned with modelling the the beliefs of this first scientist, and thus instances of updating one’s beliefs via testimony are not applicable use cases for H&F’s solution.
Finally, Branden Fitelson has suggested in correspondence that one could take on board the criticisms offered above and still maintain the novelty of H&F’s solution. The key move would be to change the interpretation of the extra-systemic propositions X and Y in the premises (12)–(15). Under the proposed change, X becomes either ‘H is empirically adequate with respect to E’ or ‘H is predictively accurate with respect to E’. Y then becomes ‘\(H^{\prime }\) is empirically adequate with respect to E’, or ‘\(H^{\prime }\) is predictively accurate with respect to E’, where E is old evidence. With this reinterpretation in mind, suppose that our old evidence is a data set plotting the relationship between two variables w and z. A linear regression model with coefficient \(\beta \) has the greatest degree of fit relative to the data, but no substantive theory has been put forward as to why this is the case. Independently of the data set in question, scientists develop a theory implying that for any w, \(w=\beta z\). If we let E be the proposition that the data set is observed, let H be the proposition that theory described above is true, let X be the proposition that the hypothesis H implies that a model with coefficient \(\beta \) best fits the data, and let Y be the proposition that a competitor hypothesis \(H^{\prime }\) implies that a model with coefficient \(\beta \) best fits the data, then one might think that H&F’s argument would be sound.
I take it that the argument to this effect would proceed roughly as follows. At first, we just have some data E. After the analysis, we learn that the best empirical model of the data is implied by the hypothesis H. This discovery, which does not involve the acquisition of any new data, should increase our degree of belief in the hypothesis H, regardless of whether or not \(H^{\prime }\) also entails the best model of the data. This seems to hold whether we interpret a model’s goodness of fit as a matter of empirical adequacy (i.e. the model accurately represents the existing data) or as a matter of predictive accuracy (i.e. given the value of one variable, the model does a good job of telling us the value that the other variable should take).
In response, I argue that the explanatory relationship appealed to here is deductive, rather than inductive. Recall that H entails that the relationship between w and z is best modelled by the equation \(w=\beta z\). The evidential proposition E describes one body of evidence that is best modelled by the line \(w=\beta z\). However, indefinitely many other bodies of evidence could also be optimally modelled by the equation \(w=\beta z\). So the evidence E confirms H in virtue of the fact that E is a member of this wider set of evidential propositions. The relationship between H and the existence of some data set that is best modelled by the line \(w=\beta z\) is deductive, rather than inductive. So this example does not show that H&F’s solution improves on Garber’s solution with respect to the set of cases that it can successfully accomodate.
It might be said in response to this that H may not deductively entail that \(w=\beta z\) ought to be the best-fitting model of the data set in question. Instead, H may only assign a high probability to the proposition that \(w=\beta z\) best fits the data. Here, I need only note that if this probability is generated via frequency data, then this frequency data will screen off the hypothesis from the explanatory proposition, as argued above. If the probability is generated via purely a priori analysis, then we are back to the problem that was noted in the warehouse example. That is, if there are truly no other events like E to be observed, then it is unclear why we ought to think that the truth of both H and E renders the proposition ‘H adequately explains E’ more likely.