1 Introduction

One can basically distinguish two kinds of abductive inferences: those generating new hypotheses and those aiming at determining the best hypothesis from a set of available candidates. Let us call abductive inferences of the former kind creative, and those of the latter kind selective.Footnote 1 While most of the philosophical literature on abduction focuses on selective abduction (see, e.g., Lipton 2004; Niiniluoto 1999; Williamson 2016), there is also an increasing interest in creative abduction (cf. Douven 2017).

In contrast to selective abduction and other kinds of inferences (such as deduction and induction), creative abduction is intended as an inference method for generating hypotheses featuring new theoretical concepts on the basis of empirical phenomena. Most philosophers of science are quite sceptical about whether a general approach toward such a logic of scientific inquiry can be fruitful. However, since theoretical concepts are intimately connected to empirical phenomena via dispositions (see, e.g., Carnap 1936, 1937), a restriction of the domain of application of such an approach to empirically correlated dispositions might be promising. Schurz (2008) differentiates between different patterns of abduction and argues for the view that at least one kind of creative abduction can be theoretically justified. In a nutshell, his approach is based on the idea that inferences to theoretical concepts unifying empirical correlations among dispositions can be justified by Reichenbach’s (1956) principle of the common cause.

In this paper we take up Schurz’ (2008) proposal to combine creative abduction and principles of causation. We model cases of successful creative abduction within a Bayes net framework which can, if causally interpreted, be seen as a generalization of Reichenbach’s (1956) ideas (cf. Glymour et al. 1991). Such a move allows us to specify general conditions which have to be satisfied in order to generate hypotheses involving new theoretical concepts and to describe their unificatory power in a more fine-grained way. In addition, it can be used to shed new light on several other issues discussed within philosophy of science. In this paper we will sketch how it allows for handling cases in which we can only measure non-strict (i.e., probabilistic) empirical dependencies among dispositions, and how it paves the way for new applications to other topics within philosophy of science. We consider our analysis of successful instances of creative abduction by means of Bayes net models as another step toward a unified Bayesian philosophy of science in the sense of Sprenger and Hartmann (forthcoming).

The paper is structured as follows: In Section 2 we introduce Schurz’ (2008) approach to creative abduction. We also explain how it allows for unifying strict empirical correlations among dispositions and how it can be justified by Reichenbach’s (1956) principle of the common cause. In Section 3 we then briefly introduce the Bayes net formalism, present our proposal how to model successful cases of creative abduction within this particular framework, and identify necessary conditions for such cases. Next we investigate the unificatory power gained by creative abduction in the Bayesian setting and draw a comparison with the unificatory power creative abduction provides in the strict setting. In Section 4 we sketch possible applications of our analysis to other topics within philosophy of science. In particular, we discuss the generation of use-novel predictions, new possible ways of applying Bayesian confirmation theory, a possible (partial) solution to the problem of underdetermination, and the connection of modeling successful instances of creative abduction Bayesian style to epistemic challenges tackled in the causal inference literature. We conclude in Section 5.

2 Creative abduction, unification, and the principle of the common cause

In this section we present Schurz’ (2008) approach to creative abduction. Following Schurz, we focus on a simple analysis of dispositions as introduced by the early logical empiricists (e.g., Carnap 1936, 1937).Footnote 2 According to this analysis, whether an object x has a disposition D depends on whether certain test conditions T lead to a specific reaction R. For an object x to be soluble in water, for example, it is required that x dissolves at some time t if put into water at t:

$$ \forall t(T(x,t)\rightarrow (D(x)\leftrightarrow R(x,t))) $$
(1)

According to the traditional understanding, T and R are empirical concepts, while the dispositional concept D is a not directly observable theoretical concept. Note that Eq. 1 comes close to a partial definition of D on the basis of T and R, except that the dispositional term is not relativized to t. What distinguishes the characterization of a disposition D(x) as provided in Eq. 1 from a purely conventional definition of a disposition with reference to time (e.g., by replacing D(x) with D(x,t) in Eq. 1, where D(x,t) might be interpreted as x is soluble in water at some point in time t) is that Eq. 1 is empirically creative in the sense that it allows for deducing empirical statements which cannot be deduced from our background postulates on statements containing T and R alone. It is a well-known fact that the only non-conservative (or creative) import of Eq. 1 is the following assumption about the uniformity of test-reaction pairs: If at some time t an object x satisfies the test conditions and brings about the corresponding reaction, then x will do so at any time t:

$$ \exists t(T(x,t)\wedge R(x,t))\rightarrow \forall t(T(x,t)\rightarrow R(x,t)) $$
(2)

Equations 1 and 2 are empirically equivalent, where two statements “are empirically equivalent just in case they have the same class of empirical, viz., observational, consequences [and …] the empirical consequences of any statement are those of its logical consequences formulable in an observation language” (Laudan and Leplin 1991, p. 451; cf. also Okasha 1997, p. 251). That the empirical content of Eq. 2 is implied by Eq. 1 is straightforward, since Eq. 2 contains only (logical and) empirical expressions and is a direct consequence of Eq. 1. That all statements containing only (logical and) empirical expressions that are consequences of Eq. 1 can be deduced already from Eq. 2 can be shown by definition theoretical means (cf. Essler and Trapp1978).

If Eq. 2 has been established on empirical grounds, then introducing a disposition D via Eq. 1 is a theoretical means to explain Eq. 2. However, not much is gained by introducing D since for each regularity among test-reaction pairs a distinct disposition has to be postulated. Things become more interesting once we focus on regularities among several dispositions D1,...,Dn, each characterized by a corresponding test-reaction pair consisting of Ti and Ri (with 1 ≤ in). Now assume that we found strict pairwise empirical correlations between all of these dispositions D1,...,Dn, meaning that

$$ D_{i}(x)\leftrightarrow D_{i + 1}(x)\text{ for all } 1\leq i<n. $$
(3)

This amounts to the assumption that the following statement has been empirically established:

$$ \exists t(T_{i}(x,t)\wedge R_{i}(x,t))\rightarrow \forall t(T_{j}(x,t)\rightarrow R_{j}(x,t))\text{ for all \(1\leq i,j\leq n\)} $$
(4)

Let us call each statement of this form a crossed uniformity assumption. Given n test-reaction pairs for n dispositions D1,...,Dn, we get n2 such crossed uniformity assumptions (Schurz 2008, p. 226). It is a logical fact that this is empirically equivalent to introducing one higher-level dispositional concept \(\mathcal {D}\) characterized by n test-reaction pairs:

$$ \forall t(T_{i}(x,t)\rightarrow (\mathcal{D}(x)\leftrightarrow R_{i}(x,t)))\text{ for all \(1\leq i\leq n\)} $$
(5)

Note that introducing the theoretical concept \(\mathcal {D}\) via Eq. 5 reduces the number of law statements from n2 to n. In this sense such a reduction can be understood as unificatory. The abductive inference consists in the introduction of \(\mathcal {D}\) via Eq. 5 on the basis of Eq. 4. It can be illustrated on the following example inspired by Hempel (1965): Assume that at some time the inhabitants of a not too distant possible world realized that some objects have the disposition to attract iron (D1) and that some objects have the disposition to produce electricity when moved along a wire (D2), meaning that they introduced the two theoretical concepts D1 and D2 on the basis of Eq. 2 and in accordance with Eq. 1. Suppose further that both discoveries were made independently of each other, but that people found out later on that the dispositions D1 and D2 are correlated (Eq. 3) via observing that their corresponding test and reaction conditions coincided (Eq. 4). They might then have started to explain this correlation by introducing the higher-level disposition of generating an electromagnetic field \(\mathcal {D}\) via Eq. 5.

Note that creative abduction as discussed above can be interpreted either in a realist or an instrumentalist way. Under the latter interpretation \(\mathcal {D}\) is taken to be nothing over and above a more or less useful theoretical means to unify empirical descriptions of certain phenomena of interest that can—in principle—be replaced by any other concept with equal empirical adequacy and unificatory power. Under the realist interpretation, on the other hand, \(\mathcal {D}\) is assumed to represent a real structure; statements involving \(\mathcal {D}\) are considered to be either true or false. Schurz (2008) made a strong case in favour of a realist interpretation by endorsing Reichenbach’s (1956) common cause principle:

(CCP) :

If two properties A and B are correlated and neither A causes B nor B causes A, then A and B are effects of a common cause C.

(CCP) demands that every correlation among any pair of properties not standing in direct causal dependence to each other has to be explained by the existence of an independent common cause. In this sense (CCP) provides a way of causally unifying observed regularities. In the case of pairwise empirically correlated dispositions such as D1,...,Dn above, (CCP) supports a realist interpretation of the unifying higher-level disposition \(\mathcal {D}\): The correlation among dispositions D1,...,Dn is explained by postulating a common cause \(\mathcal {D}\).

In the next section we take up the idea of combining creative abduction and principles of causation by modeling cases of successful creative abduction in a Bayes net framework. Though Bayes nets can be causally interpreted, one does not have to subscribe to a realist interpretation when employing this particular framework to model creative abduction. While the realist gets a justification for creative abductive inferences on the basis of a causal interpretation, the instrumentalist can still use the Bayes net framework without a causal interpretation as a tool for justifying abductive inferences in terms of unificatory power. In this paper we prefer to stay neutral on the realist vs. instrumentalist question. As we will show, modeling creative abduction Bayesian style comes with a couple of advantages regardless of the answer to that question.

3 Modeling creative abduction Bayesian style

We start this section by briefly introducing the basics of the Bayes net formalism. Bayes nets allow for modeling and graphically representing the paths over which probabilistic information spreads between variables. A Bayes net consists of a set V of random variables X1,...,Xn, a set E of directed edges (→) connecting some of these variables, and a probability distribution P over V. A triple 〈V,E,P〉 is a Bayes net if and only if it conforms to the Markov factorization (Pearl 2000, p. 16)

$$ P(X_{1},...,X_{n})=\prod\limits_{i = 1}^{n} P(X_{i}|\mathbf{Par}(X_{i})), $$
(6)

where Par(Xi) is the set of Xi’s parents in the Bayes net’s graph G = 〈V,E〉, i.e., the set of all XjV for which XjXi holds. Whenever the probability distribution P of a triple 〈V,E,P〉 factors according to Eq. 6, then one can read off certain independencies in P from the graph G = 〈V,E〉. Every XiV has, for example, to be independent of every Xj that is not connected to Xi via a path Xi→...→Xj conditional on Par(Xi). In the causal interpretation, the arrows (→) of a Bayes net’s graph stand for direct cause-effect relationships. It is well-known that (CCP) is a consequence of assuming the causally interpretated Markov factorization. Note that Schurz (2008, 2016) only refers to the causal Bayes net framework in order to justify (CCP) in support for a realist interpretation of creative abduction.Footnote 3 In contrast, we employ Bayes nets in order to analyze successful instances of creative abduction.

Let us now come to the question of how to model successful cases of creative abduction in the Bayes net framework. We represent pairwise empirically correlated lower-level dispositions by variables D1,...,Dn and the abduced higher-level disposition by a variable \(\mathcal {D}\). Evidence for one of the lower-level dispositions Di (with 1 ≤ in) is represented by a variable Ei which stands for an inductive generalization of instances of test-reaction conditions such as (Ti(a1,t1) ∧ Ri(a1,t1)) ∧ ... ∧ (Ti(ak,tl) ∧ Ri(ak,tl)). The dependence of each lower-level disposition Di on its corresponding evidence Ei is represented the same way as the dependence of a hypothesis on its evidence is typically modeled in the Bayesian framework: For each pair Di,Ei we draw an arrow DiEi. Since the creative abductive step is conducted by applying (CCP) in Schurz’ (2008) original approach, we introduce the higher-level disposition variable \(\mathcal {D}\) as a common parent of the lower-level disposition variables D1,...,Dn. The resulting graph is depicted in Fig. 1.

Fig. 1
figure 1

Bayes net for modeling successful instances of creative abduction

Probability flow between dispositions D1,...,Dn is established via \(\mathcal {D}\) if the following general conditions are satisfied:

  1. 1.

    \(\mathcal {D}\) is not extreme, i.e., \(0<P(\mathcal {D})<1\).

  2. 2.

    Each Di depends positively on \(\mathcal {D}\), i.e., \(P(D_{i}|\mathcal {D})>P(D_{i})\).

From 1. and 2. it follows that P(Di|Dj) > P(Di) if ij. (For a proof see, e.g., Dardashti et al. 2017.) To account for the corresponding correlations between the evidence E1,...,En, the following condition has to be satisfied as well:

  1. 3.

    Each Ei depends positively on its corresponding Di, i.e., P(Ei|Di) > P(Ei).

From 1., 2., and 3. it follows that P(Ei|Ej) > P(Ei) if ij.

Conditions 1., 2., and 3. are necessary conditions for successful creative abduction: They guarantee pairwise correlations among lower-level dispositions that have to be inductively inferred on the basis of observed evidence and build the basis for introducing the higher-level disposition \(\mathcal {D}\) which is then, in turn, used to explain these correlations.Footnote 4

Like in Schurz’ (2008) original approach, creative abduction provides unification if modeled Bayesian style. In the original approach (see Section 2) introducing the higher-level disposition \(\mathcal {D}\) provided unification of n2 empirical law statements establishing pairwise empirical correlations among n lower-level dispositions to n higher-level dispositional statements. In the Bayes net setting, pairwise empirical correlations between n lower-level dispositions D1,...,Dn consist in \(\binom {n}{2}\) probabilistic dependencies of the form P(Di|Dj) > P(Di), where 1 ≤ ijn. Similarly, for the dependencies among pairs of evidential variables there are \(\binom {n}{2}\) empirical correlation statements of the form

$$ P(E_{i}|E_{j})>P(E_{i})\text{, where \(1\leq i\not=j\leq n\).} $$
(7)

It follows from the Markov factorization (Eq. 6) that these \(\binom {n}{2}\) empirical correlation statements can be unified by the 2n + 1 probabilistic statements in conditions 1., 2. and 3.: n statements of the form P(Ei|Di) > P(Ei) (with 1 ≤ in), n statements of the form \(P(D_{i}|\mathcal {D})>P(D_{i})\) (with 1 ≤ in), and 1 statement \(0<P(\mathcal {D})<1\). To compare Schurz’ (2008) approach and the Bayesian approach w.r.t. their unificatory power, we introduce a simple measure u intended to capture the intuitions about unification outlined above. Given n correlated lower-level dispositions, u(n) measures the ratio between x(n) empirical statements to be unified and y(n) unifying theoretical statements. In order to shift the neutral case to 0, we subtract 1 from this ratio: \(u(n)=\frac {x(n)}{y(n)}-1\). Its output is in the interval [− 1,), where a negative value means that the theoretical description is more costly than simply listing the empirical statements, 0 means that there is no gain but also no cost in providing a theoretical description, and a positive value means that the theoretical description provides unification.Footnote 5

A comparison of the unificatory power of both, the original and the Bayes net approach, is provided in Fig. 2 (thin solid line and thin dotted line): In the case of strict (unconditional) correlations, the original approach fares better than the Bayesian approach. This is due to the theoretical power of the Bayesian framework which requires more parametrization. However, one can increase the performance of the Bayesian approach (see thin and thick dotted line in Fig. 2) by omitting the intermediate lower-level dispositions D1,...,Dn in the 2n + 1 statements used for unifying the correlations among the evidence E1,...,En and explain these correlations directly by n statements of the form \(P(E_{i}|\mathcal {D})>P(E_{i})\) (with 1 ≤ in) and 1 statement \(0<\mathcal {D}<1\) instead.Footnote 6 While introducing the lower-level dispositions D1,...,Dn might be practically necessary to find a more general higher-level disposition \(\mathcal {D}\), the presence of these lower-level dispositions should not be counted against the unificatory value of the larger theory since all the theoretical gain achieved by the unification can eventually be traced back to the presence of the higher-level disposition \(\mathcal {D}\).Footnote 7

Fig. 2
figure 2

Comparison of unificatory power in the original and in the Bayesian setting: n is the number of pairwise empirically correlated dispositions. u(n) measures the unificatory power given n such dispositions by taking the ratio between the number of their corresponding empirical law statements and the number of unifying statements with a shift of the neutral case to 0. In the original setting (thin solid line), u(n) is calculated via \(\frac {n^{2}}{n}-1\), where n2 is the number of empirical law statements in Eq. 4. The unifying statements consist of the n formulae in Eq. 5. In the Bayesian setting (thin dotted line), the corresponding u(n) is calculated via \(\frac {\binom {n}{2}}{2n + 1}-1\). The nominator \(\binom {n}{2}\) expresses the number of statements describing the strict (unconditional) empirical correlations in Eq. 7, and the denominator 2n + 1 is the number of unifying statements in conditions 1., 2., and 3. Omitting the lower-level dispositions D1,...,Dn results in a slight boost of unificatory power (thick dotted line): If one operates directly with the higher-level disposition \(\mathcal {D}\), u(n) is calculated via \(\frac {\binom {n}{2}}{n + 1}-1\). Again, \(\binom {n}{2}\) expresses the number of statements describing the empirical correlations, and n + 1 is the number of unifying statements (condition 1 and n statements of the form \(P(E_{i}|\mathcal {D})>P(E_{i})\), where 1 ≤ in). The unificatory power u(n) in the Bayesian setting with conditional dependencies (thin dashed line) is calculated via \(\frac {2^{n-2}\cdot \binom {n}{2}}{2n + 1}-1\). The numerator expresses the number of statements describing the conditional and unconditional dependencies according to Eq. 8, and the denominator 2n + 1 is, again, the number of unifying statements in conditions 1., 2., and 3. When directly operating with \(\mathcal {D}\) in this setting (thick dashd line), again, a boost in unificatory power results. In this setting u(n) is calculated via \(\frac {2^{n-2}\cdot \binom {n}{2}}{n + 1}-1\). The latter two cases show that once one allows for non-strict (conditional) correlations, then abductive inference in the Bayes net setting receives a tremendous boost in terms of unificatory power. Note that the y-axis plots the logarithm of the ratio with a shift of the neutral case to 0

Up to now we focused on comparing the unification of statements about unconditional empirical correlations. However, many more empirical correlations are possible in the Bayesian setting. If the evidential base is strictly correlated (i.e., P(Ei|Ej) and \(P(E_{i}|\overline {E}_{j})\) with 1 ≤ i,jn are extreme), then it follows from Eq. 6 and conditions 1., 2., and 3. that each two variables Ei,Ej (with ij) are independent conditional on any set of other evidential variables. Thus, the unconditional dependence statements in Eq. 7 capture all dependencies among variables E1,...,En in this setting. However, if some correlations among pieces of evidence cannot be screened off by some non-empty set of other evidential variables, then also many conditional empirical dependencies may hold among pairs of evidential variables. In particular, there can be up to \(2^{n-2}\cdot \binom {n}{2}\) empirical dependencies of the form

$$\begin{array}{@{}rcl@{}} &&P(E_{i}|E_{j},\mathbf{Z})>P(E_{i}|\mathbf{Z}),\text{where}\\ &&1\leq i\neq j\leq n \text{and} \mathbf{Z}\subseteq\{E_{k}:1\leq i\neq k\neq j\leq n\}. \end{array} $$
(8)

If these conditional dependencies are also taken into account, then creative abduction Bayesian style provides a tremendous gain in unificatory power (see Fig. 2, thin dotted and thin dashed line as well as thick dotted and thick dashed line). From 1., 2., and 3. it also follows that P(Ei|Y) > P(Ei|Z), where ZY and Y are sets of evidential variables different from Ei. (For a proof see, e.g., Dardashti et al. 2017.) So, the Bayes net framework allows for a much more fine-grained modeling of non-strictly empirically correlated dispositions which can be found in many higher-level sciences such as economics, medicine, psychology, and sociology.

As the comparison in Fig. 2 shows, the original approach proposed by Schurz (2008) and our Bayesian approach perform differently well in different settings. In the case without conditional correlations, the strict approach fares better. It provides more unificatory power and leads already to unification with only two empirically correlated dispositions, while our Bayes net approach requires at least four empirically correlated dispositions to produce positive unificatory power. In the non-strict setting with conditional correlations, on the other hand, Schurz’ approach is not applicable. This is the setting where the Bayesian approach excels. Although the version with 2n + 1 unifying statements also requires at least four empirically correlated dispositions to produce positive unificatory power, the amount of unificatory power provided explodes. The version with n + 1 unifying statements fares even better. Note that it already provides positive unificatory power with three empiricaly correlated dispositions. These results suggest that the two approaches might rather be seen as complementing each other than as concurring accounts.

4 Possible applications and connections to other issues

In this section we outline possible applications of modeling creative abduction Bayesian style and connections to other topics from the philosophy of science literature. In particular, we discuss how abduced theoretical concepts allow for use-novel predictions, how the approach fits with a recent proposal to solve the problem of underdetermination, and how it provides new possibilities for confirmation. Finally, we briefly discuss how results from the causal discovery literature could be used to approach creative abduction from an epistemic perspective.

Use-novel predictions

Let us illustrate how creative abduction in a Bayes net model allows for generating use-novel predictionsFootnote 8 by means of the magnet example introduced in Section 2. Our line of reasoning here is in accordance with Schurz (2008). Although regarding use-novel facts our framework does not add anything to his argumentation, we think that it is good to see that the Bayesian approach can provide use-novel predictions as well. Assume that an empirical correlation between the two dispositions of attracting iron (D1) and producing electricity when being moved along a wire (D2) had been established by experimenting with lodestone. It is inferred by abductive inference that this correlation is brought about by the higher-order disposition of generating an electromagnetic field (\(\mathcal {D}\)). In our approach, this means that one subscribes to a dispositional pattern captured by a Bayes net model with the structure \(D_{1}\longleftarrow \mathcal {D}\longrightarrow D_{2}\). Now assume that one finds an object that is not a lodestone, but attracts iron anyway (D1). It follows from our model together with conditions 1. and 2. that this increases the probability that this object’s having disposition \(\mathcal {D}\) brought about its having disposition D1. Hence, the probability for \(\mathcal {D}\) is increased as well. But since \(\mathcal {D}\) also increases the probability of this object’s having the disposition to produce electricity by being moved along a wire, also the probability of D2 is increased. Thus, observing that the object has disposition D1 predicts that P(D2|D1) > P(D2) applies to it as well. Note that this prediction is use-novel since only lodestone was used in building the theoretical model.

Confirmation

Given two dispositions D1 and D2 are empirically correlated, it seems to be commonly accepted that one can use evidence for one of these dispositions to confirm the presence of the other disposition. If, for example, one finds that an object attracts iron (E1), then one tends to accept this as evidence that it has the disposition of producing electricity when being moved along a wire (D2) as well. So E1 can be understood as a test for whether an object has disposition D2. This can be justified by help of our model as follows: Once the model’s structure \(E_{1}\longleftarrow D_{1}\longleftarrow \mathcal {D}\longrightarrow D_{2}\) has been established via creative abduction, it follows with condition 3. that observing E1 increases the probability for the presence of D1 which, in turn, by conditions 1. and 2. increases the probability of the presence of \(\mathcal {D}\). Since \(\mathcal {D}\) is a positive factor for bringing about D2 as well, also the probability for D2’s presence will be increased. Thus, P(D2|E1) > P(D2) applies to our object and, according to Bayesian confirmation theory, E1 confirms D2.Footnote 9 Below we will see that a qualitative model of such confirmation, which might be considered to be a straightforward application of the theory of creative abduction based on the common cause principle (CCP), has several problems. In this sense, expanding the account by switching to the Bayes net framework seems to allow for increased applicability.

The problem of underdetermination

This problem arises due to the fact that two different theories or hypotheses H1 and H2 can often account for some evidence E equally well. So, just considering E, it is underdetermined which hypothesis one should choose. One approach to this problem consists in employing indirect evidence E (Laudan and Leplin 1991, p. 464): Assume that H2, but not H1 is derivable from a more general theory \(\mathcal {H}\), which also entails another hypothesis H3. Assume further that E is direct evidence for H3. Now Laudan and Leplin propose that E cannot only be employed for confirming H3 and \(\mathcal {H}\), but also for confirming H2. Their argument for cashing out E in order to confirm H3 can be stated as follows (cf. Okasha 1997, pp. 252f):

  1. i

    \(\mathcal {H}\) entails H2 and H3 (but not H1). Furthermore, E confirms H3.

  2. ii

    Hence: E confirms also \(\mathcal {H}\). (with i)

  3. iii

    Hence: E confirms also H2. (with i and ii)

However, Okasha (1997) has noted that Laudan and Leplin’s (1991) solution falls victim to problems that arise due to qualitative assumptions about confirmation. The underlying principle which grants the inference from i to ii is the so-called converse consequence condition (CCC):

(CCC) :

If A entails B and C confirms B, then C also confirms A.

And the underlying principle which grants the inference of iii is the so-called special consequence condition (SCC):

(SCC) :

If A entails B and C confirms A, then C also confirms B.

Both, (CCC) and (SCC), were already discussed by Hempel (1965), who wrote:

“Special Consequence Condition: If an observation report confirms a hypothesis H, then it also confirms every consequence of H. [… The other condition is] the condition that whatever confirms a given hypothesis also confirms every stronger one. [… This principle might be called] ‘converse consequence condition’.” (Hempel 1965, pp. 31f)

Hempel (1965) also demonstrated that these two principles taken together trivialize the notion of qualitative confirmation because they imply that every statement confirms every other statement. The reason for this is simple:

  1. 1)

    Trivially, A entails A.

  2. 2)

    Hence, by (SCC): A confirms A.

  3. 3)

    Trivially also AB entails A.

  4. 4)

    Hence, by (CCC): A confirms AB.

  5. 5)

    But then, again by (SCC): A confirms B.

Clearly, this problem does not show up for the (comparative and) quantitative notion of confirmation. If we take, for example, the positive relevance notion of confirmation, then for some A,B,C it is well possible that Pr(A|C) ≤ Pr(A) (C is not positively relevant for A) though Pr(A|B) > Pr(A) (B is positively relevant for A) and Pr(B|C) > Pr(B) (C is positively relevant for B). The question arises, how then Laudan and Leplin’s (1991) proposal can be carried out by help of a quantitative notion of confirmation. This is where our probabilistic Bayesian approach to model creative abduction comes into play. We can model Laudan and Leplin’s proposal in a quantitative (probabilistic) way by the Bayes net depicted in Fig. 3. In this model it follows that E confirms H2, but not H1: Like in the paragraph about confirmation, E confirms H2 simply because P(H2|E) > P(H2) holds due to conditions 1., 2., and 3.: The mentioned theorem of Dardashti et al. (2017) shows that given these conditions probabilistic flow between E and H2 is guaranteed, and more generally that positive relevance is transmitted via such paths.Footnote 10 Furthermore, E does not confirm H1 because P(H1|E) = P(H1) holds. This is a direct consequence of the Markov factorization (Eq. 6). In this way our approach can be used to justify a quantitative (probabilistic) reading of Laudan and Leplin’s solution to the problem of underdetermination. The quantitative model allows for avoiding problems a qualitative model of successful creative abduction might have when applied to the problem of underdetermination as outlined here.

Fig. 3
figure 3

Bayes net modeling Laudan and Leplin’s (1991) solution to the problem of underdetermination

The epistemic challenge: search

In this paper we aimed at modeling creative abduction in the Bayes net framework. To this end we assumed that creative abduction had already been successfully applied. We did not provide an answer to the epistemic question of how and under which conditions creative abduction can be successfully applied in practice. So the epistemic challenge consists in developing reliable methods to abduce unifying dispositions on the basis of empirical data. As Glymour (2018) points out, this problem is tackled in the literature on search of latent variables (see, e.g., Silva et al. 2006; Kummerfeld and Ramsey 2016). Such procedures would, however, require continuous data rather than binary variables as we used them in this paper. So variables should rather represent the strengths of dispositions than simply the presence of such dispositions to get these approaches to work. How exactly such approaches to latent variable search fit with the classical literature on abduction within philosophy of science has to be investigated in future research.

5 Conclusion

This paper was about modeling successful cases of creative abduction on the basis of empirically correlated dispositions within a Bayes net framework. After introducing Schurz’ (2008) strict approach in Section 2, we developed a Bayes net representation of instances of successful creative abduction in the sense of Schurz in Section 3. This move allows for a more fine-grained investigation of the unificatory power gained by creative abduction. It also allows for identifying the relevant necessary conditions for successful cases of creative abduction. Note that our approach to creative abduction can, in a very limited way, be used for purposes of selective abduction as well. It suggests to penalize all dispositions of a given set of candidates that do not meet the necessary conditions for successful creative abduction, i.e., all those \(\mathcal {D}\)s that (i) are not positively correlated with one of the lower-level dispositions D1,...,Dn (or one of the pieces of evidence E1,...,En) to be explained or (ii) do not screen off all non-intersecting sets of lower-level dispositions (or pieces of evidence) from each other. If (i) were the case, then \(\mathcal {D}\) would not explain every lower-level disposition (or piece of evidence), and if (ii) were not the case, the Markov condition would be violated and \(\mathcal {D}\) would not fully explain some correlations among lower-level dispositions (or pieces of evidence). In both cases, there might be a better dispositional explanation available. The approach does, however, not come with a criterion for how to select the best disposition(s) \(\mathcal {D}\) of a set of rivals all satisfying these necessary conditions. For this purpose, one could use one of the approaches to selective abduction already on the market (see, e.g., Lipton 2004; Niiniluoto 1999; Williamson 2016).

In Section 4 we then discussed several possible applications of modeling creative abduction Bayesian style. In particular, we spelled out how creative abductive inferences can generate use-novel predictions in our setting. We also presented a new possibility to apply Bayesian confirmation theory: Once a higher-level connection between lower-level dispositions has been established via creative abduction, one can confirm the presence of one of these lower-level dispositions by finding evidence for one of the other lower-level dispositions. Another result was that a quantitative (probabilistic) reading of Laudan and Leplin’s (1991) proposed solution to the problem of underdetermination can be supported once one is able to unify one of the competing hypotheses with an additional hypothesis via creative abduction.

This paper was about modeling successful instances of creative abduction and about which interesting conclusions one can draw from a Bayes net representation. An issue that has not been tackled is the epistemic question of how exactly theoretical concepts should be abduced on the basis of empirical data. If dispositions can be adequately represented by continuous variables, then this seems to open the door for a fruitful application of much more sophisticated search procedures from the literature on causal discovery.