1 Introduction: conditionals and ‘because’

Many philosophers felt that there is a close connection between the connective ‘if’ on the one hand and the connectives ‘since’ and ‘because’ on the other. Frege (1892, p. 48), Ramsey (1931, p. 317), Goodman (1947, p. 114), Ryle (1950, pp. 339–340), von Kutschera (1974, pp. 265–268), Pizzi (1980, pp. 75–77), McCall (1983, p. 315) and Blau (2008, p. 164–171) have all voiced their opinions about this idea. For example, it has been argued that ‘because A, C’ implies the conditional ‘if A, then C’, and also the might counterfactual ‘if \(\lnot A\), it might have been the case that \(\lnot C\)’, or even the stronger ‘if \(\lnot A\), it would have been the case that \(\lnot C\)’. There is even a terminological encoding of the similarity of ‘if’ and ‘since’. Goodman called ‘since’ sentences factual conditionals, because their assertion presupposes that the antecedent is (believed to be) a fact. This label makes equally good sense for ‘because’. In this article, we will in fact assume that ‘since’ and ‘because’ are synonymous, and we will mostly talk about ‘because’. We take it that ‘because’ sentences do not necessarily express causal relations, but may express reason relations or explanatory relations in a broader sense.

Given the above-mentioned long tradition of philosophical notes on the relation between ‘because’ and ‘if’, it is a striking and surprising fact that although philosophy, logic, linguistics and psychology have produced a vast literature on the logic of ‘if’ sentences (i.e., on conditional logic), these investigations have not been used to get more insights on ‘because’. Indeed, very little work has been done on the logic of sentences featuring ‘because’.

In this paper, we want to make up for this discrepancy by proposing a logic for ‘because’. We will not, however, use any of the bridges suggested by the philosophers mentioned above. We will follow our own route linking ‘because’ to existing work in conditional logic. We argue that there is a missing link between the suppositional conditionals that are typically the subject of conditional logics and ‘because’ sentences. This missing link are difference-making conditionals. On the one hand, difference-making conditionals are like suppositional conditionals in that they allow for various truth or acceptance values of the antecedent (which is not true for ‘because’). In fact, we think that many conditionals uttered in ordinary discourse are intended as difference-making conditionals, and we’ll give some examples for this claim. On the other hand, difference-making conditionals are like ‘because’ sentences in that they highlight that their antecedent is relevant for their consequent (which is not true for ‘if’ in its suppositional interpretation). That is, supposing the antecedent to be true has the effect of ‘raising’ the truth value or acceptance value of the consequent. This is consonant with the idea that ‘because’ sentences express explanations, that their antecedents name reasons for their consequents. In our approach, difference-making conditionals are stronger than suppositional conditionals, and ‘because’ sentences are in turn stronger than difference-making conditionals.

The plan of the paper is as follows. We first recapitulate some of the most popular logics for suppositional conditionals. We base our subsequent considerations on a minimal core logic and show how it can be extended to some of the most well-known logics for suppositional conditionals. Second, we argue that most of the usual principles for suppositional conditionals fail for ‘because’. Then we introduce several semantics known from suppositional conditionals, which we reinterpret for difference-making conditionals and for ‘because’. In a fourth step, we present our minimal logic for ‘because’ sentences and show how it can be extended in ways that parallel the extensions of the minimal logic of suppositional conditionals. We prove these logics to be sound with respect to the semantics. We then compare our account with related work. In this first and major part of the paper, we argue on the one hand from a number of natural language examples that certain principles should fail for reasoning with ‘because’. On the other hand, we argue from a semantic interpretation of ‘because’ that it should follow certain principles. At the end of our paper, we confront our analysis with a number of examples known to be problematic for models of causal reasoning. Some of them turn out to be problematic for our analysis of ‘because’ sentences, too, others don’t.

2 The logic of suppositional conditionals

We will be using several object languages in this paper. All of them feature the usual truth-functional propositional operators \(\lnot \), \(\wedge \), \(\vee \), and \(\equiv \). We define the logical constants \(\top \) (verum) and \(\bot \) (falsum) in the obvious way: \(\top \) designates an arbitrary classical propositional tautology, for example \(p \vee \lnot p\), and \(\bot \) designates \(\lnot \top \). We use \(\vdash \) to denote classical consequence. We say that a sentence is factual if it is only composed of the above vocabulary, i.e., factual sentences are the sentences of classical propositional logic.

Our object languages also feature conditional connectives of three different kinds: the suppositional conditional >, the difference-making conditional \(\gg \) (both read ‘if ..., [then] ...’) and the ‘because’ connective . \(A \not > B\) is short for the negation of the statement \(A > B\), and similarly for the other conditionals. The semantics for these conditionals will be presented in Sect. 4. In this section, we are concerned with the most important inference principles known from suppositional conditional reasoning. They should be read as follows: If the antecedent conditionals hold, the consequent conditionals hold as well.Footnote 1

The most prominent principles of suppositional conditionals > are:

  • If \(\vdash A \equiv B\), then: if \(A>C\) then \(B>C\). LLE

  • If \(\vdash B \supset C\), then: if \(A>B\) then \(A>C\). RW

  • \(A>A\). ID

  • If \(A>B\) and \(A>C\), then \(A>B\wedge C\). AND

  • If \(A>C\) and \(A>B\), then \(A\wedge B>C\). CM

  • If \(A\wedge B>C\) and \(A>B\), then \(A>C\). CUT

  • If \(A>C\) and \(B>C\), then \(A\vee B>C\). OR

  • If \(A\vee B>C\), then \(A>C\) or \(B>C\). DR

  • If \(A>C\) and \(A\not >\lnot B\), then \(A\wedge B>C\). RM

LLE is Left Logical Equivalence, RW is Right Weakening, ID is Identity (also known as Reflexivity), AND is also known Conjunction in the Consequent, CM is Cautious Monotonicity, CUT is a kind of reverse version of CM, OR is also known as Disjunction in the Antecedent, DR is Disjunctive Rationality and RM is Rational Monotonicity.

The four principles LLE–AND determine the System B for basic reasoning.Footnote 2 The six principles LLE–CUT determine the System C for cumulative reasoning, the seven principles LLE–OR determine the System P for preferential reasoning.Footnote 3 System P has often been considered to be the conservative core of reasoning with conditionals.Footnote 4 The eight principles LLE–DR determine the System D for disjunctively rational reasoning, the nine principles LLE–RM determine the System R for rational reasoning.Footnote 5

It should be noted that in the presence of Right Logical Equivalence

  • If \(\vdash B \equiv C\), then: if \(A > B\) then \(A > C\). RLE

RW is equivalent to any of the following two conditions:

  • If \(A > B\), then \(A > B \vee C\). DW

  • If \(A > B \wedge C\), then \(A > C\). CW

DW stands for Disjunctive Weakening, CW for Conjunctive Weakening. In fact, we can always replace RW by RLE + DW (Raidl, 2021b, p. 88) or by RLE + CW.

We define the inner necessity \({{\,\mathrm{\boxdot }\,}}A\) and the outer necessity \({{\,\mathrm{\square }\,}}A\) in the following wayFootnote 6:

  • \({{\,\mathrm{\boxdot }\,}}A \,= \ (\top > A)\).

  • \({{\,\mathrm{\square }\,}}A \,= \ (\lnot A > \bot )\).

We let and denote the inner and outer possibility of A, respectively. In a possible worlds account (see Sect. 4 below), the inner necessity can be interpreted as expressing what is true in the closest possible world(s) and the outer necessity can be thought of as a metaphysical necessity. In a belief revision account (see Sect. 4), the inner necessity can be interpreted as a belief operator (provided we assume that revising by the tautology does not change the agent’s beliefs), and the outer necessity is a doxastic necessity.

The weakest logic that we will consider here, let us call it \({{\,\mathrm{\textbf{B}}\,}}^+\), is \({{\,\mathrm{\textbf{B}}\,}}\) augmented by the following principles:

  • \(\lnot {{\,\mathrm{\boxdot }\,}}\bot \). CONS

  • If \(A>C\), then . INC

  • If \({{\,\mathrm{\boxdot }\,}}A\) and \({{\,\mathrm{\boxdot }\,}}C\), then \(A > C\). CPRES

CONS is a Consistency condition, INC is similar to the belief revision principle of Inclusion and CPRES to the belief revision principle of Cautious Preservation.Footnote 7 The last is not to be confused with the (in)famous principle of Preservation

  • If , then \(A > C\). PRES

We do not necessarily assume PRES. And we do not require the following principle of Strong Consistency, which strengthens CONS:

  • If \(A > \bot \), then \(A\vdash \bot \). SCONS

In most systems \({{\,\mathrm{\square }\,}}A\) implies \({{\,\mathrm{\boxdot }\,}}A\). It suffices to apply RW and INC. It is easy to verify that given ID, LLE and RW, we can derive INC, CPRES, PRES from OR, CM, RM, respectively. Furthermore, PRES implies CPRES in the presence of CONS. CONS says that the inner modality (and the outer modality) is consistent. In terms of beliefs, this means that the prior beliefs (and the doxastic necessities) are consistent. In terms of closest worlds, it means that there are always closest possible worlds (and thus always accessible worlds).

We call \(\textbf{B}'\) the system obtained from \(\textbf{B}\) by adding INC and CPRES, and \(\textbf{B}_{\tiny \text{ AGM }}\) the system obtained from \(\textbf{B}\) by adding INC and PRES. We call \(\textbf{BN}\), \(\textbf{PN}\), \(\textbf{DN}\) and \(\textbf{RN}\), the strengthenings of the systems \(\textbf{B}\), \(\textbf{P}\), \(\textbf{D}\) and \(\textbf{R}\) by the principle CONS (Fig. 1).Footnote 8 Thus our system \({{\,\mathrm{\textbf{B}}\,}}^+\) can equivalently be written as \({{\,\mathrm{\textbf{BN}}\,}}+\ \textrm{INC} +\ \textrm{CPRES}\), or as \({{\,\mathrm{\textbf{B}}\,}}'\textbf{N}\). \({{\,\mathrm{\textbf{RN}}\,}}\) can be seen as the non-nested fragment of the Lewisean System \(\textsf{VN} = \textsf{V} + \text{ CONS }\), analysed in Raidl (2019). A hierarchy including \(\textbf{P}\), \({{\,\mathrm{\textbf{D}}\,}}\), \({{\,\mathrm{\textbf{R}}\,}}\), \({{\,\mathrm{\textbf{PN}}\,}}\), \({{\,\mathrm{\textbf{DN}}\,}}\), \({{\,\mathrm{\textbf{RN}}\,}}\), or rather their extensions with unrestricted embeddings and nestings of conditionals, is analysed in Raidl (2021a, ch. 7).

Fig. 1
figure 1

Hierarchy of systems: On the left: the system B for basic reasoning, C for cumulative reasoning, \(\textbf{B}_{\tiny \text{ AGM }}\) for basic reasoning according the AGM theory, P for preferential reasoning, D for disjunctive rational reasoning, R for rational reasoning, and \(\textbf{B}'\) the system obtained from B by adding Inclusion (INC) and Cautious Preservation (CPRES). On the right: the strengthenings of these systems by Consistency (CONS) are denoted by adding the letter N. The basic system considered here is B\(^+\) = B\('\)N

3 Almost all traditional principles for suppositional conditionals fail for ‘because’

It will turn out that only two of the principles for suppositional conditionals remain valid in our modeling of ‘because’, namely LLE and AND. We discuss here a few examples that illustrate how some of the principles can come to fail.Footnote 9 In Sect. 4.3 below we will formalize these examples.

Against Right Weakening. It makes perfect sense to say ‘Because you pay an extra fee (p), your letter will be delivered (q) by express (r)’, since the extra fee will buy you a special service.Footnote 10 But it sounds odd to say ‘Because you pay an extra fee, your letter will be delivered’, since the letter would be delivered anyway, even if you did not pay the extra fee.

Rott (2022a) has argued that it is the hallmark of difference-making conditionals that they do not satisfy Right Weakening. Just as it is the most striking feature of the conditionals modeled by conditional logics in the wake of Stalnaker and Lewis that they don’t validate ‘Left Strengthening’ (also known as Strengthening the Antecedent), it is the most striking feature of difference-making conditionals that they invalidate Right Weakening (or Weakening the Consequent).

Against Cautious Monotony. A research project with two postdoc positions is about to start. I believe that Pam and Quinn will work on the project (p and q), and that the project will be successful (r). I know that Pam is an excellent and dedicated researcher, and if she is missing, the project might fail. On the other hand, I know that Quinn is neither the greatest researcher and nor terribly interested in the topic of the project. But Quinn likes Pam a lot, and if Pam is not in, it is not sure that Quinn will be in. So I think ‘Because Pam works on the project, Quinn will work on it, too’, and I also think ‘Because Pam works on the project, the project will be successful’. It sounds strange, however, to say ‘Because Pam and Quinn work on the project, the project will be successful’, since should one of them not be in the project, it will most likely be Quinn who is missing—remember he is not keen on the topic—and the project will be a success anyway because of Pam’s work.

Against Cut. Another research project with two postdoc positions is about to start. There have been many highly qualified applicants. I believe that Peter and Quiana will work on this project (p and q), and that the project will be successful (r). I know that Peter is not the greatest researcher but an exceptionally nice person, and that Quiana is brilliant but the topic of the project is not her favourite one. However, Quiana likes Peter a lot, and if Peter is not in, it is very unlikely that Quiana will be in. Peter and Quiana form a very good team, but if one of them is missing, this will be Quiana and the project is likely to fail (it is Quiana’s contribution that is crucial for the success of the project). So I think ‘Because Peter works on the project, Quiana will work on it, too’, and I also think ‘Because Peter and Quiana work on the project, the project will be successful’. It sounds strange, however, to say ‘Because Peter works on the project, the project will be successful’, since should Peter not be in the project, it will be successful anyway, as there are many competent applicants for this project.

Against OR. Pam and Quinn live in a village with two pubs. They both prefer the Irish pub to the Spanish pub, but they don’t avoid the latter altogether. I know that they will go out tonight and they want to meet in a pub, but it is not quite clear in which pub. I believe that Pam will go to the Irish pub (p), that Quinn will go to the Irish pub (q), and that they will meet each other (r). It makes sense to say ‘Because Pam goes to the Irish pub, they will meet’, since if Pam does not go to the Irish pub (and goes to the Spanish pub instead), they will most likely miss each other. Similarly for ‘Because Quinn goes to the Irish pub, they will meet’. But it sounds odd to say ‘Because Pam goes to the Irish pub or Quinn goes to the Irish pub, they will meet’, since if neither of them goes to the Irish pub, they will meet each other anyway in the Spanish pub.Footnote 11

4 Semantics

In the debate about conditionals there is a controversy whether conditionals have truth values or only acceptance values. We want to provide a semantics that is flexible enough to accommodate both positions. The most general models of this kind are multi-state models in which the states may either be doxastic states or wordly states. Each state is represented by a set of possible worlds and some choice function over this set.Footnote 12 More traditional possible-worlds models of the Stalnaker-Lewis kind are special cases of multi-state models: they can be identified with models having a multitude of states each of which is associated to a single world which is the “most plausible” world for that state. This then is the distinguished world of the state, and the “plausibility” of the other worlds is reinterpreted as their comparative closeness or similarity to the distinguished world. Such models can be understood as non-doxastic ones, and they provide truth conditions for conditionals at world-state pairs. In the following we will consider truth conditions and acceptance conditions in parallel.

In this paper, we do not want to commit ourselves to the view that sentences with conditionals or ‘because’ as the main connective express propositions. So we consider only flat (non-embedded) conditionals with a factual antecedent A and a factual consequent C. We refrain from insisting that it makes sense to embed conditionals in complex sentences.Footnote 13 For this reason, we can work with very simple single-state models with only a single state.

4.1 Models, satisfaction, validity

A model is a triple \(M = \langle W, \sigma , v\rangle \), where W is a set of possible worlds, v is a valuation function over worlds and propositional variables extended to all factual sentences in the classical Boolean way,Footnote 14 and \(\sigma \) is a choice function over W which assigns to each subset V of W a (possibly empty) subset of V. Thus the following property is satisfied:

  • (id) \(\sigma (V) \subseteq V\).

Since we assume (id), we will also call these models (id)-models.

Our guiding idea is to think of \(\sigma \) as a plausibility function representing our dispositions to restrict attention to certain possible worlds, given certain suppositions. Intuitively, for a subset V of W, \(\sigma \) selects the most plausible elements of V; these are then given by \(\sigma (V)\). An agent whose dispositions are represented by \(\sigma \) and who makes the supposition that V restricts attention to the worlds in \(\sigma (V)\) for her further reasoning. This allows for both a doxastic interpretation and a metaphysical interpretation of M and \(\sigma \). In the doxastic interpretation, M represents an agent’s doxastic state, and \(\sigma (V)\) can be understood as the strongest believed proposition after (hypothetically) revising her beliefs by V. \(\sigma (W)\) can be identified with the strongest proposition currently believed to be true by the agent. In the metaphysical interpretation, M represents a possible-worlds scenario, and \(\sigma (V)\) can be understood as the set of V-worlds that are closest to the actual world. \(\sigma (W)\) then is the set containing only the world that is closest to the actual world, viz., the actual world itself.

If A is a factual sentence, we write for the A-worlds \(\{w\in W: v(w,A)=1\}\) and drop the subscript if there is no danger of confusion. Factual sentences A are satisfied in worlds as usual: \(w {{\,\mathrm{\vDash }\,}}A\) iff . For two factual sentences A and C, the satisfaction of a suppositional conditional \(A>C\) in a model \(M = \langle W, \sigma , v\rangle \) is defined by

  • \(M \;{{\,\mathrm{\vDash }\,}}\, A > C\) iff .

We also allow negated conditionals and define \(M \,{{\,\mathrm{\vDash }\,}}A \not > C\) as \(M \,{{\,\mathrm{\nvDash }\,}}\, (A > C)\). Let \(X_1, \dots , X_n\) be a (possibly empty) sequence of conditionals or negated conditionals, and let Y be a conditional or a negated conditional or a disjunction ‘\(Y_1\) or \(Y_2\)’ of conditionals. The inference from \(X_1, \dots , X_n\) to Y is valid in a model iff whenever \(\sigma \) satisfies \(X_1, \dots , X_n\), it also satisfies Y (or respectively, it also satisfies \(Y_1\) or satisfies \(Y_2\)).Footnote 15 The inference is valid in a class of models iff it is valid in all models of that class.

Our basic semantics, which we call 0-semantics, has the following additional properties:

(cons):

\(\sigma (W) \ne \emptyset \) .

(inc):

If \(\sigma (V) \subseteq U\), then .

(cpres):

If \(\sigma (W) \subseteq V \cap U\), then \(\sigma (V) \subseteq U\).

We will shortly see that the properties (id), (cons), (inc) and (cpres) correspond to the principles ID, CONS, INC and CPRES, respectively. We will not generally impose the stronger principles

(scons):

If \(\sigma (V) = \emptyset \), then \(V = \emptyset \).

(pres):

If \(\sigma (W) \subseteq U\) and , then \(\sigma (V) \subseteq U\).

We can now formulate properties corresponding to the principles OR, CM, DR and RM, and just as we extended our minimal logic \({{\,\mathrm{\textbf{B}}\,}}^+\) by the additional principles, we can strengthen our semantics by the corresponding properties:

(or):

\(\sigma (U \cup V) \subseteq \sigma (U) \cup \sigma (V)\).

(cm):

If \(\sigma (U) \subseteq V\), then \(\sigma (U \cap V) \subseteq \sigma (U)\).

(dr):

If \(\sigma (U \cup V) \subseteq T\), then \(\sigma (U) \subseteq T\) or \(\sigma (V) \subseteq T\).

(rm):

If , then \(\sigma (U \cap V) \subseteq \sigma (U)\).

The (id)-semantics with (cons), (or), (cm) and (rm) was called consistent Lewisean semantics by Raidl (2019). The weaker (id)-semantics with just (or) and (cm), or the additional (dr) were analysed in Raidl (2021a, ch. 7).

Lemma 1

Every (id)-model validates the principles LLE, RW, AND and ID.

Theorem 1

If a model has the property (cons), (inc), (cpres), (pres), (cm), (or), (dr) or (rm), respectively, it validates the corresponding principle (CONS), (INC), (CPRES), (PRES), (CM), (OR), (DR) or (RM).

We can thus conclude that our 0-semantics validates \({{\,\mathrm{\textbf{B}}\,}}^+\), and that if it has a collection of properties from the above list then it validates the corresponding collection of principles.Footnote 16

We can properly reproduce the Lewisean account, restricted to the flat language, in our models. Lewis’ assumption that the possible worlds can be arranged in nested plausibility spheres is encoded by (id), (or), (cm) and (rm)—call this a Lewisean model. A centered model is a model where \(\sigma (W)\) is a singleton \(\{w_{\sigma }\}\). The world \(w_{\sigma }\) can be thought of as the ‘center’ of the state \(\sigma \) and thus of the Lewisean spheres. Intuitively (and in accordance with Lewis), for all \(V\subseteq W\), \(\sigma (V)\) is the set of V-worlds that are closest or most similar to the world \(w_\sigma \). The move from models to centered models marks a transition from a doxastic interpretation to a metaphysical interpretation of our models. Conditionals can then be interpreted as having truth values—and not just acceptance values—in possible worlds models.

4.2 Suppositional conditionals, difference-making conditionals and ‘because’

We extend the language \({{\,\mathrm{\mathcal {L}}\,}}_>\) so as to feature not only > but also the connectives \(\gg \) and . As before, we only consider flat conditionals.Footnote 17 Conditionals (with factual antecedent A and factual consequent C) are satisfied in a model \(M = \langle W, \sigma , v\rangle \), and the defining conditions are the followingFootnote 18:

Suppositional conditionals:

  • (RT) \(M \;{{\,\mathrm{\vDash }\,}}\, A > C\) iff .

Difference-making conditionals:

  • (RRT) \(M \;{{\,\mathrm{\vDash }\,}}\, A \gg C\) iff and .

‘Because’:

  • (RTB) iff and and .

A few comments are in order. We have repeated clause (RT) that is reminiscent of the famous Ramsey testFootnote 19 and encodes the suppositional conditional > (or Lewisean conditional). ‘\(M \,{{\,\mathrm{\vDash }\,}}A > C\)’ may generally be read as ‘\(A > C\) holds in M’. According to (RT), the condition for this is that C is true in all worlds from . In the doxastic interpretation, (RT) requires that C is believed after supposing A and minimally altering one’s beliefs in accordance with that supposition. In this interpretation, ‘\(A>C\) holds in M’ means that \(A > C\) is accepted by the agent whose doxastic state is represented by \(\sigma \). In the metaphysical interpretation, on the other hand, (RT) requires that C is true in the A-worlds that are closest or most similar to the world in \(\sigma (W)\). In this interpretation, ‘\(A>C\) holds in M’ means that \(A >C\) is true in the world in \(\sigma (W)\).

In most accounts, (RT) yields that whenever A and C are (believed to be) true in M, then the conditional ‘If A then C’ holds in M—regardless of whether A is in any way (considered to be) relevant for C or whether there is any substantive connection between A and C.Footnote 20 In many contexts, this inference, called conjunctive sufficiency or conjunction conditionalization, is counter-intuitive and should be blocked. Thus (RT) does not take into account a fundamental feature of conditionals as used in natural language: typically, such conditionals do express that the antecedent is relevant to the consequent.Footnote 21

Taking up this idea, (RRT) is reminiscent of the Relevant Ramsey test of Rott (2022a) and encodes the difference-making conditional. Such a conditional \(A\gg B\) holds in a model M just in case (i) B is true at all worlds in , (ii) but not at all worlds in . Roughly speaking, supposing A makes a difference to the metaphysical or doxastic status of the consequent. The idea of incorporating relevance into the interpretation of conditionals was the basis of Rott (1986), who introduced RRT in a belief-revision interpretation.Footnote 22 Difference-making conditionals express that the antecedent is a “sufficient reason” for the consequent in the sense of Spohn (1983), a notion that was suggested to be applied to conditionals in Spohn (2013, 2015). The logic of such conditionals was analysed by Raidl (2021b, 2021c).

The conditions of (RTB) for a ‘because’ sentence to hold are those of (RRT), namely (i) and (ii), extended by the requirement (iii) of the truth or acceptance of the antecedent and (iv) of the consequent. It is a kind of Ramsey test for ‘because’ sentences. We formally represent ‘because’ by the symbol in this paper.

Overall, (RRT) and (RTB) stepwise strengthen the original Ramsey test (RT). For a conditional to hold in \(\sigma \), (RT) imposes (i) that the consequent is true in the most plausible antecedent worlds, (RRT) adds (ii) that the consequent fails to be true in at least one of the most plausible non-antecedent worlds. (RTB) further adds (iii) that the antecedent is (believed to be) true in the state and that (iv) the consequent is (believed to be) true in the state.

Validity is defined as before. In Sect. 5, we will derive principles of reasoning with ‘because’ from the properties of the choice functions \(\sigma \).

Although widely taken for granted, the semantic postulate of Preservation (pres) and its conditional analogue (PRES) have repeatedly been criticised in belief revision theory and conditional logic alike.Footnote 23 There is no similar discussion of the weaker cousin—Cautious Preservation (CPRES)—nor of Inclusion (INC) for suppositional conditionals. In fact, the semantic conditions (inc) and (cpres) are unavoidable, if we model \(\sigma (V)\) as the minimal V-worlds according to an order relation. They are also unavoidable if we accept the highly entrenched laws of conditional reasoning OR and CM. For this reason, we prefer not to endorse (pres) generally, but do not quarrel with (inc) and (cpres).

In fact, one can show that we need not adopt Preservation at all. In the language (without > and \(\gg \)), it makes no difference whether the models do or do not satisfy (pres).

Lemma 2

For every model M satisfying (id), (cons) and (cpres), and violating (pres), there is a (pres) model \(M'\) with the same worlds and the same valuation which agrees with M on all -sentences.

Corollary 1

Let C be a class of models satisfying (id), (cons) and (cpres), and \(C'\) the restriction of C to those that satisfy (pres). Then C and \(C'\) have the same logic.

Our choice of 0-semantics as the minimal semantics for the modelling of ‘because’ is rooted in our assumption that actual belief is consistent (cons) and our endorsement of (inc) and (cpres), which are weak standard requirements in both doxastic or metaphysical interpretations of conditionals.

4.3 Formalizing the counterexamples

We here briefly formalize the examples from Sect. 3 and illustrate why the principles mentioned there are invalid. For this we use a system of spheres in the style of Grove (1988). In the center of such a system, we find the possible worlds compatible with the agent’s beliefs. These are the most plausible worlds. Should the agent learn, however, that none of these possible worlds is the actual one, she has a first fallback position with the ring of possible worlds around the center. Should she learn that none of these is the actual world either, her next fallback position is the next ring. And so on. Systems of spheres interpreted in this way are essentially equivalent to a total plausibility preordering of W, where w is more plausible than v, in symbols \(w < v\), iff there is a sphere that contains w but doesn’t contain v. Since all possible worlds are comparable in terms of plausibility, it should be noted that this representation makes the conditional > rather strong, and the related hypothetical belief revision is close to the standard AGM belief revision mechanism. But this causes no problems for the argument, since if a principle is invalid for a strong semantics, it is invalid for any weaker semantics. Thus to show that the principles RW, CM, CUT and OR are invalid in our ground semantics, it suffices to show that they are invalid in a stronger semantics for ‘because’.

Against Right Weakening. Recall our example: It is fine to say ‘Because you pay an extra fee (p), your letter will be delivered (q) by express (r)’, formalized as . But it sounds odd to say ‘Because you pay an extra fee, your letter will be delivered’, formalized as . Figure 2 gives a diagram representing this situation.

The figure is to be read as follows. The most plausible worlds are all in the inner circle. Each of them make p, q and r true. That is, we believe p, q and r. We also believe \(q \wedge r\) given p, since the most plausible p-worlds (in the inner circle) are \(q \wedge r\)-worlds. And we don’t believe \(q \wedge r\) given \(\lnot p\), since some of the most plausible \(\lnot p\)-worlds (like the one designated by the red spot and in the second circle) is not an r-world. That is, we accept the suppositional conditional ‘if you hadn’t payed the extra fee, your letter might not have been delivered by express’. However, we also accept the suppositional conditional ‘if you hadn’t payed the extra fee, your letter would have been delivered’, since the most plausible \(\lnot p\)-worlds (in the second circle but outside the p-area) are in fact q-worlds. And thus we reject ‘if you hadn’t payed the extra fee, your letter might not have been delivered’. This is why we reject ‘your letter will be delivered, because you pay the extra fee’. We won’t repeat such a detailed analysis for the following illustrations.

Fig. 2
figure 2

RW is invalid for ‘because’: , but not

Against Cautious Monotony. Our example was that we accept ‘Because Pam works on the project (p), Quinn will work on it, too (q)’, formalized as . We also accept ‘Because Pam works on the project (p), the project will be successful (r)’, formalized as . But in the situation described, we would reject ‘Because Pam and Quinn work on the project, the project will be successful’, formalized as . The reason was that we think that Quinn is not competent and is not interested in the project, but wants to work with Pam. Figure 3 depicts this situation. We accept \(\lnot (\lnot p > q)\) and \(\lnot (\lnot p > r)\), but we reject \(\lnot (\lnot (p \wedge q) > r)\), since we accept \(\lnot (p \wedge q) > r\).

Fig. 3
figure 3

Cautious Monotony is invalid for ‘because’: and , but not

Against Cut. We accept ‘Because Peter works on the project (p), Quiana will work on it, too (q)’, formalized as , and we also accept ‘Because Peter and Quiana work on the project, the project will be successful (r)’, formalized as . But we reject ‘Because Peter works on the project, the project will be successful’, formalized as . The project will be successful due to Quiana, and Quiana will work on the project because Peter works on it, but Peter will make no contribution to the project. Figure 4 presents a diagram illustrating this situation. We accept \(\lnot (\lnot p > q)\) and \(\lnot (\lnot (p \wedge q) > r)\), but we reject \(\lnot (\lnot p > r)\), since we accept \(\lnot p > r\).

Fig. 4
figure 4

Cut is invalid for ‘because’: and , but not

Against OR. In our example we accepted ‘Because Pam goes to the Irish pub, they will meet’ (), and ‘Because Quinn goes to the Irish pub, they will meet’ (), but we would reject ‘Because Pam goes to the Irish pub or Quinn goes to the Irish pub, they will meet’ (), This situation is depicted in Fig. 5. We accept \(\lnot (\lnot p > r)\) and \(\lnot (\lnot q > r)\), but we reject \(\lnot (\lnot (p \vee q) > r)\), since we accept \((\lnot p \wedge \lnot q) > r\).

Fig. 5
figure 5

OR is invalid for ‘because’: and , but not

5 Logics for ‘because’

5.1 Translation and backtranslation

Now we will concentrate on two sublanguages. The first language \({{\,\mathrm{\mathcal {L}}\,}}_>\) has just > as a non-classical connective. The other one, , has and \({{\,\mathrm{\boxdot }\,}}\) as non-classical connectives. We correspondingly denote the restricted semantic relations by \({{\,\mathrm{\vDash }\,}}_>\) and . We want to find the logic for , and for this we will use two translations between the languages and our knowledge about the logic for the suppositional conditional >. We need \({{\,\mathrm{\boxdot }\,}}\) in the language for ‘because’ for the following reason:

Lemma 3

\({{\,\mathrm{\boxdot }\,}}\) is not definable in terms of .

The relation between the two languages is given by:

Lemma 4

In all models we have (1), in all 0-models we have (2), and assuming additionally (pres) we have (3) and (4):

  1. (1)

    iff \(M {{\,\mathrm{\vDash }\,}}_> \top > A\).

  2. (2)

    iff \(M {{\,\mathrm{\vDash }\,}}_> (\top> A) \wedge (\top> C) \wedge (\lnot A \not > C)\).

  3. (3)

    iff \(M {{\,\mathrm{\vDash }\,}}_> (\top> C) \wedge (\lnot A \not > C)\).

  4. (4)

    \(M {{\,\mathrm{\vDash }\,}}_> A > C\) iff .

The meanings of some particular ‘because’ conditionals in the doxastic interpretation are collected in Table 1.

Table 1 The meanings of some basic sentences using ‘because’ in the doxastic interpretation

5.2 Systems for ‘because’

We are finally in a position to list the central principles of our systems for because.

  • If \(\vdash A\), then \({{\,\mathrm{\boxdot }\,}}A\). N

  • If \({{\,\mathrm{\boxdot }\,}}(A \supset B)\) and \({{\,\mathrm{\boxdot }\,}}A\), then \({{\,\mathrm{\boxdot }\,}}B\). K

  • Not \({{\,\mathrm{\boxdot }\,}}\bot \). D

  • If \(\vdash A \equiv B\), then: if then . LLE

  • If \(\vdash A \equiv B\), then: if then . RLE

  • If , then \({{\,\mathrm{\boxdot }\,}}A\). BA

  • If , then \({{\,\mathrm{\boxdot }\,}}B\). BC

  • . NTC

  • If , then . VDW

  • If and \({{\,\mathrm{\boxdot }\,}}B\), then . CW*

  • If , then or . AND*

  • If and \({{\,\mathrm{\boxdot }\,}}B\), then or . CUT*

  • If , then or . OR*

  • If and \({{\,\mathrm{\boxdot }\,}}B\), then or . CM*

  • If and , then . DR*

  • If and , then . RM*

The core of the logic is given by the first eleven principles from N to AND*. We denote this logic by \({{\,\mathrm{\textbf{BEC}}\,}}\). Extensions are given by adding any (combination) of the remaining principles CUT*, OR*, CM*, DR* or RM*.

The starred labels may strike the reader as surprising. The principles with starred labels are best understood as obtained by backtranslating the original principles for > and sometimes applying some further logical simplifications involving other principles. The first three principles (N)–(D) are standard assumptions for a belief modality. It follows that belief is monotone,Footnote 24 closed under conjunction and consistent. LLE and RLE are Left Logical Equivalence and Right Logical Equivalence. But in \({{\,\mathrm{\textbf{BEC}}\,}}\) the stronger RW is invalid. The reader might ask why we endorse LLE (and relatedly RLE). After all, ‘because’ might be hyperintensional, just as ‘if’ has been suggested to be hyperintentional, for instance by Fine (2012). We realize that this is an interesting topic for future research, but in this paper we take LLE and RLE as simplifying assumptions and assume ‘because’ to be an intensional connective.Footnote 25 BA and BC are our fundamental assumptions that the explanans (antecedent) and the explanandum (consequent) of an explanation are believed. NTC stands for No Tautological Consequent and says that there is no relevant antecedent for the tautology. It corresponds to the principle CN \(A > \top \), but in the language of it is expressed by the negation of CN. CW* is obtained as a backtranslation from CW, and thus in a way corresponds to RW. VDW stands for Very Weak Disjunctive Weakening and says that one can always weaken an explanatory conditional by introducing the antecedent as additional disjunct in the consequent. AND* is obtained as a backtranslation of AND. Both VDW and AND* constitute weak replacements for Right Weakening which is invalid for ‘because’.

CUT* and CM* are more complex. CUT* says that if A is a reason for C but not for the belief B, then the material implication \(B \supset A\) is a reason for C. CM* says that if A is neither a reason for C nor for the belief B, then the material conditional \(B \supset A\) is not a reason for C either. Taken together, CUT* and CM* imply that if A is not a reason for the belief B then: A is a reason for C if and only if the weaker \(B \supset A\) is already a reason for C. OR*, DR* and RM* are rather simple. OR* says that if neither A nor B is a reason for C, then they cannot jointly be a reason for C. DR* says that reasons can be conjunctively conjoined in the antecedents of factive conditionals with the same consequent. While Rational Monotonicity for suppositional conditionals is a rather complicated principle involving a negated conditional in the antecedent (a so-called non-Horn principle), its counterpart RM* for is rather simple and intuitive: If B because A, and C because A or B, then C because A. In general, the counterparts of Horn principles frequently become non-Horn, and vice versa.

Note that due to CW* and RLE, the conjunction of the three principles BA, BC and VDW can equivalently be expressed as a single more compact principle.

  • iff \({{\,\mathrm{\boxdot }\,}}A\) and \({{\,\mathrm{\boxdot }\,}}B\) and B

B together with CW* is the proper axiom of the logic of ‘because’ in the sense of Raidl (2021b). Together with LLE and RLE they form a minimal core for the logic of ‘because’.

Theorem 2

The above principles from \({{\,\mathrm{\textbf{BEC}}\,}}\) are valid in our 0-semantics. And if the semantics validates CUT, OR, CM, DR, RM in \({{\,\mathrm{\mathcal {L}}\,}}_>\), then it validates CUT*, OR*, CM*, DR*, RM* in .

One can in fact prove a stronger result: the principles of \({{\,\mathrm{\textbf{BEC}}\,}}\) determine the 0-semantics. And the reverse of the second statement holds, too.Footnote 26

5.3 Further derivable principles and validities

The following lemma lists further interesting principles.

Lemma 5

The following principles are derivable in BEC:

  • . AT

  • . NTA

If we additionally assume OR*, we can also derive:

  • If and \({{\,\mathrm{\boxdot }\,}}C\), then . COND*

AT stands for Aristotle’s Thesis. It is a fundamental principle of connexive logic and says that a sentence cannot explain its negation. NTA stands for No Tautological Antecedent and says that tautologies are not relevant for anything. NTA and NTC taken together tell us that tautologies do not enter into relevance relations. COND* says that if \(A\vee B\) is not a reason for a belief C, then A cannot be a reason for \(B \vee C\). It is a problematic principle that will be discussed in Sect. 7.

6 Comparison with other work

We believe that the systems introduced and discussed in this paper constitute the first systematic study of logics for ‘because’. Our semantics closely links the meaning of ‘because’ to the meaning of ‘if’. As a consequence, our logics for ‘because’ are closely related and comparable to the standard systems of conditional logic that have been studied over the last 50 years. To the best of our knowledge, there has so far been no comparable work on the logic of ‘because’ or ‘since’.

There are nonetheless some works that may be compared to our approach. Some works aim at a logic of ‘because’ rather directly; some works offer alternatives to the account of difference-making conditionals; some works address similar problems in a probabilistic framework. We briefly comment on each of these approaches in turn.

6.1 Work on ‘because’

Burks (1951) developed an early proposal for a kind of causal conditional called ‘causal implication’ and a kind of counterfactual called ‘counterfactual implication’, assuming the latter to be definable from the former by adding that the antecedent is false. Burks’s implication is supposed to express causal sufficiency which he analyzes with the help of a strict implication, with the necessity behaving like a normal Kripke necessity. As a consequence, his causal implication validates rather undesirable laws, such as Strengthening the Antecedent, (weak) Transitivity, Contraposition and the paradoxes of strict implication. We hope to have made it clear that these principles cannot hold for the natural-language connective ‘because’.

Urchs (1994) investigates principles for a precausal connective, listing good principles that should be valid and bad principles that should be invalid. There are some axiomatic similarities to our account, since he rejects symmetry as well as asymmetry as general principles, and also rejects Contraposition and Strengthening the Antecedent. His connective is factive. However, he endorses Conjunctive Weakening. The latter fails, however, for ‘because’, since Right Weakening fails.

Schnieder’s (2011) paper on “A logic for ‘because’ ” is very different from, and can hardly be compared to, the work in conditional logic. He motivates the axioms for his logic based on a particular kind of ‘because’—the noncausal ‘because’ appearing in the literature on grounding. A similarity with our account are Schnieder’s truth axioms, according to which ‘because’ implies the truth of the antecedent as well as of the consequent, and thus his grounding ‘because’ is factive. But his approach differs from ours significantly. On the one hand, Schnieder’s axiomatic system is rather weak. In particular, ‘because’ is treated as hyperintensional, so that equivalent sentences cannot be substituted in the scope of ‘because’ and LLE and RLE become invalid. On the other hand, Schnieder’s system is rather strong, since his ‘because’ is required to be asymmetric and transitive. In contrast, our ‘because’ invalidates asymmetry and transitivity. Concerning transitivity, this is just as it should be in a general analysis of ‘because’. From the premises ‘Because Hinckley fired at Reagan, agent McCarthy threw himself in front of Reagan’ and ‘Because Agent McCarthy threw himself in front of Reagan, Reagan survived March 30, 1981’ it does not follow that ‘Because Hinckley fired at Reagan, Reagan survived March 30, 1981.’ Concerning asymmetry, we say more in Sect. 7.

Andreas and Günther (2019) offer a formal analysis of ‘because’ that has some similarities to our doxastic interpretation. Like Rott (2022a), they pick up on an idea contained in Rott (1986) and emphasize that ‘because’ is factive and not strictly asymmetric. However, they use a variant of the Ramsey Test, contracting first to ensure that any beliefs about the antecedent and about the consequent (or their negations) get eliminated, and then expanding by the antecedent. Andreas and Günter do not offer a logic for ‘because’, but such an analysis might be given if the properties of the contraction operation with respect to the antecedent and the consequent were analyzed in detail.

6.2 Alternatives to difference-making conditionals

Fariñas del Cerro and Herzig (1996) frame their work in the context of belief change theories. They use the formula ‘\(A \leadsto C\)’ to express that ‘C depends on A’. After non-trivial adaptations to our framework are made, this is equivalent to \({{\,\mathrm{\boxdot }\,}}C\) and \(\lnot A \not > C\).Footnote 27 If we assume PRES (and CONS), this in turn is equivalent to our definition of (see part 3 of Lemma 4). Thus the formula \(A \leadsto C\) seems to express exactly the same content as our ‘C, because A’.

However, the equivalence only holds in a rather strong semantics, and differences emerge as soon as we drop PRES or admit incomparabilities.Footnote 28 Consider the example given in Fig. 6 with a language having only two propositional variables p and q. The example violates Preservation, since we have \(\top > q\) and \(\top \not > p\), but \(\lnot p \not > q\). In such a context, Fariñas and Herzig’s analysis (FH) of dependency and our analysis (RTB) of ‘because’ diverge: we have \(p\leadsto q\) but not .Footnote 29 It is intuitively clear that an agent would not accept ‘q, because p’ in this situation because she does not believe that the antecedent p is true. ‘Because’ is factive in the antecedent. If we want to be able to do without full comparability (or without Preservation) and retain factivity, we must add, as we did, that the antecedent is believed as a defining condition for ‘because’.

Fig. 6
figure 6

An example with incomparable worlds: the world denoted by ‘’, at which p is false and q is true, is neither more plausible nor less plausible than any of the other worlds. World pq is more plausible than world

A further difference to Fariñas and Herzig is that their semantics is rather strong. In our terminology, they consider models for the logic \(\textbf{RN} + \text{ SCONS }\). SCONS allows them to treat the outer modality as validity or logical truth. This, however goes against the now standard precaution to distinguish metaphysical or doxastic necessity from the stronger logical necessity. For this reason, we consider SCONS as a dubious axiom.

We already mentioned that Spohn’s notion of sufficient reason has essentially the same structure as the difference-making conditional. And Spohn’s account can be captured by a semantics with the conditional logic \(\textbf{RN}\) (Raidl 2019). As a consequence, our strongest doxastic analysis of ‘C, because A’ captures essentially the same idea as A being a sufficient reason for C joined with having the reason A.Footnote 30

A structurally different account is the ‘evidential conditional’ studied by Crupi and Iacona (2022a) in the framework of a possible world semantics. Their evidential conditional can be defined from a Lewisean suppositional conditional > by putting \(A \vartriangleright B\) iff \(A > B\) and \(\lnot B > \lnot A\). The evidential conditional thus has Contraposition built into it, which is invalid for our ‘because’. Further differences are that the evidential conditional validates AND, CM, OR and even Supraclassicality, as well as the so-called paradoxes of strict implication, i.e., Necessary Consequent and Impossible Antecedent (Raidl, Iacona and Crupi 2022). The evidential conditional is not factual. The only similarity to difference making and our ‘because’ is that Right Weakening is violated—this violation being the ‘hallmark of relevance’. One may very well, however, base a definition for ‘because’ on \(\vartriangleright \) by adding that the antecedent and the consequent are (believed to be) true; the logic of such a connective remains yet to be explored. Rott (2022b), however, raises serious doubts as to whether Contraposition is suitable for capturing the idea of evidence or support.

6.3 Probabilistic accounts

The study of a logic of conditionals that incorporates relevance in a probabilistic framework has only begun very recently, with the analysis of ‘evidential conditionals’ by Douven (2016, Ch. 5), Crupi and Iacona (2022b) and van Rooij and Schulz (2022).

Douven’s account of an evidential conditional \(A \Rightarrow C\) is a refinement of the combination of probabilistic relevance \(P(C|A) > P(C)\) and high conditional probability \(P(C|A)>t\) for an appropriate threshold t. Whereas high conditional probability is known to invalidate AND (cf. Hawthorne and Makinson 2007), probabilistic relevance is known to validate symmetry. Like our ‘because’, Douven’s evidential conditional violates RW, and it also invalidates AND, CM, CUT and OR. In addition, it is not factive.

Crupi and Iacona’s probabilistic evidential conditional, although having a slightly different logic from their possible-worlds based account, also violates RW and validates AND, but it in addition validates CM, OR and Contraposition. It is also not factive.

According to van Rooij and Schulz (2022), a relevance-based conditional \(A \Rightarrow C\) holds when the causal power of A for C is high, i.e., \(\frac{P(C|A)-P(C|\lnot A)}{1-P(C|\lnot A)} > t\) for an appropriate threshold t. If \(P(C|\lnot A) = 0\) we obtain high conditional probability. If we disregard the subtraction in the denominator and set t to 0, we obtain probabilistic relevance, since \(P(C|A) > P(C|\lnot A)\) iff \(P(C|A) > P(C)\).Footnote 31 So far the logic for this notion of causal power has not been determined. Note, however, that AND will fail, and that this connective, too, will not be factive.

7 Problem cases

So far we have argued that certain first principles should hold for reasoning with ‘because’. We have built a formal framework for these natural ideas and outlined the logic for ‘because’ that flows naturally from our premises. In this section, we confront our analysis with a number of examples that are known to be pose problems for models of causal reasoning. A word of caution is in order, though: problematic examples for causation are not necessarily problematic when transferred to the discussion of ‘because’ sentences. While a causal relation can always be expressed by a ‘because’ sentence, it must be borne in mind that ‘because’ sentences can be used to express non-causal explanatory relations, too. And thus, not every ‘because’ sentence expresses a causal relation.

Unless stated otherwise, we will make the simplifying assumption that the selection function \(\sigma \) is based on a plausibility order <. That is, there is a strict transitive relation < over \(W' \subseteq W\) such that \(\sigma (V) = \min _< (V \cap W')\), where \(\min _< X = \{y \in X:\) there is no \(x \in X\) such that \(x < y\}\). Here \(x<y\) is read as ‘x is more(!) plausible than y’.

Reflexivity.Footnote 32 On our account, holds if and only if A is a belief the negation of which is possible. So ‘A, because A’ is a contingent sentence that may, but need not be true. In conditional logic many people think that Reflexivity should be axiomatic. This is in stark contrast to causation and explanation. It seems that A cannot possibly ever cause or explain itself. Not everyone agrees: Halpern (2016, p. 17), for instance, holds that reflexivity is a natural property of his ‘affects’ relation. Another possible reaction to this problem is to say that ‘A, because A’ is a limiting case in our formalization that can and should be dealt with just according to what one’s best theory tells us.

Asymmetry. Since causation is asymmetric, one may be inclined to think that because should be asymmetric as well. This inference, however, would only be admissible if all ‘because’ sentences could be rephrased in terms of causing. As already mentioned, we think that this is not the case. We can always express the fact that p causes q by the sentence ‘q because p’; but not every such ‘because’ sentence expresses a causal relation. Other than for causation, asymmetry is not always to be expected for explanation. Sometimes an effect may be regarded as a reason for, or an explanation of, the (fact that we infer the) cause, as for example in an inference to the best explanation. And this, too, can be expressed by ‘because’ sentences. We can say both ‘Because Carol is at home, her apartment is lit’ (which may track a causal relation) and ‘Carol is at home, because her apartment is lit’ (which may be seen as tracking a converse ‘is evidence for’ relation).Footnote 33 Our account makes room for such symmetric explanations, thus violating asymmetry. Yet it does not fall into the other extreme of validating symmetry, contrary to the probabilistic concept of dependency or relevance.

The problem, on our account, is rather that symmetric explanations are abundant. Consider a situation in which you believe, with good justification, that Pam and Quentin, a married couple, are at Frieda’s party. However, you are not entirely sure, because you know that they have also been invited by Ben. You think it is not impossible that either (i) they both went to Ben’s party () or that they split and (ii) Pam went to Frieda and Quentin went to Ben () or (iii) Quentin went to Frieda and Pam went to Ben (). Actually, you think that if they aren’t both at Frieda’s party, each of the three possibilities (i), (ii) and (iii) is equally plausible. In this case, our analysis then commits us to accepting both ‘Pam went to Frieda’s party because Quentin went’ () and ‘Quentin went to Frieda’s party because Pam went’ (); see part (1) of Fig. 7. This may seem strange. But perhaps it is an adequate result. They are a married couple after all, and both of them being at Ben’s party is just as plausible as only one of them being there. So p and q seem to explain each other. Parts (2) and (3) of Fig. 7 show how alternative plausibility orderings give rise to other acceptance and rejection patterns of ‘because’ sentences.

Fig. 7
figure 7

Plausibility orderings for the worlds pq, , and . In (1), we have and , a violation of Asymmetry. In (2), and , and in (3), and , a violation of Symmetry

Sufficient reasons.Footnote 34 Consider a situation in which a reasoner accepts the sentence (i) ‘Peter went to the party because he was invited to the party and he wanted to get drunk.’ If we suppose that the ‘because’ clause specifies sufficient reasons for the main clause, it may seem that, on our analysis, the reasoner is committed to accepting also the reverse, (ii) ‘Peter was invited to the party and wanted to get drunk because he went to the party.’ If the reasons given in (i) are considered as sufficient, then supposing that Peter didn’t go to the party will result in giving up the reasons. Now it looks as if we are caught with a counter-intuitive result regarding the acceptance of (ii). The two reasons mentioned are only sufficient in the context of the particular situation at hand—which is captured, in our model, by the selection function. In order to see whether sentence (ii) is acceptable, we have to look at the situation carefully. There are situations in which (ii) is indeed acceptable, namely when the counterfactual assumption that Peter didn’t go to the party would lead the reasoner to abandon the conjunction ‘Peter was invited to the party and he wanted to get drunk’. But this need not be the case. If, for example, the reasoner is very sure for independent reasons that Peter was invited and wanted to get drunk, then the counterfactual assumption of his absence would make her rather believe that Peter was sick, had an accident or faced some other strong impediment that prevented him from coming. So the acceptability of (ii) depends on whether Peter’s presence is evidence for his being invited (i) and desiring to get drunk (d).

Now one may object: The example can be strengthened by appending to the conjunction in the ‘because’ clause of (i) and the main clause of (ii) that none of the potential impediments is present: that Peter isn’t sick, doesn’t have an accident, etc (\(q_1\wedge \dots \wedge q_n\)). By using such an extended conjunction, the speaker takes care to specify ‘fully sufficient reasons’ for Peter’s going to the party (p). In our model, it makes sense then to render the extended ‘fully sufficient’ form of (i) as the modal (i\(^f\)) , and the question is whether (i\(^f\)) entails that (ii\(^f\)) . To this we reply that, first, extending the natural-language sentence (i) in such a way would result in a rather long and unnatural ‘because’ sentence, one that would hardly ever be uttered in normal conversations.Footnote 35 But second, it is true that our analysis predicts that (ii\(^f\)) be accepted if (i\(^f\)) is accepted, \(i, d, q_1, \ldots , q_n, p\) are believed and the side condition \(\lnot \Box p\) is accepted, too. Given that the reasoner believes that all parts of the fully specified conjunctive sufficient reason are true, the fact that Peter goes to the party is evidence for this sufficient reason (unless Peter goes to the Party necessarily). Correspondingly, according to our analysis, ‘because’ comes out symmetric for all truly sufficient reasons in the above modal sense. We do not think that this is a counterintuitive result. A problem will arise, however, in the case of (presumed) overdetermination—another problem to which we now turn.

Fig. 8
figure 8

Plausibility orderings for the worlds pqr, , , , .... Implausible worlds are not represented. In (1) , , . In (2) , , . In (3) , , . In (4) , , . In all cases . (1) treats overdetermination but is problematic for productive preemption. (2) explains non-productive preemption. (3)–(4) explain productive preemption

Overdetermination.Footnote 36 Overdetermination is known to be a problem for many causal theories. Consider a firing squad of two soldiers (Pam and Quinn), and assume that one shot is sufficient for the delinquent’s (Ron’s) death (r). Pam and Quinn actually fire a shot (pq). What does our model say? This depends on the plausibilities of possible worlds. The situation suggests that we think that Ron dies (r) and that both fire, and it is more plausible that both Pam and Quinn fire than only one of them (see (1) in Fig. 8). The world and those with are all implausible. Assuming this, our model says that neither ‘Ron dies because Pam fires’ nor ‘Ron dies because Quinn fires’ holds (, ). ‘Ron dies because Pam and Quinn fire’ does not hold either, according to the model, but ‘Ron dies because Pam or Quinn fires’ does hold (, ). We think that this is a satisfying result. It is not that Ron died because Pam shot at him—he would have died even if Pam had refrained from shooting. Similarly for Quinn Ron died because at least one member of the squad shot at him. Our analysis produces a disjunctive explanation and excludes the single-factor explanations (and the conjunctive explanation, too).Footnote 37

We think that the above plausibility order gets the conjunctive and the disjunctive reading right—the shooting of all soldiers doesn’t make a difference, but the shooting of at least one of them does. Yet, many contributors to the causal and legal literature agree that every single shot is a cause and thus provides an explanation. Our diagnosis is that such overdetermining causes do not provide an explanation, since one soldier’s refraining from shooting doesn’t make a difference. If this is correct, then we have a deviation here from the thesis mentioned above that a causal relation can always be expressed by a ‘because’ sentence.

Preemption. Preemption is also known from the causal literature. An event (p) may cause some effect (r), while another (slightly later) event (q) is merely a preempted potential cause, since the first already caused the effect. Usually, if ‘because’ is read as expressing (productive) causing, one would like to affirm ‘r because p’ but not ‘r because q’.

Let us consider the well-known case of late preemption (with new names) from Hall (2004). Both Pam and Quentin throw a stone at a glass bottle (pq). The bottle breaks (r). Pam throws just a fraction of a second earlier, so that it is her stone that actually hits the glass bottle and causes its shattering. Quentin’s throw is a preempted potential cause. In our simple model which does not have time in it, a possible representation is again part (1) of Fig. 8. Then our model says that both ‘The bottle shatters because Pam throws’ and ‘The bottle shatters because Quentin throws’ are false or unacceptable, but that ‘The bottle shatters because Pam or Quentin throws’ is true or acceptable. This is counterintuitive in one (perhaps the preferred!) reading of ‘because’. Indeed, one reading of ‘because’ is the causal reading in the production sense, using Hall’s terminology. And in this sense we expect ‘The bottle shatters because Pam throws a stone at it’ to hold but we would reject the same claim for Quentin.

This diagnosis cannot be reproduced if ‘because’ is interpreted along the lines suggested in this paper. Our connective can only represent causal relations in the dependence sense—which we think is an admissible interpretation. If Pam had refrained from throwing a stone at the bottle, it would still have shattered. So the shattering of the bottle does not depend on Pam’s throwing, just as it doesn’t depend on Quentin’s throwing. Although the situation represented by part (1) of Fig. 8 does not allow for a productive causal reading of ‘because’ it allows for a dependence reading.Footnote 38

This failure to model the productive reading in the preemptive case is due to the fact that the asymmetry of the example is not represented in the model. There are however at least two ways of getting an asymmetric model in which the disjunction (\(p\vee q\)) as well as one of the disjuncts (p) explain, but the other disjunct (q) doesn’t.

The first option is essentially to assume that p is believed but q is not, and that and are equally plausible (or incomparable) in terms of plausibility. Part (4) of Fig. 8 depicts this situation. In the bottle example, this means that we believe that Pam (and thus someone) throws, but we don’t believe that Quentin throws. After all, he throws after Pam, so that he might refrain from throwing if Pam does not throw. The second option is essentially to assume that is not less plausible than , but keeping the belief of both q and p (and r). Part (3) of Fig. 8 depicts this situation. In the bottle example, this means that we believe that both throw, and we think that if Pam hadn’t thrown, Quentin might not have thrown either. In both options, the disjunction explains (), and one of the disjuncts explains the shattering of the bottle (, but ).

Both option (3) and option (4) of Fig. 8 represent situations with patterns of ‘because’ sentences that match the production sense of causation. However, it is debatable whether they represent our plausibilistic intuitions about the situation.

‘Conditionalization’. The principle of Conditionalization for suppositional conditionals (COND: If \(A \wedge B>C\), then \(A>(B\supset C)\)) gets transformed for factual difference-making conditionals into the following principle (COND*): If and \({{\,\mathrm{\boxdot }\,}}C\), then (see Lemma 5). This is a very surprising principle. Consider the following example concerning a much desired job. Sam accepts ‘Bob gets the job or Carol gets the job because Ann makes the decision’, since he knows that Bob and Carol are Ann’s protégés. Actually, Sam believes that Carol will get the job. It seems strange, to say the least, that Sam is justified to infer from these premises that ‘Carol gets the job because Ann makes the decision or Bob gets the job.’ After all, Bob getting the job would exclude Carol’s getting it! But this is what our analysis licenses. We acknowledge that this inference presents a serious challenge to our analysis. We may, however, note that ‘Carol gets the job because Ann makes the decision or Bob gets the job’ does not imply ‘Carol gets the job because Bob gets the job.’ Moreover, (COND*) requires (OR*) or the corresponding semantic principle (or), and we can do without that principle in our weakest semantics (see Fig. 1).

In sum, we think that some of the six problem cases reviewed in this section turned out to be problematic for our analysis. Others are not so problematic. We do not claim that our approach solves all the problems. The main goal of this paper has been to give an axiomatic characterization of an analysis of ‘because’ that is plausible prima facie. In this section, we have indicated some (real and potential) problems that we see in our account, with the idea of paving the way for a discussion between the modal logic and the causation communities. But our main contribution clearly lies in the development of a formal system (or actually, of formal systems) that can be put to such tests in the first place. Our systems are in a situation akin to that of the early conditional logics that were known to be plagued by problems of a similar caliber (like the problem of Simplification of Disjunctive Antecedents) and were yet given a chance to grow and develop.

8 Conclusion

Different kinds of conditionals validate different sets of inference patterns. For natural-language conditionals, as opposed to material (or strict) conditionals, strengthening the antecedent is invalid. This has been one of the most important messages of conditional logic ever since the times of Goodman, Adams, Stalnaker and Lewis. For difference-making conditionals, as opposed to conditionals normally studied in the field of conditional logic, the dual pattern of weakening the consequent (RW) is invalid, too. What raises the doxastic status of C does not necessarily raise the doxastic status of \(B\vee C\). Many other well-known patterns get lost as well, and conditionals appear to behave rather irregularly if the relevance idea is heeded. Still there is a logic of difference-making conditionals as captured by the Relevant Ramsey Test and there is a logic for ‘because’ as captured by the Ramsey Test for ‘because’. We used the latter to provide a semantics for ‘because’.

To the best of our knowledge, we have presented in this paper the first logics for ‘because’ that can be compared, in status and elaboration, to the more orthodox logics for suppositional conditionals that have been dominant in the discussion for 50 years. We have shown how they relate to each other. For every conditional logic \(\textbf{L}_>\) (starting with \({{\,\mathrm{\textbf{B}}\,}}^+\)) there is a corresponding companion ‘because’ logic such that they are valid in the same sets of models, the first with respect to the Ramsey Test (RT), the second with respect to the Ramsey Test for ‘because’ (RTB). Difference-making conditionals defined by the Relevant Ramsey Test (RRT) can be viewed as a missing link.

Our analysis is not fully exhaustive yet and can be extended. We restrict ourselves to mentioning the technical aspects here. First, we have disregarded Boolean combinations of conditionals as well as nested conditionals. Correspondingly, we have disregarded multistate models and possible worlds semantics with multistates. Second, our validity results here constitute only one direction of the correspondence result. The other direction requires one to argue on the level of frames (instead of models). Third, although we obtain more general soundness results from our validity results, the related completeness part requires a proof of the full correspondence result and a property known as canonicity. This can also be done indirectly by backsimulating the suppositional conditional logic in the logic for ‘because’ (cf. Raidl 2021b). The logics arising from our principles for ‘because’ are in fact not only sound, but also complete for their corresponding semantics. This is shown by Raidl (2022).

It is, of course, a very good question to ask how far an analysis of ‘because’ can go if it is confined to modal logic broadly construed.Footnote 39 We have not answered this question. But we believe we have provided a solid logical basis for attacking it, by showing what follows from our analysis (the validities) and discussing what the limits of this analysis are (its invalidities and potential counter-examples).