# Declarations of independence

## Abstract

According to orthodox (Kolmogorovian) probability theory, conditional probabilities are by definition certain ratios of unconditional probabilities. As a result, orthodox conditional probabilities are regarded as undefined whenever their antecedents have zero unconditional probability. This has important ramifications for the notion of probabilistic independence. Traditionally, independence is defined in terms of unconditional probabilities (the factorization of the relevant joint unconditional probabilities). Various “equivalent” formulations of independence can be given using conditional probabilities. But these “equivalences” break down if conditional probabilities are permitted to have conditions with zero unconditional probability. We reconsider probabilistic independence in this more general setting. We argue that a less orthodox but more general (Popperian) theory of conditional probability should be used, and that much of the conventional wisdom about probabilistic independence needs to be rethought.

This is a preview of subscription content, log in to check access.

## Notes

1. 1.

To be sure, Kolomogorov was well aware of this problem, and he went on to offer a more sophisticated treatment of probability conditional on a sigma algebra, $$\hbox {P}(\hbox {A}{\vert }{\vert }\mathcal {F})$$, in order to address it. We will return to this point later; as we will see, this approach also faces some serious problems.

2. 2.

To name just a few of the most famous of these, we have Parzen (1960), Papoulis (1965), Feller (1968), Rozanov (1977), Loève (1977), Billingsley (1995), Ross (1998), and, of course, Kolmogorov (1933/1950) himself.

3. 3.

Axiom K3 (finite additivity) is often strengthened to require additivity over denumerably many mutually exclusive events. There is considerable controversy over the issue of countable additivity. Both Savage and de Finetti urged against the assumption of countable additivity in the context of personalistic probability. Moreover, the assumption of countable additivity has some surprising and paradoxical consequences in Kolmogorov’s more general theory of conditional probability (Seidenfeld et al. 2001). We will briefly comment on this issue below.

4. 4.

See Hájek (2003), Pruss (2013), Easwaran (2014).

5. 5.

For instance, Carnap (1950, 1952), Popper (1959), Kolmogorov (1933/1950), Rényi (1955) and several others have proposed axiomatizations of conditional probability (as primitive). See Roeper and Leblanc (1999) for a very thorough survey and comparison of these alternative approaches (in which it is shown that Popper’s definition of conditional probability is the most general of the well-known proposals).

6. 6.

This axiomatization is slightly different (syntactically) from Popper’s original axiomatization. But, the two are equivalent (as Roeper and Leblanc show). Moreover, we are defining Popper functions over sets rather than statements or propositions, which is non-standard. If you prefer, think of our sets as sets of possible worlds in which the corresponding propositions are true. This is an inessential difference (since it doesn’t change the formal consequences of the axiomatization), and we will use the terms “entailment” and “set inclusion” interchangeably. Our aim here is to frame the various axiomatizations as generally (and commensurably) as possible. We don’t want to restrict some axiomatizations (e.g., Popper’s) to logical languages or other structures that have limited cardinality. Popper’s aim was to provide a logically autonomous axiomatization of conditional probability. Ours is simply to compare various axiomatizations in various ways, with an eye toward independence judgments. So we don’t mind interpreting the connectives in both Popper’s axiomatization and Kolmogorov’s axiomatization, and doing so in the same (non-autonomous, set-theoretic) way. Given our set-theoretic reading of the connectives, axioms P5 and P6 above are redundant. We include them so that the reader can easily cross-check the above axiomatization with the (autonomous) axiomatic system given in Roeper and Leblanc. We also recommend that text for various key lemmas and theorems that are known to hold for Popper functions.

7. 7.

The case of causal dependence is a little different because there are built-in logical or mereological ‘no-overlap’ constraints on the relata of the causal relation. See Arntzenius (1992) for discussion.

8. 8.

Thanks to Leon Leontyev for this way of expressing the point.

9. 9.

Thanks here to Leon Leontyev.

10. 10.

It is shown in Fitelson (1999, 2001) that, despite the unified nature of the Kolmogorovian theory of probabilistic (in)dependence, there are many (radically) non-equivalent Kolmogorovian measures of degree of dependence (i.e., degree of correlation among propositions).

11. 11.

See, for example, (Pfeiffer (1990), pp. 73–84) who states 16 “equivalent” renditions of “A and B are probabilistically independent” (including our four, above) without mentioning that this “equivalence” depends on the assumption that the conditional probabilities are well-defined. He does the same thing in his discussion of conditional independence (pp. 89–113). Moreover, in the very same text (pp. 454–462), he discusses Kolmogorov’s more sophisticated definition of conditional probability. So, he is clearly well aware of the problem of zero probability conditions in the general case. This is not atypical.

12. 12.

Suppose that $$B = \lnot A$$. Is $$A$$ independent of $$\lnot A$$? The answer would seem to be no, paralleling our earlier discussion of every event’s or proposition’s self-dependence—perhaps with one exception. If $$A$$ has (the Popper analogue of) unconditional probability zero, then perhaps it is probabilistically insensitive to itself, since its probability is already minimal. It seems right that its probability is unmoved by its negation’s occurrence—it has nowhere lower to move! ($$\Omega$$ CONSTRUAL) delivers this result (0 = 0). (NEGATION CONSTRUAL) regards $$A$$ as dependent on $$\lnot A\,(0 \ne 1$$). But as before, we may want to allow more than one concept of independence to accommodate this result.

Suppose that $$A$$ has probability 1, so that $$\lnot A$$ has probability 0. This could happen in two ways: a non-trivial way, and the trivial way in which $$\hbox {Pr}( \_ , \lnot {A})$$ is the constant function 1. (See axiom P3.) In the latter case we may call $$\lnot {A}$$ ‘anomalous’. In that case both construals judge $$A$$ to be independent of $$\lnot {A}$$ (1 = 1). This may seem surprising. However, we might simply bite the bullet, given how strange $$\lnot A$$ is—it doesn’t just have probability 0, but it does so anomalously. It might not be much of a bullet; after all, contradictions classically entail their own negations, so we have already been primed to expect anomalous propositions to behave anomalously! Or we might revise (P3), so that rather than defaulting to a value of 1, probabilities conditional on anomalous propositions get assigned some new non-numerical value, such as ANOMALY. Then our CONSTRUALS would no longer judge $$A$$ to be independent of $$\lnot A$$, since it is not the case that ANOMALY = 1. We are grateful to Hanti Lin for inspiring this paragraph.

13. 13.

Thanks to Hanti Lin for helpful discussion here.

14. 14.

An important special case occurs when C itself has zero unconditional probability. When this happens, no event can be conditionally independent (or dependent) of any other event, given C. The example below is even more compelling than this special case, since none of its individual propositions have zero probability.

15. 15.

We thank especially Leon Leontyev and Hanti Lin for very helpful comments.

## References

1. Arntzenius, F. (1992). The common cause principle. Proceedings of the 1992 PSA conference (Vol. 2, pp. 227–237).

2. Billingsley, P. (1995). Probability and measure (3rd ed.). New York: Wiley.

3. Carnap, R. (1950). Logical foundations of probability. Chicago: University of Chicago Press.

4. Carnap, R. (1952). The continuum of inductive methods. Chicago: The University of Chicago Press.

5. Easwaran, K. (2014). Regularity and hyperreal credences. The Philosophical Review, 123(1), 1–41.

6. Etchemendy, J. (1990). The concept of logical consequence. Cambridge, MA: Harvard University Press.

7. Feller, W. (1968). An introduction to probability theory and its applications. New York: Wiley.

8. Fitelson, B. (1999). The plurality of Bayesian measures of confirmation and the problem of measure sensitivity. Philosophy of Science, S362–S378.

9. Fitelson, B. (2001). Studies in Bayesian confirmation theory. PhD Dissertation, University of Wisconsin-Madison.

10. Hájek, A. (2003). What conditional probability could not be. Synthese, 137(3), 273–323.

11. Kolmogorov, A. N. (1933/1950). Grundbegriffe der Wahrscheinlichkeitsrechnung, Ergebnisse Der Mathematik (Trans. Foundations of probability). New York: Chelsea Publishing Company.

12. Lewis, D. (1979). Counterfactual dependence and time’s arrow. Noûs, 13, 455–476.

13. Lewis, D. (1980). A subjectivist’s guide to objective chance. In R. Carnap & R. C. Jeffrey (Eds.), Studies in inductive logic and probability (Vol. 2, pp. 263–293). Berkeley: University of California Press. (Reprinted with added postscripts from Philosophical papers, Vol. 2, pp. 83–132, by D. Lewis, Ed., Oxford, UK: Oxford University Press.)

14. Loève, M. (1977). Probability theory. I (4th ed.). New York: Springer.

15. Papoulis, A. (1965). Probability, random variables, and stochastic processes. New York: McGraw-Hill.

16. Parzen, E. (1960). Modern probability theory and its applications. New York: Wiley.

17. Pfeiffer, P. (1990). Probability for applications. New York: Springer.

18. Popper, K. (1959). The logic of scientific discovery. London: Hutchinson & Co.

19. Pruss, A. (2013). Probability, regularity and cardinality. Philosophy of Science, 80, 231–240.

20. Rényi, A. (1955). On a new axiomatic theory of probability. Acta Mathematica Academiae Scientiarum Hungaricae, 6, 285–335.

21. Roeper, P., & Leblanc, H. (1999). Probability theory and probability logic. Toronto: University of Toronto Press.

22. Ross, S. (1998). A first course in probability (5th ed.). Upper Saddle River: Prentice Hall.

23. Rozanov, Y. A. (1977). Probability theory (Revised English ed.) (R. A. Silverman, Translated from the Russian). New York: Dover.

24. Seidenfeld, T., Schervish, M. J., & Kadane, J. B. (2001). Improper regular conditional distributions. The Annals of Probability, 29(4), 1612–1624.

## Author information

Authors

### Corresponding author

Correspondence to Alan Hájek.

## Appendix

### Appendix

Proof of (NEGATION CONSTRUAL) $$\Rightarrow$$ ($$\Omega$$ CONSTRUAL) $$\Rightarrow$$ (POPPER FACTORIZATION).

First, we prove that ($$\Omega$$ CONSTRUAL) $$\Rightarrow$$ (POPPER FACTORIZATION). Indeed, we’ll prove the following more general result (the result in question is a special case of the following, with C = $$\Omega$$):

\begin{aligned} \hbox {Pr(A, B} \cap \hbox {C}) = \hbox {Pr(A, C)} \Rightarrow \hbox {Pr(A} \cap \hbox {B, C}) = \hbox {Pr(A, C) Pr(B, C)} \end{aligned}

Assume Pr(A, B $$\cap$$ C) = Pr(A, C). Then, by Popper’s product axiom P4, we have

\begin{aligned} \hbox {Pr(A} \cap \hbox {B, C)} = \hbox {Pr(A, B} \cap \hbox {C) Pr(B, C)} = \hbox {Pr(A, C) Pr(B, C)}. \end{aligned}

$$\square$$

Now, we prove that (NEGATION CONSTRUAL) $$\Rightarrow$$ ($$\Omega$$ CONSTRUAL). That is, by logic,

$$\hbox {Pr(A, B} \cap \Omega ) = \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega ) \Rightarrow \hbox {Pr(A, B} \cap \Omega ) = \hbox {Pr(A}, \Omega )$$.

Now, $$\hbox {Pr(A, B} \cap \Omega ) = \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega )$$ [Assumption]

Thus,

$$\hbox {Pr(A, B} \cap \Omega )\,\hbox {Pr(B}, \Omega ) = \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega )\,\hbox {Pr(B}, \Omega )$$ [algebra]

But also,

$$\hbox {Pr(A, B} \cap \Omega )\,\hbox {Pr(B}, \Omega ) = \hbox {Pr(A} \cap \hbox {B}, \Omega )$$ [Popper’s product axiom P4]

Thus,

$$\hbox {Pr(A} \cap \hbox {B}, \Omega ) = \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega )\,\hbox {Pr(B}, \Omega )$$,

and so

$$\hbox {Pr(A} \cap \hbox {B}, \Omega ) = \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega )\,(1 - \hbox {Pr}( \lnot \hbox {B}, \Omega ))$$ [Popper’s additivity axiom P3]

(This axiom implies $$\hbox {Pr(B}, \Omega ) + \hbox {Pr}(\lnot \hbox {B}, \Omega ) = 1$$,

since it is not that case that for all X, $$\hbox {Pr(X}, \Omega ) = 1$$. The fact that

there exists an X such that $$\hbox {Pr(X}, \Omega ) \ne 1$$ is proven as lemma 4(t)

in (Roeper and Leblanc (1999), p. 198).)

$$\hbox {Pr(A} \cap \hbox {B}, \Omega ) = \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega ) - \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega ) \hbox {Pr}(\lnot \hbox {B}, \Omega )$$ [algebra]

$$\hbox {Pr(A} \cap \hbox {B}, \Omega ) = \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega ) - \hbox {Pr(A} \cap \lnot \hbox {B}, \Omega )$$ [Popper’s product axiom P4]

$$\hbox {Pr(A} \cap \hbox {B}, \Omega ) + \hbox {Pr(A} \cap \lnot \hbox {B}, \Omega ) = \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega )$$ [algebra]

$$\hbox {Pr(A}, \Omega ) = \hbox {Pr(A}, \lnot \hbox {B} \cap \Omega )$$

[It can be shown that Popper’s axioms imply $$\hbox {Pr(A}, \Omega ) = \hbox {Pr(A} \cap \hbox {B}, \Omega ) + \hbox {Pr(A} \cap \lnot \hbox {B}, \Omega )$$,

since it is not the case that for all X, $$\hbox {Pr(X}, \Omega ) = 1$$ (as above).

This is proved as Lemma 4(i) in (Roeper and Leblanc (1999), p. 197).]

$$\hbox {Pr(A}, \Omega ) = \hbox {Pr(A, B} \cap \Omega )$$ [by our assumption] $$\square$$

## Rights and permissions

Reprints and Permissions

Fitelson, B., Hájek, A. Declarations of independence. Synthese 194, 3979–3995 (2017). https://doi.org/10.1007/s11229-014-0559-2