Abstract
A widespread view among physicists is that Bell’s theorem rests on an implicit assumption of “classicality,” in addition to locality. According to this understanding, the violation of Bell’s inequalities poses no challenge to locality, but simply reinforces the fact that quantum mechanics is not classical. The paper provides a critical analysis of this view. First we characterize the notion of classicality in probabilistic terms. We argue that classicality thus construed has nothing to do with the validity of classical physics, nor of classical probability theory, contrary to what many believe. At the same time, we show that the probabilistic notion of classicality is not an additional premise of Bell’s theorem, but a mathematical corollary of locality in conjunction with the standard auxiliary assumptions of Bell. Accordingly, any theory that claims to get around the derivation of Bell’s inequalities by giving up classicality, in fact has to give up one of those standard assumptions. As an illustration of this, we look at two recent interpretations of quantum mechanics, Reinhard Werner’s operational quantum mechanics and Robert Griffiths’ consistent histories approach, that are claimed to be local and nonclassical, and identify which of the standard assumptions of Bell’s theorem each of them is forced to give up. We claim that while in operational quantum mechanics the Common Cause Principle is violated, the consistent histories approach is conspiratorial. Finally, we relate these two options to the idea of realism, a notion that is also often identified as an implicit assumption of Bell’s theorem.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Many physicists are unimpressed by Bell’s theorem. A widespread view is that Bell’s reasoning rests upon an implicit assumption of “classicality” that directly goes against the fundamental principles of quantum mechanics (QM). According to such an understanding, the violation of Bell’s inequalities poses no challenge to our causal picture of the world (locality, in particular), but simply reinforces the fact that QM is not classical. One proponent of such a view is Reinhard Werner, who concisely puts it like this (Werner, 2014a, p. 4):
Bell showed (maybe against his own intentions ...) that classicality and locality together lead to false empirical conclusions. Of course, all the talk about the nonlocality of quantum mechanics really says [is] that any classical extension violates locality ...
In line with this picture, physicists have developed various interpretations of quantum theory that are claimed to be local and nonclassical. Among recent variants are Werner’s operational quantum mechanics (Werner , 2014a; Werner , 2014b) and Robert Griffiths’s consistent histories approach (Griffiths , 2020).
Others object to this view. Criticizing Werner’s position about the EPR argument and Bell’s theorem, Maudlin (2014b, pp. 1–2) writes:
Werner thinks that Bell and Einstein and I have all tacitly made an assumption of which we are unaware, an assumption he labels C for ‘classicality’. ... Werner concedes that Bell proved that any classical theory that violates his inequalities must be nonlocal. But deny classicality and the arguments no longer go through. ... The condition C is easily stated: it is that the state space of a theory forms a simplex. Good. The space of density matrices in quantum theory does not form a simplex, so if one takes the possible physical states of a system to be given by the density matrices, then one’s theory is not classical in this sense. That much is clear. But what is not at all clear is where the assumption that the state space is a simplex is presupposed in either Einstein’s or Bell’s reasoning.
It is interesting to compare this debate with an idea developed in the ’80s by scholars like Itamar Pitowsky, Arthur Fine, and others. On their interpretation, Bell’s inequalities have nothing to do with locality and causality. The violation of the inequalities, on their account, is an indication that the classical laws of probability are no longer applicable in the quantum domain. That is, there is a sense of classicality that directly entails Bell’s inequalities, without any further assumption about locality. In Pitowsky’s (1989, p. 8) words:
The set of axioms for classical probability entail that frequencies should obey an apriori set of constraints [Belltype inequalities] that are often violated by quantum frequencies. The violation itself has apriori nothing to do with the principle of locality for it often occurs in cases where spatiotemporal aspects play no role whatever.
On Fine’s (1982, p. 294) reading:
... hidden variables and the Bell inequalities are all about ... imposing requirements to make well defined precisely those probability distributions for noncommuting observables whose rejection is the very essence of quantum mechanics.
Notice that Fine says not only that Bell’s theorem is about probability rather than locality and causality, but also that Bell’s inequalities follow from a condition (classicality) that contradicts QM, suggesting that the violation of the inequalities should not be surprising but should actually be expected in the first place.
In sum, the debate over the role of classicality in Bell’s theorem leaves us with a confusing disagreement of the following positions:

Bell’s argument presupposes classicality in addition to locality (Werner)

Classicality is nowhere referred to in Bell’s argument; the only substantial assumption is locality (Maudlin)

Bell’s inequalities follow only from classicality; no further locality assumption is needed (Pitowsky and Fine)
This situation poses two straightforward questions: 1) Is there a common notion of classicality shared by all parties? 2) If yes, what role exactly does classicality thus construed play in Bell’s theorem? The aim of this paper is to clarify these questions.
We will proceed as follows. Section 2 answers question 1 positively: we show that Werner’s notion of classicality (condition C above) is equivalent with Pitowsky’s and Fine’s probabilistic conditions of the existence of a Kolmogorovian representation of quantum probabilities and the existence of joint distributions, respectively. Then, with an unambiguous notion of classicality at hand, Section 3 answers question 2: we demonstrate that classicality is not a presupposition of Bell’s theorem but a consequence of the standard causalstatistical assumptions. Next, in Section 4, we investigate how the approaches of Werner and Griffiths can claim to get around Bell’s theorem. In light of what we will have shown about classicality, it is clear that in getting around the derivation of Bell’s inequalities each of the two approaches in question must violate one of the standard causalstatistical assumptions of Bell’s theorem. We claim that while in Werner’s operational quantum mechanics the Common Cause Principle is violated, in the consistent histories approach of Griffiths, the formulation of quantum theory turns out to be conspiratorial. Finally, in Section 5, we relate these two options to the idea of realism, a notion that is also often identified as an implicit assumption of Bell’s theorem. The Appendix contains the proofs of the two central mathematical propositions that we formulate in the main text.
2 Classicality as a probabilistic notion
Recall Pitowsky’s (1989) formalism. Let n be a natural number and S be a subset of \(\left\{ (i,j)i<j;i,j=1,2,...,n\right\} \). Suppose we are given \(n+\left S\right \) numbers (\(\left S\right \) denotes the number of elements of S):
with \(0\le p_{i},p_{ij}\le 1\). We arrange these numbers in a socalled a correlation vector
where the index pairs \((i,j)\in S\) are ordered lexicographically. \(\overrightarrow{p}\) will be thought of as an array of experimentally ascertained probabilities of n outcomes of some measurements performed on a given system, and some of the correlations of these outcomes (the probabilities of their conjunctions). \(\overrightarrow{p}\) can be seen as a (partial) description of the system’s “state,” characterizing how the system is disposed to react to certain measurements performed on it.
As an example consider a \(2\times 2\)type EPR–Bohm (EPRB) scenario. In each wing of a \(2\times 2\) EPRB experiment one selects from two given measurement settings (directions). Label by \(a_{1},a_{2}\) the settings on the left, by \(b_{3},b_{4}\) the settings on the right. Let \(A_{1},A_{2},B_{3},B_{4}\) denote the corresponding spin up outcomes. The probabilities of spin outcomes yielded by the experiment and predicted by QM can be arranged in a correlation vector of type \(n=4\), \(S_{\text {EPR}}=\left\{ (1,3),(1,4),(2,3),(2,4)\right\} \):
with^{Footnote 1}
\(\overrightarrow{p}_{\text {EPR}}\) provides a (partial) description of the spin state of the twoparticle system (or an ensemble of such systems) prepared and measured in an EPRB experiment.
We now recall and precisely formulate notions of when such a description—and hence the system in question and its state—is regarded as “classical.”
The first notion (Pitowsky, 1989) requires that a correlation vector be composed of numbers that satisfy Kolmogorov’s axioms, so they be classical probabilities.
Definition 1
Correlation vector \(\overrightarrow{p}\) admits a classical probability space representation iff there exists a classical probability space \(\left( X,\mathcal {A},\mu \right) \) and \(E_{1},E_{2},...,E_{n}\in \mathcal {A}\) such that
The second notion (Fine, 1982) requires that the probability values in a correlation vector arise from a joint distribution as marginal probabilities.
Definition 2
Correlation vector \(\overrightarrow{p}\) is extractable from a joint distribution iff there exist \(2^{n}\) numbers^{Footnote 2}\(p_{\alpha _{1}...\alpha _{n}},\alpha _{1},...,\alpha _{n}\in \left\{ +,\right\} \) such that
and
Consider the set \(\Omega \) of all possible correlation vectors that can be experimentally realized—with fixed type of measurements, but varying ways in which the system is prepared before the measurements are carried out. One assumes that \(\Omega \) is a convex set in \(\mathbb {R}^{n+\left S\right }\), so that the statistical mixture of realizable probabilities is also realizable. \(\Omega \) can be associated with the system’s “state space.” The third notion of classicality (Barrett, 2007; Werner, 2014a) characterizes the state space of a system as a convex set: it requires that the state space of a classical system be a simplex, that is, every state has a unique decomposition as a convex combination of extreme points of state space. Since correlation vectors don’t necessarily provide a complete description of the system’s “state” in the sense of specifying the probabilities of atomic events, here we give a slightly modified formulation of this idea, one where \(\Omega \) itself is not required to be a simplex, but be obtainable as a projection (that is, partial description) of one.
Definition 3
Let \(\Omega \subset \mathbb {R}^{n+\left S\right }\) be a set of correlation vectors. \(\Omega \) is projectable from a probability simplex iff there exists a probability simplex \(\Delta _{d}\subset \mathbb {R}^{d}\) with d number of vertices for some positive integer d, a linear map \(\varphi :\mathbb {R}^{d}\rightarrow \mathbb {R}^{n+\left S\right }\), and sets of indices \(R_{i}\subseteq \left\{ 1,2,...,d\right\} ,i=1,2,...,n\) such that
and for all \(\overrightarrow{p}\in \Omega ,\varvec{\pi }\in \Delta _{d}\), if \(\varphi \left( \varvec{\pi }\right) =\overrightarrow{p}\) then
where \(\pi _{r}\) is the rth component of \(\varvec{\pi }\).^{Footnote 3}
In the foundations of QM literature the above notions of classicality are often used interchangeably. For special, EPRBtype correlation vectors, the equivalence of the first two notions is an immediate consequence of results by Fine (1982) and Pitowsky (1989). However, all three notions are in fact equivalent, for generic correlation vectors (for proof see Appendix):
Proposition 4
Consider a set of correlation vectors \(\Omega \subset \mathbb {R}^{n+\left S\right }\). The following conditions are equivalent:

(i)
For all \(\overrightarrow{p}\in \Omega \), \(\overrightarrow{p}\) admits a classical probability space representation.

(ii)
For all \(\overrightarrow{p}\in \Omega \), \(\overrightarrow{p}\) is extractable from a joint distribution.

(iii)
\(\Omega \) is projectable from a probability simplex.
The consequence of Proposition 4 is that classicality as a probabilistic notion has an unambiguous meaning. Many hold that classicality thus construed is an implicit assumption of Bell’s theorem, and giving up this assumption then provides a way to get around the violation of Bell’s inequalities—a particularly natural way, it is held, given that classicality is already in contradiction with the fundamental principles of QM. In the next section we will argue that this picture is mistaken: in fact, classicality is not a presupposition of Bell’s theorem, in addition to the standard causalstatistical assumptions, rather it is a corollary of those.
3 Classicality and Bell’s theorem
Bell’s theorem can be and has been formulated in various different ways. Here we consider a commonly accepted derivation that is more general than Bell’s original 1964 reasoning in that it doesn’t presuppose perfect correlations and is based on the notion of a common cause.
The probabilities measured in an EPRB experiment and encoded in correlation vector \(\overrightarrow{p}_{\text {EPR}}\) ((3)–(5)) in general display statistical correlations between outcomes in the two wings. In general, we have
Since the two wings are spacelike separated, the only way these correlations can be explained is by assuming the existence of some correlated properties, commonly described by a “hidden variable,” that the particles carry with themselves right from their emission and that are responsible for the outcomes (even if in a probabilistic sense).^{Footnote 4} As many have rightly emphasized (e.g. Bell , 2004, pp. 143–144; Norsen , 2007, pp. 318–319; Maudlin , 2014a, p. 5), these preexisting properties are not presupposed but inferred in Bell’s reasoning. Given the experimentally verified statistics, the fundamental presuppositions from which the existence derives are in most general terms captured by the following two principles.

1.
Locality: There can exist no direct causal connection between spacelike separated events.

2.
Common Cause Principle: Robust probabilistic correlations do not occur in nature as a matter of pure accident, or mere hap. Any such correlation must be brought about either by direct or by common causal connection.^{Footnote 5}
1–2 entail the existence of a common cause—which we may think of as the physical event that determines the physical properties of the particles after emission—in terms of which the EPRB correlations (11) can be explained.^{Footnote 6} What it means to explain these correlations is characterized by a second pair of assumptions.
Let \(C_{k}(k\in K)\) denote a partition of events, describing the common cause.^{Footnote 7}

3.
Factorization:
$$\begin{aligned} \begin{array}{rclcl} p\left( A_{i}\cap B_{j}a_{i}\cap b_{j}\cap C_{k}\right) &{} = &{} p\left( A_{i}a_{i}\cap C_{k}\right) p\left( B_{j}b_{j}\cap C_{k}\right) \\ &{} &{} (i,j)\in S_{\text {EPR}},k\in K \end{array} \end{aligned}$$(12) 
4.
Noconspiracy:
$$\begin{aligned} \begin{array}{rclcl} p\left( C_{k}a_{i}\cap b_{j}\right) &{} = &{} p\left( C_{k}\right) \\ &{} &{} (i,j)\in S_{\text {EPR}},k\in K \end{array} \end{aligned}$$(13)
Both factorization and noconspiracy are statistical independence conditions. Factorization expresses the requirement that conditionalizing on the common cause \(C_{k}\) leaves no residual correlation between \(A_{i}\) and \(B_{j}\), given the chosen measurement angles on their respective sides. Noconspiracy expresses the assumption that the choice about which angles to measure, which is something that can be done at the last moment and by any selectionprocedure one likes, can not influence, nor be influenced by, the common cause, and hence the two must be uncorrelated.
It is worth noting that the four conditions are in fact inextricably intertwined. Both factorization and noconspiracy incorporate the Common Cause Principle in a trivial sense: if there could be robust correlations in the world occurring as a matter of pure accident, then requiring these independence conditions would have no ground, for any such accidental correlation could spoil these independencies. Further, factorization is often formulated as a joint result of two conditions: 1) outcome independence
which is a characterization of the common cause as a screeneroff; and 2) parameter independence
which is taken to be required by locality. Finally, note that noconspiracy, as a statement about statistical independence, is also a compound condition: it not only incorporates the idea of the autonomy of measurement choice (sometimes referred to as noconspiracy in a narrower sense), according to which the settings of measurements can neither be directly influenced by the \(C_{k}\)s, nor can there be common causal connection between them. But it also incorporates the idea of no retrocausation. This is because statistical independence (13) could also break down in a way that, reversely, the measurement choices have an effect on the \(C_{k}\)s, and since the measurement choice can be made at the last moment before the measurement, while the \(C_{k}\)s, characterizing the common cause, are localized at the emission of particles, this would involve retrocausal connection.
Conditions 1–4 above will be referred to as the standard causalstatistical assumptions of Bell’s theorem. We do not claim that these conditions cover every detail that the derivation of Bell’s inequalities rests upon^{Footnote 8}—though we believe they condense the substantial assumptions—, nor are these conditions independent or nonredundant, as we have just seen. None of this will be relevant to our argument; the main ingredient of which is the mathematical fact that given these four assumptions correlation vector \(\overrightarrow{p}_{\text {EPR}}\) must be classical. To formulate and prove this we will use Pitowsky’s characterization of classicality as is encapsulated in Definition 1. The following statement is a consequence of results by Fine (1982) in conjunction with Proposition 4. In the Appendix we give a more explicit proof of it based on HoferSzabó (2020).
Proposition 5
Suppose that there is a partition of events \(C_{k}(k\in K)\) for which (12)–(13) hold. Then \(\overrightarrow{p}_{\text {EPR}}\) admits a classical probability space representation.
Conditions 1–2 plus the EPRB statistics imply the existence of a common cause (hidden variable) assumed to be characterized by conditions 3–4. Conditions 3–4 imply that \(\overrightarrow{p}_{\text {EPR}}\) must be classical. Thus, in sum, the standard causalstatistical assumptions of Bell’s theorem imply that \(\overrightarrow{p}_{\text {EPR}}\) is a classical correlation vector.
In light of this result, the following remarks about the conceptual terrain are in order. Firstly, as Pitowsky (1989) and Fine (1982) proved, classicality (the mathematical condition) alone implies Bell’s inequalities. Therefore it is strictly speaking incorrect to say, as Werner (2014a) does, that classicality is an additional premise of Bell’s theorem, on top of the standard causalstatistical assumptions. Werner here seems to have in mind the introduction of the common cause \(C_{k}\) (“a hidden state \(\lambda \)”), which he takes to either imply, or be equivalent to, classicality. But as we have seen, the necessity of introducing \(C_{k}\) follows from the standard causalstatistical assumptions alone, so it is not an additional assumption.
Secondly, the Pitowsky–Fine derivation of Bell’s inequalities is often interpreted as a demonstration that Bell’s theorem has nothing to do with locality, causality, etc., but is instead essentially about probability. We have two remarks on this view. The first one is simply logic: Since Bell’s inequalities can be derived from two alternative sets of premises (the standard causalstatistical assumptions on the one hand, and classicality on the other), the violation of the inequalities implies that both sets of premises must contain a false one. That is, both classicality and the standard causalstatistical assumptions have to be violated in the world.
But this picture is still, on its own, potentially misleading. For while the standard causalstatistical assumptions (locality, the Common Cause Principle, noconspiracy, etc.) are all robust physical/metaphysical principles which we have strong reasons to assume, the mathematical condition of classicality in itself is completely unreasonable and unmotivated. This is because the components of correlation vector \(\overrightarrow{p}_{\text {EPR}}\), (5), are conditional probabilities:
Values of conditional probabilities pertaining to different conditions do not form a probability measure in general, and so it makes no sense in general to require that these values obey Kolmogorov’s axioms, that is, that they be representable in a classical probability space in accord with Definition 1.^{Footnote 9}
The only way classicality is motivated in EPRB is that, as stated by Proposition 5, it is a mathematical corollary of the standard causalstatistical assumptions of Bell’s theorem. So what the Pitowsky–Fine derivation of Bell’s inequalities provides is not a new understanding of Bell’s theorem, but yet another way of deriving Bell’s inequalities from the standard assumptions:
standard causalstatistical assumptions
\(\Downarrow \)
classicality
\(\Downarrow \)
Bell’s inequalities^{Footnote 10}
Therefore, the mathematical condition of classicality makes sense in cases where the standard causalstatistical assumptions apply: where we have spacelike separated subsystems that are assumed to behave locally, etc. Importantly, when one or more of those assumptions does not hold, classicality (again, the mathematical condition) may fail to hold, even in classical physical systems.
One way that this can happen is if the system in question is composed of timelike separated subsystems that are allowed to interact. As an example, imagine Laurel and Hardy on a teetertotter. Assume that Hardy weights twice as much as Laurel. We want to see if Laurel and Hardy go up or down, under the following conditions:
Introduce the following outcome events:
Suppose that the teetertotter experiment is performed repeatedly. Elementary physics entails the following probabilities:^{Footnote 11}
Further, assuming that both Laurel and Hardy sit far from the center and close to the center half of the times, independently of each other, we have
Now, correlation vector \(\overrightarrow{p}_{\text {LH}}=\left( p_{1},p_{2},p_{3},p_{4},p_{13},p_{14},p_{23},p_{24}\right) \) is not classical since, for example, \(p_{1}<p_{13}\). There certainly cannot exist a classical probability space with events \(E_{1}\) and \(E_{3}\) in it such that
as that would entail \(\mu \left( E_{1}\right) <\mu \left( E_{1}\cap E_{3}\right) \), in contradiction with Kolmogorov’s laws of probability. But this fact by no means implies the break down of classical probability theory, let alone classical physics, in any sense. It is simply that the values of classical conditional probabilities pertaining to different conditions—\(p_{1}=p\left( A_{1}a_{1}\right) \) and \(p_{13}=p\left( A_{1}\cap B_{3}a_{1}\cap b_{3}\right) \)—do not form a probability measure.
At the same time, this example witnesses an obvious violation of the standard assumptions of Bell’s theorem.^{Footnote 12} The outcomes are correlated, for we have:
The obvious explanation of this correlation is the direct physical influence between the two ends of the teetertotter, which ensures that when one end goes down, the other end goes up. Since there is direct causal connection, the Common Cause Principle no longer demands the existence of a common cause satisfying factorization and noconspiracy. Indeed, there is just no such an event in the example. For instance, \(B_{4}\) as a potential direct cause of \(A_{2}\) does screen off correlation (21), meaning that partition \(\left\{ C_{k}\right\} _{k=1,2}=\left\{ B_{4},\lnot B_{4}\right\} \) satisfies outcome independence (14), as well as factorization (12). But \(B_{4}\) fails to satisfy noconspiracy (13), for the obvious reason that \(B_{4}\) can only occur when the corresponding “measurement” \(b_{4}\) is performed, so there is strong correlation between \(B_{4}\) and \(b_{4}\). Again, the fact that there is no common cause \(C_{k}\) satisfying the standard Bell assumptions comes as no surprise since, unlike in the EPRB scenario, events on the two sides of the teetertotter can, and in fact do, causally influence each other, with no contradiction to locality.
With these remarks in mind, it is worth mentioning a strand of approaches attempting to get around Bell’s theorem. The experimental violation of Bell’s inequalities in conjunction with the Pitowsky–Fine derivation of the inequalities implies that \(\overrightarrow{p}_{\text {EPR}}\) does not admit a classical probability space representation. A reaction shared by many scholars is that the way to evade this problem is generalizing the notion of probability space, relaxing some of Kolmogorov’s axioms, so that under the generalized notion of probability space \(\overrightarrow{p}_{\text {EPR}}\) does admit a probability space representation. There are many proposals in this direction, for an overview see Feintzeig and Fletcher (2017). In light of what we said about classicality, here we briefly mention two possible concerns with these approaches. First of all, it must be emphasized that there is a clear sense in which the probability values in \(\overrightarrow{p}_{\text {EPR}}\) can be represented in a classical probability space: not as absolute probabilities of events, as Definition 1 requires, but as conditional probabilities conditioned on different conditioning events, in line with (17).^{Footnote 13} Similarly in our example: \(\overrightarrow{p}_{\text {LH}}\) is not a classical correlation vector, but no one would take this as evidence of Kolmogorov’s probability rules being violated by the teetertotter system. If one wants to write down the numbers in \(\overrightarrow{p}_{\text {LH}}\) as values of probabilities in a probability space, they will be conditional probabilities in a classical probability space. Accordingly, to accommodate the Bell inequality violating correlation vectors in a probability space it is not necessary to generalize the notion of probability. Secondly, even if one did that, it must be clear that this move doesn’t save locality. This is because classicality (in the sense of Definition 1), as we have argued, is not among the premises of the standard derivation of Bell’s inequalities—it is just not a condition that one could deny in order to evade the derivation of the inequalities from locality, noconspiracy, etc. (without also having to deny one of these standard assumptions).^{Footnote 14}
Note that what we said about classicality is equally true for the existence of noncontextual hidden variables: 1) Bell’s inequalities can be derived from the existence of noncontextual hidden variables (Shimony , 1984, pp. 30–31). 2) Many believe that this makes the derivation from the standard causalstatistical assumptions irrelevant. 3) But the derivation from noncontextual hidden variables does not invalidate the derivation from the standard causalstatistical assumptions: the violation of Bell’s inequalities implies that both noncontextual hidden variables and locality (or another one of the standard causalstatistical assumptions) must go. 4) Furthermore, as with classicality, the existence of noncontextual hidden variables in itself is not wellmotivated. Our teetertotter system is again a good example, since it displays an obvious violation of noncontextuality in the following sense. Let \(C_{k}\) now describe not a common cause but the system’s “ontic state,” that is, for the teetertotter system, all the physical factors together, including the weights of Laurel and Hardy and the small perturbations in play in the balance case (but excluding conditions \(a_{1},a_{2},b_{3},b_{4}\)), that go into determining which one of the two goes up and down. Noncontextuality is the condition that the ontic state determines the probability of the outcomes of each measurement independently of what other measurements are simultaneously performed;^{Footnote 15} which is, in our case, is nothing but the condition of parameter independence (15)–(16). The violation of this condition is due to the fact that, even when \(C_{k}\) is given,^{Footnote 16} the outcome on one side (whether Laurel/Hardy goes up or down) depends on the measurement choice on the other side (where Hardy/Laurel sits, respectively). Again, contextuality in this sense comes as no surprise since the two ends can and do physically interact. 5) That said, the existence of noncontextual hidden variables, just as classicality, does follow when we have spacelike separated subsystems that are assumed to behave locally, etc., that is, where the standard causalstatistical assumptions apply. For in that case \(C_{k}\) in assumptions 3–4 will just be a noncontextual hidden variable. The most wellknown derivation of noncontextual hidden variables from locality, etc. is the EPR argument.
Indeed, the tight connection between classicality, noncontextual hidden variables, and the standard causalstatistical assumptions is especially transparent in the deterministic case. Suppose we have parallel measurement directions in the two wings of EPRB, with perfect correlation between outcomes of measurements in the same direction. Perfect correlation can only be explained by a deterministic common cause (HoferSzabó et al. , 2013, p. 15, Proposition 2.7). This fact, together with noconspiracy (13), imply that \(p\left( A_{i}a_{i}\cap b_{j}\cap C_{k}\right) ,p\left( B_{j}a_{i}\cap b_{j}\cap C_{k}\right) \in \left\{ 0,1\right\} ,(i,j)\in S_{\text {EPR}}\). Parameter independence (15)–(16) further entails
Now introduce the following events:
In conjunction with (22), the standard causalstatistical assumptions imply (see formula (32) in Appendix):
That is, the conditional probabilities of outcome events figuring in \(\overrightarrow{p}_{\text {EPR}}\) must be equal with the absolute probabilities of the events that predetermine these outcomes. This gives Proposition 5 a straightforward interpretation: since on the right hand side we have absolute probabilities of events that are representable in a classical probability space, the values of conditional probabilities on the left hand side must also be so representable: \(\overrightarrow{p}_{\text {EPR}}\) must be classical. On the other hand, what (24) says is that the measurement outcomes \(A_{i},B_{j}\) simply reveal the preexisting properties \(C_{A_{i}},C_{B_{j}}\)—which is exactly the idea behind noncontextual hidden variables. It must be stressed that equalities (24), and thus both classicality and the existence of noncontextual hidden variables, are consequences of the standard causalstatistical assumptions of Bell’s theorem, including, importantly, the causal separation of the subsystems we consider.
All this implies that when the system in question is not composed of spacelike separated subsystems then we have no automatic reason to expect that classicality, the existence of noncontextual hidden variables and Bell’s inequalities will be satisfied in the first place. A quantum example of such a system is precisely that of the spin3/2 Neon atom, as discussed in Griffiths (2020): Looking at expected values of spins along different axes, one can derive predicted values that violate a Bell inequality (specifically the CHSH inequality). Griffiths (2020, p. 3) urges us, on the basis of this example, to view things in this way (similar to Pitowsky and Fine quoted in the introduction):
Thus the violation of the CHSH inequality in this case has nothing to do with nonlocality. Instead it has everything to do with the fact that in quantum mechanics, unlike classical mechanics, physical properties and variables are represented by noncommuting operators.
Let us make a few remarks here. Firstly, one way to characterize the significance of noncommuting operators is what Fine (1982) suggests in the passage quoted in the introduction: according to the accepted view physical variables represented by noncommuting operators cannot be measured simultaneously and so in general they need not, and in fact they do not, have a joint distribution, in contradiction with what’s required by classicality in the sense of Definition 2. Notice however that the failure of existence of joint distributions in that sense is not specific to QM. Indeed, the same holds in our Laurel and Hardy example: since all three definitions of classicality are equivalent, and classicality in the sense of Definition 1 is violated in the Laurel and Hardy case, this means that classicality in the sense of Definition 2 also fails to hold for this simple classical physical system. Therefore, the absence of joint distributions that comes with noncommuting operators in the formalism doesn’t seem to mark off quantum physics from classical physics.
Secondly, there indeed is an intuition behind noncontextual hidden variables that might come from (simple cases in) classical physics. In the Neon atom case, one might say “the value of its spin along any given direction ought to be welldefined at all times; after all, spin might be roughly analogous to a classical angular momentum vector, the projection of which along any direction always has a welldefined value.” But the idea of noncontextual hidden variables not only incorporates the assumption that there are real, welldefined properties existing at all times; but, crucially, it brings with it a specific, and rather simplified, picture of how these properties are revealed in measurements: that the outcome of a measurement only depends on the corresponding property at present, irrespective of what other measurements, that is physical interactions, take place. However, this latter picture of the measurement process is something that is generally, perhaps even typically, not true in classical physics. The Laurel and Hardy example is again a case in point. The physics of it incorporates real properties that are welldefined at all times: the weights of Laurel and Hardy, their distances from the center, the density distribution of the board, etc. All these real properties altogether will determine the measurement outcomes, that is which end will go up and down. But this determination is contextual: whether Laurel goes up or down will depend not only on a property of Laurel but also on where Hardy sits. Since noncontextual hidden variables do not exist in general even for classical physical systems, it doesn’t seem terribly surprising, in itself, that their existence is not provided for generic physical systems, including the Neon atom.^{Footnote 17}
Finally, we agree with Griffiths that the violation of a Bell inequality doesn’t necessarily have to do with nonlocality, as exemplified by the Neon case. Indeed, the same is true of the violation in the Laurel and Hardy example. While Laurel and Hardy are somewhat spatially separated, that is inessential; what matters is that the events \(A_{1},B_{3}\) etc. are (a) not spacelike separated and indeed are (b) causally connected. So the violation of a Bell inequality in general is neither surprising nor relevant to whether there is locality in the world. By contrast, when a Bell inequality has been derived for a setup like EPRB experiments, with locality as a fundamental physical assumption among the premises, and the inequality is found to be violated in actual experiments, this does bear on the principle of locality!^{Footnote 18} By analogy: to say that the violation of Bell’s inequalities in EPRB has nothing to do with nonlocality because the inequalities can be violated in cases where locality perfectly holds, is no better than saying that the violation of energy conservation in a closed system has no bearing on the laws of thermodynamics because energy conservation can be violated in an open system in complete harmony with the those laws.
4 Two ways to get around Bell’s theorem
Classicality is thus not a condition that one could deny without also having to deny one of the standard causalstatistical assumptions of Bell’s theorem. If any theory of quantum phenomena is to avoid the derivation of Bell’s inequalities, it has to give up one of those standard premises. We will now look at two versions of standard QM, Werner’s operational quantum mechanics and Griffiths’ consistent histories approach, both of which are claimed to evade Bell’s theorem by giving up classicality, that is, claimed to be local nonclassical quantum theories. We describe what the two versions are claimed to say about the example of the EPRB scenario, and identify which one of the standard causalstatistical assumptions of Bell’s theorem each theory is in fact forced to give up.
Operational quantum mechanics (OQM) is basically a variant of standard QM in which quantum states \(\psi \) are treated as purely epistemic, i.e., as tools for calculating what results to expect from measurements made in various scenarios. In the EPRB case this means that the only role of \(\psi \) is to recover, through Born’s rule, the measurement statistics encapsulated in \(\overrightarrow{p}_{\text {EPR}}\):
where \(\hat{A}_{i},\hat{B}_{j}\) are projection operators pertaining the outcomes \(A_{i},B_{j}\) respectively, and \(\psi _{s}\) is the singlet state.
It is not clear in what sense OQM is meant to be local. On the one hand, Werner seems to suggests that locality simply consists in nosignaling (4), that is in a statistical independence condition:
... in the operational approach no prediction about B changes when or if a measurement or other procedure is carried out on A. This independence is built into the structure of quantum theory. This is also the same as the nosignalling condition and the possibility of tracing out system A, getting a reduced state for B, which does not change (and so is undisturbed) whatever happens just to A. (Werner , 2014b, p. 4)
On the other hand, later he explicitly condemns the conflation of mere statistical dependence and physical disturbance:
... if you condition on the outcome of a measurement on A, you get a modified state for B. That is just another way to look at correlation, but never, not even in classical probability, can this be confused with a physical disturbance. The state change only becomes effective when the results from the two labs are brought together and are jointly analyzed, which can happen centuries later.^{Footnote 19} (Werner , 2014b, p. 4)
One can nonetheless accept that OQM is local in the sense that the theory doesn’t state—in fact, since it only talks about measurement statistics, it has no words to state—the existence of physical disturbance between the two wings. The problem is that in the very same way, the theory doesn’t contain any kind of causal mechanism that could serve to correlate the outcomes in the two wings. Indeed, since the ontology of OQM only consists of measurement events and perhaps information states of agents, there are no common causes in it either that could explain the EPRB correlations. Thus, OQM violates the Common Cause Principle, the demand that there be no robust regularities without some sort of causal explanation—which was one of the standard assumptions of Bell’s theorem. It is by giving up the Common Cause Principle’s requirement that OQM is able to evade the derivation of Bell’s inequalities. In effect, the defender of OQM says: “Two guys are flipping coins on opposite sides of the Earth, whenever they get a “Flip!” command from their cousin Alice. There is no causal connection—neither direct, nor common causal—between the outcomes. But the two coins always land oppositely. Deal with it.”
Griffiths’ consistent histories approach (CH) to QM is an expression of Bohr’s complementarity idea. In CH, every maximal set of compatible measurements is associated with a socalled “framework.” In the EPRB case we have four frameworks pertaining to the four measurement pairs in \(S_{\text {EPR}}\). Relative to a given framework, in CH, quantum systems possess real properties that measurements simply reveal. Mathematically, one can model this by assigning to each framework a “small” probability space \(\left( X_{ij},\mathcal {A}_{ij},\mu _{ij}\right) ,(i,j)\in S_{\text {EPR}}\).^{Footnote 20} In each of these spaces we have a partition of events \(C_{ij}^{++},C_{ij}^{+},C_{ij}^{+},C_{ij}^{}\in \mathcal {A}_{ij}\) corresponding to the particles having spin properties \(+\) or − in the chosen pair of directions \((i,j)\in S_{\text {EPR}}\). Then Born’s rule is said to recover the probabilities of these properties:
These properties are thought to be revealed in measurements and hence the probabilities of these properties return the components of \(\overrightarrow{p}_{\text {EPR}}\):^{Footnote 21}
The essential tenet of CH is the “single framework rule”: properties associated with different frameworks, and represented in different probability spaces, do not coexist and cannot be talked about at the same time. In particular, properties \(C_{ij}^{\alpha \beta }\) do not coexist for different \((i,j)\in S_{\text {EPR}}\), and hence the particles possess spin only in one direction at a time. This notion is what encapsulates Bohr’s idea of complementary descriptions.
In CH, properties \(C_{ij}^{\alpha \beta }\) operate as (deterministic) common causes in the sense that they are assumed to obey outcome independence in each probability space separately:
Notice however that neither parameter independence nor noconspiracy can be written down in the “frameworks” formalism. This is because that would require an identification of the \(C_{ij}^{\alpha \beta }\)s across different frameworks (probability spaces), and that’s exactly what the single framework rule forbids doing. Nonetheless, there is a clear sense in which noconspiracy is violated in CH. Consider the ensemble of runs of an EPRB experiment. The presence of a spin property of the system, in a given run, depends on the framework in which we choose to the describe the system, which in turn depends on the measurements we choose to perform in the given run. Indeed, spin property \(C_{ij}^{\alpha \beta }\) will only be assumed to be present in a given run if we choose to perform measurements \(a_{i}\) and \(b_{j}\). This means that there is strong correlation, over the runs of the EPRB experiment, between the properties we ascribe to the system and the measurements we choose to perform—which is a violation of noconspiracy. Again, here we use ‘correlation’ in a relative frequency sense rather than a probabilistic sense expressible in terms of probability spaces \(\left( X_{ij},\mathcal {A}_{ij},\mu _{ij}\right) \). One way to phrase the point is that in CH, the very existence of the \(C_{ij}^{\alpha \beta }\) properties depends on human choices and thus cannot have an antecedent probability. Therefore, the right hand side of equation (13) cannot exist, per the single framework rule.
Thus, in the consistent histories approach, it is the violation of another standard premise of Bell’s theorem, that of noconspiracy, which blocks the derivation of Bell’s inequalities.
5 Conclusion
As we mentioned in the introduction, many physicists adamantly insist that EPRBtype phenomena do not show that there is genuine nonlocality in natural phenomena. Werner and Griffiths are just two prominent voices making such claims in recent years. While they advocate different interpretations of the standard QM formalism, both options can be related to a form of antirealism. The instrumentalist attitude implicit in OQM is certainly a source of not asking for explanation of correlations, and of not being bothered by the violation of the Common Cause Principle (cf. Lewis , 2019). The violation of noconspiracy we found in the CH approach can also be a result of some form of antirealism, in which the measurement process is thought to have a role in constituting the property of the system measured. Such a constitution relation brings about correlation between performing the measurement and ascribing the property—a correlation which is not due to a causal link but due to a logical/analytic connection. The idea is reminiscent of the Bohrian position that Einstein lamented: the moon is only there when you look at it.
In our view, adopting an antirealist stance of the above sorts does not save locality. The reason is that the very formulation of locality requires a sort of realism (cf. Norsen , 2007). One piece of evidence for this is that in neither of the two “antirealist” versions of QM in question can one meaningfully formulate the condition of parameter independence (15)–(16), a condition that is usually taken to be a requirement of locality. In OQM there are no \(C_{k}\)s, so parameter independence cannot be written down. Similarly, as we have seen above, to write down parameter independence in the histories formalism, we need to have an identification of the \(C_{ij}^{\alpha \beta }\)s across different “frameworks”; but that’s exactly what we are forbidden to have in the CH approach.
If this dialectical situation is acknowledged, then the philosophers impressed by Bell’s theorem and the EPRB experiments can come to a peaceful agreement with physicists who are not so impressed. Those physicists prefer a strong dose of antirealism in their physics, rather than a realistic physics that incorporates nonlocality explicitly. The CH advocate embraces a form of conspiracy between the preexisting properties that are revealed by measurement and the choices we make of what to measure. The OQM advocate must admit that nature somehow displays (enforces?) the inequalityviolating correlations, and that nothing in the properties of the measured particles predetermine (or at least causally explain) what the results would be (cf. Alice’s two friends flipping coins on opposite sides of the world). In both approaches, it must remain a mystery how nature can display these correlations between chancy events at spacelike separation.^{Footnote 22} As philosophers, we would only ask that the physicists refrain from making two sorts of statements (i) Saying that the QM treatment of EPRB is perfectly local (though they can perfectly well say that the QM treatment is not overtly nonlocal!). (ii) Saying that Bell did not prove what many philosophers think he proved, because he made a tacit and inappropriate presupposition of “classicality” in his argument. As we have seen (Section 3), “classicality” is a consequence of the standard causal and statistical assumptions made in Bell’s argument, not a separate, tacit assumption.
Notes
Note that the nonsignaling character of the EPRB scenario means that these probabilities obey
$$\begin{aligned} \begin{array}{rclcl} p\left( A_{i}a_{i}\right) &{} = &{} p\left( A_{i}a_{i}\cap b_{j}\right) \\ p\left( B_{j}b_{j}\right) &{} = &{} p\left( B_{j}a_{i}\cap b_{j}\right) \\ &{} &{} (i,j)\in S_{\text {EPR}} \end{array} \end{aligned}$$(5)Numbers \(p_{\alpha _{1}...\alpha _{n}}\) encode the probabilities of the \(2^{n}\) number of atomic events constructed from the n events in question.
Numbers \(\pi _{r}\) (\(r=1,2,...,d\)) are probabilities of some atomic events. Index set \(R_{i}\) corresponds to the set of those atomic events where the ith event of the n events we talk about occurs.
As noted by Fine (1989), one could also simply take it as a brute fact that the laws of physics enforce certain correlations on measurement outcomes in EPRBtype systems, independently of whether the measurements are spacelike separated or not. Note that if the laws enforce correlations between spacelike separated events in this way, then the laws are, in an important sense, nonlocal in themselves: what the laws dictate for the future behavior of system B is not affected by just what has happened in the recent causal past of B, but instead may depend on what is happening in distant, spacelike separated regions. A second thing to note is that this brute lawexplanation claim—if the nonlocality that it entails is not interpreted in a causal way—is in contradiction with the Common Cause Principle which we introduce below.
The first formulation of the argument from locality to preexisting properties is that of EPR of course. The EPR argument famously employs the Reality Criterion. As is argued by Gömöri and HoferSzabó (2021), the Reality Criterion is just a special case of the Common Cause Principle for perfect correlations.
In Bell’s argument this is where the socalled “hidden variable” terms \(\lambda \) come in. At times physicists such as Werner and Griffiths present this introduction of a factor accounting for measurement results as an unacceptable move amounting to the presupposition of realism, or “classical realism.” We feel it is better to keep the mathematical notion of classicality (Werner , 2014a, p. 3) separate from the more general idea that particles have properties that play a role in explaining measurement outcomes and the correlations observable in them, which Werner fails to do (as does Griffiths , 2020, p. 15). In Section 5 we briefly return to the question of how realism in the latter sense is related to the assumptions of Bell’s argument.
The notion of common cause we employ here is what HoferSzabó et al. (2013, Sec. 7) terms a common cause system.
This observation was first made by Szabó (1995).
When Hardy sits twice as close to the center as Laurel, they in principle balance. In this case assume small perturbations to determine how they move, so that half of the times Laurel, half of the times Hardy ends up going up. That’s why we have 1/2 in the last row of (18).
The thesis that quantum probabilities can always be interpreted as classical conditional probabilities is called the Kolmogorovian Censorship Hypothesis by Szabó (1995). The fact that this is indeed always possible mathematically, has been proved in various forms (Bana & Durt, 1997; Szabó , 2001; Rédei , 2010).
Note that the function \(p(\cdot )\) itself (but not \(p(\cdot X)\) for varying X!) is indeed assumed to obey Kolmogorov’s axioms. That is, in the derivation of Bell’s inequalities we assume that there is a classical probability space \(\left( X,\mathcal {A},p\right) \) where \(a_{1},a_{2},b_{3},b_{4},A_{1},A_{2},B_{3},B_{4}\) and \(C_{k}(k\in K)\) are represented as events, and probability measure p, in terms of which we formulate conditions (12)–(13), obeys Kolmogorov’s axioms. This is explicitly assumed in the proof of Proposition 5 (see Appendix). It must be emphasized however that the classicality of \(p(\cdot )\) (but not of \(p(\cdot X)\) for varying X!) is an analytic consequence of p being interpretable as relative frequency of events occurring in the runs of an EPRB experiment; and without such an interpretation the violation of Bell’s inequalities cannot be said to be empirically confirmed (cf. Szabó , 2001, Sec. 2.2).
This notion of noncontextuality is what HoferSzabó (2022) calls simultaneous noncontextuality.
Note that in this simple example not only noncontextuality but also nosignaling (4) is violated. One can give examples of more finedtuned classical interacting systems where the measurement statistics obey nosignaling, but the underlying ontic state fails to satisfy noncontextuality (and Bell’s inequalities and classicality are also violated). For such an example see e.g. HoferSzabó (2021, Sec. 4).
To see how spin is understood as a contextual property in the Bohmian version of QM see e.g. Daumer (1996).
As we noted in footnote 4, one could take the results as showing that we should give up the Common Cause Principle, as Fine urges, but this leaves us with a sort of nonlocality built into the laws of nature themselves. The only other escape routes run through rejecting noconspiracy, which is a price so high that only a handful of physicists have been willing to consider paying it (see e.g. Hossenfelder & Palmer, 2020).
If quantum states are epistemic, one may wonder that the measurement at A does not change the state for B, since it clearly does change the epistemic state of the observer at A who does the measurement! Werner does not say what it means to say that a state change has “become effective,” nor what happens to the state for B if no measurement is made in that wing (so that the results cannot be jointly analyzed).
As Lewis (2019) points out, the dialectical situation may be different in an Everettian framework, although there is no consensus among philosophers about whether EPRB phenomena involve nonlocality in that framework, or not.
References
Bana, G., & Durt, T. (1997). Proof of Kolmogorovian censorship. Foundations of Physics, 27, 1355–1373.
Barrett, J. (2007). Information processing in generalized probabilistic theories. Physical Review A, 75, 032304.
Bell, J. S. (2004). Speakable and unspeakable in quantum mechanics. Cambridge University Press.
Butterfield, J. (1992). Bell’s theorem. What it takes, British Journal for the Philosophy of Science, 43, 41–83.
Daumer, M., et al. (1996). Naive realism about operators. Erkenntnis, 45, 379–397.
Feintzeig, B. H., & Fletcher, S. C. (2017). On Noncontextual, NonKolmogorovian Hidden Variable Theories. Foundations of Physics, 47, 294–315.
Fine, A. (1982). Hidden variables, joint probability, and the Bell inequalities. Physical Review Letters, 48, 291–295.
Fine, A. (1989). Do correlations need to be explained?. In J. Cushing & E. McMullin (Eds.), Philosophical consequences of quantum theory (pp. 175–194). University of Notre Dame Press.
Gömöri, M., & T. Placek (2017). Small probability space formulation of Bell’s theorem. In G. HoferSzabó & L. Wroński (Eds.), Making it formally explicit. European studies in philosophy of science (Vol. 6, pp. 109–127). Springer.
Gömöri, M., & HoferSzabó, G. (2021). On the meaning of EPR’s reality criterion. Synthese, 199, 13441–13469.
Griffiths, R. B. (2020). Nonlocality claims are inconsistent with Hilbertspace quantum mechanics. Physical Review A, 101, 022117.
HoferSzabó, G., Rédei, M., & Szabó, L. E. (2013). The principle of the common cause. Cambridge University Press.
HoferSzabó, G. (2020). On the three types of Bell’s inequalities. In M. Hemmo & O. Shenker (Eds.), Quantum, probability, logic. Jerusalem studies in philosophy and history of science (pp. 353–374). Springer.
HoferSzabó, G. (2021). Causal contextuality and contextualitybydefault are different concepts. Journal of Mathematical Psychology, 104, 102590.
HoferSzabó, G. (2022). Two concepts of noncontextuality in quantum mechanics. Studies in History and Philosophy of Science, 93, 21–29.
Hossenfelder, S., & Palmer, T. (2020). Rethinking superdeterminism. Frontiers Physics, 8, 139.
Lewis P. J. (2019). Bell’s theorem, realism, and locality. In A. Cordero (Ed.), Philosophers look at quantum mechanics. Synthese library (Studies in epistemology, logic, methodology, and philosophy of science) (Vol. 406, pp. 33–43). Springer.
Maudlin, T. (2014a). What Bell did. Journal of Physics A: Mathematical and Theoretical, 47, 424010.
Maudlin, T. (2014b). Reply to comment on what Bell did. Journal of Physics A: Mathematical and Theoretical, 47, 424012.
Norsen, T. (2007). Against realism. Foundations of Physics, 37, 311–340.
Pitowsky, I. (1989). Quantum probability – quantum logic. Springer.
Rédei, M. (2010). Kolmogorovian censorship hypothesis for general quantum probability theories. Manuscrito  Revista Internacional de Filosofia, 33, 365–380.
Santos, E. (1986). The Bell inequalities as tests of classical logic. Physical Review A, 115, 363–365.
Shimony, A. (1984). Contextual hidden variables theories and Bell’s inequalities. The British Journal for the Philosophy of Science, 35, 25–45.
Szabó, L. E. (1995). Is quantum mechanics compatible with a deterministic universe? Two interpretations of o quantum probabilities. Foundations of Physics Letters, 8, 421–440.
Szabó, L. E. (2001). Critical reflections on quantum probability theory. In M. Rédei and M. Stoeltzner (Eds.), John von Neumann and the foundations of quantum physics (pp. 201–219). Kluwer Academic Publishers.
Weingartner, P. (2009). Matrixbased logic for application in physics. The Review of Symbolic Logic, 2(1), 132–163.
Werner, R. (2014a). Comment on ‘What Bell did’. Journal of Physics A: Mathematical and Theoretical, 47, 424011.
Werner, R. (2014b). What Maudlin replied to. arXiv:1411.2120
Acknowledgements
Márton Gömöri acknowledges the support of the following Hungarian research grants: Hungarian Eötvös Scholarship and grants no. K115593 and K134275 of the National Research, Development and Innovation Office. Carl Hoefer acknowledges the support of the following Catalan and Spanish research grants: 2021SGR00276, FFI201676799P, PID2020115114GBI00, and CEX2021001169M funded by MCIN/AEI/10.13039/501100011033.
We wish to thank Győző Egri, Balázs Gyenis and Gábor HoferSzabó for valuable feedback.
Funding
Open access funding provided by Eötvös Loránd University.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics approval
The manuscript complies with the Ethical Rules applicable for European Journal for Philosophy of Science. All authors agree with the content of the manuscript and give consent to submit it.
Conflict of interest
We declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Here we give the proofs of Proposition 4 and 5.
Proof of Proposition 4
We show that (a) (i) implies (ii), (b) (ii) implies (iii), and (c) (iii) implies (i).
(a) Let \(\overrightarrow{p}\) be an arbitrary element of \(\Omega \) and suppose that \(\overrightarrow{p}\) admits a classical probability space representation. We demonstrate that \(\overrightarrow{p}\) is extractable from a joint distribution. Since \(\overrightarrow{p}\) admits a classical probability space representation, there is a classical probability space \(\left( X,\mathcal {A},\mu \right) \) and \(E_{1},E_{2},...,E_{n}\in \mathcal {A}\) such that (6) holds. As \(\left( X,\mathcal {A},\mu \right) \) is a classical probability space, events \(E_{1},E_{2},...,E_{n}\in \mathcal {A}\) in it has a joint distribution which marginalizes to (6). More precisely, with notation
let
Since \(\mu \) is a Kolmogorovian probability measure, (7) will obviously hold. Moreover, due to (6) and the law of total probability,
that is, (8) is also satisfied. Thus, \(\overrightarrow{p}\) is extractable from a joint distribution.
(b) Suppose that for all \(\overrightarrow{p}\in \Omega \), \(\overrightarrow{p}\) is extractable from a joint distribution. We will demonstrate that \(\Omega \) is projectable from the \(2^{n}\)vertex probability simplex \(\Delta _{2^{n}}\subset \mathbb {R}^{2^{n}}\). Instead of \(r\in \left\{ 1,2,...,2^{n}\right\} \), it will be convenient to label the components of vectors in \(\mathbb {R}^{2^{n}}\) by \(\left( \alpha _{1},...,\alpha _{n}\right) \in \left\{ +,\right\} ^{n}\), where indices \(\left( \alpha _{1},...,\alpha _{n}\right) \) are ordered lexicographically, with \(+\) preceding −. With this notation, introduce the following sets of indices: \(R_{i}:=\left\{ \left( \alpha _{1},...,\alpha _{n}\right) \in \left\{ +,\right\} ^{n}\alpha _{i}=+\right\} ,i=1,2,...,n\); and let map \(\varphi :\mathbb {R}^{2{^n}}\rightarrow \mathbb {R}^{n+\left S\right }\) be defined by
for all \(\varvec{x}\in \mathbb {R}^{2^{n}}\). \(\varphi \) is obviously linear. Now, since for all \(\overrightarrow{p}\in \Omega \), \(\overrightarrow{p}\) is extractable from a joint distribution, for all \(\overrightarrow{p}\in \Omega \) there exist \(2^{n}\) numbers \(p_{\alpha _{1}...\alpha _{n}},\alpha _{1},...,\alpha _{n}\in \left\{ +,\right\} \) such that (7) and (8) hold. For each \(\overrightarrow{p}\in \Omega \), consider the vector \(\varvec{\pi }\in \mathbb {R}^{2^{n}}\) for which \(\pi _{\alpha _{1}...\alpha _{n}}=p_{\alpha _{1}...\alpha _{n}}\). Due to (7), \(\varvec{\pi }\in \Delta _{2^{n}}\) for all \(\overrightarrow{p}\in \Omega \). Moreover, due to (8) and the definition of \(\varphi \), (29), \(\varphi \left( \varvec{\pi }\right) =\overrightarrow{p}\). Hence, \(\Omega \subseteq \varphi \left( \Delta _{2^{n}}\right) \). Furthermore, suppose that for a \(\varvec{\pi }'\in \Delta _{2^{n}}\), \(\varphi \left( \varvec{\pi }'\right) =\overrightarrow{p}\) for some \(\overrightarrow{p}\in \Omega \). Then, again due to (29),
which is nothing but (10) in terms of indices \(\left( \alpha _{1},...,\alpha _{n}\right) \) instead of r. Thus, \(\Omega \) is projectable from the probability simplex \(\Delta _{2^{n}}\subset \mathbb {R}^{2^{n}}\).
(c) Suppose that \(\Omega \) is projectable from a probability simplex \(\Delta _{d}\subset \mathbb {R}^{d}\). Consider the set of vertices of \(\Delta _{d}\), \(X=\left\{ \varvec{e}_{1},\varvec{e}_{2},...,\varvec{e}_{d}\right\} \), and its subset algebra \(\mathcal {A}.\) We show that each \(\overrightarrow{p}\in \Omega \) admits a classical probability space representation over measurable space \(\left( X,\mathcal {A}\right) \). Since \(\Omega \) is projectable from the probability simplex \(\Delta _{d}\), there is a linear map \(\varphi :\mathbb {R}^{d}\rightarrow \mathbb {R}^{n+\left S\right }\), and sets of indices \(R_{i}\subseteq \left\{ 1,2,...,d\right\} ,i=1,2,...,n\) such that (9) and (10) hold. Let \(E_{i}:=\left\{ \varvec{e}_{r}r\in R_{i}\right\} \in \mathcal {A},i=1,2,...,n\). Consider an arbitrary \(\overrightarrow{p}\in \Omega \). Since \(\Omega \subseteq \varphi \left( \Delta _{d}\right) \), there is a \(\varvec{\pi }\in \Delta _{d}\) such that \(\varphi \left( \varvec{\pi }\right) =\overrightarrow{p}\). Now, define \(\mu \) in the following way:
\(\mu \) is obviously a probability measure on \(\mathcal {A}\), and
that is, (6) is satisfied. Thus, \(\left( X,\mathcal {A},\mu \right) \) with \(E_{1},E_{2},...,E_{n}\in \mathcal {A}\) provides a classical probability space representation of \(\overrightarrow{p}\). \(\square \)
Proof of Proposition 5
As Pitowsky (1989) showed, the fact that a correlation vector \(\overrightarrow{p}\in \mathbb {R}^{n+\left S\right }\) admits a classical probability space representation is equivalent with the fact that \(\overrightarrow{p}\) lies within the socalled classical correlation polytope
the convex hull of the classical vertex vectors \(\overrightarrow{u}^{\varepsilon }\in \mathbb {R}^{n+\left S\right }\,(\varepsilon \in \left\{ 0,1\right\} ^{n})\) whose components are defined as
We will show that \(\overrightarrow{p}_{\text {EPR}}\in c\left( 4,S_{\text {EPR}}\right) \).
First, note that any correlation vector \(\overrightarrow{p}\in \mathbb {R}^{n+\left S\right }\) for which
is an element of c(n, S). Indeed, it easy to verify that such a vector can be expressed as a convex combination of classical vertex vectors \(\overrightarrow{u}^{\varepsilon }\) with coefficients
Now suppose that there is a classical probability space \(\left( X,\mathcal {A},p\right) \) where \(a_{1},a_{2},b_{3},b_{4},\) \(A_{1},A_{2},B_{3},B_{4}\) and \(C_{k}(k\in K)\) are represented as events (\(a_{1},a_{2},b_{3},b_{4},A_{1},A_{2},B_{3},B_{4},\) \(C_{k}\in \mathcal {A},k\in K\)), \(\left\{ C_{k}\right\} _{k\in K}\) is a partition of \(\mathcal {A}\), and probability measure p satisfies (12)–(13). Since \(C_{k}(k\in K)\) partition \(\mathcal {A}\), one can apply the law of total probability with respect to conditional measures \(p\left( \cdot a_{i}\cap a_{j}\right) \) to receive
Taking into account nosignaling (4) and assumptions (12)–(13), equations (31) simplify to
Now introduce correlation vectors \(\overrightarrow{p}^{k}\in \mathbb {R}^{4+4}\,(k\in K)\) with components
Observe that correlation vectors \(\overrightarrow{p}^{k}\) are of the form (30), and thus they all lie in \(c\left( 4,S_{\text {EPR}}\right) \). Furthermore, notice that what (32) says is
where \(p\left( C_{k}\right) \ge 0,\underset{k\in K}{\sum }p\left( C_{k}\right) =1\) since \(\left\{ C_{k}\right\} _{k\in K}\) is a partition. This means that \(\overrightarrow{p}_{\text {EPR}}\) is a convex combination of elements of \(c\left( 4,S_{\text {EPR}}\right) \), which entails that \(\overrightarrow{p}_{\text {EPR}}\in c\left( 4,S_{\text {EPR}}\right) \) as \(c\left( 4,S_{\text {EPR}}\right) \) is a convex set.\(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gömöri, M., Hoefer, C. Classicality and Bell’s theorem. Euro Jnl Phil Sci 13, 45 (2023). https://doi.org/10.1007/s1319402300531y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s1319402300531y