# A Rigorous Analysis of the Clauser–Horne–Shimony–Holt Inequality Experiment When Trials Need Not be Independent

## Abstract

The Clauser–Horne–Shimony–Holt (CHSH) inequality is a constraint that local hidden variable theories must obey. Quantum Mechanics predicts a violation of this inequality in certain experimental settings. Treatments of this subject frequently make simplifying assumptions about the probability spaces available to a local hidden variable theory, such as assuming the state of the system is a discrete or absolutely continuous random variable, or assuming that repeated experimental trials are independent and identically distributed. In this paper, we do two things: first, show that the CHSH inequality holds even for completely general state variables in the measure-theoretic setting, and second, demonstrate how to drop the assumption of independence of subsequent trials while still being able to perform a hypothesis test that will distinguish Quantum Mechanics from local theories. The statistical strength of such a test is computed.

### Keywords

Quantum theory Bell’s theorem Measure-theoretic probability Bell inequalities Hypothesis test Hidden-variable theories### Mathematics Subject Classification

81P15 81Qxx## 1 Introduction

It has been known since the 1964 publication by Bell [1] that Quantum Mechanics makes predictions incompatible with any so-called local hidden variable theory (LHVT). The conflict can be tested experimentally with an instrument that generates entangled particles and two particle detectors that can measure certain properties, such as spin or polarization. For such an experiment, the Clauser–Horne–Shimony–Holt (CHSH) inequality [2] provides a constraint on the possible outcomes under a LHVT; according to the prediction of Quantum Mechanics, the constraint will be violated. The profound physical implications of the CHSH experiment have been long discussed, and more recently, the experiment has been found to have new applications in the field of device-independent quantum key distribution [3, 4] and device-independent randomness expansion [5].

*probabilistic*statement, asserting that under locality, a particular function of the probabilities of various experimental outcomes cannot exceed a certain quantity. According to the predictions of Quantum Mechanics, this quantity will be exceeded.

The probabilistic nature of the constraint (1) raises two issues. The first issue is: how does one build an appropriate mathematical model for the experiment? In the original proofs of the Bell [1] and CHSH [2] inequalities, it is tacitly assumed that the random variable that models the state of the system can be taken to be absolutely continuous, in the sense that it has a probability density function. Though this is a fairly reasonable assumption to make about a random variable modeling a real-world phenomenon, in the interest of full generality it would be best to not make such a claim. In some recent work on hidden variable models [6, 7], authors have worked in a more general measure-theoretic setting, though the frameworks set out in [6, 7] have not been used to prove the original CHSH inequality or model repeated trials of the experiment.

The second issue is: how does one draw a conclusion from the experimental data? As the constraints on LHVTs are probabilistic, any single execution of the experiment does not provide evidence for or against any one particular theory. (This is for the same reason that the result of a single coin toss does not tell you if a coin is biased.) The standard strategy for dealing with this is to run many trials of the experiment and compare the sample means to the predicted expectations. There is a problem, though—the sample means needn’t converge to the predicted expectations. One could expect convergence if one could assume that subsequent trials are independent and identically distributed (i.i.d.)—but plausible though this assumption seems, it need *not* be satisfied by a LHVT. Indeed, it is not hard to devise a mechanism for a LHVT to violate this assumption: detected particles could leave some sort of residue in the particle detectors that biases the outcome of the next incoming particle. This complication has been referred to as the “memory loophole” in [8], and it has also been addressed in [9]. (Possible interdependence between experimental trials can also cause security problems for quantum key distribution protocols, as seen in [10, 11].) [8] concludes that, even allowing for time dependence, quantum mechanical experimental data can be reliably distinguished from the data produced by any LHVT; however, the paper uses some informal justifications and assumes that the state random variable is absolutely continuous. [9] reaches the same conclusion with more rigor, but the exact bound on the statistical p-value derived from the Azuma–Hoeffding inequality [16, 17] can be improved on. (Here, the “*p*-value” is the probability of seeing data as or more extreme than what is observed experimentally, under a LHVT.)

In this paper, we resolve these two issues simultaneously. We present a completely general measure–theoretic model for the Bell test experiment, making no unnecessary assumptions about the random variables involved. Using this framework, we show that the CHSH inequality can still be derived. The framework can be extended in a natural way to accommodate repeated trials that need not be independent and/or identically distributed. In the extended framework, we prove that a hypothesis test can reliably distinguish between Quantum Mechanics and LHVTs, where the null hypothesis is that nature is governed by a LHVT. Interestingly, the *p*-value for rejecting the null hypothesis is shown to be the same as it would be if we restricted the null hypothesis to the narrower class of LHVTs that are i.i.d. That is, allowing for LHVTs with memory does not increase the probability of violating the CHSH inequality under the null hypothesis. The calculated *p*-value of the hypothesis test described in this paper compares favorably to other calculations of *p*-values in Bell-inequality experiments [12, 13].

The paper uses the formalism of measure-theoretic probability (see, e.g., [14]). The structure is as follows: in Sect. 2, we describe the mathematical model for the CHSH experiment, in Sect. 3, we derive the CHSH inequality in this setting, and in Sect. 4, we extend the framework to the multiple trial, non-i.i.d. setting and show how to set up an appropriate hypothesis test, which is then analyzed. There is also an appendix in which we provide some context for our mathematical model by comparing it to another recent model of hidden variable theories given by Brandenburer and Yanofsky in [15].

## 2 The Setting and the Mathematical Model

We now model a single trial of the experiment, and leave the repeated-trial scenario to Sect. 4. The following definition contains the necessary elements for the model. Standard concepts such as “probability measure” are defined in [14].

**Definition 1**

The most general of the five random variables above is \(\lambda \). This generality is fitting, because \(\lambda \) describes the portion of the experiment that we don’t directly observe: the state of the photon pair that is theorized to be travelling towards the detectors. Quantum Mechanics has a well-defined description of \(\lambda \) and how it triggers the detectors. But we also want to be able to model any conceivable LHVT, so we define the state of the system, \(\lambda \), with complete generality.

The other four random variables are more straightforward because they model aspects of the experiment that we can directly observe. We model the detector settings as random variables, as we want the experimenter to toggle the detector settings randomly and independently of anything else going on in the experiment, with the choice of setting occurring just before the detection event.

The following three assumptions encapsulate a set of requirements that an experimenter can satisfy in order to properly test Bell’s theorem. The notation “\(X \perp \!\!\!\perp Y\)” means “\(X\) is independent of \(Y\).”

(4) is closely related to the “\(\lambda \)-independence” assumption that appears in Brandenburger and Yanofsky [15]. Note that, unlike [15], we don’t make a slightly stronger assumption that the *joint* distribution of \(A\) and \(B\) is independent from \(\lambda \), written \((A, B) \perp \!\!\!\perp \lambda \); this stronger assumption turns out to be unnecessary in our framework. This contrast is explored in the appendix.

*beyond*the effects of the shared history of what happened between them prior to detection, represented by \(\lambda \).

The conditional probabilities in (6) are themselves random variables, as defined in [14]. Theoretically these can be complicated constructions, but if \(\lambda \) is a discrete random variable, the dependent random variable \(P(E|\lambda )\) is also discrete, taking the value \(P(E|\{\lambda =x\})\) when \(\lambda = x\). This simplified situation has the benefit of being highly intuitive, and it is explored in the appendix. For now, we make no such simplifying assumption about \(\lambda \).

We refer to a Bell experiment satisfying (2)–(5) as being governed by a LHVT. Experimental results inconsistent with assumptions (2)–(5) can be considered violations of the LHVT hypothesis, implying that one of the assumptions must not hold. We will further explore how to interpret a violation in the conclusion.

## 3 Deriving the CHSH Inequality

*and*the random variable \(\xi \). Using this shorthand, we can derive the following expression,

**Lemma 1**

Let \(a\), \(b\), \(D_1\), \(D_2\) be as in Definition 1. Then, under (2) and (3), the Eq. (9) holds.

*Proof*

**Lemma 2**

*Proof*

This follows in a straightforward manner from the measure-theoretic definition of conditional probability. \(\square \)

**Proposition 1**

*Proof*

**Lemma 3**

*Proof*

*Example 1*

Under (3) and (4), Lemma 3 applies to expressions such as \(\mu _a(+_1|\lambda )\), \(\mu _{b'}(-_2|\lambda )\), etc.

**Proposition 2**

*Proof*

As a consequence of Proposition 2, in any LHVT, the quantity \( K^{CHSH}\) must satisfy the simple inequality (16). On the other hand, Quantum Mechanics predicts \(K^{CHSH} = 2 \sqrt{2} > 2\). If we repeat the Bell test experiment many times and assume that the results of repeated trials are independent and identically distributed, we can calculate the \(K^{CHSH}\) quantity empirically and draw an appropriate conclusion about the theory describing the experiment.

However, as earlier noted, the assumptions of a LHVT do not require repeated trials to be independent and identically distributed, and so we have no reason to assert that the relative frequencies of various outcomes will converge to some underlying probability. A priori, we cannot even rule out the (pathological) possibility that each successive trial individually obeys the CHSH inequality (as required by a LHVT), but that the relative frequencies over many trials converge to the quantum values! In the next section, we address this problem.

## 4 A Hypothesis Test When Trials are Not Independent

If we run the experiment one time, we will randomly select one particular setting result for \(A\) and \(B\), and we will observe \(D_1D_2\) equal to \(+1\) or \(-1\). This one result tells us nothing about the satisfaction or violation of (18). We must run the experiment many times to discern a pattern.

Luckily, we can perform a cogent hypothesis test, even without the assumption of independent, identically distributed trials. Here is a useful analogy that will illustrate how we do this. Suppose we were to flip 10,000 different coins, and 80 % of them were to come up “heads.” Then we could reasonably conclude that at least *some* of the 10,000 coins were biased towards heads. The coins needn’t be identically distributed—indeed, perhaps some of the coins were fair—but it is intuitively clear that *some* of them must have been biased.

Analogously, each trial of the Bell test is like a coin flip, resulting in the product \(D_1D_2\) being equal to \(+1\) or \(-1\). In the previous section, we showed that the assumption of a LHVT puts certain constraints on the probabilities of getting \(+1\) or \(-1\). If the universe is governed by a LHVT, then the constraint must be satisfied on *every* trial. On the other hand, if Quantum Mechanics is obeyed, the constraint is *violated* on every trial. Then, thinking of the analogy, the locality assumption is like the assumption that *every one* of the 10,000 coins are fair, whereas agreement with Quantum Mechanics will predict getting 80 % heads. The Bell test is of course a little more complicated than coin tossing, but the analogy is a good idea to keep in mind as we design the hypothesis test.

*Remark 1*

(22) can be satisfied by appropriate calibration of the experimental apparatus. Earlier, we assumed only that these probabilities were positive; to prove an analogue of the CHSH inequality that holds over repeated trials, it is useful to assume that all the setting probabilities are calibrated to 1/2.

*filtration*—i.e., a sequence of nested \(\sigma \)-algebras:

Time Sequentiality:

For the final step, we establish a locality assumption corresponding to (5):

**Proposition 3**

*Proof*

The following lemma rules out possibilities like (31), and it will allow us to demonstrate that \(\alpha \) decreases as \(n\) increases.

**Lemma 4**

*Proof*

Lemma 4 allows us to formulate an upper-limit distribution for \(\overline{C_n}\), as shown in the following proposition. The result shows us that over many repetitions of the experiment, \(C_i\) cannot do any better at accumulating “\(+1\)” outcomes than an independent, identically distributed process that has a \({3\over 4}\) chance of success each time (i.e., a Binomial random variable). In light of Lemma 4, this may seem intuitive, but the proof does take some effort.

**Proposition 4**

*Proof*

To show this holds for any fixed positive integer \(n\), we use mathematical induction.

Case 1: \(n=1\).

Now, \(k\) can range from \(0\) to \(n+1\). First, let us prove it for \(k\) between \(1\) and \(n\), and later we will prove the boundary cases of \(k=0\) and \(k=n+1\).

*exactly*\(k-1\) successes after \(n\) trials, and \(p_{C_{n+1}}\) denotes the probability that \(C_{n+1}=+1\), given exactly \(k-1\) successes after \(n\) trials. As we are temporarily omitting the possibility that \(k=n+1\) or \(k=0\), it follows that \(P_{n,k}(C)\) and \(P_{n,k-1}(C)\) are well-defined and included in the scope of the inductive hypothesis.

So, under the null hypothesis, the probability of getting at least \(k\) “\(C_i=+1\)” results over the course of \(n\) trials is bounded above by the probability of getting at least \(k\) “successes” over the course of \(n\) Bernoulli trials with probability of success \({3\over 4}\). The bound is sharp: the i.i.d. case with \(p_i = {3\over 4}\) is allowed (just not *implied*) by assumptions (21)–(25). Note that this result directly pertains to the behavior of \(\overline{C_n}\), as the event “\(\overline{C_n}>z\)” is equivalent to the event “at least \(k\) of the \(C_i\) equal \(+\)1”, where \(k\) is an integer determined by the particular value of \(z\).

**Corollary 1**

*p*-values calculated numerically in [13]. The martingale-based analysis of [9] would result in a larger

*p*-value, as discussed in [13]; this is due [9]’s use of the loose (though computationally simple) Azuma–Hoeffding inequality [16, 17] to bound the upper tail probabilities, as opposed to exact figures that can be obtained from the Binomial distribution. Tighter Azuma-Hoeffding bounds can be applied, such as expression (8) in [18], which in our setting simplifies to

As the table reveals, the difference between the upper bound and the exact calculation is roughly two orders of magnitude for larger values of \(n\).

*Remark 2*

To calculate the power of the test, we used our knowledge of the quantum predictions. \(H_A\) could be extended to include *any* violation of locality; from a hypothesis test standpoint, our knowledge of the precise quantum predictions is not necessary. Smaller (sub-quantum) violations of the CHSH inequality would take more trials to detect. And violations of the inequality on *some* trials, balanced by trials that obey the inequality, could be statistically undetectable if the trials obeying the inequality were to do so by a large enough margin.

## 5 Conclusion

We have shown that the CHSH inequality can be proved in a completely general measure-theoretic framework, and furthermore that a hypothesis test can definitively test locality in an experimental setting.

By working in a precise setting, we gain the benefit of clearly delineating all of the assumptions being made. If \(H_A\) is supported by experiment, one of the various assumptions must be false. Under most standard interpretations of the quantum description of a Bell experiment, (2)–(4) can be satisfied and it is the locality assumption, (25), that is violated. As Quantum Mechanics is a successful theory upheld by countless experiments, it would be logical to attribute the failure of \(H_0\) to a quantum violation of (25).

However, the formulation of \(H_0\), and the derivation of the CHSH inequality (16) also rest on four other assumptions; the “experimental assumptions,” (21), (22), and (23), and time sequentiality, (24). A physical theory could violate \(H_0\), but still satisfy locality so long as one of the other assumptions turned out not to hold.

It is not clear that a violation of the time sequentiality assumption (24) would have any physical interpretation, as (24) is really a technical detail of how to model the problem—akin to the more basic assumption that we can model the problem with a probability space and random variables to begin with. As for the two assumptions (21) and (22), these can be compared to observed data and confirmed to any desired degree of certainty.^{1} On the other hand, (23) is a different creature. Equation (23) states that two observable random variables, \(A_i\) and \(B_i\), are independent of an unobservable random variable, \(\lambda _i\), and therefore this assumption cannot be directly tested.

What would a violation of (23) imply? This would mean that whatever process you were using to randomly set the detector settings was influenced by the state of the system prior to detection, \(\lambda _i\). Since we can choose any source of randomness—a separate quantum process, a random number generator on a computer, random fluctuations of the cosmic background radiation—to toggle the detector settings, the state of the system \(\lambda _i\) would have to be correlated with all sorts of seemingly unrelated processes. However, this would be the only alternative explanation, if we are to keep the locality assumption.

Sometimes it is claimed that it is not locality, but *realism* that must be abandoned. However, there is some debate about whether realism is a well-defined, required concept in the context of Bell experiments [19], and there is no clear invocation of realism at any point in this paper (assumption (23) is more aptly referred to as a free-will assumption, and (5) is of course a locality assumption). It could be argued that modeling the problem using the usual notions of probability fundamentally presupposes a realist viewpoint, but then it is not clear what a non-realist—but local—theory would be, or how such a theory could be modeled. In any case, to claim that the CHSH inequality rests on an assumption of realism requires being able to identify which of the assumptions and/or deductive steps in Sects. 2–4 should be identified with realism.

This paper assumes that every trial results in a detection event at both ends of the laboratory. In practice, however, there are limits in the detection efficiency of real-world particle detectors that result in most photons going undetected, so many trials end with only one detector detecting a photon, or no detections at all: see, for example, [20], where detection efficiency was only 5 %. To properly model a real-world experiment with this constraint, one would have to allow for a third outcome, “undetected” or “0”, in addition to the two outcomes “\(+1\)” and “\(-1\)”. Previous papers [21, 22, 23] have analyzed how to model this additional-outcome experiment and it has been found that, for a CHSH experiment using the singlet state, Quantum Mechanics is distinguishable from any LHVT so long as the detection efficiency exceeds a crucial cut-off of about 83 %, an efficiency that has not yet been achieved in a CHSH experiment. Detection-efficiency issues can be addressed in a completely general measure–theoretic framework without making i.i.d. assumptions about repeated trials; this is done in a separate work [24].

## Footnotes

- 1.
The reader may note that confirming these two assumptions by appealing to experimental data would require an assumption that the random variable sequences \(\{A_i\}\) and \(\{B_i\}\) are i.i.d.—exactly the sort of assumption we are trying to avoid in this paper. However, the difference is this: we

*observe*\(\{A_i\}\) and \(\{B_i\}\), and we may come to a reasonable conclusion that we are observing an i.i.d. sequence, whereas we will never be able to conclude this about the unobserved sequence \(\{\lambda _i\}\).

## Notes

### Acknowledgments

The author would like to thank Michael Mislove and Keye Martin for their support and guidance, as well as Gustavo Didier and Lev Kaplan for their helpful comments and suggestions. This work was partially supported by grant FA9550-13-1-0135 from the US Air Force Office of Scientific Research and Grant N00014-10-1-0329 P00004 from the US Office of Naval Research.

### References

- 1.Bell, J.: On The Einstein Podolsky Rosen Paradox. Physics
**1**, 195–200 (1964)Google Scholar - 2.Clauser, J., Horne, A., Shimony, A., Holt, R.: Proposed experiment to test local hidden-variable theories. Phys. Rev. Lett.
**23**, 880–884 (1969)ADSCrossRefGoogle Scholar - 3.Barrett, J., Hardy, L., Kent, A.: No signaling and quantum key distribution. Phys. Rev. Lett.
**95**, 010503 (2005)ADSCrossRefGoogle Scholar - 4.Acín, A., Brunner, N., Gisin, N., Massar, S., Pironio, S., Scarani, V.: Device-independent security of quantum cryptography against collective attacks. Phys. Rev. Lett.
**98**, 230501 (2007)ADSCrossRefGoogle Scholar - 5.Pironio, S., et al.: Random numbers certified by Bell’s theorem. Nature
**464**, 1021–1024 (2010)ADSCrossRefGoogle Scholar - 6.Fritz, T.: Beyond Bell’s theorem: correlation scenarios. New J. Phys.
**14**(10), 103001 (2012)ADSCrossRefMathSciNetGoogle Scholar - 7.A. Brandenburger, H.J. Keisler.: Fiber products of measures and quantum foundations. URL http://pages.stern.nyu.edu/abranden/fpmqf-10-29-12.pdf. To appear in Logic and Algebraic Structures in Quantum Computing and Information. Lecture Notes in Logic, Association for Symbolic Logic, Cambridge University Press, Cambridge (2012)
- 8.Barrett, J., Collins, D., Hardy, L., Kent, A., Popescu, S.: Quantum nononlocality, Bell inequalities, and the memory loophole. Phys. Rev. A
**66**, 042111 (2002)ADSCrossRefGoogle Scholar - 9.Gill, R.D.: Accardi Contra Bell (Cum Mundi): the impossible coupling. Math. Stat. Appl. Festschr. Constance van Eeden IMS Lect. Notes Monogr.
**42**, 133–154 (2003)Google Scholar - 10.Hänggi, E., Renner, R., Wolf, S.: The impossibility of non-signaling privacy amplification. Theor. Comput. Sci.
**486**, 27–42 (2013)CrossRefMATHGoogle Scholar - 11.Barrett, J., Colbeck, R., Kent, A.: Memory attacks on device-independent quantum cryptography. Phys. Rev. Lett.
**110**, 010503 (2013)ADSCrossRefGoogle Scholar - 12.van Dam, W., Gill, R.D., Grunwald, P.D.: The statistical strength of nonlocality proofs. IEEE Trans. Inf. Theory
**51**, 2812–2835 (2005)CrossRefMATHGoogle Scholar - 13.Zhang, Y., Glancy, S., Knill, E.: Asymptotically optimal data analysis for rejecting local realism. Phys. Rev. A
**84**, 062118 (2011)ADSCrossRefGoogle Scholar - 14.Chung, K.L.: A Course in Probability Theory. A Course in Probability Theory, 2nd edn. Academic Press, San Diego (1974)Google Scholar
- 15.Brandenburger, A., Yanofsky, N.: A classification of hidden-variable properties. J. Phys. A
**41**, 425302 (2008)ADSCrossRefMathSciNetGoogle Scholar - 16.Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc.
**58**, 13–30 (1963)CrossRefMATHMathSciNetGoogle Scholar - 17.Azuma, K.: Weighted sums of certain dependent random variables. Tohoku Math. J.
**19**(3), 357–367 (1967)CrossRefMATHMathSciNetGoogle Scholar - 18.Zhang, Y., Glancy, S., Knill, E.: Efficient quantification of experimental evidence against local realism. Phys. Rev. A
**88**, 052119 (2013).Google Scholar - 19.Gisin, N.: Non-realism: deep Thought or a Soft Option? Found. Phys.
**42**, 80–85 (2012)ADSCrossRefMATHMathSciNetGoogle Scholar - 20.Weihs, G., Jennewein, T., Simon, C., Weinfurter, H., Zeilinger, A.: Violation of Bell’s inequality under strict Einstein locality conditions. Phys. Rev. Lett.
**81**, 5039–5043 (1998)Google Scholar - 21.Pearle, P.M.: Hidden-variable example based upon data rejection. Phys. Rev. D
**2**, 1418–1425 (1970)Google Scholar - 22.Clauser, J., Horne, M.: Experimental consequences of objective local theories. Phys. Rev. Lett.
**10**(2), 526–535 (1974)ADSGoogle Scholar - 23.Mermin, N.D., Garg, A.: Detector inefficiencies in the Einstein–Podolsky–Rosen experiment. Phys. Rev. D
**35**(12), 3831–3835 (1987)ADSCrossRefGoogle Scholar - 24.P. Bierhorst, A mathematical foundation for locality. Ph.D. thesis, Tulane University (2014)Google Scholar