Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This book will not include an exhaustive discussion of all proposed interpretations of what quantum mechanics actually is. Existing approaches have been described in excessive detail in the literature [1, 68, 10, 11, 35], but we think they all contain weaknesses. The most conservative attitude is what we shall call the Copenhagen Interpretation. It is also a very pragmatic one, and some mainstream researchers insist that it contains all we need to know about quantum mechanics.

Yet it is the things that are not explained in the Copenhagen picture that often capture our attention. Below, we begin with indicating how the cellular Automaton interpretation will address some of these questions.

1 The Copenhagen Doctrine

It must have been a very exciting period of early modern science, when researchers began to understand how to handle quantum mechanics, in the late 1920s and subsequent years [64]. The first coherent picture of how one should think of quantum mechanics, is what we now shall call the Copenhagen Doctrine. In the early days, physicists were still struggling with the equations and the technical difficulties. Today, we know precisely how to handle all these, so that now we can rephrase the original starting points much more accurately. Originally, quantum mechanics was formulated in terms of wave functions, with which one referred to the states electrons are in; ignoring spin for a moment, they were the functions \(\psi (\vec{x}, t)=\langle \vec{x}|\psi (t)\rangle \). Now, we may still use the words ‘wave function’ when we really mean to talk of ket states in more general terms.

Leaving aside who said exactly what in the 1920s, here are the main points of what one might call the Copenhagen Doctrine. Somewhat anachronistically,Footnote 1 we employ Dirac’s notation:

A system is completely described by its ‘wave function’ \(|\psi (t)\rangle \), which is an element of Hilbert space, and any basis in Hilbert space can be used for its description. This wave function obeys a linear first order differential equation in time, to be referred to as Schrödinger’s equation, of which the exact form can be determined by repeated experiments.

A measurement can be made using any observable \(\mathcal{O}\) that one might want to choose (observables are Hermitian operators in Hilbert space). The theory then predicts the average measured value of \(\mathcal{O}\) , after many repetitions of the experiment, to be

$$\begin{aligned} \langle \mathcal{O}\rangle =\langle \psi (t)|\mathcal{O}|\psi (t) \rangle . \end{aligned}$$
(3.1)

As soon as the measurement is made, the wave function of the system collapses to a state in the subspace of Hilbert space that is an eigenstate of the observable \(\mathcal{O}\), or a probabilistic distribution of eigenstates, according to Eq. (3.1).

When two observables \(\mathcal{O}_{1}\) and \(\mathcal{O}_{2}\) do not commute, they cannot both be measured accurately. The commutator \([\mathcal{O}_{1}, \mathcal{O}_{2}]\) indicates how large the product of the ‘uncertainties’ \(\delta \mathcal{O}_{1}\) and \(\delta \mathcal{O}_{2}\) should be expected to be. The measuring device itself must be regarded as a classical object, and for large systems the quantum mechanical measurement approaches closely the classical description.

Implicitly included in Eq. (3.1) is the element of probability. If we expand the wave function \(|\psi \rangle \) into eigenstates \(|\varphi \rangle \) of an observable \(\mathcal{O}\), then we find that the probability that the experiment on \(|\psi \rangle \) actually gives as a result that the eigenvalue of the state \(|\varphi \rangle \) is found, will be given by \(P=|\langle \varphi |\psi \rangle |^{2}\). This is referred to as Born’s probability rule [12, 13].

We note that the wave function may not be given any ontological significance. The existence of a ‘pilot wave’ is not demanded; one cannot actually measure \(\langle \varphi |\psi \rangle \) itself; only by repeated experiments, one can measure the probabilities, with intrinsic margins of error. We say that the wave function, or more precisely, the amplitudes, are psi-epistemic rather than psi-ontic.

An important element in the Copenhagen interpretation is that one may only ask what the outcome of an experiment will be. In particular, it is forbidden to ask: what is it that is actually happening? It is exactly the latter question that sparks endless discussions; the important point made by the Copenhagen group is that such questions are unnecessary. If one knows the Schrödinger equation, one knows everything needed to predict the outcomes of an experiment, no further questions should be asked.

This is a strong point of the Copenhagen doctrine, but it also yields severe limitations. If we know the Schrödinger equation, we know everything there is to be known; however, what if we do not yet know the Schrödinger equation? How does one arrive at the correct equation? In particular, how do we arrive at the correct Hamiltonian if the gravitational force is involved?

Gravity has been a major focus point of the last 30 years and more, in elementary particle theory and the theory of space and time. Numerous wild guesses have been made. In particular, (super)string theory has made huge advances. Yet no convincing model that unifies gravity with the other forces has been constructed; models proposed so-far have not been able to explain, let alone predict, the values of the fundamental constants of Nature, including the masses of many fundamental particles, the fine structure constant, and the cosmological constant. And here it is, according to the author’s opinion, where we do have to ask: What is it, or what could it be, that is actually going on?

One strong feature of the Copenhagen approach to quantum theory was that it was also clearly shown how a Schrödinger equation can be obtained if the classical limit is known:

If a classical system is described by the (continuous) Hamilton equations, this means that we have classical variables \(p_{i}\) and \(q_{i}\), for which one can define Poisson brackets. Replacing these by commutators, one obtains a quantum model whose classical limit (\(\hbar\rightarrow 0\)) corresponds to the given classical system.

This is a very powerful trick, but unfortunately, in the case of the gravitational force, it is not good enough to give us ‘quantum gravity’. The problem with gravity is not just that the gravitational force appears not to be renormalizable, or that it is difficult to define the quantum versions of space- and time coordinates, and the physical aspects of non-trivial space-time topologies; some authors attempt to address these problems as merely technical ones, which can be handled by using some tricks. The real problem is that space-time curvature runs out of control at the Planck scale. We will be forced to turn to a different book keeping system for Nature’s physical degrees of freedom there.

A promising approach was to employ local conformal symmetry [59, 111, 112] as a more fundamental principle than usually thought; this could be a way to make distance and time scales relative, so that what was dubbed as ‘small distances’ ceases to have an absolute meaning. The theory is recapitulated in Appendix B. It does need further polishing, and it too could eventually require a Cellular Automaton interpretation of the quantum features that it will have to include.

2 The Einsteinian View

This section is called ‘The Einsteinian view’, rather than ‘Einstein’s view’, because we do not want to go into a discussion of what it actually was that Einstein thought. It is well-known that Einstein was uncomfortable with the Copenhagen Doctrine. The notion that there might be ways to rephrase things such that all phenomena in the universe are controlled by equations that leave nothing to chance, will now be referred to as the Einsteinian view. We do ask further questions, such as Can quantum-mechanical description of physical reality be considered complete? [33, 53], or, does the theory tell us everything we might want to know about what is going on?

In the Einstein–Podolsky–Rosen discussion of a Gedanken experiment, two particles (photons, for instance), are created in a state

$$\begin{aligned} x_{1}-x_{2}=0, \qquad p_{1}+p_{2}=0. \end{aligned}$$
(3.2)

Since \([x_{1}-x_{2},p_{1}+p_{2}]=0\), both equations in (3.2) can be simultaneously sharply imposed.

What bothered Einstein, Podolsky and Rosen was that, long after the two particles ceased to interact, an observer of particle # 2 might decide either to measure its momentum \(p_{2}\), after which we know for sure the momentum \(p_{1}\) of particle # 1, or its position \(x_{2}\), after which we would know for sure the position \(x_{1}\) of particle #1. How can such a particle be described by a quantum mechanical wave function at all? Apparently, the measurement at particle # 2 affected the state of particle #1, but how could that have happened?

In modern quantum terminology, however, we would have said that the measurements proposed in this Gedanken experiment would have disturbed the wave function of the entangled particles. The measurements on particle # 2 affects the probability distributions for particle # 1, which in no way should be considered as the effect of a spooky signal from one system to the other.

In any case, even Einstein, Podolsky and Rosen had no difficulty in computing the quantum mechanical probabilities for the outcomes of the measurements, so that, in principle, quantum mechanics emerged unharmed out of this sequence of arguments.

It is much more difficult to describe the two EPR photons in a classical model. Such questions will be the topic of Sect. 3.6.

Einstein had difficulties with the relativistic invariance of quantum mechanics (“does the spooky information transmitted by these particles go faster than light?”). These, however, are now seen as technical difficulties that have been resolved. It may be considered part of Copenhagen’s Doctrine, that the transmission of information over a distance can only take place, if we can identify operators \(A\) at space-time point \(x_{1}\) and operators \(B\) at space-time point \(x_{2}\) that do not commute: \([A,B]\ne0\). We now understand that, in elementary particle theory, all space-like separated observables mutually commute, which precludes any signalling faster than light. It is a built-in feature of the Standard Model, to which it actually owes much of its success.

So, with the technical difficulties out of the way, we are left with the more essential Einsteinian objections against the Copenhagen doctrine for quantum mechanics: it is a probabilistic theory that does not tell us what actually is going on. It is sometimes even suggested that we have to put our “classical” sense of logic on hold. Others deny that: “Keep remembering what you should never ask, while reshaping your sense of logic, and everything will be fine.” According to the present author, the Einstein–Bohr debate is not over. A theory must be found that does not force us to redefine any aspect of classical, logical reasoning.

What Einstein and Bohr did seem to agree about is the importance of the role of an observer. Indeed, this was the important lesson learned in the \(20{\mathrm{th}}\) century: if something cannot be observed, it may not be a well-defined concept—it may even not exist at all. We have to limit ourselves to observable features of a theory. It is an important ingredient of our present work that we propose to part from this doctrine, at least to some extent: Things that are not directly observable may still exist and as such play a decisive role in the observable properties of an object. They may also help us to construct realistic models of the world.

Indeed, there are big problems with the dictum that everything we talk about must be observable. While observing microscopic objects, an observer may disturb them, even in a classical theory; moreover, in gravity theories, observers may carry gravitational fields that disturb the system they are looking at, so we cannot afford to make an observer infinitely heavy (carrying large bags full of “data”, whose sheer weight gravitationally disturbs the environment), but also not infinitely light (light particles do not transmit large amounts of data at all), while, if the mass of an observer would be “somewhere in between”, this could entail that our theory will be inaccurate from its very inception.

An interesting blow was given to the doctrine that observability should be central, when quark theory was proposed. Quarks cannot be isolated to be observed individually, and for that reason the idea that quarks would be physical particles was attacked. Fortunately, in this case the theoretical coherence of the evidence in favour of the quarks became so overwhelming, and experimental methods for observing them, even while they are not entirely separated, improved so much, that all doubts evaporated.

In short, the Cellular Automaton Interpretation tells us to return to classical logic and build models. These models describe the evolution of large sets of data, which eventually may bring about classical phenomena that we can observe. The fact that these data themselves cannot be directly observed, and that our experiments will provide nothing but statistical information, including fluctuations and uncertainties, can be fully explained within the settings of the models; if the observer takes no longer part in the definition of physical degrees of freedom and their values, then his or her limited abilities will no longer stand in the way of accurate formalisms.

We suspect that this view is closer to Einstein’s than it can be to Bohr, but, in a sense, neither of them would fully agree. We do not claim the wisdom that our view is obviously superior, but rather advocate that one should try to follow such paths, and learn from our successes and failures.

3 Notions Not Admitted in the CAI

It is often attempted to attach a physical meaning to the wave function beyond what it is according to Copenhagen. Could it have an ontological significance as a ‘pilot wave function’ [10, 11, 26]? It should be clear from nearly every page of this book that we do not wish to attach any ontological meaning to the wave function, if we are using it as a template.

In an ontological description of our universe, in terms of its ontological basis, there are only two values a wave function can take: 1 and 0. A state is actually realized when the wave function is 1, and it does not describe our world when the wave function is zero. It is only this ‘universal wave function’, that for that reason may be called ontological.

It is only for mathematical reasons that one might subsequently want to equip this wave function with a phase, \(e^{i\varphi }\). In the ontological basis, this phase \(\varphi \) has no physical meaning at all, but as soon as one considers operators, such as the time-evolution operator \(U(t)\), and the Hamiltonian, these phases have to be chosen. From a physical point of view, any phase is as good as any other, but for keeping the mathematical complexity under control, precise definitions of these phases is crucial. One can then perform the unitary transformations to any of the basis choices usually employed in physics. The template states subsequently introduced, all come with precisely defined phases.

A semantic complication is caused as soon as we apply second quantization. Where a single particle state is described by a wave function, the second-quantized version of the theory sometimes replaces this by an operator field. Its physical meaning is then completely different. Operator fields are usually not ontological since they are superimposables rather than beables (see Sect. 2.1.1), but in principle they could be; wave functions, in contrast, are elements of Hilbert space and as such should not be confused with operators, let alone beable operators.

How exactly to phrase the so-called ‘Many World Interpretation’ [35] of quantum mechanics, is not always agreed upon [29, 30]. When doing ordinary physics with atoms and elementary particles, this interpretation may well fulfill the basic needs of a researcher, but from what has been learned in this book it should be obvious that our theory contrasts strongly with such ideas. There is only one single world that is being selected out in our theory as being ‘the real world’, while all others simply are not realized.

The reader may have noticed that the topic in this book is being referred to alternately as a ‘theory’ and as an ‘interpretation’. The theory we describe consists not only of the assumption that an ontological basis exists, but also that it can be derived, so as to provide an ontological description of our universe. It suggests pathways to pin down the nature of this ontological basis. When we talk of an interpretation, this means that, even if we find it hard or impossible to identify the ontological basis, the mere assumption that one might exist suffices to help us understand what the quantum mechanical expressions normally employed in physics, are actually standing for, and how a physical reality underlying them can be imagined.

There is another aspect of our theory that is different from ordinary quantum mechanics; it is the notion that our ontological variables, the beables, probably only apply to the most basic degrees of freedom of the physical world, i.c. the ones that are relevant at the Planck scale. This is the smallest distance scale relevant for physics, and we shall return to it. It is not at all clear whether we can transform ontological variables to ones that still make sense at scales where physicists can do experiments today, and this may well be the reason why such variables still play practically no role in existing models of Nature such as the Standard Model.

We have reasons to suspect that this is the very reason why we have quantum mechanics rather than an ontological theory, describing the known particles and forces today: physics was not yet ready to identify the truly ontological degrees of freedom.

4 The Collapsing Wave Function and Schrödinger’s Cat

The following ingredient in the Copenhagen interpretation, Sect. 3.1, is often the subject of discussions:

As soon as an observable \(\mathcal{O}\) is measured, the wave function of the system collapses to a state in the subspace of Hilbert space that is an eigenstate of the observable \(\mathcal{O}\) , or a probabilistic distribution of eigenstates.

This is referred to as the “collapse of the wave function”. It appears as if the action of the measurement itself causes the wave function to attain its new form. The question then asked is what physical process is associated to that.

Again, the official reply according to the Copenhagen doctrine is that this question should not be asked. Do the calculation and check your result with the experiments. However, there appears to be a contradiction, and this is illustrated by Erwin Schrödinger’s Gedanken experiment with a cat [7577]. The experiment is summarized as follows:

In a sealed box, one performs a typical quantum experiment. It could be a Stern Gerlach experiment where a spin \(1/2\) particle with spin up is sent through an inhomogeneous magnetic field that splits the wave function according to the values of the spin in the \(y\) direction, or it could be a radioactive atom that has probability \(1/2\) to decay within a certain time. In any case, the wave function is well specified at \(t=0\), while at \(t=1\) it is in a superposition of two states, which are sent to a detector that determines which of the two states is realized. It is expected that the wave function ‘collapses’ into one of the two possible final states.

The box also contains a live cat (and air for the cat to breathe). Depending on the outcome of the measurement, a capsule with poison is broken, or kept intact. The cat dies when one state is found, and otherwise the cat stays alive. At the end of the experiment, we open the box and inspect the cat.

Clearly, the probability that we find a dead cat is about \(1/2\), and otherwise we find a live cat. However, we could also regard the experiment from a microscopic point of view. The initial state was a pure, ‘conventional’, quantum state. The final state appears to be a mixture. Should the cat, together with the other remains of the experiment, upon opening the box, not be found in a superimposed state: dead and alive?

The collapse axiom tells us that the state should be ‘dead cat’ or ‘live cat’, whereas the first parts of our description of the quantum mechanical states of Hilbert space, clearly dictates that if two states, \(|\psi _{1}\rangle \) and \(|\psi _{2}\rangle \) are possible in a quantum system, then we can also have \(\alpha |\psi _{1}\rangle +\beta |\psi _{2}\rangle \). According to Schrödinger’s equation, this superposition of states always evolves into a superposition of the final states. The collapse seems to violate Schrödinger’s equation. Something is not quite right.

An answer that several investigators have studied [65, 66], is that, apparently, Schrödinger’s equation is only an approximation, and that tiny non-linear ‘correction terms’ bring about the collapse [4, 41, 73]. One of the problems with this is that observations can be made at quite different scales of space, time, energy and mass. How big should the putative correction terms be? Secondly, how do the correction terms know in advance which measurements we are planning to perform?

Some authors try to attribute the splitting of the dead cat state and the live cat state to ‘decoherence’. But then, what exactly is decoherence? Why can we not consider the entire box with the cat in it, in perfect isolation from its environment?

This, we believe, is where the cellular automaton interpretation of quantum mechanics will come to the rescue. It is formulated using no wave function at all, but there are ontological states instead. It ends up with just one wave function, taking the value 1 if we have a state the universe is in, and 0 if that state is not realized. There are no other wave functions, no superposition.

How this explains the collapse phenomenon will be explained in Chap. 4. In summary: quantum mechanics is not the basic theory but a tool to solve the mathematical equations. This tool works just as well for superimposed states (the templates) as for the ontological states, but they are not the same thing. The dead cat is in an ontological state and so is the live one. The superimposed cat solves the equations mathematically in a perfectly acceptable way, but it does not describe a state that can occur in the real world. We postpone the precise explanation to Chap. 4. It will sound very odd to physicists who have grown up with standard quantum mechanics, but it does provide the logical solution to the Schrödinger cat paradox.Footnote 2

One may ask what this may imply when we have transformations between ontological states and template states. Our experiences tell us that all template states that are superpositions \(\alpha |\psi _{1}\rangle +\beta |\psi _{2}\rangle \) of ontological states, may serve as suitable approximations describing probabilistic situations in the real world. How can it be that, sometimes, they do seem to be ontological? The most likely response to that will be that the transformation does not always have to be entirely local, but in practice may involve many spectator states in the environment. What we can be sure of is that all ontological states form an orthonormal set. So, whenever we use \(\alpha |\psi _{1}\rangle +\beta |\psi _{2}\rangle \) to describe an ontological state, there must be other wave functions in the environment which must be chosen differently for any distinct pair \(\alpha \) and \(\beta \), such that the entire set that we use to describe physical situations are always orthonormal.

This should be taken in mind in the next sections where we comment on the Alice and Bob Gedanken experiments.

In Ref. [96], it is argued that the collapse axiom in conventional descriptions of quantum mechanics, essentially leads to requiring the existence of a preferred orthonormal set of basis states. Our reasoning is the other way around: we start with the fundamental orthonormal basis and derive from that the emergence of the collapse of the wave function.

5 Decoherence and Born’s Probability Axiom

The cellular automaton interpretation does away with one somewhat murky ingredient of the more standard interpretation schemes: the role of ‘decoherence’. It is the argument often employed to explain why macroscopic systems are never seen in a quantum superposition. Let \(|\psi _{1}\rangle \) and \(|\psi _{2}\rangle \) be two states a classical system can be in, such as a cat being dead and a cat being alive. According to Copenhagen, in its pristine form, quantum mechanics would predict the possibility of a third state, \(|\psi _{3}\rangle = \alpha |\psi _{1}\rangle +\beta |\psi _{2}\rangle \), where \(\alpha \) and \(\beta \) can be any pair of complex numbers with \(|\alpha |^{2}+|\beta |^{2}=1\).

Indeed, it seems almost inevitable that a system that can evolve into state \(|\psi _{1}\rangle \) or into state \(|\psi _{2}\rangle \), should also allow for states that evolve into \(|\psi _{3}\rangle \). Why do we not observe such states? The only thing we do observe is a situation whose probability of being in \(|\psi _{1}\rangle \) might be \(|\alpha |^{2}\) and the probability to be in \(|\psi _{2}\rangle \) is \(|\beta |^{2}\). But that is not the same as state \(|\psi _{3}\rangle \).

The argument often heard is that, somehow, the state \(|\psi _{3}\rangle \) is unstable. According to Copenhagen, the probability of a state \(|\psi \rangle \) to be in state \(|\psi _{3}\rangle \) is

$$\begin{aligned} P_{3}=|\langle \psi _{3}|\psi \rangle |^{2}=| \alpha |^{2} |\langle \psi _{1}|\psi \rangle |^{2}+|\beta |^{2}|\langle \psi _{2}|\psi \rangle |^{2}+2\mathrm{Re} \bigl(\alpha ^{*}\beta \langle \psi |\psi _{1}\rangle \langle \psi _{2}|\psi \rangle \bigr). \end{aligned}$$
(3.3)

The last term here is the interference term. It distinguishes the real quantum theory from classical theories. Now it is said that, if \(|\psi _{1}\rangle \) and \(|\psi _{2}\rangle \) become classical, they cannot stay immune for interactions with the environment. In the presence of such interactions, the energies of \(|\psi _{1}\rangle \) and \(|\psi _{2}\rangle \) will not exactly match, and consequently, the interference term, will oscillate badly. This term might then be seen to average out to zero. The first two terms are just the probabilities to have either \(|\psi _{1}\rangle \) or \(|\psi _{2}\rangle \), which would be the classical probabilities.

If indeed the last term becomes unobservable, we say that the two states decohere [73, 95], so that the interference term should be replaced by zero. The question is, if we include the environment in our description, the energies should still be exactly conserved, and there is no rapid oscillation. Is it legal to say nevertheless that the interference term will disappear? Note that its absolute value on average does remain large.

The CAI will give a much more direct answer: if states \(|\psi _{1}\rangle \) and \(|\psi _{2}\rangle \) are classical, then they are ontological states. State \(|\psi _{3}\rangle \) will then not be an ontological state, and the states of the real universe, describing what happens if an actual experiment is carried out, never include state \(|\psi _{3}\rangle \). It is merely a template, useful for calculations, but not describing reality. What it may describe is a situation where, from the very start, the coefficients \(\alpha \) and \(\beta \) were declared to represent probabilities.

Copenhagen quantum mechanics contains an apparently irreducible axiom: the probability that a state \(|\psi \rangle \) is found to agree with the properties of another state \(|\varphi \rangle \), must be given by

$$\begin{aligned} P=|\langle \varphi |\psi \rangle |^{2}. \end{aligned}$$
(3.4)

This is the famous Born rule [12, 13]. What is the physical origin of this axiom?

Note, that Born did not have much of a choice. The completeness theorem of linear algebra implies that the eigenstates \(|\varphi \rangle \) of an Hermitian operator span the entire Hilbert space, and therefore,

$$\begin{aligned} \sum_{\varphi}|\varphi \rangle \langle \varphi |= \mathbb{I};\qquad\sum_{\varphi}|\langle \varphi |\psi \rangle |^{2}=\sum_{\varphi}\langle \psi | \varphi \rangle \langle \varphi |\psi \rangle =\langle \psi |\psi \rangle =1 , \end{aligned}$$
(3.5)

where \(\mathbb {I}\) stands for the identity operator. If Born would have chosen any other expression to represent probabilities, according to Gleason’s theorem [43], they would not have added up to one. The expression (3.4) turns out to be ideally suited to serve as a probability.

Yet this is a separate axiom, and the question why it works so well is not misplaced. In a hidden variable theory, probabilities may have a different origin. The most natural explanation as to why some states are more probable than others may be traced to their initial states much earlier in time. One can ask which initial states may have led to a state seen at present, and how probable these may have been. There may be numerous answers to that question. One now could attempt to estimate their combined probabilities. The relative probabilities of some given observed final states could then be related to the ratios of the numbers found. Our question then is, can we explain whether and how the expression (3.4) is related to these numbers? This discussion is continued in Sect. 3.7 and in Sect. 4.3.

6 Bell’s Theorem, Bell’s Inequalities and the CHSH Inequality

One of the main reasons why ‘hidden variable’ theories are usually dismissed, and emphatically so when the theory obeys local equations, is the apparent difficulty in such theories to represent entangled quantum states. Just because the De Broglie–Bohm theory (not further discussed here) is intrinsically non-local, it is generally concluded that all hidden variable theories are either non-local or unable to reproduce quantum features at all. When J.S. Bell was investigating the possibility of hidden variable theories, he hit upon the same difficulties, upon which he attempted to prove that local hidden variable theories are impossible.

As before, we do not intend to follow precisely the historical development of Bell’s theory [7, 8], but limit ourselves to a summary of the most modern formulation of the principles. Bell designed a Gedanken experiment, and at the heart of it is a pair of quantum-entangled particles. They could be spin-\({1\over 2}\) particles, which each can be in just two quantum states described by the Pauli matrices (1.7), or alternatively spin 1 photons. There are a few subtle differences between these two cases. Although these are not essential to the argument, let us briefly pause at these differences. We return to the more crucial aspects of the Bell inequalities after Eq. (3.9).

The two orthonormal states for photons are the ones where they are polarized horizontally or vertically, while the two spin-\({1\over 2}\) states are polarized up or down. Indeed, quite generally when polarized particles are discussed, the angles for the photons are handled as being half the angles for spin-\({1\over 2}\) particles.

A second difference concerns the entangled state, which in both cases has total spin 0. For spin-\({1\over 2}\), this means that \((\vec{\sigma}_{1}+\vec{\sigma}_{2})|\psi \rangle =0\), where \(\vec{\sigma}\) are the Pauli matrices, Eqs. (1.7), so that

$$\begin{aligned} |\psi \rangle ={\textstyle{1\over \sqrt{2}}}(|{\uparrow}{\downarrow} \rangle -|{\downarrow} {\uparrow}\rangle ), \end{aligned}$$
(3.6)

which means that the two electrons are polarized in opposite directions.

For spin 1, this is different. Let us have these photons move in the \(\pm z\)-direction. Defining \(A_{\pm}={1\over \sqrt{2}}(A_{x}\pm iA_{y})\) as the operators that create or annihilate one unit of spin in the \(z\)-direction, and taking into account that photons are bosons, the 2 photon state with zero spin in the \(z\)-direction is

$$\begin{aligned} |\psi \rangle ={\textstyle{1\over \sqrt {2}}}\bigl(A^{(1)}_{+}A^{(2)}_{-}+A^{(1)}_{-}A^{(2)}_{+} \bigr) |\,\rangle ={\textstyle{1\over \sqrt{2}}} |z,-z\rangle +{\textstyle{1\over \sqrt{2}}}|{-}z, z\rangle , \end{aligned}$$
(3.7)

and since helicity is spin in the direction of motion, while the photons go in opposite directions, we can rewrite this state as

$$\begin{aligned} |\psi \rangle ={\textstyle{1\over \sqrt{2}}}(|{+}{+}\rangle +|{-}{-}\rangle ), \end{aligned}$$
(3.8)

where ± denote the helicities. Alternatively one can use the operators \(A_{x}\) and \(A_{y}\) to indicate the creators of linearly polarized photons, and then we have

$$\begin{aligned} |\psi \rangle ={\textstyle{1\over \sqrt {2}}}\bigl(A_{x}^{(1)}A_{x}^{(2)}+A_{y}^{(1)}A_{y}^{(2)} \bigr) |\,\rangle ={\textstyle{1\over \sqrt{2}}}(|xx\rangle +|yy\rangle ). \end{aligned}$$
(3.9)

Thus, the two photons are linearly polarized in the same direction.

Since the experiment is mostly done with photons, we will henceforth describe the entangled photon state.

Back to business. The Bell experiment is illustrated in Fig. 3.1. At the point \(S\), an atom \(\varepsilon \) is prepared to be in an unstable \(J=0\) state at \(t=t_{1}\), such that it can decay only into an other \(J=0\) state, by simultaneously emitting two photons such that \(\Delta J=0\), and the two photons, \(\alpha \) and \(\beta \) , must therefore be in the entangled \(S_{\mathrm{tot}}=0\) state, at \(t=t_{2}\).

Fig. 3.1
figure 1

A Bell-type experiment. Space runs horizontally, time vertically. Single lines denote single bits or qubits travelling; widened lines denote classical information carried by millions of bits. Light travels along \(45^{\circ}\), as indicated by the light cone on the right. Meaning of the variables: see text

After having travelled for a long time, at \(t=t_{3}\), photon \(\alpha \) is detected by observer \(A\) (Alice), and photon \(\beta \) is detected by \(B\) (Bob). Ideally, the observers use a polarization filter oriented at an angle \(a\) (Alice) and \(b\) (Bob). If the photon is transmitted by the filter, it is polarized in the direction of the polarization filter’s angle, if it is reflected, its polarization was orthogonal to this angle. At both sides of the filter there is a detector, so both Alice and Bob observe that one of their two detectors gives a signal. We call Alice’s signal \(A=+1\) if the photon passed the filter, and \(A=-1\) if it is reflected. Bob’s signal \(B\) is defined the same way.

According to quantum theory, if \(A=1\), Alice’s photon is polarized in the direction \(a\), so Bob’s photon must also be polarized in that direction, and the intensity of the light going through Bob’s filter will be \(\cos^{2}(a-b)\). Therefore, according to quantum theory, the probability that \(B=1\) is \(\cos^{2}(a-b)\). The probability that \(B=-1\) is then \(\sin^{2}(a-b)\), and the same reasoning can be set up if \(A=-1\). The expectation value of the product \(AB\) is thus found to be

$$ \langle AB\rangle =\cos^{2}(a-b)-\sin ^{2}(a-b)=\cos2(a-b), $$
(3.10)

according to quantum mechanics.

In fact, these correlation functions can now be checked experimentally. Beautiful experiments [2, 3] confirmed that correlations can come close to Eq. (3.10).

The point made by Bell is that it seems to be impossible to reproduce this strong correlation between the findings \(A\) and \(B\) in any theory where classical information is passed on from the atom \(\varepsilon \) to Alice (A) and Bob (B). All one needs to assume is that the atom emits a signal to Alice and one to Bob, regarding the polarization of the photons emitted. It could be the information that both photons \(\alpha \) and \(\beta \) are polarized in direction \(c\). Since this information is emitted long before either Alice or Bob decided how to orient their polarization filters, it is obvious that the signals in \(\alpha \) and \(\beta \) should not depend on that. Alice and Bob are free to choose their polarizers.

The correlations then directly lead to a contradiction, regardless of the classical signals’ nature. The contradiction is arrived at as follows. Consider two choices that Alice can make: the angles \(a\) and \(a'\). Similarly, Bob can choose between angles \(b\) and \(b'\). Whatever the signal is that the photons carry along, it should entail an expectation value for the four observations that can be made: Alice observes \(A\) or \(A'\) in the two cases, and Bob observes \(B\) or \(B'\). If both Alice and Bob make large numbers of observations, every time using either one of their two options, they can compare notes afterwards, and measure the averages of \(AB, A'B, AB',\) and \(A'B'\). They calculate the value of

$$\begin{aligned} S=\langle AB\rangle +\bigl\langle A'B\bigr\rangle +\bigl\langle AB'\bigr\rangle -\bigl\langle A'B'\bigr\rangle , \end{aligned}$$
(3.11)

and see how it depends on the polarization angles \(a\) and \(b\).

Now suppose we assume that, every time photons are emitted, they have well-defined values for \(A,A',B\), and \(B'\). Whatever the signals are that are carried by the photons, at each measurement these four terms will take the values \(\pm1\), but they can never all contribute to the quantity \(S\) with the same sign (because of the one minus sign in (3.11)). Because of this, it is easy to see that \(S\) is always \(\pm2\), and its average value will therefore obey:

$$ | \langle S\rangle |\le2; $$
(3.12)

this modern version of Bell’s original observation is called the Clauser–Horne–Shimony–Holt (CHSH) inequality [20, 78]. However, if we choose the angles

$$\begin{aligned} a=22.5^{\circ},\qquad a'=-22.5^{\circ},\qquad b=0^{\circ},\qquad b'=45^{\circ}, \end{aligned}$$
(3.13)

then, according to Eq. (3.10), quantum mechanics gives for the expectation value

$$\begin{aligned} S=3\cos\bigl(45^{\circ}\bigr)-\cos135^{\circ}=2\sqrt{2}>2. \end{aligned}$$
(3.14)

How can this be? Apparently, quantum mechanics does not give explicit values \(\pm1\) for the measurements of \(A\) and \(A'\); it only gives a value to the quantity actually measured, which in each case is either \(A\) or \(A'\) and also either \(B\) or \(B'\). If Alice measures \(A\), she cannot also measure \(A'\) because the operator for \(A'\) does not commute with \(A\); the polarizers differ by an angle of \(45^{\circ}\), and a photon polarized along one of these angles is a quantum superposition of a photon polarized along the other angle and a photon polarized orthogonally to that. So the quantum outcome is completely in accordance with the Copenhagen prescriptions, but it seems that it cannot be realized in a local hidden variable theory.

We say that, if \(A\) is actually measured, the measurement of \(A'\) is counterfactual, which means that we imagine measuring \(A'\) but we cannot actually do it, just as we are unable to measure position if we already found out what exactly the momentum of a particle is. If two observables do not commute, one can measure one of them, but the measurement of the other is counterfactual.

Indeed, in the arguments used, it was assumed that the hidden variable theory should allow an observer actually to carry out counterfactual measurements. This is called definiteness. Local hidden variable theories that allow for counterfactual observations are said to have local definiteness. Quantum mechanics forbids local counterfactual definiteness.

However, to use the word ‘definiteness’, or ‘realism’, for the possibility to perform counterfactual observations is not very accurate. ‘realism’ should mean that there is actually something happening, not a superposition of things; something is happening for sure, while something else is not happening. That is not the same thing as saying that both Alice and Bob at all times can choose to modify their polarization angles, without allowing for any modification at other places [85].

It is here that the notion of “free will” is introduced [22, 23], often in an imprecise manner.Footnote 3 The real extra assumption made by Bell, and more or less tacitly by many of his followers, is that both Alice and Bob should have the “free will” to modify their settings at any moment, without having to consult the settings at the system \(S\) that produces an unstable atom. If this were allowed to happen in a hidden variable theory, we would get local counterfactual definiteness, which was ruled out.

The essence of the argument that now follows has indeed been raised before. The formulation by C.H. Brans [14] is basically correct, but we add an extra twist, something to be called the ‘ontology conservation law’ (Sect. 3.7.1), in order to indicate why violation of Bell’s theorem does not require ‘absurd physics’.

How can we deny Alice and/or Bob their free will? Well, precisely in a deterministic hidden variable theory, Alice and Bob can only change their minds about the setting of their polarizers, if their brains follow different laws than they did before, and, like it or not, Alice’s and Bob’s actions are determined by laws of physics [118], even if these are only local laws. Their decisions, logically, have their roots in the distant past, going back all the way to the Big Bang. So why should we believe that they can do counterfactual observations?

The way this argument is usually countered is that the correlations between the photons \(c\) from the decaying atom and the settings \(a\) and \(b\) chosen by Alice and Bob have to be amazingly strong. A gigantically complex algorithm could make Alice an Bob take their decisions, and yet the decaying atom, long before Alice and Bob applied this algorithm, knew about the result. This is called ‘conspiracy’, and conspiracy is said to be “disgusting”. “One could better stop doing physics than believe such a weird thing”, is what several investigators quipped.

In Sects. 3.7.1, 5.7.3, and 10.3.3, we go to the core of this issue.

7 The Mouse Dropping Function

To illustrate how crazy things can get, a polished version of Bell’s experiment was proposed: both Alice and Bob carry along with them a mouse in a cage,Footnote 4 with food. Every time they want to set the angles of their polarizers, they count the number of mouse droppings. If it is even, they choose one angle, if it is odd, they choose the other. “Now, the decaying atom has to know ahead of time how many droppings the mouse will produce. Isn’t this disgusting?”

To see what is needed to obtain this “disgusting” result, let us consider a simple model. We assume that there are correlations between the joint polarizations of the two entangled photons, called \(c\), and the settings \(a\) chosen by Alice and \(b\) chosen by Bob. All these angles are taken to be in the interval \([0,180^{\circ}]\). Define the function \(W(c|a,b)\) as being the conditional probability to have both photons polarized in the direction \(c\), given \(a\) and \(b\). Assume that Alice’s outcome \(A=+1\) as soon as her ‘ontological’ photon has \(|a-c|<45^{\circ}\) or \(>135^{\circ}\), otherwise \(A=-1\). For Bob’s measurement, replacing \(a\leftrightarrow b\), we assume the same. Everything will be periodic in \(a,b,\) and \(c\) with period \(\pi(180^{\circ}).\)

It is reasonable to expect that \(W\) only depends on the relative angles \(c-a\) and \(c-b\):

$$ W(c|a,b)=W(c-a,c-b);\qquad \int_{0}^{\pi}\mathrm {d}c\,W(c-a,\, c-b)=1. $$
(3.15)

Introduce a sign function \(s_{(\varphi )}\) as follows:Footnote 5

$$\begin{aligned} s_{(\varphi )}\equiv\mathrm{sign}\bigl(\cos(2\varphi )\bigr);\qquad A=s_{(c-a)},\qquad B=s_{(c-b)}. \end{aligned}$$
(3.16)

The expectation value of the product \(AB\) is

$$\begin{aligned} \langle AB\rangle = \int\mathrm{d}c\,W(c|a,b) s_{(c-a)}s_{(c-b)}, \end{aligned}$$
(3.17)

How should \(W\) be chosen so as to reproduce the quantum expression (3.10)?

Introduce the new variables

$$\begin{aligned} x=c-{\textstyle{1\over 2}}(a+b),\qquad z={\textstyle{1\over 2}}(b-a),\qquad W=W(x+z,x-z). \end{aligned}$$
(3.18)

Quantum mechanics demands that

$$\begin{aligned} \int_{0}^{\pi}\mathrm{d}x\, W(x+z,x-z)s_{(x+z)} s_{(x-z)}=\cos4z. \end{aligned}$$
(3.19)

Writing

$$\begin{aligned} s_{(x+z)}s_{(x-z)}=\mathrm{sign}\bigl(\cos2(x+z)\cos 2(x-z)\bigr)= \mathrm{sign}(\cos 4x+\cos4z), \end{aligned}$$
(3.20)

we see that the equations are periodic with period \(\pi/2\), but looking more closely, we see that both sides of Eq. (3.19) change sign if \(x\) and \(z\) are shifted by an amount \(\pi/4\). Therefore, one first tries solutions with periodicity \(\pi/4\). Furthermore, we have the symmetry \(x\leftrightarrow-x,z\leftrightarrow-z\).

Equation (3.19) contains more unknowns than equations, but if we assume \(W\) only to depend on \(x\) but not on \(z\) then the equation can readily be solved. Differentiating Eq. (3.19) to \(z\), one only gets Dirac delta functions inside the integral, all adding up to yield:

$$\begin{aligned} 4 \int_{0}^{\pi/4}W(x)\,\mathrm{d}x\, \bigl(-2 \delta (x+z-\pi /4)\bigr)= -4 \sin4z,\quad \hbox{if}\quad 0< z< {\textstyle{1\over 4}}\pi \end{aligned}$$
(3.21)

(the four parts of the integral in Eq. (3.19) each turn out to give the same contribution, hence the first factor 4). Thus one arrives at

$$\begin{aligned} W(c\,|\,a,b)=W(x+z,x-z)={\textstyle{1\over 2}}|\sin4x|={\textstyle{1\over 2}}\bigl|\sin (4c-2a-2b)\bigr|. \end{aligned}$$
(3.22)

This also yields a normalized 3-point probability distribution,

$$\begin{aligned} W(a,b,c)={\textstyle{1\over 2\pi^{2}}}\bigl|\sin(4c-2a-2b)\bigr|. \end{aligned}$$
(3.23)

By inspection, we find that this correlation function \(W\) indeed leads to the quantum expression (3.10). We could call this the ‘mouse dropping function’ (see Fig. 3.2). If Alice wants to perform a counterfactual measurement, she modifies the angle \(a\), while \(b\) and \(c\) are kept untouched. She herewith chooses a configuration that is less probable, or more probable, than the configuration she had before. Taking in mind a possible interpretation of the Born probabilities, as expressed in Sects. 3.5 and 4.3, this means that the configuration of initial states where Alice’s mouse produced a different number of droppings, may be more probable or less probable than the state she had before. In quantum mechanics, we have learned that this is acceptable. If we say this in terms of an underlying deterministic theory, there seem to be problems with it.

Fig. 3.2
figure 2

The mouse dropping function, Eq. (3.23). Horizontally the variable \(x=c-{1\over 2}(a+b)\). Averaging over any of the three variables \(a\), \(b\), or \(c\), gives the same result as the flat line; the correlations then disappear

7.1 Ontology Conservation and Hidden Information

In fact, all we have stated here is that, even in a deterministic theory obeying local equations, configurations of template states may have non-trivial space-like correlations. It is known that this happens in many physical systems. A liquid close to its thermodynamical critical point shows a phenomenon called critical opalescence: large fluctuations in the local density. This means that the density correlation functions are non-trivial over relatively large space-like distances. This does not entail a violation of relativity theory or any other principle in physics such as causality; it is a normal phenomenon. A liquid does not have to be a quantum liquid to show critical opalescence.

It is still true that the effect of the mouse droppings seems to be mysterious, since, in a sense, we do deny that Alice, Bob, and their mice have “free will”. What exactly is ‘free will’? We prefer a mathematical definition rather than an emotional one (see Sect. 3.8), and we also return to this subject, in our final discussion in Sects. 10.2 and 10.3. All we can bring forward now is that mice guts also obey energy-, momentum- and angular momentum conservation because these are general laws. In our ‘hidden variable’ theory, a general law will have to be added to this: an ontological state evolves into an ontological state; superpositions evolve into superpositions. If the mouse droppings are ontological in one description and counterfactual in another, the initial state from which they came was also ontological or counterfactual, accordingly. This should not be a mysterious statement.

There is a problematic element in this argument however, which is that, somehow, the entangled photons leaving the source, already carry information about the settings that will be used by Alice and Bob. They don’t tell us the settings themselves, but they do carry information about the correlation function of these settings. Thus, non-local information about the future is already present in the ‘hidden’ ontological data of the photons, information that disappears when we rephrase what happens in terms of standard quantum mechanical descriptions. Thus, there is non-trivial information about the future in the ontological description of what happens. We claim that, as long as this information is subject to a rigorous conservation law—the law of the conservation of ontology as described above—there is no contradiction here, but we do suspect that this may shed at least an unusual light on the idea of superdeterminism.

Actually, there is an even stronger arrangement than the mouse droppings by which Alice and Bob can make their decisions. They could both monitor the light fluctuations caused by light coming from different quasars, at opposite points in the sky [38], and use these to decide about the settings \(a\) and \(b\) of their filters. These quasars, indicated as \(Q_{A}\) and \(Q_{B}\) in Fig. 3.1, may have emitted their photons shortly after the Big Bang, at time \(t=t_{0}\) in the Figure, when they were at billions of light years separation from one another. The fluctuations of these quasars should also obey the mouse dropping formula (3.22). How can this be? The only possible explanation is the one offered by the inflation theory of the early universe: these two quasars, together with the decaying atom, do have a common past, and therefore their light is correlated.

Note that the correlation generated by the probability distribution (3.23) is a genuine three-body correlation. Integrating over any one of the three variables gives a flat distribution. The quasars are correlated only through the state the decaying atom is in, but not directly with one another. It clearly is a mysterious correlation, but it is not at odds with what we know about the laws of physics, see remarks at the end of Sects. 20.6 and 20.7 in Part II.

In fact, these space-like correlations must be huge: who knows where else in the universe some alien Alices and Bobs are doing experiments using the same quasars….

8 Free Will and Time Inversion

The notion of “free will” can be confusing. At some occasions, the discussion seems to border to a religious one. It should be quite clear that the theories of Nature discussed in this book, have nothing to do with religion, and so we must formulate in a more concrete manner what is meant by “free will”.

The idea behind what is often called ‘free will’ is actually extremely simple, in principle. Imagine that we have a model that describes what is going on in Nature, for instance the thought experiment for which Bell and CHSH wrote their inequalities. Suppose we have described the decaying atom, emitting two photons, in terms of some kinematic variables. All we want to know is how the system in the model would react if we make a small change in the settings chosen by Alice while keeping Bob and the entangled photons untouched. How would the stream of information carriers interact to yield results that violate these inequalities?

We know that there exists no divine model maker who can make such changes, but that is not the point. We know that, strictly speaking, Bob does not have the free will to make the changes all by itself. But what would the model say? What one might expect in a theory is that:

the theory predicts how its variables evolve, in an unambiguous way, from any chosen initial state.

In a Bell-type experiment, suppose we start from a configuration with given settings \(a\) of Alice’s filters and \(b\) of Bob’s. We see entangled particles moving from the source to the two detectors. We need our model to prescribe what happens when we look at the state that was modified as described above. Thus, it is actually the freedom to choose the initial state at any time \(t\), that one wishes to impose on a theory.

Note, that this is done at a price: one assumes that one can ignore the question how these states could have evolved from configurations in the past. The whole ‘free will’ argument assumes that we don’t need to check which modifications would be needed in the past events to realize the new modification. No matter what the state is, the theory should produce a prediction. Bell derived his inequalities for the outcomes of different initial states that he chose, and these inequalities appear to be violated by quantum mechanics.

We derived in Sect. 3.7 that, in order to reproduce the quantum mechanical result, the probabilities of the settings \(a,b\) and \(c\) must be correlated, and the correlation function associated to one simple model was calculated. Here we see how, in principle, the notion of free will as given above can be obstructed:

If a modification is made in the given values of the kinetic variables, they might have a much larger or smaller probability than the original configuration.

The correlation function we found describes 3-point correlations. All two-point correlations vanish.

Now the situation can be further clarified if we make use of an important property that Schrödinger’s equation shares with Newton’s classical equations: the microscopic laws can be solved backwards in time.

Only Alice, but not Bob, modified her settings, and the photons are left as they are. We then may arrive at a configuration that appears to violate Bell’s, or the CHSH inequality.Footnote 6 In the older discussions of superdeterminism, it would have been stated that Alice does not have the ‘free will’ to do this, but now we say: maybe not, but we can certainly allow ourselves to study the state obtained, and ask what the physical structure was in its past, by solving the microscopic equations backwards in time.

And now, it is not difficult to see what then happens. The quantum state of the entangled photons will no longer be as it was originally prepared: the photon leaving Alice (backwards in time) now has a different polarization. The original state where the total spin of the two photons was zero (an S-state), will now partly contain a D-state, with total spin \(s=2\). Thus, by choosing a different setting, either Alice or Bob modified the states of the photons they are detecting. Following this \(s=2\) state back to the past, we do not see a simple decaying atom, but a much less probable state of photons bouncing off the atom, refusing to perform the assumed decay process backwards in time. Thermodynamically, this is a much less probable initial state; it is a counterfactual initial state.

This counterfactual initial state will be an entirely legal one in terms of the microscopic laws of physics, but probably not at all in terms of the macroscopic laws, in particular, thermodynamics. What this argument shows is that, Bell’s theorem requires more hidden assumptions than usually thought: The quantum theory only contradicts the classical one if we assume that the ‘counterfactual modification’ does not violate the laws of thermodynamics.

In our models, we must assume that it does. Inevitably, a more ‘probable’ modification of the settings does turn the photon state into a different one. At first sight, this seems odd: the modification was made in one of the settings, not in the approaching photons. However, we must admit that the photons described in quantum mechanical language, are in template states; the ontological states, forming an orthonormal set, must involve many more ontological degrees of freedom than just these two photons, in order to stay orthonormal.

Note that, here finally, the cause of the violation of the CHSH inequalities is pinned down. In summary: the notion of ‘free will’ must be replaced with the notion that a useful model of Nature should give correct predictions concerning what happens for any given initial state (freedom of choosing the initial state), while the counterfactual initial state considered in Bell’s Gedanken experiment causes the original entangled photons to obtain a spin 2 admixture, which would significantly suppress the probability of this state.

In Sects. 4.2 and 4.3 we shall see that quantum mechanical probabilities can actually be traced back to probabilities in the initial state. So when at time \(t=t_{3}\) an amplitude is found to be low and the Born probability is small, this actually can be attributed to a smaller probability of the initial state chosen, at \(t=t_{1}\). The spin \(=2\) state for the decaying atom has much smaller probability than the spin \(=0\) state.