Abstract
Although retrocausality might be involved in quantum mechanics in a number of ways, the focus here is on the delaychoice arguments popularized by John Archibald Wheeler. There is a common fallacy that is often involved in the interpretation of quantum experiments involving a certain type of separation such as the: doubleslit experiments, whichway interferometer experiments, polarization analyzer experiments, SternGerlach experiments, and quantum eraser experiments. The fallacy leads not only to flawed textbook accounts of these experiments but to flawed inferences about retrocausality in the context of delayed choice versions of separation experiments.
Introduction: retrocausality in QM
There are a number of ways that the idea of retrocausality arises in quantum mechanics (QM). One way, which is analyzed here, is the argument largely due to Wheeler [19] that delayedchoice experiments reveal a type of retrocausality.
There is also the twovector approach to QM pioneered by Aharonov et al.:
in which a quantum system is described, at a given time, by two (instead of one) quantum states: the usual one evolving toward the future and the second evolving backwards in time from a future measurement [1, p. 1].
Cramer’s [4] transactional interpretation of QM also involves the idea of a second wave travelling backwards in time. The idea of QM as involving a wave travelling backwards in time goes back at least to Arthur Eddington’s Gifford Lectures in 1927:
The probability is often stated to be proportional to \(\psi ^{2}\), instead of \(\psi \), as assumed above. The whole interpretation is very obscure, but it seems to depend on whether you are considering the probability after you know what has happened or the probability for the purposes of prediction. The \(\psi ^{2}\) is obtained by introducing two symmetrical systems of \(\psi \)waves travelling in opposite directions in time; one of these must presumably correspond to probable inference from what is known (or is stated) to have been the condition at a later time [6, fn. pp. 216–217].
Finally the idea of retrocausality might arise when spacelike separated entangled systems are viewed from different inertial frames of reference. Abner Shimony popularized the idea of “peaceful coexistence” [17, p. 388] in spite of the “tension” between QM and special relativity.
In order to explore further the tension between quantum mechanics and relativity theory, let us consider an experimental arrangement in which [system 1] and [system 2] are tested by observers at rest in different inertial frames, and suppose that the tests are events of spacelike separation. If the reduction of \(\psi \) is to be interpreted causally, then which of the events is the cause and which is the effect? There is obviously no relativistically invariant way to answer this question. It could happen that in one frame of reference the testing of [system 1] is earlier than the testing of [system 2], and in the other frame the converse is the case [17, p. 387].
If a measurement of system 1 was taken as the “cause” and the reduction of system 2 the “effect” then in certain inertial frames the effect would precede the cause. This might be interpreted as a type of retrocausality or rather as a type of causal connection where the usual notions of “cause” and “effect” do not apply. As Shimony put it:
The wiser course is to say that quantum mechanics presents us with a kind of causal connection which is generically different from anything that could be characterised classically, since the causal connection cannot be unequivocally analysed into a cause and an effect [17, p. 387].
These other ways in which retrocausality might arise in QM are mentioned solely to emphasize that this paper is only concerned with Wheeler’s delayedchoice arguments.
Wheeler’s delayedchoice argument for retrocausality
Following Wheeler’s paper on delayed choice experiments, a number of quantum physicists, philosophers, and popular science writers seem to have just accepted the implication of retrocausality as part of quantum “weirdness.” In Wheeler’s own words:
There is an inescapable sense in which we, in the here and now, by a delayed setting of our analyzer of polarization to one or other angle, have an inescapable, an irretrievable, an unavoidable influence on what we have the right to say about what we call the past [21, p. 486].
Similar examples abound in the literature. For instance, concerning the quasargalaxy version of Wheeler’s delayed choice experiment, Anton Zeilinger remarks:
We decide, by choosing the measuring device, which phenomenon can become reality and which one cannot. Wheeler explicates this by example of the wellknown case of a quasar, of which we can see two pictures through the gravity lens action of a galaxy that lies between the quasar and ourselves. By choosing which instrument to use for observing the light coming from that quasar, we can decide here and now whether the quantum phenomenon in which the photons take part is interference of amplitudes passing on both side of the galaxy or whether we determine the path the photon took on one or the other side of the galaxy [24, pp. 191–192].
Occasionally instead of stating that future actions can determine whether the particle passes “on both sides of the galaxy” (or through both slits of a twoslit experiment) or only “on one or the other side” (or through only one slit), the euphemism is used of saying the photon acts like a wave or particle depending on the future actions.
The important conclusion is that, while individual events just happen, their physical interpretation in terms of wave or particle might depend on the future; it might particularly depend on decisions we might make in the future concerning the measurement performed at some distant spacetime location in the future [23, p. 207].
The purpose of this paper is to show how the delayedchoice experiments can be interpreted without involving retrocausality. Initially, the arguments are informal and then slowly a more formal analysis (far short of a fullblown mathematical treatment) is introduced to make the necessary points. The delayedchoice experiments in question involve a certain type of separation such as the:

doubleslit experiments,

whichway interferometer experiments,

polarization analyzer experiments, and

SternGerlach experiments, and

quantum eraser experiments.
The fallacy can be first described in general terms. In each case, given an incoming quantum particle, the “separation” apparatus creates a superposition of certain eigenstates; it is not a measurement. Detectors can be placed in certain positions so that when the evolving superposition state is finally projected or collapsed by the detectors, then only one of the eigenstates can register at each detector. It is then fallaciously assumed that the particle had already been projected or collapsed to an eigenstate at the entrance to the separation apparatus—as if it had been a measurement.
And if after the particle had entered the apparatus, the delayedchoice is made to suddenly remove the detectors (prior to arrival of the particle), then the superposition would continue to evolve and have distinctive effects (e.g., interference patterns in the twoslit experiment). Hence the fallacy makes it seem that by the delayed choice to insert or remove the appropriately positioned detectors or measurement devices, one can retrocause either a collapse to an eigenstate or not at the particle’s entrance into the apparatus.
Doubleslit experiments
In the wellknown setup for the doubleslit experiment, if detectors \(D_{1}\) and \(D_{2}\) are placed a small finite distance after the slits so a particle “going through the other slit” cannot reach the detector, then this is seen as “measuring which slit the particle went through” and a hit at a detector is usually interpreted as “the particle went through that slit.”
The natural image for this behaviour is that of a particle that passes either through hole [slit 1] or through [slit 2], but not through both holes [2, p. 45].
But this is incorrect. The particle is in a superposition state, which we might represent schematically as \(\left S1\right\rangle +\left S2\right\rangle \), that evolves until it hits the detector which projects (or collapses) the superposition to one of (the evolved versions of) the sliteigenstates. The particle’s state was not collapsed earlier so it was not previously in the \(\left S1\right\rangle \) or \(\left S2\right\rangle \) eigenstate, i.e., it did not “go through slit 1” or “go through slit 2 (Fig. 1).”
Thus what is called “detecting which slit the particle went through” is a misinterpretation. It is only placing a detector in such a position so that when the superposition projects to an eigenstate, only one of the eigenstates can register in that detector. It is about the spatial detector placement; it is not about whichslit information.
By erroneously talking about the detector “showing the particle went through slit 1,” we imply a type of retrocausality. If the detector is suddenly removed after the particle has passed the slits, then the superposition state continues to evolve and shows interference on the far wall (not shown)—in which case people say “the particle went through both slits.” Thus the “bad talk” makes it seem that by removing or inserting the detector after the particle is beyond the slits, one can retrocause the particle to go through “both slits” or one slit only.
This sudden removal or insertion of detectors that can only detect one of the sliteigenstates is a version of Wheeler’s delayed choice thoughtexperiment [19] (Fig. 2).
In Wheeler’s version of the experiment, there are two detectors which are positioned behind the removable screen so they can only detect one of the projected (evolved) slit eigenstates when the screen is removed. The choice to remove the screen or not is delayed until after a photon has traversed the two slits.
In the one case [screen in place] the quantum will transform a grain of silver bromide and contribute to the record of a twoslit interference fringe. In the other case [screen removed] one of the two counters will go off and signal in which beam—and therefore from which slit—the photon has arrived [19, p. 13].
The fallacy is involved when Wheeler infers from the fact that one of the speciallyplaced detectors went off that the photon had come from one of the slits—as if there had been a projection or collapse to one of the slit eigenstates at the slits rather than later at the detectors. Wheeler makes a similar mistake when he infers from a photon now registering a certain polarization upon measurement—that the photon always had that polarization. Hence by changing the angle of the polarization detector we could seem to change the polarization in the past.
Whichway interferometer experiments
Consider a MachZehnderstyle interferometer with only one beamsplitter (e.g., halfsilvered mirror), the “separation,” which creates the photon superposition: \(\left T1\right\rangle +\left R1\right\rangle \) (which stand for “Transmit” to the upper arm or “Reflect” into the lower arm at the first beamsplitter) (Fig. 3).
When detector \(D_{1}\) registers a hit, it is said that “the photon was reflected and thus took the lower arm” of the interferometer and similarly for \(D_{2}\) and passing through into the upper arm.
Indeed, if we want to visualize what happens in this experiment, the only possible image is that “something” is either reflected, or transmitted, on the beamsplitter, but it is not split: this corresponds to the behavior of a classical particle [2, p. 58].
So we can say in this case, without fear of paradox, that each photon went through just one path through the beamsplitter. In fact, if the photon were to take both paths, it would be hard to understand why it should appear to have taken just one or the other paths, why, that is, it is detected at [D1] (say) rather than at both [D1 and D2] [9, p. 40].
This is the interferometer analogue of putting two upclose detectors after the two slits in the twoslit experiment.
And this standard description is incorrect for the same reasons. The photon stays in the superposition state until the detectors force a projection to one of the (evolved) eigenstates. If the projection is to the evolved \(\left R1\right\rangle \) eigenstate then only \(D_{1}\) will get a hit, and similarly for \(D_{2}\) and the evolved version of \(\left T1\right\rangle \). The point is that the placement of the detectors (like in the doubleslit experiment) only captures one or the other of the projected eigenstates—but that does not mean the photon was in that eigenstate prior to the detection/measurement.
Now insert a second beamsplitter as in the following diagram (Fig. 4).
It is said that the second beamsplitter “erases” the “whichway information” so that a hit at either detector could have come from either arm, and thus an interference pattern emerges by varying the phase \(\phi \).
But this is also incorrect. There is no “whichway information” to be erased in the superposition state \(\left T1\right\rangle +\left R1\right\rangle \) which is further transformed at the second beamsplitter (where \(\left T2\right\rangle \), \(\left R2\right\rangle \) refer to transmit or reflect at the second beamsplitter) to the superposition \(\left T1,T2\right\rangle +\left T1,R2\right\rangle +\left R1,T2\right\rangle +\left R1,R2\right\rangle \) that can be regrouped according to what can register at each detector:
The socalled “whichway information” was not there to be “erased” since the particle did not take one way or the other in the first place. The second beamsplitter only allows the superposition state \(\left[ \left T1,R2\right\rangle +\left R1,T2\right\rangle \right] _{D_{1}}\)to be registered at \(D_{1}\) or the superposition state \(\left[ \left T1,T2\right\rangle +\left R1,R2\right\rangle \right] _{D_{2}}\) to be registered at \(D_{2}\). By using a phase shifter \(\phi \), an interference pattern can be recorded at each detector since each one is now detecting a superposition that will involve interference.
By inserting or removing the second beamsplitter after the particle has traversed the first beamsplitter (as in [19] or [20]), the fallacy makes it seem that we can retrocause the particle to go through both arms or only one arm.
Polarization analyzers and loops
Another common textbook example of the fallacy is the treatment of polarization analyzers such as calcite crystals that are incorrectly said to create two orthogonally polarized beams in the upper and lower channels, say \(\left v\right\rangle \) and \(\left h\right\rangle \) from an arbitrary incident beam (Fig. 5).
The output from the analyzer \(P\) is routinely described as a “vertically polarized” beam and “horizontally polarized” beam as if the analyzer was itself a measurement that collapsed or projected the incident beam to either of those polarization eigenstates. This seems to follow because if one positions a detector in the upper beam then only vertically polarized photons are observed and similarly for the lower beam and horizontally polarized photons. A blocking mask in one of the beams has the same effect as a detector to project the photons to eigenstates. If a blocking mask in inserted in the lower beam, then only vertically polarized photons will be found in the upper beam, and viceversa.
But here again, the story is about the spatial placement of the detector (or blocking mask); it is not about the analyzer supposedly projecting a photon into one or the other of the eigenstates. The analyzer puts the incident photons into a superposition state, a superposition state that correlates the compatible polarization and spatial modes for a particle. This could be schematically represented as:
There is a certain ambiguity in the practice of representing two eigenstates or eigenvalues in a ket: \(\left \text {state 1, state 2}\right\rangle \). This could be interpreted as shorthand for:

1.
the tensor product of two states \(\left \text {state 1}\right\rangle \otimes \left \text {state 2}\right\rangle \) of two particles so a superposition would be an entangled state–or

2.
it could be interpreted as giving two eigenstates of one particle for two compatible observables (e.g., as in Dirac’s complete set of commuting observables) and then we could consider a superposition that correlates those singleparticle states.
We are using in this section the second singleparticle version of the correlated superposition:
This sort of a superposition state is thus formally similar to an entangled state but only involves a single particle.
If a polarization detector is spatially placed in, say, the upper channel and it registers a hit, then that is the measurement that collapses the evolved superposition state to \(\left \text {vertical, upper}\right\rangle \) so only a vertically polarized photon will register in the upper detector, and similarly for the lower channel. Thus it is misleadingly said that the “upper beam” was already vertically polarized and the “lower beam” was already horizontally polarized as if the analyzer had already done the projection to one of the eigenstates.
If the analyzer had in fact induced a collapse to the eigenstates, then any prior polarization of the incident beam would be lost. Hence assume that the incident beam was prepared in a specific polarization of, say, \(\left 45^{\circ }\right\rangle \) halfway between the states of vertical and horizontal polarization. Then follow the \(vh\)analyzer \(P\) with its inverse \(P^{1}\) to form an analyzer loop [8] (Fig. 6).
The characteristic feature of an analyzer loop is that it outputs the same polarization, in this case \(\left 45^{\circ }\right\rangle \), as the incident beam. This would be impossible if the \(P\) analyzer had in fact rendered all the photons into a vertical or horizontal eigenstate thereby destroying the information about the polarization of the incident beam. But since no collapsing measurement was in fact made in \(P\) or its inverse, the original beam can be the output of an analyzer loop.
Some texts do not realize there is a problem with presenting a polarization analyzer such as a calcite crystal as creating two beams with orthogonal eigenstate polarizations—rather than creating a superposition state so that appropriately positioned detectors can detect only one eigenstate when the detectors cause the projections to the eigenstates.
One (partial) exception is Dicke and Wittke’s text [5]. At first they present polarization analyzers as if they measured polarization and thus “destroyed completely any information that we had about the polarization” [5, p. 118] of the incident beam. But then they note a problem:
The equipment [polarization analyzers] has been described in terms of devices which measure the polarization of a photon. Strictly speaking, this is not quite accurate [5, p. 118].
They then go on to consider the inverse analyzer \(P^{1}\) which combined with \(P\) will form an analyzer loop that just transmits the incident photon unchanged.
They have some trouble squaring this with their prior statement about the \(P\) analyzer destroying the polarization of the incident beam but they struggle with getting it right.
Stating it another way, although [when considered by itself] the polarization \(P\) completely destroyed the previous polarization \(Q\) [of the incident beam], making it impossible to predict the result of the outcome of a subsequent measurement of \(Q\), in [the analyzer loop] the disturbance of the polarization which was effected by the box \(P\) is seen to be revocable: if the box \(P\) is combined with another box of the right type, the combination can be such as to leave the polarization \(Q\) unaffected [5, p. 119].
They then go on to correctly note that the polarization analyzer \(P\) did not in fact project the incident photons into polarization eigenstates.
However, it should be noted that in this particular case [sic!], the first box \(P\) in [the first half of the analyzer loop] did not really measure the polarization of the photon: no determination was made of the channel \(\ldots \) which the photon followed in leaving the box \(P\) [5, p. 119].
Their phrase “in this particular case” makes it seem that the delayed choice to not add or add the second half \(P^{1}\) of the analyzer loop will retrocause a measurement to (respectively) be made or not made in the first box \(P\).
There is some classical imagery (like Schrödinger’s cat running around one side or the other side of a tree) that is sometimes used to illustrate quantum separation experiments when in fact it only illustrates how classical imagery can be misleading. Suppose an interstate highway separates at a city into both northern and southern bypass routes—like the two channels in a polarization analyzer loop. One can observe the bypass routes while a car is in transit and find that it is in one bypass route or another. But after the car transits whichever bypass it took without being observed and rejoins the undivided interstate, then it is said that the whichway information is erased so an observation cannot elicit that information.
This is not a correct description of the corresponding quantum separation experiment since the classical imagery does not contemplate superposition states. The particleascar is in a superposition of the two routes until an observation (e.g., a detector or “road block”) collapses the superposition to one eigenstate or the other. Thus when the undetected particle rejoins the undivided “interstate,” there was no whichway information to be erased. Correct descriptions of quantum separation experiments require taking superposition seriously—so classical imagery should only be used cum grano salis.
This analysis might be rendered in a more technical but highly schematic way. The photons in the incident beam have a particular polarization \(\left \psi \right\rangle \) such as \(\left 45^{\circ }\right\rangle \) in the above example. This polarization state can be represented or resolved in terms of the \(vh\)basis as:
The effect of the \(vh\)analyzer \(P\) might be represented as correlating the vertical and horizontal polarization states with the upper (\(U\)) and lower (\(L\)) channels so the \(vh\)analyzer puts an incident photon into the oneparticle correlated superposition state:
not into a mixture of an eigenstate of \(\left v\right\rangle \) in the upper channel (\(\left v,U\right\rangle \)) or an eigenstate \(\left h\right\rangle \) in the lower channel (\(\left h,L\right\rangle \)).
If a blocker or detector were inserted in either channel, then this superposition state would project to one of the eigenstates, and then, as indicated by the spatial modes that bring detector placement into play, only vertically polarized photons would be found in the upper channel and horizontally polarized photons in the lower channel.
The separation fallacy is to describe the \(vh\)analyzer as if the analyzer’s effect by itself was to project an incident photon either into \(\left v\right\rangle \) in the upper channel or \(\left h\right\rangle \) in the lower channel (a mixed state)—instead of only creating the above correlated superposition state.
In the analyzer loop, no measurement (detector or blocker) is made after the \(vh\)analyzer. It is followed by the inverse \(vh\)analyzer \(P^{1}\) which has the inverse effect of removing the \(U\) and \(L\) spatial modes from the superposition state \(\left\langle v\psi \right\rangle \left v,U\right\rangle +\left\langle h\psi \right\rangle \left h,L\right\rangle \) so that a photon exits the loop in the superposition state:
The inverse \(vh\)analyzer does not “erase” the whichpolarization information since there was no measurement—to reduce the superposition state to eigenstate polarizations in the channels of the analyzer loop–in the first place. The inverse \(vh\)analyzer does remove the correlation with the two spatial modes so the original state \(\left\langle v\psi \right\rangle \left v\right\rangle +\left\langle h\psi \right\rangle \left h\right\rangle =\left \psi \right\rangle \) is restored.
SternGerlach experiments
We have seen the fallacy in the standard treatments of the doubleslit experiment, whichway interferometer experiments, and in polarization analyzers. In spite of the differences between those separation experiments, there was that common (mis)interpretative theme. Since the “logic” of the polarization analyzers is followed in the SternGerlach experiment (with spin playing the role of polarization), it is not surprising that the same fallacy occurs there. Many texts represent the SternGerlach apparatus as separating particles into spin eigenstates denoted by, say, \(+S,0S,S\) (Fig. 7).
But a careful “expert” analysis of the experiment (e.g., [10, p. 171]) shows the apparatus does not project the particles to eigenstates. Instead it creates a single particle superposition state that correlates spin with spatial location so that with a detector in a certain position, it will only see particles of one spin state (analogous to the previous polarization example). If the collapse is caused by placing blocking masks over two of the beams, then the particles in the third beam will all be those that have collapsed to the same eigenstate. It is the detectors or blockers that cause the collapse or projection to eigenstates, not the prior separation apparatus.
And again, the fallacy is revealed by considering the SternGerlach analogue of an analyzer loop. The idea of a SternGerlach loop seems to have been first broached by Bohm [3, 22.11] and was later used by Eugene Wigner [22]. One of the few texts to consider such a SternGerlach analyzer loop is The Feynman Lectures on Physics: Quantum Mechanics (Vol. III) where it is called a “modified SternGerlach apparatus” [7, p. 5–12] (Fig. 8).
We previously saw how a polarization analyzer, contrary to the statement in many texts, does not lose the polarization information of the incident beam when it “separates” the beam (into a positionallycorrelated superposition state). In the context of the SternGerlach apparatus, Feynman similarly remarks:
“Some people would say that in the filtering by \(T\) we have ‘lost the information’ about the previous state (\(+S\)) because we have ‘disturbed’ the atoms when we separated them into three beams in the apparatus \(T\). But that is not true. The past information is not lost by the separation into three beams, but by the blocking masks that are put in...” [7, p. 5–19 (italics in original)].
The separation fallacy
We have seen the same fallacy of interpretation in twoslit experiments, whichway interferometer experiments, polarization analyzers, and SternGerlach experiments. The common element in all the cases is that there is some separation apparatus that puts a particle into a certain superposition of spatially “entangled” or correlated eigenstates in such a manner that when an appropriately spatiallypositioned detector induces a collapse to an eigenstate, then the detector will only register one of the eigenstates. The separation fallacy is that this is misinterpreted as showing that the particle was already in that eigenstate in that position as a result of the previous “separation.” In fact the superposition evolves until some distinction is made that constitutes a measurement, and only then is the state reduced to an eigenstate. The quantum erasers are more elaborate versions of these simpler experiments, and a similar separation fallacy arises in that context.
One photon quantum eraser experiment
A simple quantum eraser can be devised using a single beam of photons as in [12]. We start with the twoslit setup where a \(+45^{\circ }\) polarizer in front of the slits to control the incoming polarization. Here we will represent the system after the polarizer as a tensor product with the second component giving the polarization state. Only one particle is involved but spatial mode and polarization are compatible observables, so the tensor product can be used for the more technical calculations below. The evolving state after the two slits is the superposition (Fig. 9):
Then a horizontal polarizer is inserted after slit 1 and a vertical polarizer after slit 2. This will change the evolving state to: \(\frac{1}{\sqrt{2} }\left( \left S1\right\rangle \otimes \left H\right\rangle +\left S2\right\rangle \otimes \left V\right\rangle \right) \) but since these new polarizers involve some measurements, not just unitary evolution, it may be helpful to go through the calculation in some detail. The state that “hits” the \(H,V\) polarizers is:
The \(45^{\circ }\) polarization state can be resolved by inserting the identity operator \(I=\left H\right\rangle \left\langle H\right +\left V\right\rangle \left\langle V\right \) to get:
Substituting this for \(\left 45^{\circ }\right\rangle \), we have the state that hits the \(H,V\) polarizers as:
which can be regrouped in two parts as:
Then the \(H,V\) polarizers are making a degenerate measurement that give the first state \(\left S1\right\rangle \otimes \left H\right\rangle +\left S2\right\rangle \otimes \left V\right\rangle \) with probability \(\left( \frac{1}{2}\right) ^{2}+\left( \frac{1}{2}\right) ^{2}=\frac{1}{2}\). The other state \(\left S1\right\rangle \otimes \left V\right\rangle +\left S2\right\rangle \otimes \left H\right\rangle \) is obtained with the same probability, and it is blocked by the polarizers. Thus the state that evolves is the state (after being normalized):
After the two slits, a photon is in a state that entangles the spatial slit states and the polarization states (for a discussion of this type of entanglement, see [14]). But as this superposition evolves, it cannot be separated into a superposition of the slitstates as before, so the interference disappears (Fig. 10).
Technically, if \(P_{\Delta y}\) is the projection operator representing finding a particle in the region \(\Delta y\) along the wall, then that probability in the state \(\frac{1}{\sqrt{2}}\left[ \left S1\right\rangle \otimes \left H\right\rangle +\left S2\right\rangle \otimes \left V\right\rangle \right] \) is:
which is the average of separate slit probabilities that shows no interference. The key step is how the orthogonal polarization markings decohered the state since \(\left\langle HV\right\rangle =0=\left\langle VH\right\rangle \) and thus eliminated the interference between the \(S1\) and \(S2\) terms. The statereduction occurs only when the evolved superposition state hits the far wall which measures the positional component (i.e., \(P_{\Delta y}\)) of the entangled state and shows the noninterference pattern.
The key point is that in spite of the bad terminology of “whichway” or “whichslit” information, the polarization markings do NOT create a halfhalf mixture of horizontally polarized photons going through slit 1 and vertically polarized photons going through slit 2. It creates the superposition (pure) state \(\frac{1}{\sqrt{2}}\left[ \left S1\right\rangle \otimes \left H\right\rangle +\left S2\right\rangle \otimes \left V\right\rangle \right] \) which evolves until measured at the wall.
This can be seen by inserting a \(+45^{\circ }\) polarizer between the twoslit screen and the far wall.
Each of the horizontal and vertical polarization states can be represented as a superposition of \(+45^{\circ }\) and \(45^{\circ }\) polarization states. Just as the horizontal polarizer in front of slit 1 threw out the vertical component so we have no \(\left S1\right\rangle \otimes \left V\right\rangle \) term in the superposition, so now the \(+45^{\circ }\) polarizer throws out the \(45^{\circ }\) component of each of the \(\left H\right\rangle \) and \(\left V\right\rangle \) terms so the state transformation is:
It might be useful to again go through the calculation in some detail.

1.
\(\left H\right\rangle =\left( \left +45^{\circ }\right\rangle \left\langle +45^{\circ }\right +\left 45^{\circ }\right\rangle \left\langle 45^{\circ }\right \right) \left H\right\rangle =\left\langle +45^{\circ }H\right\rangle \left +45^{\circ }\right\rangle +\left\langle 45^{\circ }H\right\rangle \left 45^{\circ }\right\rangle \) and since a horizontal vector at \(0^{\circ }\) is the sum of the \(+45^{\circ } \) vector and the \(45^{\circ }\) vector, \(\left\langle +45^{\circ }H\right\rangle =\left\langle 45^{\circ }H\right\rangle =\frac{1}{\sqrt{2}}\) so that: \(\left H\right\rangle =\frac{1}{\sqrt{2}}\left[ \left +45^{\circ }\right\rangle +\left 45^{\circ }\right\rangle \right] \).

2.
\(\left V\right\rangle =\left( \left +45^{\circ }\right\rangle \left\langle +45^{\circ }\right +\left 45^{\circ }\right\rangle \left\langle 45^{\circ }\right \right) \left V\right\rangle =\left\langle +45^{\circ }V\right\rangle \left +45^{\circ }\right\rangle +\left\langle 45^{\circ }V\right\rangle \left 45^{\circ }\right\rangle \) and since a vertical vector at \(90^{\circ }\) is the sum of the \(+45^{\circ }\) vector and the negative of the \(45^{\circ }\) vector, \(\left\langle +45^{\circ }V\right\rangle =\frac{1}{\sqrt{2}}\) and \(\left\langle 45^{\circ }V\right\rangle =\frac{1}{\sqrt{2}}\) so that: \(\left V\right\rangle =\frac{1}{\sqrt{2}}\left[ \left +45^{\circ }\right\rangle \left 45^{\circ }\right\rangle \right] \).
Hence making the substitutions gives:
We then regroup the terms according to the measurement being made by the \(45^{\circ }\) polarizer:
Then with probability \(\left( \frac{1}{2}\right) ^{2}+\left( \frac{1}{2}\right) ^{2}=\frac{1}{2}\), the \(+45^{\circ }\) polarization measurement passes the state \(\left( \left S1\right\rangle +\left S2\right\rangle \right) \otimes \left +45^{\circ }\right\rangle \) and blocks the state \(\left( \left S1\right\rangle \left S2\right\rangle \right) \otimes \left 45^{\circ }\right\rangle \). Hence the normalized state that evolves is: \(\frac{1}{\sqrt{2}}\left( \left S1\right\rangle +\left S2\right\rangle \right) \otimes \left +45^{\circ }\right\rangle \), as indicated above.
Then at the wall, the positional measurement \(P_{\Delta y}\) of the first component is the evolved superposition \(\left S1\right\rangle +\left S2\right\rangle \) which again shows an interference pattern. But it is not the same as the original interference pattern before \(H,V\) or \(+45^{\circ }\) polarizers were inserted. This “shifted” interference pattern is called the fringe pattern of Fig. 11.
Alternatively we could insert a \(45^{\circ }\) polarizer which would transform the state \(\frac{1}{\sqrt{2}}\left[ \left S1\right\rangle \otimes \left H\right\rangle +\left S2\right\rangle \otimes \left V\right\rangle \right] \) into \(\frac{1}{\sqrt{2}}\left( \left S1\right\rangle \left S2\right\rangle \right) \otimes \left 45^{\circ }\right\rangle \) which produces the interference pattern from the “other half” of the photons and which is called the antifringe pattern (Fig. 12).
The allthephotons sum of the fringe and antifringe patterns reproduces the “mush” noninterference pattern of Fig. 10.
This is one of the simplest examples of a quantum eraser experiment.

1.
The insertion of the horizontal and vertical polarizers marks the photons with “whichslit” information that eliminates the interference pattern.

2.
The insertion of a \(+45^{\circ }\) or \(45^{\circ }\) polarizer “erases” the whichslit information so an interference pattern reappears.
But there is a mistaken interpretation of the quantum eraser experiment that leads one to infer that there is retrocausality.
[Start of erroneous example]

1.
The markings by insertion of the horizontal and vertical polarizers creates the halfhalf mixture where each photon is reduced to either a horizontally polarized photon that went through slit 1 or a vertically polarized photon that went through slit 2. Hence the photon “goes through one slit or the other.”

2.
The insertion of the \(+45^{\circ }\) polarizer erases that whichslit information so interference reappears which means that the photon had to “go through both slits.”
Hence the delayed choice to insert or not insert the \(+45^{\circ }\) polarizer–after the photons have traversed the screen and \(H,V\) polarizers–retrocauses the photons to either:

go through both slits, or

to only go through one slit or the other.
[End of erroneous example]
Now we can see the importance of realizing that prior to inserting the \(+45^{\circ }\) polarizer, the photons were in the superposition (pure) state \(\frac{1}{\sqrt{2}}\left[ \left S1\right\rangle \otimes \left H\right\rangle +\left S2\right\rangle \otimes \left V\right\rangle \right] \), not a halfhalf mixture of the reduced states \(\left S1\right\rangle \otimes \left H\right\rangle \) or \(\left S2\right\rangle \otimes \left V\right\rangle \). The proof that the system was not in that mixture is obtained by inserting the \(+45^{\circ }\) polarizer which yields the (fringe) interference pattern.

1.
If a photon had been, say, in the state \(\left S1\right\rangle \otimes \left H\right\rangle \) then, with \(50{\%}\) probability, the photon would have passed through the filter in the state \(\left S1\right\rangle \otimes \left +45^{\circ }\right\rangle \), but that would not yield any interference pattern at the wall since their was no contribution from slit 2.

2.
And similarly if a photon in the state \(\left S2\right\rangle \otimes \left V\right\rangle \) hits the \(+45^{\circ }\) polarizer.
The fact that the insertion of the \(+45^{\circ }\) polarizer yielded interference proved that the incident photons were in a superposition (pure) state \(\frac{1}{\sqrt{2}}\left[ \left S1\right\rangle \otimes \left H\right\rangle +\left S2\right\rangle \otimes \left V\right\rangle \right] \) which, in turn, means there was no mixture of “going through one slit or the other” in case the \(+45^{\circ }\) polarizer had not been inserted.
Thus a correct interpretation of the quantum eraser experiment removes any inference of retrocausality and fully accounts for the experimentally verified facts given in the figures.
Two photon quantum eraser experiment
We now turn to one of the more elaborate quantum eraser experiments [18] (Fig. 13).
A photon hits a downconverter which emits a “signal” \(p\)photon entangled with an “idler” \(s\)photon with a superposition of orthogonal \(\left x\right\rangle \) and \(\left y\right\rangle \) polarizations so the overall state is:
The lower \(s\)photon hits a doubleslit screen, and will show an interference pattern on the \(D_{s}\) detector as the detector is moved along the \(x\)axis.
Next two quarterwave plates are inserted before the twoslit screen with the fast axis of the one over slit \(1\) oriented at \(\left +45^{\circ }\right\rangle \) to the xaxis and the one over the slit \(2\) with its fast axis oriented at \(\left 45^{\circ }\right\rangle \) to the \(x\)axis (Fig. 14).
Then Walborn et al. [18] give the overall state of the system as (where \(S1\) and \(S2\) refer to the two slits):
Then by measuring the linear polarization of the \(p\)photon at \(D_{p}\) and the circular polarization at \(D_{s}\), “whichslit information” is said to be obtained and no interference pattern recorded at \(D_{s}\).
For instance measuring \(\left x\right\rangle \) at \(D_{p}\) and \(\left L\right\rangle \) at \(D_{s}\) imply \(S2\), i.e., slit \(2\). But as previously explained, this does not mean that the \(s\)photon went through slit \(2\). It means we have positioned the two detectors in polarization space, say to measure \(\left x\right\rangle \) polarization at \(D_{p}\) and \(\left L\right\rangle \) polarization at \(D_{s}\), so only when the superposition state collapses to \(\left x\right\rangle \) for the \(p\)photon and \(\left L\right\rangle \) for the \(s\)photon do we get a hit at both detectors.
This is the analogue of the onebeamsplitter interferometer where the positioning of the detectors would only record one collapsed state which did not imply the system was all along in that particular armeigenstate. The phrase “whichslit” or “whicharm information” is a misnomer in that it implies the system was already in a slit or wayeigenstate and the socalled measurement only revealed the information. Instead, it is only at the measurement that there is a collapse or projection to an evolved sliteigenstate (not at the previous separation due to the two slits).
Walborn et al. indulge in the separation fallacy when they discuss what the socalled “whichpath information” reveals.
Let us consider the first possibility [detecting \(p\) before \(s\)]. If photon \(p\) is detected with polarization \(x\) (say), then we know that photon \(s\) has polarization \(y\) before hitting the \(\lambda /4\) plates and the double slit. By looking at [the above formula for \(\left \Psi \right\rangle \)], it is clear that detection of photon \(s\) (after the double slit) with polarization \(R\) is compatible only with the passage of \(s\) through slit \(1\) and polarization \(L\) is compatible only with the passage of \(s\) through slit \(2\). This can be verified experimentally. In the usual quantum mechanics language, detection of photon \(p\) before photon \(s\) has prepared photon \(s\) in a certain state [18, p. 4].
Firstly, the measurement that \(p\) has polarization \(x\) after the \(s \) photon has traversed the \(\lambda /4\) plates and two slits [see their Fig. 1] does not retrocause the \(s\) photon to already have “polarization \(y \) before hitting the \(\lambda /4\) plates.” When photon \(p\) is measured with polarization \(x\), then the two particle system is in the superposition state:
which means that the \(s\) photon is still in the slitsuperposition state: \(i\left R,S1\right\rangle i\left L,S2\right\rangle \). Then only with the measurement of the circular polarization states \(L\) or \(R\) at \(D_{s}\) do we have the collapse to (the evolved version of) one of the slit eigenstates \(S1\) or \(S2\). It is an instance of the separation fallacy to infer “the passage of \(s\) through slit 1” or “slit 2”, i.e., \(S1\) or \(S2\), instead of the photon \(s\) being in the entangled superposition state \(\left \Psi \right\rangle \) after traversing the slits.
Let us take a new polarization space basis of \(\left +\right\rangle =+45^{\circ }\) to the \(x\)axis and \(\left \right\rangle =45^{\circ }\) to the \(x\)axis. Then the overall state can be rewritten in terms of this basis as (see original paper for the details): (Fig. 15)
Then a \(\left +\right\rangle \) polarizer or a \(\left \right\rangle \) polarizer is inserted in front of \(D_{p}\) to select \(\left +\right\rangle _{p}\) or \(\left \right\rangle _{p}\) respectively. In the first case, this reduces the overall state \(\left \Psi \right\rangle \) to \(\left +,S1\right\rangle i\left +,S2\right\rangle \) which exhibits an interference pattern, and similarly for the \(\left \right\rangle _{p}\) selection. This is misleadingly said to “erase” the socalled “whichslit information” so that the interference pattern is restored.
The first thing to notice is that two complementary interferences patterns, called “fringes” and “antifringes,” are being selected. Their sum is the nointerference pattern obtained before inserting the polarizer. The polarizer simply selects one of the interference patterns out of the mush of their merged noninterference pattern. Thus instead of “erasing whichslit information,” it selects one of two interference patterns out of the bothpatterns mush.
Even though the polarizer may be inserted after the \(s\)photon has traversed the two slits, there is no retrocausation of the photon going though both slits or only one slit as previously explained.
One might also notice that the entangled \(p\)photon plays little real role in this setup (as opposed to the “delayed erasure” setup considered next). Instead of inserting the \(\left +\right\rangle \) or \(\left \right\rangle \) polarizer in front of \(D_{p}\), insert it in front of \(D_{s}\) and it would have the same effect of selecting \(\left +,S1\right\rangle i\left +,S2\right\rangle \) or \(\left ,S1\right\rangle +i\left ,S2\right\rangle \) each of which exhibits interference. Then it is very close to the onephoton eraser experiment of the last section.
Delayed quantum eraser
If the upper arm is extended so the \(D_{p}\) detector is triggered last (“delayed erasure”), the same results are obtained. The entangled state is then collapsed at \(D_{s}\). A coincidence counter (not pictured) is used to correlate the hits at \(D_{s}\) with the hits at \(D_{p}\) for each fixed polarizer setting, and the same interference pattern is obtained (Fig. 16).
The interesting point is that the \(D_{p}\) detections could be years after the \(D_{s}\) hits in this delayed erasure setup. If the \(D_{p}\) polarizer is set at \(\left +\right\rangle _{p}\), then out of the mush of hits at \(D_{s}\) obtained years before, the coincidence counter will pick out the ones from \(\left +,S1\right\rangle i\left +,S2\right\rangle \) which will show interference.
Again, the yearslater \(D_{p}\) detections do not retrocause anything at \(D_{s}\), e.g., do not “erase whichway information” years after the \(D_{s}\) hits are recorded (in spite of the “delayed erasure” talk). They only pick (via the coincidence counter) one or the other interference pattern out of the yearsearlier mush of hits at \(D_{s}\).
We must conclude, therefore, that the loss of distinguishability is but a side effect, and that the essential feature of quantum erasure is the postselection of subensembles with maximal fringe visibility [15, p. 79].
The same sort of analysis could be made of the delayed choice quantum eraser experiment described in the papers by Scully et al. [16] or Kim et al. [13]. Brian Greene [11, pp. 194–199] gives a good popular analysis of the Kim et al. experiment which avoids the separation fallacy and thus avoids any implication of retrocausality.
References
 1.
Aharonov, Y., Vaidman, L.: Protective measurements of twostate vectors. In: Cohen, R.S., Horne, M., Stachel, J. (eds.) Potentiality, Entanglement and PassionataDistance: Quantum Mechanical Studies for Abner Shimony, vol. Two, pp. 1–8. Springer Science+Business Media, Dordrecht (1997)
 2.
Aspect, A., Grangier, P.: Waveparticle duality: a case study. In: Miller, Arthur I. (ed.) SixtyTwo Years of Uncertainty, pp. 45–59. Plenum Press, New York (1990)
 3.
Bohm, D.: Quantum Theory. PrenticeHall, Englewood Cliffs (1951)
 4.
Cramer, J.: The transactional interpretation of quantum mechanics. Rev. Modern Phys. 58(July), 647–688 (1986)
 5.
Dicke, R.H., Wittke, J.P.: Introduction to Quantum Mechanics. AddisonWesley, Reading (1960)
 6.
Eddington, A.S.: The Nature of the Physical World: Gifford Lectures 1927. Macmillan, New York (1929)
 7.
Feynman, R.P., Leighton, R.B., Sands, M.: The Feynman Lectures on Physics: Quantum Mechanics, vol. III. AddisonWesley, Reading (1965)
 8.
French, A.P., Taylor, E.F.: An Introduction to Quantum Physics. Norton, New York (1978)
 9.
Gibbins, P.: Particles and Paradoxes: The Limits of Quantum Logic. Cambridge University Press, Cambridge (1987)
 10.
Gottfried, K.: Quantum Mechanics. AddisonWesley, Reading (1989)
 11.
Greene, B.: The Fabric of the Cosmos. Alfred A. Knopf, New York (2004)
 12.
Hilmer, R., Kwiat, P.G.: A doityourself quantum eraser. Sci. Am. 296(5), 90–95 (2007)
 13.
Kim, Y.H., Yu, R., Kulik, S.P., Shih, Y.H., Scully, M.O.: Delayed choice quantum eraser. Phys. Rev. Lett. 84(1) (2000)
 14.
Kwiat, P.G., Steinberg, A.M., Chiaom, R.Y.: Observation of a “quantum eraser”: a revival of coherence in a twophoton interference experiment. Phys. Rev. A. 45(11(1)), 7729–7736 (1992)
 15.
Kwiat, P.G., Schwindt, P.D.D., Englert, B.G.: What does a quantum eraser really erase? In: Bonifacio, R. (ed.) Mysteries, Puzzles, and Paradoxes in Quantum Mechanics, pp. 69–80. American Institute of Physics, Woodbury (1999)
 16.
Scully, M.O., Englert, B.G., Walther, H.: Quantum optical tests of complementarity. Nature 351, 111–116 (1991)
 17.
Shimony, A.: Conceptual foundations of quantum mechanics. In: Davies, P. (ed.) The New Physics, pp. 373–395. Cambridge University Press, Cambridge UK (1989)
 18.
Walborn, S.P., Terra Cunha, M.O., Padua, S., Monken, C.H.: Doubleslit quantum eraser. Phys. Rev. A. 65(3), 1–6 (2002)
 19.
Wheeler, J.A.: The “past” and the “delayedchoice” doubleslit experiment. In: Marlow, A.R. (ed.) Mathematical Foundations of Quantum Theory, pp. 9–48. Academic Press, New York (1978)
 20.
Wheeler, J.A.: Law without law. In: Wheeler, J.A., Zurek, W.H. (eds.) Quantum Theory and Measurement, pp. 182–213. Princeton University Press, Princeton (1983)
 21.
Wheeler, J.A.: Hermann Weyl and the unity of knowledge. In: Deppert, W., Hübner, K., Oberschelp, A, Weidemann, V. (eds.) Exact Sciences and their Philosophical Foundations Vorträge des Internationalen HermannWeylKongresses Kiel 1985. pp 469–503, Frankfurt am Main: Verlag Peter Lang (1988)
 22.
Wigner, E.P.: The problem of measurement. Symmetries and Reflections, pp. 153–170. Ox Bow Press, Woodbridge (1979)
 23.
Zeilinger, A.: Why the quantum? “It” from “bit”? A participatory universe? Three farreaching challenges from John Archibald Wheeler and their relation to experiment. In: Barrow, J., Davies, P., Harper, C. (eds.) Science and Ultimate Reality: Quantum Theory, Cosmology, and Complexity, pp. 201–220. Cambridge University Press, Cambridge (2004)
 24.
Zeilinger, A.: On the interpretation and philosophical foundation of quantum mechanics. In: Daub, H. (ed.) Grenzen menschlicher Existenz. Petersberg, Michael Imhof Verlag (2008)
Author information
Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ellerman, D. Why delayed choice experiments do Not imply retrocausality. Quantum Stud.: Math. Found. 2, 183–199 (2015). https://doi.org/10.1007/s4050901400262
Received:
Accepted:
Published:
Issue Date:
Keywords
 Retrocausality
 Delayed choice experiments
 Quantum eraser experiments