1 Introduction

Delayed choice experiments constitute a class of experiments with the general feature that “quantum effects can mimic an influence of future actions on past events” [13, Conclusion and outlook]. The keywords in this characterisation are ‘can mimic’. As the following remark by Ma et al. [14] in their article on delayed-choice entanglement swappingFootnote 1 perfectly highlights, the status of the retro-causation of delayed choice experiments is generally understood to be dependent on the interpretation of quantum mechanics one adheres to:

If one viewed the quantum state as a real physical object, one could get the paradoxical situation that future actions seem to have an influence on past and already irrevocably recorded events. However, there is never a paradox if the quantum state is viewed as no more than a ‘catalogue of our knowledge’. Then the state is a probability list for all possible measurement outcomes, the relative temporal order of the three observers’ events is irrelevant and no physical interactions whatsoever between these events, especially into the past, are necessary to explain the delayed-choice entanglement swapping.

Thus, according to Ma et al., whether delayed choice experiments show apparent retro-causality cannot be decided solely from their outcome data as such. It depends on the role we ascribe to the state. If the state is considered to correspond to an objective part of reality, then these experiments seem to show retro-causality; if, on the other hand, the state is considered merely a catalogue of our knowledge, then the feature of retro-causality disappears.

In this paper, we challenge this conclusion. Our central claim is that the outcomes of delayed choice experiments can be fully explained in terms of a step-by-step mathematical analysis in forward time. This analysis involves no retro-causal steps, regardless of the ontological status one wishes to ascribe to the quantum mechanical state as such (real or epistemological). We substantiate our claim by providing such step-by-step analyses for two celebrated delayed choice experiments: Wheeler’s original gedanken-experiment [20, 21] and the ‘delayed quantum eraser’ of Scully and Drühl [16]. For both experiments, we provide a parallel analysis of their delayed and non-delayed counterparts and show that both lead to the same final quantum state. This is neither ‘obvious’ nor interpretation-dependent but represents the outcome of a careful analysis of the steps involved. It demonstrates that, on an operational level, the delayed and non-delayed versions of the experiments cannot be distinguished. As we will argue in the final section, the puzzling aspects of delayed choice experiments seem to arise from the use of notions such as ‘wave-particle duality’ and ‘which path information’ as explanatory rather than descriptive tools. As these notions make sense only once the completed experiment can be overseen in its entirety, their use entails a certain degree of backward-in-time reasoning about the experiments. By abandoning these descriptive notions in favour of a forward mathematical analysis, the puzzling aspects of the delayed choice experiments disappear.

There is extensive literature discussing theoretical aspects and experimental realisations of delayed choice experiments. A (quantum) Wheeler’s delayed choice experiment, in which the second beam splitter can be in a superposition of being present and absent, was proposed about a decade ago by [8]; for a fairly recent review, the reader is referred to [13]. Additional references to some recent papers can be found in, e.g. [15]. General aspects of the problem of retro-causality in quantum mechanics are discussed in [3, 5, 18, 19], where further references can be found.

The arguments presented here can be placed in line with some of these earlier studies. In her discussion of the delayed quantum eraser, Hossenfelder [7] points out how the data of the first, signal particle remains necessarily unchanged during the whole experiment; the purported retro-causality of the delayed eraser therefore happens only on the level of retro-active selections in the signal data based on the actions of the second, idler photon. Our mathematical analysis confirms this position. Kastner [10] argues that the delayed quantum eraser is in essence no different from a standard EPR-pair, in that the order of space-like separated operations is irrelevant; the quantum eraser does not ‘delay’ nor ‘erase’. Gaasbeek [6] argues that, from the point of view of the theory of special relativity, the sequence of the space-like separated measurements is relative to the observer overseeing the experiment, and therefore the order between these measurement operations should have no influence on the eventual outcomes. We agree with Kastner and Gaasbeek. However, our point is not made by analogy to an EPR-pair, but through direct analysis of the delayed quantum eraser itself, and we arrive at our conclusion by showing thus mathematically using standard textbook quantum mechanics only; our arguments do not invoke special relativity. In [2], Donker et al. argue that the outcomes of delayed choice experiments can be explained entirely in terms event-by-event based models involving objects travelling one-by-one through the experimental set-up and generating clicks of a detector. Let us finally mention the work [1] by Dieguez et al., which analyses delayed choice experiments by adopting an operational quantifier of realism, enabling them argue that the visibility at the output has no connection whatsoever with wave and particle elements of reality as defined in accordance with the adopted criterion of realism. In the same paper it is observed that “To date, a detailed analysis is lacking which would allow one to track the behaviour of the system at every stage of the experiment”. We provide a framework for such an analysis here.

2 Wheeler’s delayed choice experiment

Wheeler’s delayed choice experiment [20, 21] (as cited from [13, sect. II.D]) makes use of two Mach–Zehnder interferometers, experimental devices that can be used to demonstrate certain quantum mechanical phenomena involving superposition. A Mach–Zehnder interferometer consists of a single photon source and a sequence of mirrors and beamsplitters which eventually steer the emitted photon towards (one of) two detectors. In the first set-up, only one beamsplitter is used. A single photon is emitted towards the beamsplitter, which brings the photon into a superposition of travelling via the upper or lower path (see Fig. 1, left). After being directed towards the detectors by mirrors, the photon is then detected in one of the detectors with equal probability. The second set-up of the experiment follows the same description, but after having been deflected by the mirrors the photon encounters a second beamsplitter, which causes both paths to interfere. This causes only the bottom detector to detect incoming photons (see Fig. 1, right).

Fig. 1
figure 1

The Mach–Zehnder Interferometer and without a second beamsplitter. The black dot denotes the single-photon source, the open rectangle denotes a beamsplitter, the closed rectangles are mirrors, and the open-half circles are the detectors. The arrows represent the possible ‘paths’ of the photons emitted by the laser. The beamsplitter creates a superposition describing the two possible paths depicted in the figure. As explained in the main text, the second beamsplitter has the effect that no incoming photons are detected by the top detector due to destructive interference

In 1978, Wheeler proposed to combine the two scenarios by having a quantum random bit generator decide between them. If the quantum random generator gives output ‘off’, the second beamsplitter is deactivated and the first scenario is followed; if the quantum random generator gives output ‘on’, the second beamsplitter is activated and the second scenario is followed (see Fig. 2). This set-up becomes a ‘delayed-choice’ experiment by letting the quantum random generator decide between the two scenarios only after the photon has passed the first beamsplitter. Experimental realisation of this gedanken-experiment was reported in [9].

Fig. 2
figure 2

A delayed choice experiment with a Mach–Zehnder Interferometer. The square labelled R denotes a quantum random bit generator which, depending on its two possible outputs, activates or deactivates the second beamsplitter

2.1 Wheeler’s original argument

Wheeler’s original interest in the above experiment came from an argument based on the Copenhagen interpretation. In this view, the separate clicks of the two detectors in the first scenario are interpreted as revealing the ‘particle’ nature of the photon, whereas the presumed interference explaining the second scenario is interpreted as revealing the ‘wave’ nature of the photon. From this point of view, the introduction of the second beamsplitter “forces” the photon to behave like a particle or like a wave. In the delayed choice experiment, this choice can only be made “in flight” once it is known whether or not a second beamsplitter will be encountered. Thus Wheeler interprets this experiment as a manifestation of retro-causation:

“In this sense, we have a strange inversion of the normal order of time. We, now, by moving the mirror in or out have an unavoidable effect on what we have a right to say about the already past history of that photon.” \((\dots )\) “Thus one decides whether the photon ‘shall have come by one route [as particle] or by both routes [as an interfering wave]’ after it has already done its travel” [21] (cited directly from [13]).

2.2 Analysis of the experiment

In this section, we provide our analysis of Wheeler’s gedanken-experiment. We first provide an analysis of the experiment in the language of standard quantum mechanics and, subsequently, recast it in mathematical language. We use the latter to show the equivalence of both the delayed and non-delayed versions of the experiment. Lastly, we ground this equivalence in the physical principle that space-like separated arms of the experiment necessarily commute.

2.2.1 Quantum mechanical description of the gedanken-experiment

Let us first analyse the scenario without a quantum random bit generator as depicted on the left in Fig. 1. In this scenario, no second beamsplitter is present. The states of the photon before passing the beamsplitter, after passing the beamsplitter and before arriving at the mirrors, and after being deflected by the mirrors but before arriving at the detectors, can be described as follows:

$$\begin{aligned} |1, t = 1\rangle= & {} |\downarrow \rangle ,\nonumber \\ |1, t = 2\rangle= & {} \frac{1}{\sqrt{2}}(|\uparrow \rangle - |\downarrow \rangle ), \nonumber \\ |1,t = 3\rangle= & {} \frac{1}{\sqrt{2}}(|\downarrow \rangle - |\uparrow \rangle ). \end{aligned}$$
(2.1)

The number ‘1’ refers to this first scenario.

In the second scenario, with a second beamsplitter inserted between the mirrors and the detectors as depicted on the right in Fig. 1, the first three stages are the same, but we must add a fourth stage describing the state of the photon after it has encountered the second beamsplitter. The second beamsplitter changes the state \(|\uparrow \rangle \) to \(\frac{1}{\sqrt{2}} (|\downarrow \rangle +|\uparrow \rangle )\) and \(|\downarrow \rangle \) to \(\frac{1}{\sqrt{2}} (|\uparrow \rangle -|\downarrow \rangle ).\) Inserting this into the third line of Eq. (2.1), we arrive at

$$\begin{aligned} |2, t = 4\rangle = \frac{1}{\sqrt{2}}\Biggl (\Bigl (\frac{1}{\sqrt{2}} (|\uparrow \rangle -|\downarrow \rangle )\Bigr ) -\Bigl (\frac{1}{\sqrt{2}} (|\downarrow \rangle +|\uparrow \rangle ) \Bigr )\Biggr ) =-|\downarrow \rangle . \end{aligned}$$

In the third scenario (as in Fig. 2), a quantum random bit generator chooses between these two scenarios after a photon has been deflected by the mirrors. We model this by coupling the photon to the quantum random bit generator, whose output we represent by the ‘off’ state \(|\text {off}\rangle \) (no second beamsplitter is introduced) and the ‘on’ state \(|\text {on}\rangle \) (the second beamsplitter is introduced). The random generator starts in the ‘off’ position and is initiated to randomly go in the ‘on’ position in half of the runs after the photon has passed the mirrors. This results in the states

$$\begin{aligned} |3, t = 1\rangle= & {} |\downarrow \rangle |\text {off}\rangle ,\nonumber \\ |3, t = 2\rangle= & {} \frac{1}{\sqrt{2}}(|\uparrow \rangle - |\downarrow \rangle )|\text {off}\rangle , \nonumber \\ |3,t = 3\rangle= & {} \frac{1}{\sqrt{2}}(|\downarrow \rangle - |\uparrow \rangle )|\text {off}\rangle ,\nonumber \\ |3, t = 4\rangle= & {} \frac{1}{2} (|\downarrow \rangle -|\uparrow \rangle )|\text {off}\rangle - \frac{1}{\sqrt{2}}|\downarrow \rangle |\text {on}\rangle . \end{aligned}$$
(2.2)

In the last step, the results of the first and second scenarios are realised with equal probability.

2.2.2 Mathematical formulation and equivalence with the non-delayed experiment

We now turn to casting the analysis above in the language of quantum operators. For this purpose we use the following set-up and notation. We introduce the Hilbert spaces \(H_{\textrm{ph}} = {\mathbb {C}}^2\) and \(H_{\textrm{r}} = {\mathbb {C}}^2\) modelling the photon and the quantum random bit generator. The composite system of photon and random bit is modelled on the Hilbert space \(H_{\textrm{ph}}\otimes H_{\textrm{r}}\eqsim {\mathbb {C}}^4,\) which we think of as being endowed with the orthonormal basis

$$\begin{aligned} \left\{ |\uparrow \rangle |\textrm{off}\rangle = \begin{pmatrix} 1 \\ 0 \\ 0 \\ 0\end{pmatrix}, \ |\uparrow \rangle |\textrm{on}\rangle = \begin{pmatrix} 0 \\ 1 \\ 0 \\ 0 \end{pmatrix}, \ |\downarrow \rangle |\textrm{off}\rangle = \begin{pmatrix} 0 \\ 0 \\ 1 \\ 0 \end{pmatrix}, \ |\downarrow \rangle |\textrm{on}\rangle = \begin{pmatrix} 0 \\ 0 \\ 0 \\ 1 \end{pmatrix} \right\} . \end{aligned}$$

As before, the states \(|\textrm{on}\rangle \) and \(|\textrm{off}\rangle \) of the quantum random bit generator correspond to the presence, respectively absence, of the second beamsplitter. We model the various steps of the delayed choice experiment as unitary operators acting on \(H_{\textrm{ph}}\otimes H_{\textrm{r}}.\) With respect to the above basis, the first beamsplitter and the mirrors act on this tensor product respectively as \(H\otimes I\) and \(X\otimes I,\) where the Hadamard operator H and the ‘X-gate’ operator X act on \(H_{\textrm{ph}}\) by the unitary matrices

$$\begin{aligned} H =\frac{1}{\sqrt{2}} \begin{pmatrix} 1 &{} 1 \\ 1 &{} -1 \end{pmatrix},\quad X = \begin{pmatrix} 0 &{} 1 \\ 1 &{} 0 \end{pmatrix}, \end{aligned}$$

respectively. The random number generator can be modelled as \(I\otimes H,\) with the Hadamard matrix acting on \(H_{\textrm{r}}.\) Lastly, the dependence of the second beamsplitter on the state of the random generator can be modelled by the controlled Hadamard operator R on \(H_{\textrm{ph}}\otimes H_{\textrm{r}}\) given by

$$\begin{aligned} R =\begin{pmatrix} 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} \frac{1}{\sqrt{2}} &{} 0 &{} \frac{1}{\sqrt{2}} \\ 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} \frac{1}{\sqrt{2}} &{} 0 &{}-\frac{1}{\sqrt{2}} \\ \end{pmatrix}. \end{aligned}$$

The experiment can then be described by following the sequence of operations on the original state to arrive at the final (to be measured) state. The photon first encounters a beamsplitter \((H\otimes I),\) then a mirror \((X\otimes I),\) after which the random generator is initiated \((I\otimes H)\) and, lastly the photon encounters the second beamsplitter controlled by the random generator (R). Therefore the complete experiment can be described by

$$\begin{aligned} A := R \circ (I\otimes H) \circ (X\otimes I) \circ (H\otimes I). \end{aligned}$$
(2.3)

Next, we show the equivalence of the delayed experiment to the non-delayed experiment. In this case, we operate the quantum random bit generator first, that is, after the photon has left the laser but before it arrives at the beamsplitter. In line with the explanations provided in the previous paragraph, this scenario can be modelled as

$$\begin{aligned} A' := R\circ (X\otimes I) \circ (H\otimes I) \circ (I\otimes H). \end{aligned}$$
(2.4)

Then by direct computation of both operators (or by noting that the matrices \((X\otimes I) \circ (H\otimes I)\) and \(I\otimes H\) commute) we see that

$$\begin{aligned} A = \frac{1}{2} \begin{pmatrix} 1 &{} 1 &{} -1 &{} -1 \\ {\sqrt{2}} &{} -{\sqrt{2}} &{} 0 &{} 0 \\ 1 &{} 1 &{} 1 &{} 1 \\ 0 &{} 0 &{} -{\sqrt{2}} &{} {\sqrt{2}} \\ \end{pmatrix} = A' . \end{aligned}$$

From this result, we conclude that the delayed and non-delayed experiments are equivalent with respect to their mathematical description in the sense that, as far as the final state is concerned, it is irrelevant at which moment we operate the quantum random generator.

2.2.3 The operational equivalence of the delayed and non-delayed experiment in relation to special relativity

Although our arguments do not depend on special relativity and are independent of certain parts of the experiment being space-time separated from others, it is of some interest to cast both the delayed and non-delayed formulations of Wheeler’s Gedanken-experiment into the space-time format presented in Fig. 3. Not only do these figures isolate which operations are reversed between the delayed and non-delayed set-ups, but they also show that the equality is consistent with the relativistic principle that space-like separated operations should commute. Figure 4 displays the descriptions of the experiments given in Eqs. (2.3) and (2.4) in standard quantum computing language. These figures further clarify the equivalence \(A = A'.\)

Fig. 3
figure 3

A schematic space-time diagram of Wheeler’s delayed choice experiment with delay. As in Fig. 2, the black dot denotes the single-photon source, the open rectangle denotes a beamsplitter, the closed rectangles are mirrors, and the open-half circles denote the detectors. The two figures on the right display the temporal sequence of Wheelers experiment without delay in A and with delay in B. The dotted lines in both figures indicate how the regions crucial for the change in temporal order are space-like separated. In the schematic, only the horizontal axis is included and therefore only one mirror and detector are depicted in the right-hand figures, but these represent both mirrors and both detectors

Fig. 4
figure 4

A quantum computing schematic of Wheeler’s delayed choice experiment without delay in A and with delay in B, matching the temporal sequences in Fig. 3

2.2.4 Conclusion

Combining the arguments from the preceding sections, we conclude that the experimental set-up of Wheeler’s gedanken-experiment with and without delayed are indistinguishable on the mathematical level and equivalent on the physical level. Therefore, from an operational point of view, in terms of predictions using standard quantum mechanics no such thing as ‘delayed’ choice exists in this experiment.

3 The delayed quantum eraser

The most prevalent formulation of a delayed choice experiment today appears to be the ‘delayed quantum eraser’ of Scully and Drühl [16], experimentally realised by Kim et al. [12]. In contrast to Wheeler’s original idea, its formulation is not tied to any (Copenhagen-like) interpretation of the experiment. The delayed quantum eraser makes use of pairs of entangled particles. Its key feature is that after the first particle has been measured, it is possible to draw certain conclusions about it that seemingly depend on the outcome of the measurement of the second particle, which is performed at a later time. More concretely, upon repeating the experiment multiple times, we are able to identify subsets in our data of the first-measured particles which display either wave-like or particle-like behaviour, using the information obtained by measuring their entangled twins at a later moment. The puzzling thing is that wave-like or particle-like behaviour of the first-measured particles appeared to be already hidden in the data before we measured the entangled twins. As such, future events seem to retro-actively exert an influence on past events, and this retro-active influence even seems to affect already measured data.

3.1 Set-up and results

The delayed quantum eraser experiment starts by sending a single photon towards a standard double slit, reminiscent of Young’s famous double slit experiment. Behind the slits, a nonlinear crystal converts the incoming photon into an entangled pair of photons of half the frequency. The first photon of this pair is called the signal photon and is sent towards a screen. The second photon is called the idler photon and is sent towards a set-up of (half-) mirrors and detectors to achieve the delayed erasure. As was the case with the delayed choice experiment, this set-up is used to combine two experiments.

The first experiment (see Fig. 5) places the detectors \(D_1\) and \(D_2\) directly in the path of the idler photon.

Fig. 5
figure 5

The first quantum eraser experiment. The black dot denotes the single-photon source, from which a photon is sent through two slits towards a nonlinear crystal denoted by the blue rectangle. From there one photon is sent towards the screen (down) and one photon is sent towards two detectors which can be used to determine the path it took (right) (colour figure online)

After completion of the experiment, we split the dots on the screen created by the signal photons into two groups—one group consisting of the dots corresponding to those pairs whose idler photons were detected in detector \(D_1\) and the other group consisting of the dots associated with detector \(D_2.\) The result of the experiment is that each of these two groups shows a pattern reminiscent of photons moving through a single slit. Neither one of the two groups of dots shows interference.

Fig. 6
figure 6

The second quantum eraser experiment. The black dot denotes the single-photon source, from which a photon is sent through two slits towards a nonlinear crystal denoted by the blue rectangle. From there, one photon is sent towards the screen (down). The other photon is sent towards the mirror, denoted by black rectangles, and the beamsplitter, denoted by the open rectangles (right). This side ends with the detectors, denoted by the open-half circles. The red and green arrows follow the possible ‘paths’ of the particle. The red and green lines indicate the two possible paths taken by the photon; the last steps are made black to indicate the ‘erased’ path information. The path of the photon should then be read as a superposition between the two (colour figure online)

In the second experiment (see Fig. 6), the photons are directed towards a beamsplitter before being directed towards the detectors \(D_1\) and \(D_2.\) This causes both idler paths to interfere. When we again split the dots created by the signal photons into two groups—one corresponding to the clicks of the detector \(D_1\) and one corresponding to the clicks of detector \(D_2\). In this case, both sets show an interference pattern reminiscent of a double-slit experiment.

In the third experiment, the beamsplitters are set up in such a way that the decision as to which of the above two experiments is performed is the result of pure chance. In contrast to Wheeler’s experiment, no random generator is introduced, but the randomness is introduced by replacing the mirrors in the first experiment by beamsplitters (see Fig. 7). This means that the (half-)mirrors are set up in such a way that all detectors click with equal probability. When we now group the data of the position on the screen of the signal photon based on which detector clicked that round, we see the patterns as in the previous experiments: when detector \(D_1\) or \(D_2\) clicked, the corresponding data show interference, and when detectors \(D_3\) or \(D_4\) clicked, the corresponding data does not show interference.

Fig. 7
figure 7

The third quantum eraser experiment. The black dot denotes the single-photon source, from which a photon is sent through two slits towards a nonlinear crystal denoted by the blue rectangle. From there, one photon is sent towards the screen (down). The other photon is sent towards the mirrors, denoted by black rectangles, and the beamsplitters, denoted by the open rectangles (right). This side ends with the detectors, denoted by the open-half circles. The red and green arrows follow the possible ‘paths’ of the particle. The red and green lines indicate the two possible paths taken by the photon; the last steps are made black to indicate the ‘erased’ path information. The path of the photon should then be read as a superposition between the two (colour figure online)

3.2 The argument for retro-causality

In their discussion of the quantum eraser experiment, both Scully and Drühl [16] and Kim et al. [12] refrain from any specific interpretative argumentation, although their interest in these set-ups clearly is motivated by an argument of this sort. However, an argument in the spirit of Wheeler is not hard to reconstruct.

In this first set-up, the lack of displayed interference is often “explained” by saying that detection at the detectors retro-actively reveals which path the photon took. The absence of interference in the joint detection at the screen and at each one of the detectors is said to “reveal retroactively the particle-like nature of the photon”. Likewise, the displayed interference pattern in the second set-up is often “explained” by saying that the interference retro-actively erased the ‘which path’ information. The observed interference in the joint detection at the screen and each one of the detectors is said to “reveal retroactively the wave-like nature of the photon”. By extending the arm of the idler path, the choice between which aspect of the photon is revealed is crucially delayed. Therefore, the third case can be understood as a retro-active influence on the wave or particle nature of the photon.

The apparent improvement of the delayed quantum eraser over Wheeler’s original experiment is that even if one takes an agnostic position as to whether the photon “was” a particle or a wave, the experiment still appears somewhat puzzling at first sight. As the grouping of data of the signal photon based on the clicked detector of the idler photons exposes previously hidden patterns, we can still group the data of an experimenter placed at the screen in seemingly locally random sets (according to the outcomes of the idler photon) that seem to enforce a certain interpretation (c.f. [14, p. 483]). That is, for the locally random subsets of signal photon data, we seem to be able to make an interference pattern appear or not appear.

3.3 Analysis of the experiment

The key observation for the analysis of the delayed quantum eraser presented below is summarised by Fig. 8. That is, only when we group the dots on the screen based on which of the detectors have clicked after many runs of the experiment, patterns of interference do or do not show [12]. In what follows, we build a forward analysis of the experiment based on this principle. We first provide a time evolution of the quantum state describing the three experiments, as in Figs. 5, 6, and 7, using text-book quantum mechanics only. After this first analysis, we re-cast our initial analysis in the mathematical language of operators as we have done for Wheeler’s experiment above. We use this analysis to again show the equivalence of the delayed and non-delayed versions of the experiment. Lastly, we ground this equivalence in the physical principle that space-like separated arms of the experiment necessarily commute.

Fig. 8
figure 8

The key difference between marking and not marking the outcome of the data on the screen based on the detector

3.3.1 Analysis of the first experiment

In line with our explanation above, we begin by analysing the first experiment (see Fig. 5).

Fig. 9
figure 9

A complete and zoomed-in schematic of the delayed choice quantum eraser with bundles of light instead of rays

To describe the state of the signal photon, we keep track of the angles of the paths of the photon after having passed the respective slits (see Fig. 9). To good approximation, we assume that these angles can only take finitely many discrete values \(\theta _{R,1},\ldots ,\theta _{R,N}\) (for photons passing through the right slit), respectively \(\theta _{L,1},\ldots ,\theta _{L,N}\) (for photons passing through the left slit), in the interval \((-\frac{1}{2}\pi ,\frac{1}{2}\pi ).\) These angles are subject to a constraint formulated shortly. The approximation is used to allow for an analysis of the experiment in basic (finite-dimensional) quantum mechanics, which simplifies the (already sufficiently complex) calculation to follow. The original situation can be recovered by taking the limit \(N \rightarrow \infty .\)

The state of the system (photon plus screen) after the photon has passed through the double slit can then be described by the superposition

$$\begin{aligned} |1, t = 1\rangle = \frac{1}{\sqrt{2}}\left( \sum _{n=1}^N p_R(\theta _{R,n}) |R,n\rangle + \sum _{m=1}^N p_L(\theta _{L,m})|L,m\rangle \right) |\text {ready}\rangle _S. \end{aligned}$$

Here, \(|R,n\rangle \) and \(|L,m\rangle \) describe the state of a photon that passed through the right slit and continued at angle \(\theta _{R,n},\) respectively through the left slit and continued at angle \(\theta _{L,m}.\) These states are distributed with intensities \(p_{R, n} := p_R(\theta _{R,n})\) and \(p_{L, m} := p_L(\theta _{L,m}),\) respectively, which are simply the normalised intensities after diffraction from the slit.

After the photon has passed through the nonlinear crystal, the state of the system can be described as

$$\begin{aligned} |1, t = 2\rangle = \frac{1}{\sqrt{2}} \left( \sum _{n=1}^N p_{R,n} |R,n\rangle _s|R,n\rangle _i + \sum _{m=1}^N p_{L,m} |L,m\rangle _i|L,m\rangle _s\right) |\text {ready}\rangle _S, \end{aligned}$$

where \(|R,n\rangle _i|R,n\rangle _s\) and \(|L,n\rangle _i|L,n\rangle _s\) refer to the newly created entangled pairs of idler photon and signal photon.

Next, the signal photon will be measured at the screen. We assume that the angles \(\theta _{L,1}, \ldots , \theta _{L,N}\) and \(\theta _{R,1}, \ldots \theta _{R,N}\) are such that, for each \(k=1,\ldots ,N,\) the photons corresponding to angles \(\theta _{L,k}\) and \(\theta _{R,k}\) both arrive at the same position \(x_k\) on the screen. See Fig. 9.

Suppose now that the signal photon is measured to arrive at the screen in partition \(x_k.\) As the signal photon gets absorbed, we can model the state as

$$\begin{aligned} |1, t = 3\rangle = \frac{1}{\sqrt{2}} \Bigl ( \tilde{p}_{R,k} |R,k\rangle _i + \tilde{p}_{L,k} |L,k\rangle _i\Bigr )|x_k\rangle _S, \end{aligned}$$
(3.1)

where \(|x_k\rangle _S\) denotes the state of the screen having registered the signal photon in the interval \(x_k.\) The numbers \(|\tilde{p}_{R,k}|^2\) and \(|\tilde{p}_{L,k}|^2\) represent the probability density of photons with states \(|R,k\rangle \) and \(|L,k\rangle \) reaching \(x_k\) respectively. These probabilities can be calculated from our initial distribution by simple conditional probabilities, as

$$\begin{aligned} \tilde{p}_{R,k}(x) = \frac{p_{R,k}}{\sqrt{|p_{R,k}|^2 + |p_{L,k}|^2}},\quad \tilde{p}_{L,k}(x) = \frac{p_{L,k}}{\sqrt{|p_{R,k}|^2 + |p_{L,k}|^2}}. \end{aligned}$$

Expression (3.1) clearly shows that the presence of the idler photon prevents interference from taking place. Upon arrival at position \(x_k,\) the idler photon is left in a superposition of states whose momentum is perpendicular to the path of the signal photon on its way to position \(x_k.\) It is in this sense that the idler photon, after the arrival location of the signal photon has been recorded at the screen, is in a superposition of idler states carrying ‘which path’ information. In the limit for \(N\rightarrow \infty ,\) this corresponds to the idler photon being in a superposition of two states with well-defined angles \(\vartheta _{R,x}\) and \(\vartheta _{L,x}\) relative to the crystal. These angles are uniquely determined by the position x on the screen. Effectively, the distribution on the screen will now be described by the normalised addition of the distributions corresponding to the two ‘which path’ scenarios corresponding to passage through slit L (with slit R closed) and R (with slit L closed) (see Fig. 10).

Fig. 10
figure 10

Experiment 1: The two groups of dots

The phenomenon that the interference pattern disappears in the presence of idler photons, regardless of whether or not one actually measures them, was described by Zeilinger [22] in an equivalent set-up as follows (additions in brackets by the present authors):

“(...) whenever particle 1 [the idler photon] is found in beam a,  particle 2 (the signal photon) is found in beam b and whenever particle 1 is found in beam \(a',\) particle 2 is found in beam \(b'.\) The quantum state is

$$\begin{aligned} |\psi \rangle = \frac{1}{\sqrt{2}}(|a\rangle _1|b\rangle _2+|a'\rangle _1|b'\rangle _2). \end{aligned}$$

Will we now observe an interference pattern for particle 1 behind its double slit? The answer has again to be negative because by simply placing detectors in the beams b and \(b'\) of particle 2 (the idler photon) we can determine which path particle 1 took. Formally speaking, the states \(|a\rangle _1\) and \(|a'\rangle _1\) again cannot be coherently superposed because they are entangled with the two orthogonal states \(|b\rangle _1\) and \(|b'\rangle _1.\)

Obviously, the interference pattern can be obtained if one applies a so-called quantum eraser which completely erases the path information carried by particle 2. That is, one has to measure particle 2 in such a way that it is not possible, even in principle, to know from the measurement which path it took, \(a'\) or \(b'.\)

The above analysis, which is similar in spirit to the one presented for the double-double-slit experiment in [11], makes this precise.

3.3.2 Analysis of the second experiment

In the second experiment (of Fig. 6), the ‘which path’ information contained in the angles \(\vartheta _{R,x}\) and \(\vartheta _{L,x}\) of the idler photons is erased through the introduction of a beamsplitter. Naively, one would expect that therefore, in this configuration, an interference pattern should build up after all, and that contextual information is required: the photon, to be able to decide which probability distribution to “use”, must “know” in advance the experimental context in which it finds itself. As the ensuing analysis will show, no such information is needed and, in fact, no interference will build up.

Let us partition the arrival at the screen into two groups, \(G_1\) and \(G_2;\) the first consists of those arrivals whose idler partners made detector \(D_1\) click, and the second of those dots whose idler partners made detector \(D_2\) click. To include the two-detector system comprising \(D_1\) and \(D_2\) into the considerations, we introduce a ready-to-measure state for this system. Thus we replace the state \(|1, t = 3\rangle \) of Eq. (3.1) by

$$\begin{aligned} |2, t = 3\rangle = \frac{1}{\sqrt{2}}\Bigl ( \tilde{p}_{R,k}|R,k\rangle _i + \tilde{p}_{L,k}|L,k\rangle _i\Bigr )|x_k\rangle _S|\text {ready}\rangle _D, \end{aligned}$$

and similarly for \(|1, t = 2\rangle .\) Having passed the beamsplitter, before reaching the detector system the state can be described as

$$\begin{aligned} |2, t = 4\rangle&= \frac{1}{2}\Bigl ((\tilde{p}_{R,k}|R,k,1\rangle _i + \tilde{p}_{R,k}|R,k,2\rangle _i)\\&\quad + (\tilde{p}_{L,k}|L,k,1\rangle _i - \tilde{p}_{L,k}|L,k,2\rangle _i)\Bigr )|x_k\rangle _S|\text {ready}\rangle _D, \end{aligned}$$

with newly labelled idler states indicating which detectors lie in their paths. Upon arrival at the detectors the idler photons are absorbed, leaving the system in the (as yet unobserved) state

$$\begin{aligned} |2, t = 5\rangle&= \frac{1}{2} \Bigl ( (\tilde{p}_{R,k}|1\rangle _D + \tilde{p}_{R,k}|2\rangle _D) +(\tilde{p}_{L,k}|1\rangle _D - \tilde{p}_{L,k}|2\rangle _D)\Bigr )|x_k\rangle _S,\\&= \frac{1}{2} \Bigl ((\tilde{p}_{R,k} + \tilde{p}_{L,k})|1\rangle _D + (\tilde{p}_{R,k}- \tilde{p}_{L,k})|2\rangle _D\Bigr )|x_k\rangle _S, \end{aligned}$$

where \(|1\rangle _D\) and \(|2\rangle _D\) describe the states of the detector system in which \(D_1\) respectively \(D_2\) has registered a photon.

To arrive at an expression for the distribution of the \(G_1\)-arrivals we write

$$\begin{aligned} |1\rangle _D = |\text {click}\rangle _{D_1} |\text {no click}\rangle _{D_2}, \quad |2\rangle _D = |\text {no click}\rangle _{D_1} |\text {click}\rangle _{D_2} \end{aligned}$$

and trace out detector \(D_2.\) This results in the reduced state (density matrix)

$$\begin{aligned} |2, t = 5\rangle \langle 2, t = 5|_{D_2 \, \text {traced out}}&= \frac{1}{4}\Bigl ( |\tilde{p}_{R,k}|^2 + 2\ \Re \big (\tilde{p}_{R,k}\overline{\tilde{p}_{L,k}}\big )\nonumber \\&\quad + |\tilde{p}_{L,k}|^2\Bigr )|\text {click}\rangle _{D_1}|x_k\rangle _S\langle x_k|_S\langle \text {click}|_{D_1} \nonumber \\&\quad + \frac{1}{4}\Bigl ( |\tilde{p}_{R,k}|^2 - 2\ \Re \big (\tilde{p}_{R,k}\overline{\tilde{p}_{L,k}}\big ) \nonumber \\&\quad + |\tilde{p}_{L,k}|^2\Bigr )|\text {no click}\rangle _{D_1}|x_k\rangle _S\langle x_k|_S\langle \text {no click}|_{D_1}. \end{aligned}$$
(3.2)

On the basis of this state, we expect that the distribution of the \(G_1\)-arrivals, that is, the joint detection of the events \(\{\)arrival at \(x_k\) and \(D_1\) clicked\(\}\) shows the same interference as in the Young double slit experiment. Likewise, tracing out \(D_1\) results in the reduced state

$$\begin{aligned} |2, t = 5\rangle \langle 2, t = 5|_{D_1 \, \text {traced out}}&= \frac{1}{4}\Bigl ( |\tilde{p}_{R,k}|^2 + 2\ \Re \big (\tilde{p}_{R,k}\overline{\tilde{p}_{L,k}}\big ) \nonumber \\&\quad + |\tilde{p}_{L,k}|^2\Bigr )|\text {no click}\rangle _{D_2}|x_k\rangle _S\langle x_k|_S\langle \text {no click}|_{D_2} \nonumber \\&\quad + \frac{1}{4}\Bigl ( |\tilde{p}_{R,k}|^2 - 2\ \Re \big (\tilde{p}_{R,k}\overline{\tilde{p}_{L,k}}\big ) \nonumber \\&\quad + |\tilde{p}_{L,k}|^2\Bigr )|\text {click}\rangle _{D_2}|x_k\rangle _S\langle x_k|_S\langle \text {click}|_{D_2}. \end{aligned}$$
(3.3)

From this we infer that also the distribution of the \(G_2\)-arrivals shows the interference of the Young double slit experiment, but with a shift in the \(x_k\)-variable caused by the phase shift over \(\pi \) due to the presence of the minus sign (see Fig. 11).

The distribution of all arrivals, comprising both the \(G_1\)-arrivals and the \(G_2\)-arrivals, is obtained by tracing out both detectors, which results in the reduced state

$$\begin{aligned} |2, t = 5\rangle \langle 2, t = 5|_{D \, \text {traced out}} = |x_k\rangle _S\langle x_k|_S. \end{aligned}$$
(3.4)

We see that in this case no interference is built up.

On the basis of the above calculations, and in line with the heuristic argument of [7], we conclude that if the data is grouped in subsets based on whether detector 1 or detector 2 clicked, (3.2) and (3.3) predict the emergence of the interference patterns as depicted in Fig. 11. Furthermore, if this data is not grouped on the basis of which detector clicked, then no interference is detected and its outcome is equal to the distribution given in Fig. 8a.

Fig. 11
figure 11

Experiment 2: The two groups of dots

3.3.3 Analysis of the third experiment

In this experiment, the previous two experiments are combined by introducing a second beamsplitter in such a way that the idler photons reflected by the beamsplitter to one of the detectors \(D_1\) or \(D_2\) will not reveal ‘which path’ information, since the ‘which path’ information of the idler photons passing through the beamsplitter will be erased by the next beamsplitter in their path. Clicks of the detectors \(D_3\) and \(D_4\) do reveal ‘which path’ information. The reader will have no difficulty working out the formulas describing the succession of states in this scenario; the reasoning follows the same patterns as in the preceding two cases.

This time we can group the dots on the screen into four groups, corresponding to which of the detectors \(D_{1-4}\) clicked. The results for these groups are depicted in Fig. 12. In line with the first experiment, both the \(H_1\)-arrivals corresponding to registrations at detector \(D_1\) and the \(H_2\)-arrivals corresponding to registrations at \(D_2\) show interference (as they reveal no ‘which-path’ information), but the combined arrivals of \(H_1\) and \(H_2\) add up to a pattern without interference. In line with the second experiment, the \(H_3\)-arrivals and the \(H_4\)-arrivals corresponding to registrations at \(D_3\) and \(D_4,\) respectively, show no interference (as they reveal ‘which path’ information). Figures 13 depicts the resulting data on the screen after all dots are matched to a corresponding detector.

Fig. 12
figure 12

Experiment 3: The four groups of dots

Fig. 13
figure 13

Experiment 3: All dots combined

3.3.4 Mathematical analysis and equivalence with the non-delayed experiment

This section will give a mathematical analysis of the delayed quantum eraser similar to that presented in Sect. 2.2.2, for the scenario with delayed eraser and a version of it without delay. Again these scenarios turn out the produce the same final state.

Throughout the subsequent analysis, we fix a positive integer N;  only after performing all calculations that we interpret the results in passing to the limit \(N\rightarrow \infty .\) To good approximation, we assume that the photon, when passing the double slit, chooses between N fixed angles \(\theta _1,\ldots ,\theta _N\in (-\frac{1}{2}\pi ,\frac{1}{2}\pi ).\) The state of the photon, once it passed through the double slit, can be modelled by an element of the Hilbert space \({\mathbb {C}}^2\otimes {\mathbb {C}}^N.\) In this representation, the standard basis vectors \(|R,n\rangle := |R\rangle |n\rangle \) and \(|L,n\rangle = |L\rangle |n\rangle \) describe the state of a photon passing through the right, respectively left, slit and emanating from it at angle \(\theta _n.\)

The nonlinear crystal is modelled by a \((2N\times 4N)\)-matrix C acting from \({\mathbb {C}}^2\otimes {\mathbb {C}}^N\) to \(({\mathbb {C}}^2\otimes {\mathbb {C}}^N)\otimes ({\mathbb {C}}^2\otimes {\mathbb {C}}^N)\) with action

$$\begin{aligned} C: {\left\{ \begin{array}{ll} |R\rangle |n\rangle &{} \mapsto \ \ |R_s\rangle |n\rangle \otimes |R_i\rangle |n\rangle , \\ |L\rangle |n\rangle &{} \mapsto \ \ |L_s\rangle |n\rangle \otimes |L_i\rangle |n\rangle , \end{array}\right. }\end{aligned}$$

where the indices s and i on the right-hand side are nothing but a notational device to keep track of the signal and idler photons, respectively. We ignore the fact that the idler and signal photons have halved frequencies; this plays no role in the present qualitative analysis.

The action of the screen is modelled by any \((4N\times 2N)\)-matrix S acting from \(({\mathbb {C}}^2\otimes {\mathbb {C}}^N)\otimes ({\mathbb {C}}^2\otimes {\mathbb {C}}^N)\) to \({\mathbb {C}}^2\otimes {\mathbb {C}}^N\) with action

$$\begin{aligned} S: {\left\{ \begin{array}{ll} |R_s\rangle |n\rangle \otimes |R_i\rangle |n\rangle &{} \mapsto \ \ \frac{1}{\sqrt{N}}\sum _{n=1}^N \tilde{p}_{R,n}|R_i\rangle |n\rangle , \\ |L_s\rangle |n\rangle \otimes |L_i\rangle |n\rangle &{} \mapsto \ \ \frac{1}{\sqrt{N}}\sum _{n=1}^N \tilde{p}_{L,n}|L_i\rangle |n\rangle , \end{array}\right. } \end{aligned}$$

The action of the beamsplitters and mirrors may be lifted to the liner operators \(H\otimes I\) and \(X\otimes I\) on \({\mathbb {C}}^2\otimes {\mathbb {C}}^N.\) The configuration of two beamsplitters behind the non-linear crystal (see Fig. 7) acts as one beamsplitter, provided we interpret R-photons as ‘up’ and L-photons as ‘down’, and interpret the photons deflected towards detectors \(D_1\) as ‘up’ and \(D_2\) as ‘down’.

By following the sequence of elements encountered by the photon, we can determine the operator representing the delayed quantum eraser. We see that the photon, after going through the first slits, encounters the crystal C,  the Screen S,  then the first beamsplitter \((H\otimes I),\) the mirrors \((X \otimes I)\) and, lastly, the second beamsplitter \((H\otimes I).\) The delayed choice experiment may therefore be represented by the composition

$$\begin{aligned} A = (H\otimes I)\circ (X\otimes I)\circ (H\otimes I) \circ S \circ C = (Y \otimes I)\circ S\circ C, \end{aligned}$$
(3.5)

where \(Y = \bigl ({\begin{matrix} 1 \ 0 \\ 0 \ -1\end{matrix}}\bigr ).\)

As in the case of Wheeler’s delayed choice experiment, we may rearrange the order of steps in such a way that no delayed choice takes place. This will be done by making the distance from the crystal to the screen long, and the distance to the detector short. Here, the beamsplitters are interpreted as acting on \(({\mathbb {C}}^2\otimes {\mathbb {C}}^N)\otimes ({\mathbb {C}}^2\otimes {\mathbb {C}}^N)\) as \((I\otimes I)\otimes (H\otimes I)\) since the beamsplitter acts trivially on the signal photons; the mirrors can be represented in the same way. In this case, after the photon has passed the crystal C,  the idler photon will pass the first beamsplitter \(((I\otimes I)\otimes (H\otimes I))\), the mirrors \(((I\otimes I)\otimes (X\otimes I))\) and the second beamsplitter \(((I\otimes I)\otimes (H\otimes I)).\) Lastly, the signal photon will encounter the screen S. This version of the experiment may now represented by the composition

$$\begin{aligned} A'&= S \circ ((I\otimes I)\otimes (H\otimes I))\circ ((I\otimes I)\otimes (X\otimes I))\circ ((I\otimes I)\otimes (H\otimes I))\circ C\nonumber \\&= S\circ ((I\otimes I)\otimes (Y\otimes I))\circ C\nonumber \\&= (Y\otimes I)\circ S\circ C, \end{aligned}$$
(3.6)

where the last equality is immediate from the definition of S. Combining Eqs. (3.5) and (3.6), we see again that \(A = A'.\) Therefore, as in the case of the delayed choice experiment, we conclude that the scenarios with and without delayed choice result in the same final state.

3.3.5 The operational equivalence of the delayed and non-delayed experiment in relation to special relativity

To conclude, let us analyse the space-time structure of the delayed quantum eraser. In contrast to the Wheeler delayed choice experiment, in the delayed quantum eraser the two arms of the experiment do not seem to be causally connected. This assessment is not correct, however. To distinguish the patterns related to each detector on the screen, the output of the detectors needs to be combined with the screen data after the measurements of both arms of the experiment have been completed. This subtle, but crucial, point brings this case very close to the previous case. Figure 14 shows how the delayed quantum eraser can be split up into two space-like separated parts, which commute. The insight from Fig. 14 thus physically grounds the equality \(A=A'\) in the language of special relativity. Furthermore, in our view a comparison between Figs. 14 and 3 shows the crucial similarity between the two experiments from a relativistic point of view.

Fig. 14
figure 14

A schematic depiction of the space-time diagram of a non-delayed quantum eraser. The left side of the figure denotes the standard set-up of the delayed quantum eraser, as in Fig. 7. The right side of the figure places this set-up in a time-ordered sequence. The black dot denotes the single-photon source, from where a photon is sent through two slits towards a nonlinear crystal denoted by the blue rectangle. From there, one photon is sent towards some grey medium, slowing its trajectory. From there it moves towards the screen, after the other photon is registered by the detectors. This other photon is first sent towards the mirrors, denoted by black rectangles, and the beamsplitters, denoted by the open rectangles. This side ends with the detectors, denoted by the open-half circles. The lines indicate the causal relation between the detectors, which are used to colour the eventual outcome on the screen. The dotted lines are used to indicate that the two enclosed regions are space-like separated. Schematic A shows the space-time diagram of the set-up when the detection of the idler photon is not delayed, B shows the space-time diagram of the delayed set-up (colour figure online)

4 Discussion and conclusions

The controversies surrounding delayed choice experiments can only be truly resolved by a ‘forward’ understanding of the experiment, rather than by a ‘backward’ analysis. As we have demonstrated, such a forward analysis is indeed possible, rendering Wheeler’s delayed choice experiment and the delayed quantum eraser no more puzzling than anything else involving superposition and/or entanglement. At any given moment during or after the experiments, any agent involved in the experiment can fully explain the data collected he/she has access to at that moment.

At no point in the experiments, information from the future or contextual information is needed to explain or predict what happens next. To paraphrase Wheeler, in his thought experiment the photon at the first beamsplitter does not need to “know” about the full configuration of the experiment to “decide” between wave-like or particle-like behaviour. In fact, questions such as whether the photon “was” a wave or particle at the various stages of the experiment—the centrepiece in arguments purporting to demonstrate retro-causation—are completely meaningless from an operational point of view and can only lead to pseudo-problems.

Whilst the problems with delayed-choice experiments are often connected to a realist interpretation of the wave-function (as in Ma et al. [14]), a realist following the analysis presented in this paper will not encounter any problems with delayed choice experiments. Rather, the root of the problem seems to be in the use of physical concepts such as ‘wave-particle duality’ or ‘which path information’ as explanatory devices rather than descriptive tools providing heuristic pictures. In the case of Wheeler’s gedanken-experiment, the which-path question becomes meaningful after the random generator has provided an outcome; in the case of the delayed eraser, the which-path question becomes meaningful after the idler photon has passed the beamsplitters. Only after these crucial steps the presence or absence of interference can be argued for on the basis of ‘which path information’. Such an explanation, therefore, involves a certain degree of ‘backwards’ reasoning. It is this ‘backwards’ reasoning that stands at the core of the problems surrounding delayed choice experiments, as it leads to the questioning of past states without a present record. Wheeler himself wrote [20]

Does this result [the delayed-choice experiment] mean that present choice influences past dynamics, in contravention of every formulation of causality? Or does it mean, calculate pedantically and don’t ask questions? Neither; the lesson presents itself rather as this, that the past has no existence except as it is recorded in the present.Footnote 2

In the experiments discussed in the present paper we have shown that this ‘backwards’ reasoning and the related questioning of past states by the use of heuristic physical concepts can be avoided completely if one adheres to a strictly mathematical analysis. Paraphrasing Zeilinger’s words from [22], in the context of the delayed quantum eraser, the disappearance of interference should not be explained contextually on the basis that “which-path information is still available”, but on the basis of a step-by-step forward analysis of the type presented here.