A Delayed Choice Quantum Eraser Explained by the Transactional Interpretation of Quantum Mechanics

This paper explains the delayed choice quantum eraser of Kim et al. (A delayed choice quantum eraser, 1999) in terms of the transactional interpretation (TI) of quantum mechanics by Cramer (Rev Mod Phys 58:647, 1986, The quantum handshake, entanglement, nonlocality and transactions, 1986). It is kept deliberately mathematically simple to help explain the transactional technique. The emphasis is on a clear understanding of how the instantaneous “collapse” of the wave function due to a measurement at a specific time and place may be reinterpreted as a relativistically well-defined collapse over the entire path of the photon and over the entire transit time from slit to detector. This is made possible by the use of a retarded offer wave, which is thought to travel from the slits (or rather the small region within the parametric crystal where down-conversion takes place) to the detector and an advanced counter wave traveling backward in time from the detector to the slits. The point here is to make clear how simple the transactional picture is and how much more intuitive the collapse of the wave function becomes if viewed in this way. Also, any confusion about possible retro-causal signaling is put to rest. A delayed choice quantum eraser does not require any sort of backward in time communication. This paper makes the point that it is preferable to use the TI over the usual Copenhagen interpretation for a more intuitive understanding of the quantum eraser delayed choice experiment. Both methods give exactly the same end results and can be used interchangeably.


Complementarity, Which Path Information and Quantum Erasers
, in his famous lectures on physics [1] stated that the Young's double slit experiment contains nearly all mysteries of quantum mechanics, namely, waveparticle duality, particle trajectories, collapse of the wave function and non locality. We may see interference, or we may know through which slit the photon passes, but we can never know both at the same time. This is what is commonly referred to as the principle of complementarity. We say two observables are complementary if precise knowledge of one implies that all possible outcomes of measuring the other are equally likely. The fundamental enforcement of complementarity arises from correlations between the detector and the interfering particle in a way that show up in the wave function for the system. It is not, as some undergraduate text books would have you believe, a consequence of the uncertainty principle, although the application of the uncertainty principle makes for an easy calculation when the wave function of the system is difficult to write out. There have been many gedanken (German for thought) experiments over the years to show complementarity. The most famous are the Einstein recoiling slit, Feynman's light scattering scheme both discussed in Feynman's lectures on physics [1] and Wheeler's delayed choice experiment [2]. The TI of Cramer is preferred in this paper, over the usual Copenhagen interpretation (CI), for a more intuitive understanding of the quantum eraser delayed choice experiment. An alternative theory, which also claims to present an alternative perspective to the CI and provides an intuitive understanding of the paradoxes of quantum mechanics, including the quantum eraser and delayed choice, is that by Sohrab [3,4] which we mention for completeness, but will not discuss further there.
Of particular interest here is the delayed choice quantum eraser gedanken experiment by Scully and Druhl [5]. This work described a basic quantum eraser experiment and a delayed choice quantum eraser arrangement. The basic quantum eraser experiment is described using two 3-level -type atoms [6], in the place of two slits. See excites either atom A or B. The excited atom then decays and emits a signal photon. Interference fringes are sought between these signal photons on a screen some distance away. Let the identical 3-level atoms have one upper level a and two lower levels b and c. The laser excites one of the atoms up to the level a but the atom can de-excite to either state b or c. If both atoms start off in the ground state c, there are two possibilities. The excited atom decays and falls back to level c, so the excited atom becomes indistinguishable from the other atom which was not excited. In this case we would expect to see an interference pattern since there is no which path information.
In the second case, the excited atom drops to level b which is distinguishable from level c. In this case we have which path information and we would get no interference pattern. That describes the basic quantum eraser. For a delayed choice quantum eraser [5], the 3-level atoms change to 4-level atoms with levels a, b, c, d, with d the ground state. See Fig. 2. Instead of one exciting laser pulse there are two closely spaced pulses, which will both go to the same atom. The first laser pulse excites either atom A or B from the ground state d to the upper level a. The excited atom then spontaneously decays to c emitting the signal photon. The second laser pulse then excites the atom from level c to level b, which then decays with the emission of a lower energy idler photon to the ground state. Now the atoms are inside a cleverly constructed cavity with a trap door separating them. The cavity is transparent to the signal photons and laser light but strongly reflects the idler photons. There is a detector capable of detecting the idler photons only near atom A. The trap door will prevent the idler photon from B being detected. Now we have a choice whether to open the trap door or leave it closed. The signal photon detection is now correlated with the idler photon detection. The experiment has become a delayed choice quantum eraser, whether we see interference or not will depend on whether we leave the trap door open or closed. If the trap door is closed and we detect an idler photon, we know that atom A was excited. If we do not detect a photon then atom B was excited, either way we have which path information that will destroy the fringes . If the trap door is open,  then we no longer have which path information since either atom could have emitted  the idler photon. In principle the decision, to leave the trap door open or closed, can  be made after the signal photons have been detected. The paradox is, how does the  signal photon know which pattern to make, a single slit diffraction pattern or a two-slit  interference pattern, if we have not yet decided to leave the trap door open or closed? Englert et al. [7,8] in 1991 constructed a very nice atom interference gedanken experiment that shows the theory in a very straightforward manner, although the experiment would be extremely difficult to perform in practice. Soon afterward in 1993, a polarization experiment by Wineland's group [9], was the first to demonstrate an actual realization of the Scully-Druhl quantum eraser gedanken experiment. They used mercury ions in trap as the two "atoms" and observed linear π and circularly σ polarized light. Choosing to detect linear polarized light, corresponded to the case that the ions in a trap were in the same initial and final state. This implies that there was no which path information and so there was interference. Choosing to observe circular polarized light, corresponded to the situation that the ions were in distinguishable end states after scattering a photon, so which path information was available and hence there was no interference. You could choose to observe interference or not depending on whether you chose to observe linear or circularly polarized light.
There have been many quantum optics experiments involving two photon entangled states and quantum eraser arrangements to demonstrate the complementarity arguments above. Four of the better ones are [10][11][12][13]. One experiment in particular by Zeilinger's group [14] is worthy of a special note. The arm lengths in their apparatus were very long, between 55 m up to 144 km. They point out that there is no possible communication between one photon and the other in the entangled pair because of the space-like separation between them and they assume no faster-than-light communication is possible.
The most famous real experiment of the delayed choice type is that by Kim et al. [15], using parametric down conversion entangled photons. It has drawn considerably more press than any other experiment of this type and even has a couple of online animations [16]. We choose to present our case for the transactional interpretation (TI) of quantum mechanics using the Kim experiment as our example, but any of the delayed choice quantum erasers would work just as well.

Introduction to the Transactional Interpretation of Quantum Mechanics
The TI of quantum mechanics was proposed by Cramer [17] in a review article in 1986 and a short overview in 1988 [18]. More recently Cramer has written a book [19] which should become available early in 2016. It is a way to view quantum mechanics that is very intuitive and easily accounts for all the well known quantum paradoxes, Einstein Rosen Podolsky (EPR) experiment [20], which-way detection and quantum eraser experiments, [21,22]. Unfortunately, it has garnered little support over the years and has fallen off the radar. It deserves a much broader dissemination and part of the motivation to publish this paper was to bring Cramer's ideas, and the advanced wave concept, to the attention of the younger generation of physicists, who may not have heard of them before. The advanced wave is a standard solution of relativistic wave equation and was utilized by such notable physicists as Dirac, Wheeler, Feynman, Davies, Hoyle and his doctoral student Narlikar. The direct particle interaction theory (which uses advanced waves, traveling backward in time) was used by Wheeler, Feynman, Schwinger, Hoyle and Narlikar. The direct particle interaction does away with the idea of a field, the vacuum field then would be truly empty, with zero energy, as Feynman believed. Frank Wilczek recounts a conversation with Feynman [23].
Around 1982, I had a memorable conversation with Feynman at Santa Barbara. Usually, at least with people he didn't know well, Feynman was "on" -in performance mode. But after a day of bravura performances he was a little tired and eased up. Alone for a couple of hours, before dinner, we had a wide-ranging discussion about physics. Our conversation inevitably drifted to the most mysterious aspect of our model of the world-both in 1982 and today-the subject of the cosmological constant. (The cosmological constant is, essentially, the energy density of empty space. Anticipating a little, let me just mention that a big puzzle in modern physics is why empty space weighs so little even though there's so much to it.) I asked Feynman, "Doesn't it bother you that gravity seems to ignore all we have learned about the complications of the vacuum?" To which he immediately responded, "I once thought I'd solved that one." Then Feynman became wistful. Ordinarily he would look you right in the eye, and speak slowly but beautifully, in a smooth flow of fully formed sentences or even paragraphs. Now, however, he gazed off into space; he seemed transported for a moment, and said nothing. Gathering himself again, Feynman explained that he had been disappointed with the outcome of his work on quantum electrodynamics. It was a startling thing for him to say, because that brilliant work was what brought Feynman graphs to the world, as well as many of the methods we still use to do difficult calculations in quantum field theory. It was also the work for which he won the Nobel Prize. Feynman told me that when he realized that his theory of photons and electrons is mathematically equivalent to the usual theory, it crushed his deepest hopes. He had hoped that by formulating his theory directly in terms of paths of particles in space-time -Feynman graphs -he would avoid the field concept and construct something essentially new. For a while he thought he had. Why did he want to get rid of fields? "I had a slogan," he said. Ratcheting up the volume and his Brooklyn accent, he intoned it: The vacuum doesn't weigh anything [dramatic pause] because there's nothing there! Experimental observations show that the vacuum energy density is in fact very close to zero. To calculate the vacuum energy in quantum field theory, we must admit that spacetime is probably not a continuum but rather has a discrete nature, at quantum dimensions, and only sum the zero-point energies for vibrational modes having wavelengths larger than, the Planck length (10 −35 m) and less than or equal to the size of the universe (diameter approx. 8.8 × 10 26 m). This gives a ridiculously large but finite vacuum energy density of about 10 111 J m −3 or in terms of mass density 10 94 kg m −3 .
Clearly, no where near the experimentally observed value for energy density near zero and mass density, near the critical value of 10 −26 kg m −3 . The quantum field theory vacuum mass density is about 120 orders of magnitude too large-rather embarrassing really.

Absorber theory and Advanced Waves and Direct Particle Interaction Theory
The idea of advanced waves in classical electrodynamics started with Dirac [24] in 1938 and his derivation of the radiation reaction of a charged accelerated particle. Advanced waves travel backward in time and are a perfect way to allow for actionat-a-distance. A remote particle can interact with a local source particle by absorbing retarded waves from the source in the future and in response, emits an advanced wave which travels backward in time and interacts with the source immediately, at the instant the retarded wave was emitted. This is a direct particle interaction and does not require the presence of a field. This direct particle interaction conserves momentum. Dirac assumed an advanced wave, in his radiation reaction calculations, but gave no physical explanation as to where it came from. Later Wheeler and Feynman [25] wrote papers in 1945 and 1949 on absorber theory, which was their attempt to give a physical description of the origins of the advanced waves introduced by Dirac. An added motivation was to try and remove the self energy from the electron, but that was not entirely successful, as Wilzcek recounts above. The radiation reaction could be accounted for without self interaction, but at the quantum level self-interaction became unavoidable for charge renormalization, electron-positron pairs are still required to shield the infinite negative bare mass. With an upper bound (Rindler horizon caused by an accelerating expansion) and lower length cutoff (Schwarzschild radius of a particle about 10 −45 cm) , the standard renormalization procedure can be applied to the direct particle interaction approach, which is then no less suitable than conventional field theory, but has no cosmological constant problem. There are no classical divergences if the self-interaction is non-quantized. Feynman's PhD thesis included the path integral approach to non-relativistic quantum mechanics, which was used to describe how to quantize the direct particle interaction of absorber theory [26]. Paul Davies later generalized these classical results for the relativistic case of absorber theory [27,28].
Hoyle and Narlikar also worked on the relativistic absorber theory [29]. There are now three different models for absorbers which have slightly differing advanced wave behavior. Wheeler-Feynman [25], Csonka [30] and Cramer [17]. These models differ with regard to what exactly happens when there is a less than perfect absorber present. They are discussed in the very readable paperback by Nick Herbert [31]. So far, we have a working theory for classical electrodynamics and now for QED. Hoyle and Narlikar have also generalized Einstein's theory of gravitation by using a direct particle interaction. Their theory reduces to Einstein's general relativity in the limit of a smooth fluid approximation, in the rest frame of the fluid. This has the benefit of completely incorporating Mach's principle as a radiative interaction between masses, [32,33]. Cramer spells out the general quantum version of the theory applicable to all systems not just electrons [34,35].

Transactional Interpretation
For an interaction to take place between two particles, emitter and the absorber, Cramer says the emitter must send out an offer wave. This offer wave would be half an advanced and half a retarded wave going out in all directions looking for an absorber, something to interact with. When the retarded offer wave reaches the absorber, that particle sends out a counter wave, also half retarded and half advanced. See Cramer's wiggle diagram, Fig. 3. The advanced counter wave would travel backward in time, along the exact incident path of the original retarded offer wave (it is the complex conjugate of the retarded offer wave), thus constructive interference takes place along the path between the particles. In the one spatial dimension drawn in Fig. 3, the advanced counter wave reaches the emitter particle at the exact time when the retarded offer wave was emitted. This enables the advanced wave from the absorber to exactly cancel with the advanced wave from the emitter at the location of the emitter. Likewise, the retarded wave from the emitter will cancel the retarded wave from the absorber at the location of the absorber. Only the retarded wave from the emitter and the advanced wave from the absorber along the adjoining path are enhanced by the superposition, they do not cancel out. These waves represent the interaction between the particles. In three spatial dimensions things are a little more complicated. Advanced and retarded waves travel in all directions not just in the direction of one absorber. Retarded waves carry on into the future and maybe absorbed at some later point in time. An advanced wave travels backward in time to the big bang. At this point it is reflected and will move forward in time as an advanced wave identical to, and π out of phase with, the incident advanced wave. This will produce a cancellation at every point along the  3 Cramer's wiggle diagram. The figure shows a plane-wave transaction between an emitter and an absorber particle. The black vertical lines are the world-lines for each particle. Waves from the emitter are solid lines, waves from the absorber are dotted. The retarded waves are red for both emitter and absorber and the advanced waves are blue. Red retarded waves move up toward the right. Blue advanced waves move downward to the left. Note that along the path between the emitter and absorber the waves add constructively but before the emitter and after the absorber the waves destructively interfere (Color figure online) world-line back to the point of emission of the wave. All advanced waves therefore cancel out, [36]. Note that the waves are assumed to travel at speed c the speed-of-light in a vacuum, although the advanced wave is traveling backward in time, or with -t [34]. Basically, in quantum terms, the regular wave function is the offer wave, (or at least the retarded wave part that does not cancel out) the complex conjugate wave function is the confirmation wave (the advanced part moving between the absorber going back in time to the emitter) and together they give a handshake [35], which allows an interaction to take place.
Recently Kastner [37] has expounded the virtues of the transactional method with an additional twist allowing for free will. There are many examples of the use of the transactional method in the book and it is well worth a read. In this paper we make no distinction between the original Cramer TI and the Kastner version of possibilist transaction interpretation (PTI). Kastner's approach [38], is to consider a growing emergent universe in which the future is not set in stone but is actualized from an underlying substratum of quantum possibilities.
Cramer's approach means (from the authors view point) that the future is set, the past, present and future may all coexist and we simply have the illusion of flowing through time. To avoid confusion, we quote Cramer on his own interpretation [39]; Let me give an example. When you use your cash card at the grocery store to pay for your purchases, the electronic handshake that occurs between the bank and the cash register insures that money is "conserved" and is neither created nor destroyed, but it does not determine what you elected to purchase. The same is true with quantum transactions, which guarantee the conservation laws but do not determine the future. The real difference between Kastner's PTI and my TI is that for her, offer and confirmation waves exist as objects only in some multidimensional Hilbert space. In the TI the waves exist in real 3+1 dimensional space. Hilbert space was invented by theorists prone to abstraction because it was the only way they could imagine that quantum waves could be entangled. The TI explains how they can be entangled, because the multi-particle transactions allow only those subset of the waves that satisfy the conservation laws to become real transactions.
Others have considered a Many-Worlds Interpretation, with every possible event happening along parallel realities in order to maintain free will. Neither Kastner nor Cramer agree with the many-worlds view [20]. Here, the reader is asked to make up their own mind. This paper is concerned only with; Does the TI fit the data or not? It is found that all the usual quantum results hold and the TI is simply an alternative point of view from the CI, and the instantaneously collapsing wave function, way of thinking.

The Delayed Choice Quantum Eraser by Kim et al.
First we briefly explain the experiment and the observed results. The experimental arrangement can be seen in Fig. 4. An argon laser (λ p = 351.1 nm) is passed through a double slit and illuminates a type II phase matching nonlinear crystal of β-Barium Borate BBO (β-BaB 2 O 4 ) The slit A allows region A of the crystal to be illuminated and slit B allows only region B of the crystal to be illuminated. This small region is about 0.3 mm long which we take to be the slit width a. The separation d of the two regions is about 0.7 mm as specified in the paper [15]. So we may discuss regions A and B of the crystal just as well as the original 2 slits. Parametric down conversion will occur at both sites and from the one pump photon will emerge two photons, a signal and an idler. Note that all possible frequencies are created ν p = ν s + ν i . We are selecting two of the same frequency, or equivalently, twice the pump wavelength λ s = 702.2 nm. The signal and idler photons represent the e-ray and o-ray of the nonlinear crystal.
These photons are momentum entangled and are created essentially at the same time. The probability for a downconversion event is slight, so we may assume that there is only one entangled pair of photons in the system at any given time. Different wavelengths of signal and idler photons exit the crystal at different angles. The required wavelengths are selected by restricting the exit angle. Usually a small range of wavelengths would be selected. For convenience we track only one wavelength, but we should bear in mind that there will be a small bandwidth of wavelengths which will affect the interference pattern of the signal photons and change the visibility of the fringes accordingly. The bandwidth can also be changed using filters in front of the detectors. The detectors will have a less than perfect efficiency which will also affect the fringe visibility. The efficiency of the detectors was not mentioned in the experiment however, and neither was the effective bandwidth.
The signal photons are sent though a lens, of focal length f , (not specified in the paper [15]) and then focussed onto a screen where they can be detected by detector D 0 . The detector scans, via stepper motor, along the x-axis to build up a pattern. The lens is used to create the far field condition at the detector so we expect a Fraunhofer type pattern which is built up over time. The idler photons, from region A and B of the crystal, are sent in the direction of a Glen-Thompson prism (a wedge mirror is used in Fig. 4. instead) which separates them into different paths. The idler photons from region A hit BSA and are either reflected or transmitted. The reflected photons will be detected by D 3 . The transmitted photons will be reflected by mirror MA and then either transmitted through the beamsplitter BS to detector D 2 or reflected by BS into detector D 1 . The idler photons from B hit BSB and are either reflected or transmitted. The reflected photons will be detected by D 4 . The transmitted photons will be reflected by mirror MB and then either transmitted through the beamsplitter BS to detector D 1 or reflected by BS into detector D 2 .
The time of flight from the crystal to the detector D 0 for the signal photons is 8 ns shorter than for the idler photons which go in the direction of the beamsplitters and were eventually detected by detectors D 3 , D 4 or by D 1 or D 2 . The equivalent path length is approximately 2.5 m. We assume that all the detector path lengths, D 1 -D 4 , are the same and equal to 2.5 m. This path length will introduce a constant phase shift into each joint detection. It is also assumed that all mirror reflection angles are the same in both paths so that no additional phase shift differences need to be considered. Since all the phase shifts are considered equal they will cancel out and will not effect the overall interference pattern. The idler photons at these detectors no longer carry any which path information. All detectors then go to a coincidence counter. The diagram is meant to illustrate the same arm lengths for the red and blue idler photons, the reflections from the mirrors and beamsplitters are not accurately drawn with correct angles and refraction is not included (Color figure online) All the detectors are linked to a coincidence counter and the interference patterns are recorded. The intensity pattern recorded at D 0 shows no interference when there is a coincidence between D 0 and D 3 or D 4 . In these cases, we have which path information, since D 3 only records idler photons from slit A and D 4 only records idler photons from slit B. Since the signal and idler photons come from the same region of the crystal, we would then know through which path the signal photons came and we expect no interference.
When the coincidence counts are between D 0 and D 1 there is an interference pattern. The beamsplitter BS mixes the idler photons from both regions and we have now erased the which path information. There is also an interference pattern when there is a coincidence between D 0 and D 2 but this pattern differs from the previous one by a phase shift of π . In other words if one pattern shows a co-sinusoidal interference the other will be sinusoidal. The experiment is considered a delayed choice quantum eraser since the signal photons path length is shorter than the idler photons. It would seem that the signal photons are detected first, then we make a selection of which coincidence detections to look at, and depending on that choice we see or do not see interference of the signal photons. The paradox being, how can you influence the signal photon, basically tell it to interfere or not, by making a choice of detector D 1 -D 4 , 8 ns after the signal photon has already been detected by D 0 . This however is the wrong way to think about this problem. If looked at in the correct way there is no paradox.
These observations can easily be explained in terms of the TI of quantum mechanics as follows. A brief account of this experiment is given in the book by Kastner [37], we give a bit more detail here.

Transactional Interpretation Derivation
Let us start with a few preliminaries. The three beamsplitters in the experiment are all 50:50 lossless beamsplitters. When a photon wavepacket goes through one of these beamsplitters there is no loss so one would expect the probability amplitude of the wave function to remain unaltered.
This means that the amplitude reflection and transmission coefficients obey, We take all the beamsplitters to be identical for convenience. It will be assumed that each optical path length for the idler photons, is the same and any phase changes due to mirror reflections have been compensated for. An offer wave will go out from the slits and get absorbed by a detector. The detector will then send back an advanced wave (backwards in time) along the same path as the incident wave to the slits to handshake and confirm the interaction. Only then does the photon actually leave the slit region. The offer wave is a momentum entangled two-photon state (or bi-photon). The possible transactions will depend on the detector configuration which generates the counter wave. We will go through the process step by step.
The original offer wave from the slits comes from the pump laser beam, we will take this to be, where the subscript p stands for pump. The α is the single slit diffraction pattern, a sinc function of the usual kind. A and B stand for the photon wave functions from the two slits, of plane wave type. Parametric downconversion inside the β-barium borate (BBO) crystal duplicates each pump photon into a signal and an idler photon. For type I parametric down conversion, the signal and idler have the same polarization, for type II the signal and idler polarizations are perpendicular. This is of no importance here since the signal photons from regions A and B interfere at detector D 0 and both idler photons interfere at one of the four detectors D 1 -D 4 . The offer wave from the 2 slits and crystal then becomes, We select both the signal and idler photons of half the pump frequency, by restricting the exit angle from the crystal. Even so there will be a small spread in frequency, and thus wavelength, which will cause the fringe visibility to be less than perfect. However, we will continue thinking of the photons wave functions as simple monochromatic plane waves for simplicity. It is easy to generalize the end result for more than one wavelength.
The time dependent, correlation function calculation can be found in Appendix 1. This is for comparison with the TI approach taken below. We skip the details of the parametric downconversion process in what follows, but they can be found in [6,[40][41][42] and these results are used in the Appendix 1 calculation. The first reference refers to 5 basic quantum experiments and has simple theory accessible to undergraduates [40]. The second reference has more theory but still some experiment, and is geared more for graduates and researchers [41] and the last two reference is a theory paper and a text book [6,42].
The signal photons are sent to the detector D 0 . The idler photons are sent to the beamsplitter setup. The path lengths in the experiment are arranged so that the signal photons reach detector D 0 before the idler photons reach their final destination. So if the signal photon is detected at position x on the screen, then our offer wave becomes [37] ψ = α √ 2 ( x|A s |A i + x|B s |B i ) .
A simple fourier transform of a slit with a constant electric field will give the single slit diffraction amplitude α in the form where a is the slit width and k x = k sin θ and the angle θ is the angular displacement from the center of the slits to the position x on the screen. For the paraxial ray approximation this would be where f is the focal length of the lens which is taken to be roughly the slit screen distance and λ is the wavelength of the signal photons and we have used k = 2π/λ.
We will now assume that x|A s = e ik x d A x|B s = e ik x d B (9) where d A and d B are the distances from the crystal regions A and B to the screen at position x. Also we assume that the slit separation can be given by d = d A − d B . The offer wave can now be written as, Note that we have now dealt with the signal photons and only have to concern ourselves with the idler photon detection. At this point we can continue with the Cramer interpretation or take the wave function Eq. (10) as a standard wave function and use spontaneous emission photon wave packets and expand them in terms or retarded and advanced waves to clearly see the overlap of the two and how the advanced waves retrace the retarded wave in time. This is carried out, for Case 1. below, in Appendix 2. to show the technique. Three cases follow: Case 1 Assume the idler photon will be detected at detector D 1 . The offer wave produced by passing photons through the beamsplitters will be the A i idler photon is transmitted through BSA and reflected from BS to reach D 1 . The B i idler photon is transmitted through BSB and transmitted through BS to reach D 1 . See Fig. 4 for details of the paths. We have assumed that the extra path length in traveling through the beamsplitters is the same for both photons A i and B i , otherwise we would need additional phase factors to account for the path length difference. The counter wave produced by detector D 1 will be the complex conjugate wave traveling backward in time towards the slits, The probability that this transaction will occur then becomes, Let the amplitudes 1 exp(−iφ) and complex conjugate B i |A i = η 1/2 1 exp(iφ) , where η 1 represents the detector efficiency of D 1 which is most likely less than unity. The detector efficiency has been incorporated into the probability amplitude for convenience only. Then we may write, Using our earlier results Eq. (2) for the amplitudes r and t of the lossless beamsplitters and e ±iπ/2 = cos π/2 ± i sin π/2 = ±i (15) we get, It is more general to leave the result in this form. However the Kim paper [15] goes on to simplify further, uses η 1 = 1 for perfect detection and writes, where α is given by Eq. (8) and k x is given by Eq. (7). In the last step we used the double angle formula for cos 2β = 2 cos 2 β −1. This is the coincidence result between detector D 1 together with detector D 0 and shows interference. Using our result Eq. (16) it is easy to generalize to a small spread of wavelengths (bandwidth= λ) by using a computer code to plot the equation and summing the interference patterns for λ, λ ± λ, λ ± λ/2 and λ ± λ/4. This will give a quite accurate interference pattern which will match the experimental data very well. If you also include the detector efficiency η 1 then you can match the experimental fringe visibility almost exactly. This is easy to do with a symbolic manipulation code like Mathematica, which also plots the results for you.

Case 2
When the idler photons are detected at D 2 the offer wave becomes, Note that the A i photon is transmitted by both BSA and BS, and the B i photon is transmitted by BSB but reflected by BS to reach D 2 . See Fig. 4 for details. The detector produces a counter wave which is the complex conjugate of the offer wave above, Using the same manipulations as before, leaving the detector efficiency as unity, the joint probability detection of coincidence counts between D 0 and D 2 becomes, which also shows interference. The factor α is given by Eq. (8). Note that this interference is π out of phase with the interference pattern obtained from the coincidence count between D 0 and D 1 . This is easier to see in the cosine result rather than the cos 2 result. That means if the interference with D 1 is co-sinusoidal then this interference would be sinusoidal. This is exactly what was observed in the experiment [15].

Case 3
If the idler photon is detected at either D 3 or D 4 then the corresponding offer waves would be, and the counter waves would be The probability of a coincidence count between D 0 and D 3 becomes, which shows no interference only a single slit diffraction pattern. The probability of a coincidence count between D 0 and D 4 becomes, which likewise shows no interference. Again, the single slit diffraction amplitude α is given by Eq. (8). This also agrees with the experimental results of Kim et al. [15].

Discussion
The TI is related to the direct particle interaction theory of Wheeler-Feynman and Hoyle-Narlikar and involves advanced waves as well as the usual retarded waves.
The advanced waves are natural solutions to the relativistic wave equation and are required to conserve momentum in direct particle interactions. This paper has briefly considered the pros and cons of direct particle interactions verse conventional field theory methods. In terms of vacuum energy density the direct particle approach tells us there is no vacuum field and thus its energy is identically zero, close in fact to the observed value. Quantum field theory tells us that the vacuum energy density is huge and gives a value 120 times too large. Direct particle or source theory does away with self interaction and subtracting infinities is only needed for charge renormalization. Charge renormalization follows in the same manner as in the field theory case when you introduce a size cutoff (no point particles) of the Schwarzschild radius of the particle. There is also a size limit to the universe to prevent a divergent advanced wave integral due to the Rindler horizon for an accelerating expansion of the universe [43]. Advanced waves have never been detected in practice and this lack of experimental evidence is enough for some to rule them out altogether. It only takes one experimental observation to refute a theory. Cramer and Herbert [44] considered several experimental possibilities of nonlocal quantum signaling (retrocausal signals) involving path entangled systems and in all cases found that the complementarity between two-photon interference and one-photon interference blocks any potential nonlocal signal [45]. The traditional way of thinking about an instantaneous wave function collapse, at a certain time at a certain place, which is clearly in conflict with relativity, is superseded in the transactional picture. The wave function collapse is among the most confusing aspects of quantum mechanics (as a component of the measurement problem) and is simply resolved using the TI method of Cramer, or PTI of Kastner. Indeed the Copenhagen approach actually evades the entire issue by taking the wave function and its collapse as epistemic-a measure of knowledge rather than a physical entity. This approach is observer-dependent; it is subject to the 'Heisenberg Cut' in which there is no physically grounded and non-arbitrary account of what constitutes an 'observer'. In the transactional approach, there is no observer-dependence: it is absorbers that provide the missing ingredient that defines when a measurement and attendant collapse occurs.
Advanced waves are natural solutions to relativistic wave equations. In order to use this theory for the nonrelativistic case it is necessary to think of two Schrödinger equations: one Schrödinger equation for the wave function ψ and one for its complex conjugate ψ * , which becomes the advanced wave. This makes sense if we think of the Schrödinger equation as a square root version of the relativistic Klein Gordon equation.
Furthermore, work by Hogarth [46] and Hoyle and Narlikar (HN) [32,33,[47][48][49] has paved the way to a new version of direct particle interaction gravitational theory, which is fully Machian, incorporates advanced waves and has Einstein's theory as a special case. The HN theory may be quantized as in their book [32,33] using the path integral technique pioneered by Feynman [26].
It is interesting to note that the mass field m(x) in HN theory looks similar to the source field S(x) introduced by Schwinger [50,51]. Wheeler never gave up on the absorber theory, which is a direct particle interaction (action-at-a-distance) theory. It simply wasn't popular at the time and dropped off the radar. Gerard t'Hooft found a way to renormalize Yang Mills field theories in a way similar to QED and most physicists took that path. We believe the works of Cramer, Wheeler-Feynman, Hoyle-Narlikar, and Schwinger's source theory, are all direct particle interactions. How source theory is related to the Feynman path integrals is explained by Schweber [52]. It should be noted that Schwinger was able to derive the Casimir force using the source field method in which there are no nontrivial vacuum fields [53,54]. The action at a distance theories are well worth study and may lead to a consistent picture of quantum gravity. Radiation reaction can be dealt with using the half retarded half advanced absorber picture. Many QED results thought to be vacuum fluctuation related can in fact be derived by considering source fields instead, including the Lamb shift and particle self energy [53].

Conclusions
The main aim of this paper is to draw attention to the fact that the TI of quantum mechanics by John Cramer is perfectly viable and legitimate, and should be given due consideration by the physics community, which has not been the case thus far. The TI by Cramer [17], gives a simple and intuitive picture for wave function collapse distributed over the entire path of the interacting system (in Kastner's approach, the collapse is what establishes that path). In the case of the Kim experiment [15], the wave function would collapse along the entire path between the slits (or the regions A and B of the down converting crystal) and the detectors and it would happen in a way distributed over time, not in an instant. The TI picture rules out the possibility of any backward in time signals using quantum delayed choice experiments. In fact it makes clear the idea is nonsense since the advanced counter wave from the detector must travel the entire distance back to the slit in order for the photon (from the slit) to make the trip in the first place. The choice is really no longer delayed since the photon knows where it will end up because of the advanced wave coming backwards in time to confirm the interaction or handshake, as Cramer puts it. The alternative way of avoiding wave function collapse is to use the correlation functions as in Appendix 1. The calculations are far more long winded, than the fairly quick and easy calculation in the main paper, and in the opinion of the author the correlation function method masks what is really going on and thus leaves room for misinterpretation. emission photons and use the results in the Scully Druhl paper [5]. This would give a sensible answer, but we have given the parametric downconversion theory in detail in what follows. The quantum mechanical interaction picture Hamiltonian for the non-degenerate parametric downconversion in the rotating wave approximation [6] is where a † s , a † i and a † p are the creation operators for the signal, idler and pump beams respectively and a s , a i and a p are the corresponding annihilation operators. The coupling constant κ depends on the second order susceptibility tensor which mediates the interaction, [6]. In the non degenerate operation we find a two mode squeezed state output. In degenerate operation, where the signal and idler frequencies are the same and each half the pump frequency, you would get a single mode squeezed state. In the parametric approximation, the pump beam is treated classically as a coherent state and pump depletion can be neglected. If we allow α p and θ to be the real amplitude and phase of the pump then the interaction Hamiltonian becomes, The equation of motion for the signal annihilation operator, taking the expectation over the signal vacuum becomes ; where p = κα p . The signal creation operator equation of motion becomesȧ † s = i p a i e iθ .Similarly for the idler operators we use the idler vacuum to find; By differentiating the above equations with respect to time and substitution we can find, from which you can set t = 0, and find solutions for the initial conditions. By substituting back the original equations, you can easily find the A s , B s coefficients in terms of initial conditions for the creation and annihilation operators as follows, Similarly for the idler operators, For θ = π/2 these look like non degenerate squeezed state transformations, [6]. For simplicity we are using type I parametric down conversion and degenerate frequencies.
The frequency of the pump is the sum of the signal and idler frequencies. The signal and idler frequencies are taken to be the same. ω p = ω s +ω i , where ω s = ω i . In type I parametric downconversion the polarization of the signal and idler are the same. In the experiment [15], the signal photons interfere and the idler photons interfere separately so it makes no difference that they are from type II parametric down conversion and thus in perpendicular polarization states. We shall also use the same simplifying assumptions as in the previous transactional interpretation method. We assume that the separation of the region A and B from the detector D 0 are very similar the only difference in path length being the region separation. We further assume that the idler distances from region A or B to the same detector D 1 -D 4 are the same. This brings about a great simplification in that the integrations are over 2 times and not 4. The extra work involved in allowing the signal photons to have two distinct path lengths and the two idler photons to also have two distinct path lengths, to the same detector, does not add to the physics and only complicates the integrations unnecessarily. This is easy to set up but gets messy, very quickly, in practice.

Joint Detection D 0 and D 1 Detectors
For the probability of joint detection R 0,1 from detectors (D 0 , D 1 ) we set up the following integration [15], where : : denotes normal ordering where all creation operators are to the left of all the annihilation operators. The i will take values of 1-4 depending on the idler detector D 1 -D 4 . Here t 0 is the time for the signal photons to go from the crystal to the detector D 0 and t 1 is the time for the idler photons to get from the crystal to detector D 1 . We take the signal path length to be d A or d B for the two regions and the idler path length to be x A and x B from the crystal to detector one. From the experiment t 0 < t 1 by about 8ns. Shih et al [15] tell us that the above integral is approximately the same as the integral of | E (+) (t 0 )E (+) (t 1 ) | 2 . The positive frequency part of the electric signal is E (+) where E 0 is some constant. The interference results are usually normalized so we set E 0 = 1 in what follows. We drop all the ω terms , ω p = ω s + ω i since they will all cancel out, and we take ω s = ω i for simplicity. Actually if you expand the 4th order correlation function you get 3 such terms as follows, see Collett and Loudon [55]; It turns out only the first term cancels but the other two terms are non zero. Collett and Loudon [55] outline a more advanced time integration procedure. We are approximating with two times only assuming the distances for both signal photons are almost the same and the idler photons have equal path lengths to the same detector. The signal and idler electric fields for detection at D 0 and D 1 can be written as; where the first line of each electric field equation is from region A of the crystal, and the second line comes from region B. The α term is the sinc function or the square root of the single slit diffraction pattern as defined in the TI section. See Eqs. (6)(7)(8)(9) in this paper. The expectation values are evaluated in a vacuum. After some tedious algebra it can be shown that the first term in Eq. (33) gives zero. The only non-zero terms have combinations of 0|a s a † s |0 , 0|a i a † i |0 in them. The second order correlation functions in the second term are; where we have used the lossless beamsplitter result that rt +r t = 0 and |r | 2 +|t| 2 = 1 and d = d A −d B is the slit separation (distance between regions A and B or the crystal). It is also assumed that x A = x B so the idler photons travel the same distance to the same detector D 1 . The second term in the expansion with i = 1 for D 1 becomes; Similarly, The third term in the expansion Eq. (33) becomes; Hence, adding terms 2 , Eq. (36) and term 3, Eq. (38) we find the probability R 01 to be, The 1 T T 0 cosh 2 ( p t 0 )dt 0 and 1 T T 0 cosh 2 ( p t 1 )dt 1 integrals, can be performed and lead to constants so long as p T > 0. Clearly the cos[k x d] term leads to interference of the signal photons.

Joint Detection D 0 and D 2 Detectors
The joint probability R 0,2 , detection of (D 0 , D 2 ) leads to similar interference terms. The starting electric fields for detector 2 become; Since we have chosen to calculate type I, there will be no polarization change and we expect a similar result to that of R 0,1 above with the only difference that t 1 → t 2 .
We have not worried about any subtle phase changes on reflection here.

Joint Detection D 0 and D 3 Detectors
The joint probability R 0,3 , detection of (D 0 , D 3 ) signal and idler photons can be calculated using a similar technique but the starting electric fields would be, using i = 3; In this case only idler photons from region A can reach detector 3. This implies that the signal photons also came from region A and no interference results. The new term 2 becomes; The new term 3 becomes; The point probability R 03 becomes ; : Clearly no interference present.
where we have approximated by missing out the r and t reflection and transmission coefficients. These would lead to a numerical factor and possibly a phase shift which is not of importance at the moment. (Note-this is to eliminate any confusion between the transmission coefficient and the time t.) The lengths from region A,B of the crystal to detector 1 are L 1A , L 1B respectively. For interference we want to find ψ 1 ψ 1 . It is quite straightforward to multiple this out. For convenience we make the further simplifying assumptions; It is assumed that the path lengths from the regions A,B of the crystal to detector 1 are almost the same and equal to length L, which could be a meter or more in length. It is further assumed, that if there is a path difference from regions A,B of the crystal to the detector 1, it is very small so that the path difference divided by c becomes δt → 0.
The following result is then found for ψ 1 ψ 1 , where d = d A − d B as before. The result is symmetric in the retarded and advanced waves. The advanced waves are normally not detectable. The first line shows single slit diffraction terms. These theta squared terms were just in lengths for paths L 1A or L 1B alone and a factor of 2 has been removed. The interference is clear from the second and third lines of the above equation. This results from a path interference between lengths L 1A and L 1B . Both terms are either retarded or both advanced. The 4th and 5th lines show an interference between the retarded and advanced waves. The 4th line is actually a mixture of theta functions from paths L 1A and L 1B , the 5th line was originally two terms, one from region A and the other from region B. The full expression is rather long, so both arm lengths from crystal to detector 1 were taken to be approximately the same length L. The value of 2Lω/c can be very large of order ∼10 7 for lengths L of a meter, and frequency ω = 3 × 10 15 rad/s. Interference of the retarded and advanced waves takes place along the entire path length L. An advanced wave returns along the same path as the outgoing retarded wave, but the advanced wave travels in the reverse time direction from detector to slits and thus collapses the wave function along the entire path of the photon. The last term would most likely not be visible due to the large argument of the cosine which would have a tendency to cause rapid oscillation and wash out the fringes as a result (for any variation in ω).
This appears to confirm Cramer's hypothesis that the wave function collapse is not instantaneous, but is distributed in time along the flight path of the photon.