Single-slit focusing and its representations

We illustrate the phenomenon of the focusing of a freely propagating rectangular wave packet using three different tools: (1) the time-dependent wave function in position space, (2) the Wigner phase-space approach, and (3) an experiment using laser light.


Introduction
In July 1816, the civil engineer Augustin-Jean Fresnel published his preliminary results [1] confirming the wave This article is part of the topical collection "Enlightening the World with the Laser" -Honoring T. W. Hänsch guest edited by Tilman Esslinger, Nathalie Picqué, and Thomas Udem.
The TWH productions on classical wave optics illustrating, for example, a pinhole caleidoscope, a Fresnel iris or different diffraction gratings are legendary. They can be found on Dropbox and YouTube at https://dl.dropboxusercontent.com/u/87280051/ pinhole%20diffraction%204-15-2014%20720p.mov, https://www. youtube.com/watch?v=llevPEEd4L4 and https://www.youtube. com/watch?v=jzmqeRp_tmk. Over the last decades we have had great fun discussing wave phenomena such as Talbot carpets, Fresnel lenses and the diffraction from a single slit with Theodor W. Hänsch. For this reason we find it appropriate to dedicate to him this article on an elementary example of diffractive focusing on the occasion of his 75th birthday. manifests itself in the time-dependent probability density as well as the Gaussian width [5] of the wave packet. For this purpose we derive exact as well as approximate analytical expressions for the time-dependent probability amplitude and density.
In Sect. 4 we verify these predictions reporting on an experiment using laser light diffracted from a single slit. Here we take advantage of the analogy between the paraxial approximation of the Helmholtz equation of classical optics and the time-dependent Schrödinger equation of a free particle. We measure the intensity distributions of the light in the near-field of the slits and obtain the Gaussian width of the intensity field. Moreover, we make contact with the predictions of non-paraxial optics.
Section 5 illuminates this focusing effect from quantum phase space using the Wigner function. In particular, we show that the phenomenon of focusing which reflects itself in a dominant maximum of the probability density on the optical axis follows from radial cuts through the initial Wigner function at different angles with respect to the momentum axis. Moreover, we analyze the rays and envelopes of the Wigner function in more detail.
We conclude in Sect. 6 by summarizing our results and by providing an outlook. Here we allude to the influence of the number of dimensions on the focusing and emphasize the importance of corrections to paraxial optics.
To keep our article self-contained while focusing on the central ideas we have included three appendices. Indeed, "Appendix A" contains the calculations associated with the Gaussian width of our wave packet and "Appendix B" presents a detailed discussion of the Wigner function approach towards diffractive focusing. As an outlook we compare in "Appendix C" the paraxial and non-paraxial results obtained for diffraction by slits and circular apertures.

Diffraction theory
In this section we first provide a historical overview of diffraction and then address the phenomenon of diffractive focusing. Due to their different nature we distinguish in this discussion between light and matter waves. Moreover, we briefly review the concept of diffraction in time.

A brief history
Following the experimental demonstration of the wave nature of light by Thomas Young [9] and the first theory on diffraction by Fresnel [1] the nineteenth century was extremely successful in the investigation of wave phenomena, specially in optics. The unifying electromagnetic theory of James Clerk Maxwell [10] was the culmination of all previous developments on electromagnetism. Gustav Kirchhoff readdressed the diffraction of scalar waves and put it on a rigorous mathematical foundation [11]. The Fresnel diffraction arises now as a special case of the Kirchhoff diffraction. Arnold Sommerfeld and Lord Rayleigh [4,12,13] improved the Kirchhoff theory correcting the boundary conditions at the aperture and with that eliminating the discrepancy arising between the solutions and the boundary conditions chosen by Kirchhoff. Friedrich Kottler proposed another reason for this discrepancy by showing that the Kirchhoff integral can be interpreted not as a solution of the boundary value problem but as a solution for the "saltus" at the boundary [14,15]. Moreover, he extended the scalar theory to electromagnetic waves [16,17].
During the twentieth century numerous theoretical and experimental contributions to diffraction theory emerged. Julius Stratton and Lan Jen Chu extended the scalar Kirchhoff diffraction theory to vector waves [18] accounting for polarization. Hans Bethe found analytical solutions for the diffraction of electromagnetic waves by an aperture much smaller than the wavelength [19]. His theory and the corrections later introduced by Christoffel Bouwkamp [20] became important because of the invention of the nearfield scanning microscope (SNOM or NSOM) [21] and the developments related to near-field optics [22][23][24]. In 1998, Thomas Ebbesen and collaborators observed that light transmission through an array of subwavelength apertures drilled in noble metal thin films can largely surpass the value predicted by Bethe [25]. This extraordinary optical transmission is dependent on the geometry of the array, on the illumination conditions and on the size and shape of the apertures [26,27]. It results from the excitation of surface plasmon modes near the aperture. In plasmonic gratings with narrow slits it may also lead to an attenuation of the transmitted light stronger than that predicted by the Bethe-Bouwkamp theory [28].

Focusing of waves
Focusing of waves by diffraction due to slits or apertures falls into two categories: (1) near-field focusing effects arising mainly in the diffraction of electromagnetic waves, and (2) focusing resulting from diffraction of slits or apertures larger than the wave length, where the focus is located in the Fresnel zone.
In the first category we include the focusing of light resulting from the confinement of surface plasmons in nanostructured apertures in plasmonic materials [25,26,29]. Frequently scalar and electromagnetic diffraction theories assume the apertures to be located in infinite and perfectly absorbing screens, and thus surface plasmons are ignored. Hence, these theories cannot account for plasmonic modes and their optical effects. To accurately describe the effects produced by the excitation of surface plasmons a full electromagnetic theory using the optical properties of real materials is required.
The focusing of light by apertures smaller than the wave length has been investigated theoretically several times in the last decades [30,31]. However, this nearfield focusing is dependent on the polarization of light and restricted to small apertures.
The properties of the focus in laser beams and atomic beams is of interest in microscopy and atom optics. Standard laser beams such as Laguerre-Gaussian, or Hermite-Gaussian beams can be strongly focused. The smallest order Hermite-Gaussian beam called TEM 00 has the highest confinement and is, therefore, preferred in confocal microscopy. Other beams such as Airy and Bessel beams [32,33] have non-diffracting properties.
To increase the field confinement, and thus the resolution of a microscope, novel laser beams and illumination mechanisms have been proposed [22,29,[34][35][36].
In parallel, a similar interest exists in the confinement and squeezing of matter wave packets [37][38][39]. Focusing effects in atomic beams resulting from the interaction with laser fields diffracted by apertures in metallic screens were investigated recently [40]. Indeed, the interaction of matter waves with light fields has been the subject of intensive research in atom and quantum optics [41]. However, this spatial confinement, or focusing is of different nature than that of diffraction. In the latter, the field confinement created is solely determined by the properties of the incoming wave and the aperture. No other optical element is involved.
Self-focusing of light may also arise in nonlinear media [42]. However, we will not discuss this phenomenon in this article, but rather concentrate in our analysis in the focusing effects arising from diffraction in free space due to slits larger than the wavelength.
In 2012, the focusing of light waves by a slit larger than the wavelength was experimentally observed [6]. The diffraction pattern is similar to that of a circular aperture of several wavelengths in diameter [43,44]. The main difference between a slit and a circular aperture is the value of the dominant maximum, relative to the intensity of the incident wave. For a circular aperture it reaches 4.0, whereas for a slit is only 1.8 stronger than the incoming wave [6,43].
The diffraction patterns of slits and circular apertures for scalar waves and non-polarized electromagnetic waves can be accurately calculated using the Rayleigh-Sommerfeld diffraction integrals, even in the case of apertures of the size of the wavelength, without using any mathematical approximation. Moreover, analytical solutions for the onaxis field intensity were found for the circular aperture [43,45], and the oscillations of the intensity on-axis were confirmed for electromagnetic waves [44,46].

Diffraction in time
Moshinsky [8] introduced the concept of diffraction in time using matter waves. Remarkably the time evolution of the probability density of a wave packet suddenly released by a shutter is mathematically identical to the intensity pattern behind a semi-infinite plane. This analogy stands out most clearly when we substitute the time coordinate of the wave packet by the corresponding space coordinate in diffraction. Then the solution of the Schrödinger equation for the problem of the Moshinsky shutter is identical to that of the Fresnel diffraction by a semi-infinite plane, and the probability density reaches a maximum of 1.3. Moreover, Moshinsky analyzed later the time-energy uncertainty associated with the shutter arrangement [47] and Godoy investigated the Fresnel and Fraunhofer diffraction in time of initially stationary states [48].
Recently, the diffraction in time of the double-shutter problem was analyzed [5]. An initially confined rectangular wave packet in one dimension is suddenly released and evolves in time. Again, the solution of the corresponding Schrödinger equation has the same form as the Fresnel diffraction of scalar waves by a single slit of infinite length.
However, we emphasize that Fresnel diffraction only holds true in the paraxial approximation of optics. The general solution of the diffraction by a slit is found by solving the Kirchhoff, or the Rayleigh-Sommerfeld diffraction integrals.
The mathematical analysis of the classical diffraction problems makes use of wave functions expressed in real, or reciprocal space. It is also very common in the investigation of the diffraction of matter waves [8,[49][50][51].
However, since Wigner introduced his famous distribution function [52] an increasing number of publications has used the Wigner phase space representation to study the dynamics of light beams [53][54][55][56][57] and matter waves [58][59][60][61][62][63][64][65]. Other phase space distribution functions related to the Wigner function have been also used in matter waves phenomena. They are interrelated and belong to the Cohen class [66]. In this article we employ both the wave function and the Wigner representations of matter wave packets.
The evolution in time of matter waves with zero angular momentum, so-called s-waves, strongly depends on the number of space dimensions [67]. For instance, in two dimensions, an initial ring-shaped wave packet first contracts reaching a minimum, reducing the radius of the ring, and then monotonically expands. In three dimensions, the radius only increases. This effect is attributed to a quantum anti-centrifugal force [67][68][69]. This example shows that the focusing effect of a free wave packet is a more general phenomenon than that arising from the free time evolution of a one-dimensional rectangular wave packet.
We conclude with a brief reference to the type of boundaries of the slit, or shutter. Most of the classical treatments of diffraction problems define the edges of the slit, or of other aperture shape as sharp transitions between a perfectly absorbing surface and a homogeneous fully transmitting medium. In quantum matter waves, the Moshinsky shutter or the sudden release of rectangular wave packet is also an example of sharp boundaries. The effects arising in the diffraction patterns due to non-sharp boundaries have been investigated recently [5,50,[70][71][72][73].

Wave function approach
In this section we use the solution of the time-dependent Schrödinger equation, that is the wave function, to show that a freely propagating rectangular wave packet exhibits the phenomenon of focusing. For this purpose we first express the time-dependent wave function in terms of a Fresnel integral, and then derive analytic approximations for the wave function as well as the probability density. To bring out most clearly the focusing effect we finally calculate the Gaussian width [5] of the wave packet and demonstrate that it exhibits a clear minimum at the time of the focusing.

Time evolution
Central to our discussion is the free propagation of a wave packet corresponding to a non-relativistic particle of mass M. The initial wave function is of rectangular form with a length L. Here denotes the Heaviside step function.
With the help of the propagator [74] of a free particle connecting the initial coordinate y with x at time t, and the abbreviation containing the reduced Planck constant ℏ, we find from the Huygens principle of matter waves the expression for the time-dependent probability amplitude. Here we have introduced the integration variable ≡ 1∕2 (x − y). When we decompose the integral in Eq. 5 into two parts each starting from x = 0, the wave function consisting of the difference of two Fresnel integrals is thus determined by the interference of the diffraction patterns originating from two semi-infinite walls located at x = L∕2 and x = −L∕2. The amplitude and phase of each contribution are given by the Fresnel integral F, whose real and imaginary parts and follow from the Cornu spiral [75] represented in the complex plane. 1 In Fig. 1 we present the probability density | (x, t)| 2 as a function of space and time. Here and in the remainder of our article we represent the coordinate x in units of L and the time t in units of the characteristic time The curious inclusion of the factor 2 is motivated by the asymptotic expressions of | | 2 discussed in the appendices. Moreover, the probability density | | 2 is always in units of 1 / L.
Whereas on the top of Fig. 1 we show | (x, t)| 2 in continuous space-time in the bottom panel we select specific time slices corresponding to (1) short times where the probability distribution oscillates strongly, (2) intermediate times leading to focusing, and (3) longer times representing the ballistic regime.
The Cornu spiral was studied for the first time by Jacques Bernoulli in the context of elastic deformations and Leonhard Euler defined it in more rigorous terms. Alfred Cornu associated this curve with the Fresnel integrals C and S and achieved excellent numerical accuracy. Due to the work of the Italian mathematician Ernesto Cesaro it is also called clothoid.
At t = 0 the probability density starts from its initial rectangular shape and immediately develops two peaks at the edges decorated with fringes. However, after this transitional phase the two peaks disappear and a dominant maximum at the origin x = 0 forms. It is most pronounced at t ≈ 0.342 corresponding to the focus when the width of the wave packet assumes a minimum. Indeed, here the probability density assumes a maximum, which is about a factor 1.8 larger then at t = 0 where it is unity.
After the focus, that is for larger times, the wave packet displays the familiar spreading effect.

Analytic approximations
Next we give approximate but analytical expressions for the time-dependent probability amplitude and probability density. Since our interest is to obtain the behavior at early times, that is before the ballistic expansion occurs, we shall consider small values of t, corresponding to the regime where (t) is large.
With the help of the asymptotic expansion [76] valid for 1 ≪ a the expression Eq. 5 for the probability amplitude reduces to and the probability density reads This expression simplifies further when we neglect terms  t 2 ∕(1 ± 2x∕L) 2 and takes the form In Fig. 2 we compare and contrast the resulting probability densities at x = 0 as a function of time and find excellent agreement between the numerical result following from the evaluation of the integral of Eq. 5, and the approximations based on Eqs. 13 and 14. We emphasize that our approximations break down for very large values of t, but they succeed in giving the maximum of the distribution corresponding to the focusing effect. We also test in Fig. 3 our approximate but analytic expressions, Eqs. 13 and 14, for the probability density against the exact numerical result given by Eq. 5 at characteristic times confirming again the focusing effect. Fig. 1 Time evolution of a rectangular wave packet represented by its probability density | (x, t)| 2 , given by Eq. 5, depicted in continuous space-time (top) and at specific times (bottom) corresponding to a strongly oscillatory behavior, the focus, and the ballistic expansion. To bring out the characteristic features in the early-time evolution we have represented the time axis by a logarithmic scale. For a better comparison the initial wave packet at t = 0, that is for log(t = 0) = −∞ is moved to log(t) = −3.0 since our time axis extends only to this value 121 Page 6 of 22

Focusing expressed by the Gaussian width
So far we have analyzed the phenomenon of diffractive focusing of our rectangular wave packet by considering the complete probability density in space and time. We now characterize this effect by the Gaussian measure [5] (15) Here is a constant with units of an inverse length. For our rectangular wave packet we obtain the Gaussian width by numerically evaluating Eq. 15 using the integral representation Eq. 5 of the probability amplitude. In Fig. 4 we depict the corresponding curve normalized to its initial value x 2 (0) for = 6.0 which displays a clear minimum at t ≈ 0.39, thus confirming the focusing effect.
However, we note that the location of the minimum of x 2 = x 2 (t) deviates slightly from the location of the maximum of | (x = 0, t)| 2 which occurs at t = 0.342 as indicated in Fig. 2. This deviation is the result of the integration in Eq. 15.

Experimental approach
In the preceding section we have shown that a rectangular matter wave packet first focuses before it spreads. We now describe an experiment to observe the diffraction pattern and, in particular, the focusing arising close to the slit. Here we take advantage of the familiar analogy between the Schrödinger equation of a free particle, and the paraxial wave equation of classical optics. In this situation z denotes the coordinate of propagation and k the wave vector of the electromagnetic wave. Indeed, the two wave equations are identical when we make the substitution where is the wave length. In the last step we have recalled the definition, Eq. 10 of the characteristic time T. Hence, time in the Schrödinger equation translates into propagation distance z.
As a consequence of the analogy between Eqs. 16 and 17 with the scaling given by Eq. 18, it suffices to perform our experiment with light rather than matter waves.

Setup
We use a confocal microscope for measuring the light intensity diffracted by a one-dimensional slit milled in an aluminum film. 2 The slit of length 50 μm and width 2440 nm is illuminated by laser light of wavelength 488 nm as shown in Fig. 5a. A single mode optical fiber with collimator was used for the illumination. The collimated laser beam has a diameter of approximately 1 mm and is, therefore, much larger than the slit. For this reason we assume the illumination of the slit as a plane wave.
To generate images in different planes above the slit we employ a confocal laser scanning microscope (WITec GmbH). The objective used for light collection was an infinitely corrected Olympus MPlan with 100× magnification and numerical aperture NA = 0.9. The collected light was focused into a multimode optical fiber connected to an avalanche photodiode. The light diffracted by the slit is measured in a rasterized way. Each point corresponds to a pixel of an image generated by scanning in the horizontal, or in the vertical direction, as outlined in Fig. 5a.

Results
In Fig. 5b, c, we present images of the light intensity in the plane of the slit, and perpendicular to the sample, respectively. In the latter case, we scan the confocal microscope in the vertical direction with a minimum scan step of Δz ≈ 50 nm. The pixel size in the horizontal direction corresponds to a dislocation of Δx = Δy = 10 μm∕512-20 nm.
The slit was fabricated using focused ion beam milling (FIB) of a 75 nm Al thin film, evaporated at a pressure of approximately 10 −6 mbar on top of a glass substrate of 1 mm thickness. Thus, each pixel is much smaller than the wavelength and the optical resolution, which according to the Rayleigh criterion is Δr ≡ 0.61 × ∕NA. The horizontal fringes appearing in the light intensity of Fig. 5c are due to the mechanical motion of the microscope when scanning vertically the diffracted light. We note the dominant maximum of the intensity at z ≈ 4.1 μm which confirms our prediction of the focusing effect.
To analyze the phenomenon in a quantitative way we use the experimental intensity distribution of Fig 5c to obtain the Gaussian width x 2 defined by Eq. 15. The so-calculated curve, now displayed in Fig. 4 as a function of the propagation distance z and scaled according to Eq. 18, follows nicely the theoretical prediction. In particular, it displays the characteristic minimum indicating focusing at the same location as the theoretical curve.
The rapid modulation of the experimental curve is a consequence of the horizontal fringes emerging due to the mechanical motion of the microscope as mentioned above, and thus of the measurement technique. An average over these oscillations leads to a smooth curve following the theoretical curve.

Paraxial versus non-paraxial optics
To compare our experimental results to the theoretical predictions of classical optics we have calculated the intensity and phase for a slit of width L = 2.44 μm illuminated by light of wave length = 0.488 μm corresponding to the same ratio L∕ = 5 as in the experiment. For this purpose we use the Fresnel and the Rayleigh-Sommerfeld diffraction integrals familiar from paraxial and non-paraxial optics, respectively. In particular, we have chosen the Rayleigh-Sommerfeld integral of the first kind (RS-I) [4,12,13] 3 and made use of numerical integration and algorithms reported in [77] 4 . 3 The integral RS-I is often preferred to the integral of the second kind, RS-II, because it describes more accurately the value of the lobes close to the aperture. However, RS-II predicts locations of the lobes that are identical to that of RS-I. 4 Mielenz developed numerical algorithms for the calculation of the diffraction by slits and circular apertures in paraxial and non-paraxial optics [77][78][79][80][81][82][83]. We note, however, that the Mielenz definition of the Lommel functions employed in the calculation of the Fresnel diffraction of a circular aperture is not correct. Correct definitions were provided by Lommel [84], Born and Wolf [85] and Daly et al. [86]. Moreover, we note that the abbreviations of the Rayleigh-Sommerfeld integrals RS-I and RS-II by Mielenz as RS (s) and RS (p) are misleading, since the indices usually refer to s-and p-polarization of vector waves, and the Rayleigh-Sommerfeld theory applies only to scalar waves.
In Fig. 6a, b we depict the intensities following from the Rayleigh-Sommerfeld and Fresnel integrals, respectively. Moreover, in Fig. 7a, b we show the corresponding phases. For a direct comparison with Fig. 5 we have refrained from using normalized coordinates for both axis.
We note three characteristic features: (1) the number of intensity lobes is finite for RS-I, whereas it is infinite for Fresnel. (2) The phase pattern predicted by RS-I evolves almost as a plane wave in the propagation direction. In contrast, the Fresnel phase shows small oscillations around zero radians in the propagation direction, but rapid oscillations in the transverse direction, and (3) the Fresnel and RS-I intensity tend to agree as we move away from the slit. In particular, the focus occurs approximately in the same position for both calculations, as we show later.
We conclude this section by noting that although RS-I is only valid for scalar waves its intensity map still fits well with the optical intensity obtained experimentally.

Wigner function approach
In the preceding sections we have analyzed the focusing of a rectangular wave packet with the help of the time-dependent Schrödinger equation and have confirmed the effect using optical waves. We now consider this phenomenon from a different point of view, that is from phase space taking advantage of the Wigner function. This formalism has the remarkable feature that the time evolution is identical to the classical one, consisting of a shearing of the initial Wigner function. The latter contains the interference nature of quantum mechanics.

Probability density from tomographic cuts
The Wigner function has several unique properties. The ones most relevant for the present discussion are: (1) the time evolution of a free particle follows by a replacement of the phase space variables according to the classical motion, and (2) the corresponding probability densities originate from the integration of the Wigner function over the conjugate variable, that is by an appropriate tomographic cut. We now discuss these features in more detail and derive an expression for the probability density which is different from, but equivalent to the one discussed in Sect. 2.

A brief introduction
Wigner functions are quasi-probability distributions first introduced by Wigner [52] in the context of quantum correlations in statistical mechanics. They belong to the Cohen class of distributions [66] and are, therefore, related to other quadratic kernel distributions.
There exist extensive applications of the Wigner function, not only in quantum physics, but also in optics and signal processing [57,87]. Indeed, Wigner functions are frequently used to represent quantum systems in phase space [60,62,88,89]. Moreover, they describe the time evolution of wave packets in a similar way as classical phase space distributions in terms of the trajectories of classical particles [63] and have also been used in the study of diffraction in time of wave packets [51,61,90], a topic that has been addressed several times since the seminal article of Moshinsky [8].

Definition and time evolution
In this section we first define the Wigner function and discuss its time evolution in phase space. We then use its property to provide us with the marginals to compute the time-and space-dependent probability density. Here we take advantage of the fact that the Wigner function at arbitrary times is easy to obtain [63].
Indeed, for a given wave function = (x) we find the corresponding Wigner function from the definition This representation is particularly useful in the discussion of the asymptotic behavior of the probability density addressed in more detail in "Appendix B".

Time evolution: rectangular wave function
We are now in the position to consider the time evolution of our rectangular wave function. For this purpose we first calculate the Wigner function corresponding to this wave, and then analyze the shearing in phase space. Although we mainly study the motion of the complete Wigner function we also concentrate on those contributions in phase space which lead to the focusing.

General case
According to "Appendix B" the Wigner function of the rectangular wave packet, Eq. 1 reads and is shown in Fig. 8a. Here we identify three characteristic features: (1) since the rectangular wave function is We recognize a dominant positive contribution along the x-axis with a maximum at the origin. The time evolution given by Eq. 20 manifests itself in a shearing of the initial Wigner function W 0 shown in Figs. 8b, c. Indeed, every point (x, p) of the Wigner phase space moves according to the Newton law, that is x → x + pt∕M while p is a constant of the motion.
Especially the Wigner function of Fig. 8c is interesting since it represents the moment of focusing. Indeed, by this time all negative contributions of W have moved away 5 from the positive peak centered at the x-axis but now subtract from its wings. This fact stands out more clearly in the position distribution shown in Fig. 8d which can be understood as the result of an integration of W over the momentum variable.
In this active approach the Wigner function evolves in time but the axes of phase space which define the tomographic cuts, that is integration over x or p remain fixed. An alternative formulation follows from Eq. 22. Here the Wigner function remains static, while the line of integration evolves in time according to x + (p∕M)t = 0. In Fig. 9, we show the probability density at x = 0 as 5 We emphasize that our interpretation is different from [57], in page 320, which states: "... a peak in the axial intensity is achieved, associated with the vertical alignment of some of the main positive regions of the Wigner function". obtained from the Wigner function through integration from this passive point of view.
The cut at t = 0 runs along the momentum axis. Here the oscillations in the Wigner function W 0 along p average out and the main contribution to the probability density arises from the positive domain along the x-axis. For small times the lines of integration enclose small angles with respect to the p-axis and the oscillations in W 0 translate in an oscillatory probability density. A dominant maximum occurs when the cut feels the central positive ridge along the x-axis. The density decays for larger times, that is as the cut approaches the x-axis, since there is a decreasing overlap.

Rays and envelopes
According to Eq. 20 each point in Wigner phase space follows its classical trajectory. Thus, the value of W 0 at (x 0 , p 0 ) will move from x 0 to at time t. We now use these rays to show that the minima of the Wigner function generate the regions in space-time where the probability density | (x, t)| 2 assumes small values as exemplified by Fig 10. Thus, they indicate the x values of the minima of | (x, t)| 2 as the system evolves in time. The explicit expression Eq. 23 for the initial Wigner function W 0 tells us that the minima of W 0 are given by the condition for p positive, and for p negative, with n = 0, 1, 2, … We substitute these expressions for p (n) 0 into the motion, Eq. 24, and rearrange the terms, which yield for x 0 positive, and for x 0 negative. The upper signs in Eqs. 27 and 28 are for positive p (n) 0 , that is for a positive slope in the (x, t) plane, and the lower signs for negative p (n) 0 . The envelope of a set of curves F(x, t;x 0 ) = 0, each determined by a parameter x 0 , follows [91] from the requirements We consider the two cases, x 0 > 0 and x 0 < 0 in turn.
For positive x 0 , the first condition gives which, when substituted into Eq. 27, yields the relation Thus, only negative values of p (n) 0 contribute to this envelope and we obtain the space-time trajectories of the minima.
According to this expression they are parabolas 6 emerging from x = L∕2 with a steepness inversely proportional to (2n + 3∕2). Hence, the largest steepness corresponds to the case of n = 0 with the parabola crossing the t-axis, that is x = 0, at the time For a circular aperture these lines are rectilinear [92].
where in the last step we have recalled the definition, Eq. 10 of the characteristic time T.
For negative x 0 , the first condition in Eq. 29 provides us with the relation which, when substituted into Eq. 28, gives Since in this case only positive values of p (n) 0 contribute to this envelope we arrive at the space-time trajectory corresponding to parabolas emerging from x = −L∕2. They are the mirror image of the ones given by Eq. 32.
In Fig. 10 we present the envelopes, Eqs. 32 and 36, along with the corresponding probability density | (x, t)| 2 . Hence, these minima coincide with the minima of the intensity pattern and are harbingers of the maxima and, in particular, of the focus.

Summary and outlook: beyond slits
The spreading of a wave packet in the absence of a potential is a well-known phenomenon in quantum physics. However, also the opposite effect, that is a focusing of the wave packet can be achieved. In this case we have to imprint an appropriate position-dependent phase onto the initial wave function. Is it possible to achieve focusing even in the absence of any phase factors, that is for a real-valued initial wave function?
In the present article we have provided such an example in the form of a rectangular wave packet and have illuminated the resulting focusing in position as well as in phase space. Here we have used the time-dependent wave function and the Wigner distribution. Moreover, we have confirmed this phenomenon using laser light being diffracted from a slit building on the analogy between classical Fresnel optics and Schrödinger wave mechanics.
Throughout our article we have concentrated on rectangular wave packets created by a slit, that is by a one-dimensional aperture. However, similar effects appear in the diffraction from a two-dimensional aperture. In "Appendix C" we briefly summarize the literature on this problem, and discuss the similarities and differences between the oneand two-dimensional case using the examples of a slit and a circular aperture.
Figures 11 and 12 bring out most clearly the characteristic features of this dependence on the number of dimensions: (1) for the slit the relative intensity at the focus reaches the value of 1.8, whereas for the circular aperture the maximum is 4.0. (2) The position of the focus in the case of the circular aperture is closer to the opaque screen than in the slit, and (3) the focus originating from a circular aperture is more confined, since its decay towards the farfield region is faster.
This analysis shows that in two dimensions the focusing effect is more pronounced. Moreover, it is possible to optimize this effect by choosing different apertures, either by appropriate apodization, or by creating an optimal wave packet. Unfortunately, this topic goes beyond the scope of the present article and has to be postponed to a future publication.
Acknowledgements Our interest in diffractive focusing dates back many years when it was at the center of a very productive collaboration with Emerson Sadurni. Unfortunately, in the mean time his interests have shifted and he has decided not to join us in this article walking down memory lane. Although we regret his decision we respect it and are most grateful to him for working with us on this topic during his period at Ulm University and for allowing us to use some of the material prepared in those days. We have immensely profited from and enjoyed many discussions with M. Arndt, S. Fu, N. Gaaloul, G. Leuchs, E. M. Rasel, L. Shemer, D. Weisman, E. Wolf, J. Zhou, and M. S. Zubairy. In particular, we thank T. W. Hänsch for a fruitful exchange of emails concerning this topic. We thank Ch. Kranz and collaborators of the Institute of Analytical and Bioanalytical Chemistry, Ulm University, for the preparation of the slits on the thin aluminum film using FIB. This work is supported by DIP, the German-Israeli Project Cooperation. WPS is most grateful to Texas A&M University for a Faculty Fellowship at the Hagler Institute for Advanced Study at the Texas A&M University as well as to the Texas A&M AgriLife Research for its support.

Appendix A: Gaussian width
In this appendix we summarize the evaluation of the timedependent Gaussian width defined by Eq. 15 of our rectangular wave packet. To gain some insight into this unusual definition of width we first discuss several of its general properties and then rederive the familiar spreading of a Gaussian wave packet undergoing free expansion employing this measure. We conclude by considering the case of the rectangular wave packet.

A.1: General properties
We start our discussion of the Gaussian width of the probability density P = P(x) with the average by noting that for → 0 the familiar expansion leads us to Hence, in this limit the Gaussian width x 2 reduces to the second moment ⟨x 2 ⟩ of P = P(x).
Whenever P displays oscillations in the position variable x around an average value P that are rapid on the decay length 1∕ of the Gaussian in the definition, Eq. 37, of x 2 the oscillatory part averages out, and the familiar integral relation yields The other extreme occurs when P = P(x) is slowly varying compared to exp[−( x) 2 ]. In this case we can evaluate P at x = 0, factor it out of the integral and perform the integral with the help of Eq. 41. Thus, we arrive at the approximate expression which implies that the dependence of x 2 on an additional parameter, such as time is governed by the dependence of the probability density P at the origin on that parameter. Obviously, the most interesting case emerges when P and exp −( x) 2 vary on approximately the same length scale. This situation is of special interest when P exhibits a dominant maximum at x = 0. Indeed, with the help of the Taylor expansion around this point we find from the identity the Gaussian approximation of P. Here prime denotes differentiation with respect to x.
As a result, we obtain the expression where we have used again the integral relation, Eq. 41.

A.2: Gaussian wave packet
Next we evaluate the Gaussian width for a freely spreading Gaussian wave packet of initial form with  ≡ (2 ∕ ) 1∕4 and a real-valued constant . From the Huygens integral, Eq. 4, together with the propagator, Eq. 2, we obtain with the help of the integral relation the wave function and thus the probability density When we now substitute this expression into the definition, Eq. 37 of the Gaussian measure x 2 of width and recall the identity, Eq. 41, we arrive at the result or We note that for 2 ∕(2 ) ≪ 1 and ( ∕ ) 2 < 1 corresponding to early times this expression reduces to that is to the second moment ⟨x 2 ⟩ of the Gaussian, Eq. 51, as predicted by Eq. 40. Even more interesting is the limit 1 ≪ ( ∕ ) 2 corresponding to large times where Eq. 53 takes the form According to the definition, Eq. 3, of we find ∝ 1∕t and thus x 2 increases in a monotonic way tending towards the constant 1∕ 2 . This time dependence of the Gaussian width is in sharp contrast to the one of the second moment x 2 given by Eq. 54 which increases quadratically with time without a bound.

A.3: Rectangular wave packet
Finally we turn to the Gaussian width x 2 of the rectangular wave packet. Here we use the approximations developed in Sect. A.1 together with the properties of the approximate expressions, Eqs. 14 and 89 for the probability density Indeed, for early times Eq. 14 predicts rapid oscillations around P = 1∕L due to the fact that , which according to Eq. 3 is inversely proportional to t, is large. Hence, we find from Eq. 42 the approximation In the other extreme, that is for large times Fig. 1 shows that P is slowly varying. As a result, we obtain from Eq. 43 the formula (52) and the time dependence of x 2 is determined by the one of the probability density at the origin. Hence, a maximum in | (x = 0, t)| 2 translates into a minimum of x 2 and vice versa. However, in the neighborhood of the focus it is necessary to use the approximation Eq. 47 which requires P(0) and P �� (0). According to Fig. 3 the expression for P, Eq. 14, approximates well the exact distribution. Therefore, it suffices to expand Eq. 14 up to the second order in x and we find and In Fig. 13 we compare and contrast the three approximations Eqs. 56, 57, and 47 together with 58 and 59 to the numerically obtained curve based on the Fresnel integral, Eq. 5. We note that Eq. 56 provides us with the correct starting point of x 2 . Moreover, when we use Eq. 89 for | (x = 0, t)| 2 the approximation Eq. 57 describes well the long-time behavior but breaks down for short times. It is slightly off close to the focus. Here, only Eq. 47 together with Eqs. 58 and 59 yields a good fit to the exact curve.

Appendix B: Wigner dynamics of rectangular wave packet
In this appendix we analyze the time evolution of the rectangular wave packet from the point of view of Wigner phase space. For this purpose we first rederive the Wigner function of the initial rectangular wave packet and then obtain by integration over the momentum variable the timedependent probability density in position space. The resulting integral representation for the probability density at x = 0 allows us to derive analytical approximations in three different time domains: (1) early and intermediate times, (2) around the focus, and (3) very long times. We conclude by presenting an exact expression as well as several approximations for the location of the focus.

B.1: Initial Wigner function
We start by computing the Wigner function with the normalization constant  ≡ 1∕(2 ℏ) for the square packet given by the wave function When we substitute this expression into the definition of W 0 , Eq. 60, we find The step functions imply the inequalities − L 2 < x − 2 < L 2 and − L 2 < x + 2 < L 2 , which can be expressed as −L + 2x < < L + 2x and −L − 2x < < L − 2x, leading us to the relation −L + 2|x| < < L − 2|x|. These boundaries establish the limits of integration. Indeed, the integral vanishes for 2|x| − L ≥ L − 2|x| giving rise to a factor (L − 2|x|). Hence, the Wigner function corresponding to the rectangular wave packet reads which after integration takes the form and is shown in Fig. 9.

B.2: Time-dependent probability density
Next we derive the time-dependent probability density | (x, t)| 2 by integrating the time-dependent Wigner function over p, using the form When we substitute the expression Eq. 64 for W 0 with the appropriate arguments into Eq. 66 we arrive at the integral which reduces with the help of the Heaviside step function to We conclude by introducing the abbreviation and the change of variable ≡ 2y∕L and arrive at the exact expression for the time-dependent probability density.

B.3: Two exact expressions for x = 0
Next we evaluate the integral in Eq. 70 at x = 0 and note that the resulting integrand is symmetric with respect to = 0. As a result, Eq. 70 takes the form We can obtain an alternate form by the change of variables ≡ 4 (1 − ) and covers the interval [0, 1] twice as moves from 0 to 1. Indeed, for 0 < < 1∕2 we have respectively. When we decompose the integral into these two domains we find or Hence, we obtain the alternate exact expression for the time-dependent probability density at x = 0.

B.4: Approximations
Now we analyze the integrals of Eqs. 71 and 75 in three different time regimes. Our main interest is to estimate the probability density around the time of focusing.

B.4.1: Early times
In this domain we use the integral representation, Eq. 71, of | (0, t)| 2 and note that for t → 0, that is for → 0 two dominant terms emerge. Indeed, when we complete the square in the argument of the sine function we find indicating a point of stationary phase at = 1∕2.
Moreover, due to the factor 1∕ in the integrand another contribution arises. These observations suggest to decompose the integral into two parts: one around the origin and one containing the point of stationary phase, that is with Here a is a constant parameter such that 0 < a < 1∕2.
In the first integral, we approximate (1 − ) ≅ and take the limit → 0. This procedure leads us to where we have used the representation of the Dirac delta function.
We evaluate the second integral in Eq. 77 with the help of the method of stationary phase [93] which yields Finally, we combine the two results and arrive at the approximate expression which reproduces the initial value of the wave packet and yields a strongly oscillatory correction with a square root envelope. Indeed, these features stand out most clearly when we express this formula in terms of the characteristic time which yields The simplicity of this asymptotic expression for | | 2 clearly motivates the inclusion of the sometimes mysterious factor 2 in the definition of T. Equation 82 is in perfect agreement with Eq. 58, that is with our expression of Eq. 14 for x = 0. Needless to say, for t → ∞ this approximate formula, originally constructed for small times, breaks down. Indeed, we find leading to unphysical negative probabilities.

B.4.2: Time of maximum and long-time approximation
Next we use the representation of | (x = 0, t)| 2 given by Eq. 75 to obtain an estimate for long times, that is for 1 < , which captures the focusing effect as it manifests itself in a dominant maximum of the probability density. Since in this regime, the function sin( ∕8 ) does not display oscillations in the interval 0 < < 1, we may approximate it by a line secant to the curve sin( ∕8 )∕ at the end points, that is With this approximation and the integral relations and (85)

B.5: Location of the focus
We are now in the position to determine the exact location of the dominant maximum of | (x = 0, t)| 2 , that is of the focus by solving a transcendental equation. For this purpose we differentiate the exact expression for | (x = 0, t)| 2 in terms of the integral, Eq. 71, with respect to t which leads us to the equation It is interesting that our estimates based on the various approximations for | (x = 0, t)| 2 discussed before are very close to this value. Indeed, our early-time approximation, Eq. 82, provides us with the condition leading to the value e ≈ 0.058.
Remarkably, our long-time approximation, Eq. 89, which yields the transcendental equation with the solution l ≡ 3∕(16 ) ≈ 0.059 is also accurate up to two digits.

Appendix C: Circular apertures
Slits are frequently employed in diffraction problems because of the mathematical simplicity of the resulting diffraction integrals and the physical interpretation being limited to one space dimension only. However, circular apertures are often preferred due to their direct application in microscopy, photography and transmission between different regions of space. Indeed, early optics problems such as the camera obscura [3], the transmission by an aperture smaller than the wave length [19,20,22,31], the Fourier optics of pupils and their optical transfer functions [94] are only but a few examples illustrating this point. In this Appendix, we first briefly review the history of diffraction from a circular aperture, and then discuss the similarities and differences between the diffractive focusing of scalar waves arising from slits and circular apertures.

C.1: Cusps and multiple diffraction
We start by recalling that the Poisson-Arago spot is the intensity map resulting from the illumination of a circular disk. Therefore, the field distribution is the complement of that generated by a circular aperture and can be calculated using the Babinet principle [95][96][97][98].
In 1922, Coulson and Becknell 7 have investigated experimentally a variant of the Poisson-Arago spot [101,102] by measuring the intensity patterns generated by a disk rotated around its diagonal and illuminated by a point light source. The geometric shadow of this arrangement is an ellipse and the intensity pattern in the far-field is a diamond-like figure with four cusps being the evolute of the edge of the shadow. It is interesting that a few years earlier, Raman has also observed cusps in the diffraction patterns generated by an elliptical aperture [103].
Cusp are examples of catastrophe optics [99]. The interaction of light with a sharp-edge aperture was mathematically reformulated recently [104].
Finally, we turn to Letfullin and collaborators who have analyzed [105][106][107][108][109] theoretically and experimentally the phenomenon of multiple diffractive focusing, which occurs when we apply a second circular aperture of radius a 2 to the diffraction pattern generated by a first circular aperture of larger diameter a 1 , placed in the symmetry axis. Provided the distance L between the second and the first apertures is such that the Fresnel number N 1 ≡ a 2 1 ∕( L) is an odd integer, the second diffraction pattern has a maximum that can reach 10.0 times the illuminating intensity of the first aperture. Based on this observation they proposed a lens for matter waves [110].

Slit and circular aperture: paraxial versus non-paraxial optics
In Sect. 4 we have studied the diffraction from a slit of width L illuminated by light of wave length using the Fresnel and the Rayleigh-Sommerfeld integrals. The evaluation of the RS-I integral requires the numerical integration of Hankel functions [77]. We now briefly summarize an analogous discussion for the case of a circular aperture of radius a = L∕2.
In the Fresnel diffraction from a circular aperture we make use of Lommel functions in the calculation algorithm [77,[84][85][86]. For the non-paraxial calculation, the RS-I and RS-II integrals are evaluated numerically using the GNU Scientific Library (GSL) [111]. The results for the slit and the circular aperture are depicted in Figs. 11 and 12 in Sect. 6.
The fields and the corresponding normalized intensities on-axis based on the Rayleigh-Sommerfeld integrals for a circular aperture of radius a illuminated by a plane wave [43,45,77] read and From these expressions, 8 especially the one for I RS−II ∕I 0 , several properties of the non-paraxial diffraction stand out: the number of maxima, or minima of each intensity function is finite. Indeed, the maxima follow from the condition or equivalently, k( √ a 2 + z 2 − z) = (2m + 1) , with m = 0, 1, 2, … The minima occur when with n = 1, 2, … The argument of the cosine function never vanishes, except for the trivial case of a = 0, or in the asymptotic limit z → ∞. On the other hand, the intensity at the origin (z = 0) oscillates between 0 and 4.0, depending only on a. In contrast, the intensity I RS-I ∕I 0 always converges to 1.0.
The representation of the on-axis intensities I RS-I and I RS-II using a logarithmic scale for z permits us to find another interesting property of the non-paraxial diffraction: the separation between maxima, or minima along the log(z) -axis is constant. This property is due to the argument of cosine function.
We note that Forbes has investigated the scaling properties of the Fresnel diffraction patterns for circular apertures [92], but he did not mention this feature as it arises only in non-paraxial diffraction.
The Fresnel diffraction intensity on-axis of a circular aperture reads with the argument of the cosine-function being inversely proportional to z. Thus, the number of maxima and minima is infinite.
�� . 8 We emphasize that the z-values of the maxima and the minima coincide for I RS−I and I RS−II .
We can find the location z m = a∕ of the last maximum from Eq. 103. If we normalize this length by L 2 ∕ we obtain that is the position of the focus.
In the case of the slit no analytic expressions are known for the non-paraxial intensity, but the graphical representations of the RS-I intensities in Fig. 11 show that the number of lobes in the intensity map of a slit of width L = 2a is equal to that for a circular aperture of the same diameter. It depends only on the ratio L∕ .
However, the maxima are differently distributed along the z-axis, and the dominant lobe containing the focusing maximum is much more elongated than that of the circular aperture.
The Fresnel intensity on-axis is approximated by which shows why the position of the focus is shifted towards larger z values, compared to the circular aperture. After the normalization by L 2 ∕ we find Moreover, the maxima of the Fresnel intensity decay in amplitude converging to 1.0 for z → 0, unlike in the circular aperture.