1 Introduction

In July 1816, the civil engineer Augustin-Jean Fresnel published his preliminary results [1] confirming the wave theory of light. Three years later he participated with his Mémoire sur la Diffraction de la Lumière in the Grand Prix of the French Academy of Sciences [2]. It was on this occasion that Siméon Poisson predicted that an opaque disc illuminated by parallel light would create a bright spot in the center of a shadow. This phenomenon was experimentally confirmed by Francois Arago and led to the victory of the wave over the particle theory. In the present article we discuss an effect related to the Poisson spot which is the one-dimensional analogue of the camera obscura [3, 4].

Indeed, we have recently found [5] that a rectangular matter wave packet which undergoes free time evolution according to the Schrödinger equation focuses before it spreads. This phenomenon has been confirmed for light [6], water and surface plasmon waves [7]. In the present article we illustrate this effect in Wigner phase space and verify it using classical light in real space.

Our article is organized as follows: in Sect. 2 we first give a brief history of the diffraction of waves, and then review several focusing effects especially those associated with the phenomenon of diffraction in time introduced in Moshinsky [8].

We dedicate Sect. 3 to the discussion of the focusing of a rectangular wave packet from the point of view of the time-dependent wave function. In particular, we show this effect manifests itself in the time-dependent probability density as well as the Gaussian width [5] of the wave packet. For this purpose we derive exact as well as approximate analytical expressions for the time-dependent probability amplitude and density.

In Sect. 4 we verify these predictions reporting on an experiment using laser light diffracted from a single slit. Here we take advantage of the analogy between the paraxial approximation of the Helmholtz equation of classical optics and the time-dependent Schrödinger equation of a free particle. We measure the intensity distributions of the light in the near-field of the slits and obtain the Gaussian width of the intensity field. Moreover, we make contact with the predictions of non-paraxial optics.

Section 5 illuminates this focusing effect from quantum phase space using the Wigner function. In particular, we show that the phenomenon of focusing which reflects itself in a dominant maximum of the probability density on the optical axis follows from radial cuts through the initial Wigner function at different angles with respect to the momentum axis. Moreover, we analyze the rays and envelopes of the Wigner function in more detail.

We conclude in Sect. 6 by summarizing our results and by providing an outlook. Here we allude to the influence of the number of dimensions on the focusing and emphasize the importance of corrections to paraxial optics.

To keep our article self-contained while focusing on the central ideas we have included three appendices. Indeed, “Appendix A” contains the calculations associated with the Gaussian width of our wave packet and “Appendix B” presents a detailed discussion of the Wigner function approach towards diffractive focusing. As an outlook we compare in “Appendix C” the paraxial and non-paraxial results obtained for diffraction by slits and circular apertures.

2 Diffraction theory

In this section we first provide a historical overview of diffraction and then address the phenomenon of diffractive focusing. Due to their different nature we distinguish in this discussion between light and matter waves. Moreover, we briefly review the concept of diffraction in time.

2.1 A brief history

Following the experimental demonstration of the wave nature of light by Thomas Young [9] and the first theory on diffraction by Fresnel [1] the nineteenth century was extremely successful in the investigation of wave phenomena, specially in optics. The unifying electromagnetic theory of James Clerk Maxwell [10] was the culmination of all previous developments on electromagnetism. Gustav Kirchhoff readdressed the diffraction of scalar waves and put it on a rigorous mathematical foundation [11]. The Fresnel diffraction arises now as a special case of the Kirchhoff diffraction. Arnold Sommerfeld and Lord Rayleigh [4, 12, 13] improved the Kirchhoff theory correcting the boundary conditions at the aperture and with that eliminating the discrepancy arising between the solutions and the boundary conditions chosen by Kirchhoff. Friedrich Kottler proposed another reason for this discrepancy by showing that the Kirchhoff integral can be interpreted not as a solution of the boundary value problem but as a solution for the “saltus” at the boundary [14, 15]. Moreover, he extended the scalar theory to electromagnetic waves [16, 17].

During the twentieth century numerous theoretical and experimental contributions to diffraction theory emerged. Julius Stratton and Lan Jen Chu extended the scalar Kirchhoff diffraction theory to vector waves [18] accounting for polarization. Hans Bethe found analytical solutions for the diffraction of electromagnetic waves by an aperture much smaller than the wavelength [19]. His theory and the corrections later introduced by Christoffel Bouwkamp [20] became important because of the invention of the near-field scanning microscope (SNOM or NSOM) [21] and the developments related to near-field optics [2224]. In 1998, Thomas Ebbesen and collaborators observed that light transmission through an array of subwavelength apertures drilled in noble metal thin films can largely surpass the value predicted by Bethe [25]. This extraordinary optical transmission is dependent on the geometry of the array, on the illumination conditions and on the size and shape of the apertures [26, 27]. It results from the excitation of surface plasmon modes near the aperture. In plasmonic gratings with narrow slits it may also lead to an attenuation of the transmitted light stronger than that predicted by the Bethe–Bouwkamp theory [28].

2.2 Focusing of waves

Focusing of waves by diffraction due to slits or apertures falls into two categories: (1) near-field focusing effects arising mainly in the diffraction of electromagnetic waves, and (2) focusing resulting from diffraction of slits or apertures larger than the wave length, where the focus is located in the Fresnel zone.

In the first category we include the focusing of light resulting from the confinement of surface plasmons in nanostructured apertures in plasmonic materials [25, 26, 29]. Frequently scalar and electromagnetic diffraction theories assume the apertures to be located in infinite and perfectly absorbing screens, and thus surface plasmons are ignored. Hence, these theories cannot account for plasmonic modes and their optical effects. To accurately describe the effects produced by the excitation of surface plasmons a full electromagnetic theory using the optical properties of real materials is required.

The focusing of light by apertures smaller than the wave length has been investigated theoretically several times in the last decades [30, 31]. However, this near-field focusing is dependent on the polarization of light and restricted to small apertures.

The properties of the focus in laser beams and atomic beams is of interest in microscopy and atom optics. Standard laser beams such as Laguerre–Gaussian, or Hermite–Gaussian beams can be strongly focused. The smallest order Hermite–Gaussian beam called TEM\(_{00}\) has the highest confinement and is, therefore, preferred in confocal microscopy. Other beams such as Airy and Bessel beams [32, 33] have non-diffracting properties.

To increase the field confinement, and thus the resolution of a microscope, novel laser beams and illumination mechanisms have been proposed [22, 29, 3436].

In parallel, a similar interest exists in the confinement and squeezing of matter wave packets [3739]. Focusing effects in atomic beams resulting from the interaction with laser fields diffracted by apertures in metallic screens were investigated recently [40]. Indeed, the interaction of matter waves with light fields has been the subject of intensive research in atom and quantum optics [41]. However, this spatial confinement, or focusing is of different nature than that of diffraction. In the latter, the field confinement created is solely determined by the properties of the incoming wave and the aperture. No other optical element is involved.

Self-focusing of light may also arise in nonlinear media [42]. However, we will not discuss this phenomenon in this article, but rather concentrate in our analysis in the focusing effects arising from diffraction in free space due to slits larger than the wavelength.

In 2012, the focusing of light waves by a slit larger than the wavelength was experimentally observed [6]. The diffraction pattern is similar to that of a circular aperture of several wavelengths in diameter [43, 44]. The main difference between a slit and a circular aperture is the value of the dominant maximum, relative to the intensity of the incident wave. For a circular aperture it reaches 4.0, whereas for a slit is only 1.8 stronger than the incoming wave [6, 43].

The diffraction patterns of slits and circular apertures for scalar waves and non-polarized electromagnetic waves can be accurately calculated using the Rayleigh–Sommerfeld diffraction integrals, even in the case of apertures of the size of the wavelength, without using any mathematical approximation. Moreover, analytical solutions for the on-axis field intensity were found for the circular aperture [43, 45], and the oscillations of the intensity on-axis were confirmed for electromagnetic waves [44, 46].

2.3 Diffraction in time

Moshinsky [8] introduced the concept of diffraction in time using matter waves. Remarkably the time evolution of the probability density of a wave packet suddenly released by a shutter is mathematically identical to the intensity pattern behind a semi-infinite plane. This analogy stands out most clearly when we substitute the time coordinate of the wave packet by the corresponding space coordinate in diffraction. Then the solution of the Schrödinger equation for the problem of the Moshinsky shutter is identical to that of the Fresnel diffraction by a semi-infinite plane, and the probability density reaches a maximum of 1.3. Moreover, Moshinsky analyzed later the time–energy uncertainty associated with the shutter arrangement [47] and Godoy investigated the Fresnel and Fraunhofer diffraction in time of initially stationary states [48].

Recently, the diffraction in time of the double-shutter problem was analyzed [5]. An initially confined rectangular wave packet in one dimension is suddenly released and evolves in time. Again, the solution of the corresponding Schrödinger equation has the same form as the Fresnel diffraction of scalar waves by a single slit of infinite length.

However, we emphasize that Fresnel diffraction only holds true in the paraxial approximation of optics. The general solution of the diffraction by a slit is found by solving the Kirchhoff, or the Rayleigh–Sommerfeld diffraction integrals.

The mathematical analysis of the classical diffraction problems makes use of wave functions expressed in real, or reciprocal space. It is also very common in the investigation of the diffraction of matter waves [8, 4951].

However, since Wigner introduced his famous distribution function [52] an increasing number of publications has used the Wigner phase space representation to study the dynamics of light beams [5357] and matter waves [5865]. Other phase space distribution functions related to the Wigner function have been also used in matter waves phenomena. They are interrelated and belong to the Cohen class [66]. In this article we employ both the wave function and the Wigner representations of matter wave packets.

The evolution in time of matter waves with zero angular momentum, so-called s-waves, strongly depends on the number of space dimensions [67]. For instance, in two dimensions, an initial ring-shaped wave packet first contracts reaching a minimum, reducing the radius of the ring, and then monotonically expands. In three dimensions, the radius only increases. This effect is attributed to a quantum anti-centrifugal force [6769]. This example shows that the focusing effect of a free wave packet is a more general phenomenon than that arising from the free time evolution of a one-dimensional rectangular wave packet.

We conclude with a brief reference to the type of boundaries of the slit, or shutter. Most of the classical treatments of diffraction problems define the edges of the slit, or of other aperture shape as sharp transitions between a perfectly absorbing surface and a homogeneous fully transmitting medium. In quantum matter waves, the Moshinsky shutter or the sudden release of rectangular wave packet is also an example of sharp boundaries. The effects arising in the diffraction patterns due to non-sharp boundaries have been investigated recently [5, 50, 7073].

3 Wave function approach

In this section we use the solution of the time-dependent Schrödinger equation, that is the wave function, to show that a freely propagating rectangular wave packet exhibits the phenomenon of focusing. For this purpose we first express the time-dependent wave function in terms of a Fresnel integral, and then derive analytic approximations for the wave function as well as the probability density. To bring out most clearly the focusing effect we finally calculate the Gaussian width [5] of the wave packet and demonstrate that it exhibits a clear minimum at the time of the focusing.

3.1 Time evolution

Central to our discussion is the free propagation of a wave packet corresponding to a non-relativistic particle of mass M. The initial wave function

$$\begin{aligned} \psi _{0}(x) \equiv \psi (x, t=0) \equiv \frac{1}{\sqrt{L}} \varTheta \left( \frac{L}{2}-|x|\right) \end{aligned}$$
(1)

is of rectangular form with a length L. Here \(\varTheta\) denotes the Heaviside step function.

With the help of the propagator [74]

$$\begin{aligned} G(x, t|y, 0) \equiv \sqrt{\frac{\alpha (t)}{i\pi }} {\rm e}^{i\alpha (t)(x - y)^{2}} \end{aligned}$$
(2)

of a free particle connecting the initial coordinate y with x at time t, and the abbreviation

$$\begin{aligned} \alpha (t) \equiv \frac{M}{2\hbar t} \end{aligned}$$
(3)

containing the reduced Planck constant \(\hbar\), we find from the Huygens principle of matter waves

$$\begin{aligned} \psi (x,t) = \int _{-\infty }^{\infty } {\rm d}y\, G(x, t|y, 0) \psi _{0}(y) \end{aligned}$$
(4)

the expression

$$\begin{aligned} \psi (x,t) = \sqrt{\frac{1}{i\pi L}} \int _{\sqrt{\alpha (t)}(x -L/2)}^{\sqrt{\alpha (t)}(x + L/2)} {\rm d}\xi \, {\rm e}^{i \xi ^{2}} , \end{aligned}$$
(5)

for the time-dependent probability amplitude. Here we have introduced the integration variable \(\xi \equiv \alpha ^{1/2}(x - y)\).

When we decompose the integral in Eq. 5 into two parts each starting from \(x=0\), the wave function

$$\begin{aligned} \psi (x,t)&= \sqrt{\frac{1}{2 i L}} \left\{ F\left[ \sqrt{\alpha (t)}(x + L/2)\right] \right. \nonumber \\&\quad \left. - F\left[ \sqrt{\alpha (t)}(x - L/2)\right] \right\} \end{aligned}$$
(6)

consisting of the difference of two Fresnel integrals

$$\begin{aligned} F(w) = \sqrt{\frac{2}{\pi }} \int _{0}^{w} {\rm d}\xi \, {\rm e}^{i \xi ^{2}} , \end{aligned}$$
(7)

is thus determined by the interference of the diffraction patterns originating from two semi-infinite walls located at \(x=L/2\) and \(x=-L/2\). The amplitude and phase of each contribution are given by the Fresnel integral F, whose real and imaginary parts

$$\begin{aligned} C(w) \equiv \sqrt{\frac{2}{\pi }}\int _{0}^{w} {\rm d}\xi \, \cos \xi ^2 \end{aligned}$$
(8)

and

$$\begin{aligned} S(w) \equiv \sqrt{\frac{2}{\pi }}\int _{0}^{w} {\rm d}\xi \, \sin \xi ^2 \end{aligned}$$
(9)

follow from the Cornu spiral [75] represented in the complex plane.Footnote 1

Fig. 1
figure 1

Time evolution of a rectangular wave packet represented by its probability density \(|\psi (x,t)|^{2}\), given by Eq. 5, depicted in continuous space-time (top) and at specific times (bottom) corresponding to a strongly oscillatory behavior, the focus, and the ballistic expansion. To bring out the characteristic features in the early-time evolution we have represented the time axis by a logarithmic scale. For a better comparison the initial wave packet at \(t = 0\), that is for \(\log (t = 0) = -\infty\) is moved to \(\log (t) = -3.0\) since our time axis extends only to this value

In Fig. 1 we present the probability density \(|\psi (x,t)|^{2}\) as a function of space and time. Here and in the remainder of our article we represent the coordinate x in units of L and the time t in units of the characteristic time

$$\begin{aligned} T \equiv \frac{M L^{2}}{2\pi \hbar } \equiv \frac{M L^{2}}{h} \end{aligned}$$
(10)

The curious inclusion of the factor \(2\pi\) is motivated by the asymptotic expressions of \(|\psi |^{2}\) discussed in the appendices. Moreover, the probability density \(|\psi |^{2}\) is always in units of 1 / L.

Whereas on the top of Fig. 1 we show \(|\psi (x,t)|^{2}\) in continuous space-time in the bottom panel we select specific time slices corresponding to (1) short times where the probability distribution oscillates strongly, (2) intermediate times leading to focusing, and (3) longer times representing the ballistic regime.

At \(t = 0\) the probability density starts from its initial rectangular shape and immediately develops two peaks at the edges decorated with fringes. However, after this transitional phase the two peaks disappear and a dominant maximum at the origin \(x = 0\) forms. It is most pronounced at \(t \approx 0.342\) corresponding to the focus when the width of the wave packet assumes a minimum. Indeed, here the probability density assumes a maximum, which is about a factor 1.8 larger then at \(t = 0\) where it is unity. After the focus, that is for larger times, the wave packet displays the familiar spreading effect.

3.2 Analytic approximations

Next we give approximate but analytical expressions for the time-dependent probability amplitude and probability density. Since our interest is to obtain the behavior at early times, that is before the ballistic expansion occurs, we shall consider small values of t, corresponding to the regime where \(\alpha (t)\) is large.

With the help of the asymptotic expansion [76]

$$\begin{aligned} \int _{0}^{a} {\rm d}\xi \, {\rm e}^{i \xi ^{2}} \cong \frac{\sqrt{i\pi }}{2} + \frac{{\rm e}^{ia^{2}}}{2i a} \end{aligned}$$
(11)

valid for \(1 \ll a\) the expression Eq. 5 for the probability amplitude reduces to

$$\begin{aligned} \psi (x,t)&\cong \frac{1}{\sqrt{i \pi L}} \left[ \sqrt{i \pi } + \frac{{\rm e}^{i\alpha (t)(L/2-x)^2}}{2i\sqrt{\alpha (t)}(L/2-x)} \right. \nonumber \\&\quad + \left. \frac{{\rm e}^{i\alpha (t)(L/2+x)^2}}{2i\sqrt{\alpha (t)}(L/2+x)} \right] , \end{aligned}$$
(12)

and the probability density reads

$$\begin{aligned} |\psi (x,t)|^2&\cong \frac{1}{L\pi } \left| \left[ \sqrt{i\pi } + \frac{{\rm e}^{i\alpha (t)(L/2-x)^2}}{2i\sqrt{\alpha (t)}(L/2-x)} \right. \right. \nonumber \\&\quad + \left. \left. \frac{{\rm e}^{i\alpha (t)(L/2+x)^2}}{2i\sqrt{\alpha (t)}(L/2+x)} \right] \right| ^{2} . \end{aligned}$$
(13)

This expression simplifies further when we neglect terms \(\mathcal {O}\left[ t^{2}/(1 \pm 2x/L)^{2}\right]\) and takes the form

$$\begin{aligned} |\psi (x,t)|^2&\cong \frac{1}{L} \left[ 1 + \frac{ \sin \left[ \alpha (t)(L/2-x)^2 -\pi /4 \right] }{\sqrt{ \pi \alpha (t)}(L/2-x)} \right. \nonumber \\&\quad + \left. \frac{\sin \left[ \alpha (t)(L/2+x)^2 - \pi /4 \right] }{\sqrt{ \pi \alpha (t)}(L/2+x)} \right] . \end{aligned}$$
(14)

In Fig. 2 we compare and contrast the resulting probability densities at \(x = 0\) as a function of time and find excellent agreement between the numerical result following from the evaluation of the integral of Eq. 5, and the approximations based on Eqs. 13 and 14. We emphasize that our approximations break down for very large values of t, but they succeed in giving the maximum of the distribution corresponding to the focusing effect.

We also test in Fig. 3 our approximate but analytic expressions, Eqs. 13 and 14, for the probability density against the exact numerical result given by Eq. 5 at characteristic times confirming again the focusing effect.

Fig. 2
figure 2

Comparison between the exact numerical (red) and two approximate analytical expressions (blue and green) for the probability density \(|\psi (x=0,t)|^{2}\) as a function of time. Our curves are based on Eqs. 513 and 14, respectively. The oscillations near \(t = 0\) are well approximated both in amplitude and phase by the blue and green curves. A maximum occurs for \(t \approx 0.342\). Both approximations fail for large times

Fig. 3
figure 3

Comparison between the exact numerical position-dependent probability density (red) given by Eq. 5 with the approximate formulae (blue and green) represented by Eqs. 13 and 14, respectively. In the four panels the axes cover identical domains

3.3 Focusing expressed by the Gaussian width

So far we have analyzed the phenomenon of diffractive focusing of our rectangular wave packet by considering the complete probability density in space and time. We now characterize this effect by the Gaussian measure [5]

$$\begin{aligned} \delta x^{2}(t) \equiv \frac{1}{\kappa ^2}\left[ 1- \int _{-\infty }^{\infty } {\rm d}x\, {\rm e}^{-(\kappa x)^2} |\psi (x,t)|^2 \right] \end{aligned}$$
(15)

discussed in more detail in “Appendix A”. Here \(\kappa\) is a constant with units of an inverse length.

For our rectangular wave packet we obtain the Gaussian width by numerically evaluating Eq. 15 using the integral representation Eq. 5 of the probability amplitude. In Fig. 4 we depict the corresponding curve normalized to its initial value \(\delta x^{2}(0)\) for \(\kappa = 6.0\) which displays a clear minimum at \(t \approx 0.39\), thus confirming the focusing effect.

However, we note that the location of the minimum of \(\delta x^{2} = \delta x^{2}(t)\) deviates slightly from the location of the maximum of \(|\psi (x=0, t)|^{2}\) which occurs at \(t = 0.342\) as indicated in Fig. 2. This deviation is the result of the integration in Eq. 15.

4 Experimental approach

In the preceding section we have shown that a rectangular matter wave packet first focuses before it spreads. We now describe an experiment to observe the diffraction pattern and, in particular, the focusing arising close to the slit. Here we take advantage of the familiar analogy between the Schrödinger equation

$$\begin{aligned} i \hbar \frac{\partial \psi }{\partial t} = -\frac{\hbar ^{2}}{2 M}\,\frac{\partial ^{2} \psi }{\partial x^{2}} \end{aligned}$$
(16)

of a free particle, and the paraxial wave equation

$$\begin{aligned} 2i k \frac{\partial \psi }{\partial z} = -\frac{\partial ^{2} \psi }{\partial x^{2}} \end{aligned}$$
(17)

of classical optics. In this situation z denotes the coordinate of propagation and k the wave vector of the electromagnetic wave.

Indeed, the two wave equations are identical when we make the substitution

$$\begin{aligned} z \equiv \frac{\hbar k}{M} t = \frac{2\pi \hbar }{M L^{2}}\frac{L}{\lambda } L t = \frac{L}{\lambda } L \frac{t}{T} \end{aligned}$$
(18)

where \(\lambda\) is the wave length. In the last step we have recalled the definition, Eq. 10 of the characteristic time T. Hence, time in the Schrödinger equation translates into propagation distance z.

As a consequence of the analogy between Eqs. 16 and 17 with the scaling given by Eq. 18, it suffices to perform our experiment with light rather than matter waves.

4.1 Setup

We use a confocal microscope for measuring the light intensity diffracted by a one-dimensional slit milled in an aluminum film.Footnote 2 The slit of length 50 \(\upmu\)m and width 2440 nm is illuminated by laser light of wavelength 488 nm as shown in Fig. 5a. A single mode optical fiber with collimator was used for the illumination. The collimated laser beam has a diameter of approximately 1 mm and is, therefore, much larger than the slit. For this reason we assume the illumination of the slit as a plane wave.

To generate images in different planes above the slit we employ a confocal laser scanning microscope (WITec GmbH). The objective used for light collection was an infinitely corrected Olympus MPlan with 100\(\times\) magnification and numerical aperture NA = 0.9. The collected light was focused into a multimode optical fiber connected to an avalanche photodiode. The light diffracted by the slit is measured in a rasterized way. Each point corresponds to a pixel of an image generated by scanning in the horizontal, or in the vertical direction, as outlined in Fig. 5a.

4.2 Results

In Fig. 5b, c, we present images of the light intensity in the plane of the slit, and perpendicular to the sample, respectively. In the latter case, we scan the confocal microscope in the vertical direction with a minimum scan step of \(\Delta z \approx 50\) nm. The pixel size in the horizontal direction corresponds to a dislocation of \(\Delta x = \Delta y = 10 \ \upmu {\rm m} / 512 \text {--} 20\) nm. Thus, each pixel is much smaller than the wavelength and the optical resolution, which according to the Rayleigh criterion is \(\Delta r \equiv 0.61\times \lambda / {\rm NA}\). The horizontal fringes appearing in the light intensity of Fig. 5c are due to the mechanical motion of the microscope when scanning vertically the diffracted light. We note the dominant maximum of the intensity at \(z \approx 4.1\) \(\upmu\)m which confirms our prediction of the focusing effect.

To analyze the phenomenon in a quantitative way we use the experimental intensity distribution of Fig 5c to obtain the Gaussian width \(\delta x^{2}\) defined by Eq. 15. The so-calculated curve, now displayed in Fig. 4 as a function of the propagation distance z and scaled according to Eq. 18, follows nicely the theoretical prediction. In particular, it displays the characteristic minimum indicating focusing at the same location as the theoretical curve.

The rapid modulation of the experimental curve is a consequence of the horizontal fringes emerging due to the mechanical motion of the microscope as mentioned above, and thus of the measurement technique. An average over these oscillations leads to a smooth curve following the theoretical curve.

Fig. 4
figure 4

Comparison between the numerical and experimental Gaussian width \(\delta x^{2}(t)/\delta x^{2}(0)\) of a focusing rectangular wave packet. To have the same domain in the abscissa the z-coordinate of the experimental data (blue curve) was scaled using \(L = 2.44\) \(\upmu\) m. The time coordinate t of the theoretical evaluation (green curve) based on Eq. 5 was scaled according to Eq. 18 with \(\lambda = 0.2\). For both curves we have chosen \(\kappa = 6.0\)

Fig. 5
figure 5

Experimental verification of diffractive focusing from a single slit: setup (a) based on a confocal microscope viewing a section of the slit, and light intensity measured in the \(x-y\) plane of the slit (b), and in the \(x-z\) plane perpendicular to the substrate and slit (c)

4.3 Paraxial versus non-paraxial optics

To compare our experimental results to the theoretical predictions of classical optics we have calculated the intensity and phase for a slit of width \(L = 2.44\) \(\upmu\)m illuminated by light of wave length \(\lambda = 0.488\) \(\upmu\)m corresponding to the same ratio \(L/\lambda = 5\) as in the experiment. For this purpose we use the Fresnel and the Rayleigh–Sommerfeld diffraction integrals familiar from paraxial and non-paraxial optics, respectively. In particular, we have chosen the Rayleigh–Sommerfeld integral of the first kind (RS-I) [4, 12, 13]Footnote 3 and made use of numerical integration and algorithms reported in [77]Footnote 4.

In Fig. 6a, b we depict the intensities following from the Rayleigh–Sommerfeld and Fresnel integrals, respectively. Moreover, in Fig. 7a, b we show the corresponding phases. For a direct comparison with Fig. 5 we have refrained from using normalized coordinates for both axis.

We note three characteristic features: (1) the number of intensity lobes is finite for RS-I, whereas it is infinite for Fresnel. (2) The phase pattern predicted by RS-I evolves almost as a plane wave in the propagation direction. In contrast, the Fresnel phase shows small oscillations around zero radians in the propagation direction, but rapid oscillations in the transverse direction, and (3) the Fresnel and RS-I intensity tend to agree as we move away from the slit. In particular, the focus occurs approximately in the same position for both calculations, as we show later.

We conclude this section by noting that although RS-I is only valid for scalar waves its intensity map still fits well with the optical intensity obtained experimentally.

Fig. 6
figure 6

Comparison of the intensities of a plane wave of wave length \(\lambda = 0.488\) \(\upmu\)m diffracted by a slit of width \(L = 2.44\) \(\upmu\)m calculated either from the Rayleigh–Sommerfeld integral RS-I (a), or the Fresnel integral (b). In both cases focusing occurs at \(z \approx 4\), where the intensity reaches the value of 1.8. However, in contrast to the Fresnel diffraction the number of lobes for the RS-I diffraction is finite

Fig. 7
figure 7

Comparison of the phase patterns of a plane wave of wavelength \(\lambda = 0.488\) \(\upmu\)m diffracted by a slit of width \(L = 2.44\) \(\upmu\)m calculated either from the Rayleigh–Sommerfeld integral RS-I (a), or the Fresnel integral (b). The wavefronts for the non-paraxial case evolve in the forward direction almost parallel with small perturbations. However, the paraxial phase shows a much more complicated behavior with small fluctuations around zero radians on the optical axis and rapid oscillations in the lateral direction close to the slit

5 Wigner function approach

In the preceding sections we have analyzed the focusing of a rectangular wave packet with the help of the time-dependent Schrödinger equation and have confirmed the effect using optical waves. We now consider this phenomenon from a different point of view, that is from phase space taking advantage of the Wigner function. This formalism has the remarkable feature that the time evolution is identical to the classical one, consisting of a shearing of the initial Wigner function. The latter contains the interference nature of quantum mechanics.

5.1 Probability density from tomographic cuts

The Wigner function has several unique properties. The ones most relevant for the present discussion are: (1) the time evolution of a free particle follows by a replacement of the phase space variables according to the classical motion, and (2) the corresponding probability densities originate from the integration of the Wigner function over the conjugate variable, that is by an appropriate tomographic cut. We now discuss these features in more detail and derive an expression for the probability density which is different from, but equivalent to the one discussed in Sect. 2.

5.1.1 A brief introduction

Wigner functions are quasi-probability distributions first introduced by Wigner [52] in the context of quantum correlations in statistical mechanics. They belong to the Cohen class of distributions [66] and are, therefore, related to other quadratic kernel distributions.

There exist extensive applications of the Wigner function, not only in quantum physics, but also in optics and signal processing [57, 87]. Indeed, Wigner functions are frequently used to represent quantum systems in phase space [60, 62, 88, 89]. Moreover, they describe the time evolution of wave packets in a similar way as classical phase space distributions in terms of the trajectories of classical particles [63] and have also been used in the study of diffraction in time of wave packets [51, 61, 90], a topic that has been addressed several times since the seminal article of Moshinsky [8].

5.1.2 Definition and time evolution

In this section we first define the Wigner function and discuss its time evolution in phase space. We then use its property to provide us with the marginals to compute the time- and space-dependent probability density. Here we take advantage of the fact that the Wigner function at arbitrary times is easy to obtain [63].

Indeed, for a given wave function \(\psi = \psi (x)\) we find the corresponding Wigner function from the definition

$$\begin{aligned} W(x,p) \equiv \mathcal {N} \int _{-\infty }^{\infty } {\rm d}\xi \, {\rm e}^{i p \xi / \hbar } \psi \left( x-{\textstyle {1\over 2}}\xi \right) \psi ^{*}\left( x+{\textstyle {1\over 2}}\xi \right) \end{aligned}$$
(19)

with the normalization factor \(\mathcal {N} \equiv 1/(2 \pi \hbar )\), and a free particle undergoing time evolution is described in phase space by the time-dependent Wigner function

$$\begin{aligned} W(x,p;t) = W_0 \left( x-\frac{pt}{M}, p\right) . \end{aligned}$$
(20)

Here \(W_{0} \equiv W(x,p; t = 0)\) denotes the Wigner function of the initial wave function \(\psi _{0} \equiv \psi (x,t=0)\).

The probability density

$$\begin{aligned} |\psi (x,t)|^2 = \int _{-\infty }^{\infty } {\rm d}p\, W_{0} \left( x - \frac{pt}{M}, p\right) \end{aligned}$$
(21)

at a given point in space and time is obtained by integration over p which after the change of variables \(y \equiv x - (p/M)t\) reads

$$\begin{aligned} |\psi (x,t)|^2 = \frac{M}{t} \int _{-\infty }^{\infty } {\rm d}y\, W_0 \left( y, M(x-y)/t \right) . \end{aligned}$$
(22)

This representation is particularly useful in the discussion of the asymptotic behavior of the probability density addressed in more detail in “Appendix B”.

5.2 Time evolution: rectangular wave function

We are now in the position to consider the time evolution of our rectangular wave function. For this purpose we first calculate the Wigner function corresponding to this wave, and then analyze the shearing in phase space. Although we mainly study the motion of the complete Wigner function we also concentrate on those contributions in phase space which lead to the focusing.

5.2.1 General case

According to “Appendix B” the Wigner function of the rectangular wave packet, Eq. 1 reads

$$\begin{aligned} W_0(x,p) = \frac{1}{\pi L} \varTheta \left( \frac{L}{2} - |x| \right) \frac{1}{p}\sin \left[ \frac{p}{\hbar }(L - 2|x|)\right] \end{aligned}$$
(23)

and is shown in Fig. 8a. Here we identify three characteristic features: (1) since the rectangular wave function is restricted to the interval \(|x| \le L/2\) also the Wigner function is limited in phase space to this domain. (2) The Wigner function assumes positive as well as negative values. (3) We recognize a dominant positive contribution along the x-axis with a maximum at the origin.

The time evolution given by Eq. 20 manifests itself in a shearing of the initial Wigner function \(W_{0}\) shown in Figs. 8b, c. Indeed, every point (xp) of the Wigner phase space moves according to the Newton law, that is \(x \rightarrow x + p t/M\) while p is a constant of the motion.

Especially the Wigner function of Fig. 8c is interesting since it represents the moment of focusing. Indeed, by this time all negative contributions of W have moved awayFootnote 5 from the positive peak centered at the x-axis but now subtract from its wings. This fact stands out more clearly in the position distribution shown in Fig. 8d which can be understood as the result of an integration of W over the momentum variable.

Fig. 8
figure 8

Time evolution of the Wigner function corresponding to a rectangular wave packet represented by three density plots illustrating the shearing in phase space together with the associated probability densities in position space. The Wigner function of the initially rectangular wave packet is depicted in a at \(t = 0\). The distribution shears with time, as shown in b at time \(t=0.0628\), and the moment of focussing c at \(t = 0.342\). At the bottom d we display the corresponding probability densities in the coordinate x obtained by integration over the conjugate variable p. The momentum p is expressed in units of \(\hbar /L\) as suggested by the argument of the sine function in the expression, Eq. 23, for the Wigner function

In this active approach the Wigner function evolves in time but the axes of phase space which define the tomographic cuts, that is integration over x or p remain fixed. An alternative formulation follows from Eq. 22. Here the Wigner function remains static, while the line of integration evolves in time according to \(x + (p/M)t = 0\). In Fig. 9, we show the probability density at \(x = 0\) as obtained from the Wigner function through integration from this passive point of view.

The cut at \(t = 0\) runs along the momentum axis. Here the oscillations in the Wigner function \(W_{0}\) along p average out and the main contribution to the probability density arises from the positive domain along the x-axis. For small times the lines of integration enclose small angles with respect to the p-axis and the oscillations in \(W_{0}\) translate in an oscillatory probability density. A dominant maximum occurs when the cut feels the central positive ridge along the x-axis. The density decays for larger times, that is as the cut approaches the x-axis, since there is a decreasing overlap.

Fig. 9
figure 9

Diffractive focusing of a rectangular wave packet explained from Wigner phase space. At the center we depict the Wigner function of the initial rectangular wave packet as a surface on top of the \(x-p\) plane. The blue curve projected onto the surface of a cylinder represents the probability density \(|\psi (0, t)|^2\), and is the result of an integration along the red lines \(x + p t/M = 0\) at different times corresponding to different angles with respect to the \(x-\)axis. Clearly, the focusing occurs for a line sweeping only positive values of the Wigner function at the approximate time \(t \approx 0.342\)

5.2.2 Rays and envelopes

According to Eq. 20 each point in Wigner phase space follows its classical trajectory. Thus, the value of \(W_{0}\) at \((x_{0},p_{0})\) will move from \(x_{0}\) to

$$\begin{aligned} x = x_{0} + \frac{p_{0}}{M} t \end{aligned}$$
(24)

at time t. We now use these rays to show that the minima of the Wigner function generate the regions in space-time where the probability density \(|\psi (x,t)|^{2}\) assumes small values as exemplified by Fig 10. Thus, they indicate the x values of the minima of \(|\psi (x,t)|^{2}\) as the system evolves in time.

The explicit expression Eq. 23 for the initial Wigner function \(W_{0}\) tells us that the minima of \(W_{0}\) are given by the condition

$$\begin{aligned} \frac{p_{0}^{(n)}}{\hbar }(L-2|x_{0}|) = \pi \left( 2n + \frac{3}{2}\right) \end{aligned}$$
(25)

for p positive, and

$$\begin{aligned} \frac{p_{0}^{(n)}}{\hbar }(L - 2|x_{0}|) = -\pi \left( 2n + \frac{3}{2}\right) \end{aligned}$$
(26)

for p negative, with \(n=0,1,2, \dots\)

We substitute these expressions for \(p_{0}^{(n)}\) into the motion, Eq. 24, and rearrange the terms, which yield

$$\begin{aligned} (x-x_{0})(L-2x_{0})\mp \left( 2n + \frac{3}{2}\right) \frac{\pi \hbar }{M}t = 0 \end{aligned}$$
(27)

for \(x_{0}\) positive, and

$$\begin{aligned} (x-x_{0})(L+2x_{0})\mp \left( 2n + \frac{3}{2}\right) \frac{\pi \hbar }{M}t = 0 \end{aligned}$$
(28)

for \(x_{0}\) negative. The upper signs in Eqs. 27 and 28 are for positive \(p_{0}^{(n)}\), that is for a positive slope in the (xt) plane, and the lower signs for negative \(p_{0}^{(n)}\).

The envelope of a set of curves \(F(x,t; x_{0}) = 0\), each determined by a parameter \(x_{0}\), follows [91] from the requirements

$$\begin{aligned} \frac{\partial F}{\partial x_{0}} = 0 \quad \text {and} \quad F = 0 . \end{aligned}$$
(29)

We consider the two cases, \(x_{0}>0\) and \(x_{0}<0\) in turn.

For positive \(x_{0}\), the first condition gives

$$\begin{aligned} x_{0}=\frac{L+2x}{4} , \end{aligned}$$
(30)

which, when substituted into Eq. 27, yields the relation

$$\begin{aligned} -\frac{1}{2}\left( \frac{L}{2} - x\right) ^{2} = \pm \left( 2n + \frac{3}{2}\right) \frac{\pi \hbar }{M}t . \end{aligned}$$
(31)

Thus, only negative values of \(p_{0}^{(n)}\) contribute to this envelope and we obtain the space-time trajectories

$$\begin{aligned} t = \frac{(L/2-x)^{2}}{2n+3/2}\,\frac{M}{2\pi \hbar } \end{aligned}$$
(32)

of the minima.

According to this expression they are parabolas Footnote 6 emerging from \(x = L/2\) with a steepness inversely proportional to \((2n + 3/2)\). Hence, the largest steepness corresponds to the case of \(n = 0\) with the parabola crossing the t-axis, that is \(x = 0\), at the time

$$\begin{aligned} t = \frac{1}{6}\frac{ML^{2}}{2\pi \hbar } = \frac{1}{6} T , \end{aligned}$$
(33)

where in the last step we have recalled the definition, Eq. 10 of the characteristic time T.

For negative \(x_{0}\), the first condition in Eq. 29 provides us with the relation

$$\begin{aligned} x_{0}=-\frac{L - 2x}{4} , \end{aligned}$$
(34)

which, when substituted into Eq. 28, gives

$$\begin{aligned} \frac{1}{2}\left( \frac{L}{2} + x\right) ^{2} = \pm \left( 2n + \frac{3}{2}\right) \frac{\pi \hbar }{M}t . \end{aligned}$$
(35)

Since in this case only positive values of \(p_{0}^{(n)}\) contribute to this envelope we arrive at the space-time trajectory

$$\begin{aligned} t = \frac{(L/2 + x)^{2}}{2n+3/2}\,\frac{M}{2\pi \hbar } \end{aligned}$$
(36)

corresponding to parabolas emerging from \(x = - L/2\). They are the mirror image of the ones given by Eq. 32.

In Fig. 10 we present the envelopes, Eqs. 32 and 36, along with the corresponding probability density \(|\psi (x,t)|^{2}.\) Hence, these minima coincide with the minima of the intensity pattern and are harbingers of the maxima and, in particular, of the focus.

Fig. 10
figure 10

Probability density \(|\psi (x,t)|^{2}\) represented in space-time and overlaid with the envelopes (gray lines) of the diffraction in time created by a rectangular wave packet of initial width \(L = 1\). The time t has been scaled using the characteristic time T. Here the envelopes given by Eqs. 32 and 36 approximate well the motion of the minima of \(|\psi (x,t)|^{2}\)

6 Summary and outlook: beyond slits

The spreading of a wave packet in the absence of a potential is a well-known phenomenon in quantum physics. However, also the opposite effect, that is a focusing of the wave packet can be achieved. In this case we have to imprint an appropriate position-dependent phase onto the initial wave function. Is it possible to achieve focusing even in the absence of any phase factors, that is for a real-valued initial wave function?

In the present article we have provided such an example in the form of a rectangular wave packet and have illuminated the resulting focusing in position as well as in phase space. Here we have used the time-dependent wave function and the Wigner distribution. Moreover, we have confirmed this phenomenon using laser light being diffracted from a slit building on the analogy between classical Fresnel optics and Schrödinger wave mechanics.

Throughout our article we have concentrated on rectangular wave packets created by a slit, that is by a one-dimensional aperture. However, similar effects appear in the diffraction from a two-dimensional aperture. In “Appendix C” we briefly summarize the literature on this problem, and discuss the similarities and differences between the one- and two-dimensional case using the examples of a slit and a circular aperture.

Figures 11 and 12 bring out most clearly the characteristic features of this dependence on the number of dimensions: (1) for the slit the relative intensity at the focus reaches the value of 1.8, whereas for the circular aperture the maximum is 4.0. (2) The position of the focus in the case of the circular aperture is closer to the opaque screen than in the slit, and (3) the focus originating from a circular aperture is more confined, since its decay towards the far-field region is faster.

This analysis shows that in two dimensions the focusing effect is more pronounced. Moreover, it is possible to optimize this effect by choosing different apertures, either by appropriate apodization, or by creating an optimal wave packet. Unfortunately, this topic goes beyond the scope of the present article and has to be postponed to a future publication.

Fig. 11
figure 11

Rayleigh–Sommerfeld intensity pattern for a slit of width \(L = 5\lambda\) and illumination wave length \(\lambda = 0.488\) \(\upmu\)m (a), together with a comparison between the on-axis intensities predicted by the Rayleigh–Sommerfeld and Fresnel integrals (b). In contrast to Figs. 5, 6 and 7 the horizontal axes are normalized

Fig. 12
figure 12

Rayleigh–Sommerfeld intensity pattern for a circular aperture of diameter \(L = 2 a = 5\lambda\) and illumination wavelength \(\lambda = 0.488\) \(\upmu\)m (a), together with a comparison between the on-axis intensities predicted by the Rayleigh–Sommerfeld and Fresnel integrals (b). Again the axes are normalized