1 Fundamentals of Fluorescence Microscopy

This section will present the basics of fluorescence microscopy. Starting from the intensity distribution within the focal spot of an objective lens, we will discuss the image formation and derive the classical formula of the resolution limit. Furthermore, we will introduce the principle of confocal detection.

1.1 Vectorial Diffraction Theory and Intensity Distribution Within the Focal Spot

The complex electric vector field \(\mathscr {E}(\mathbf {r})\) in the focal region of an optical system can be expressed in terms of a modified Huygens-Fresnel principle as the coherent superposition of secondary plane waves at the exit pupil [2, 3]

$$\begin{aligned} \mathscr {E}(\mathbf {r}) = -\frac{ik}{2\pi } \iint \limits _{\varOmega } \sqrt{R_1 R_2} \mathbf {E}_p(\mathbf {s}) e^{ik\mathbf {s}\mathbf {r}} \mathrm {d}\varOmega , \end{aligned}$$

where \(\varOmega \) is the solid angle, \(\mathbf {E}_p\) is the complex amplitude of the secondary plane waves, \(R_1\) and \(R_2\) are the principle radii of curvature of the wavefront at the exit pupil and \(\mathbf {s}\) is the unity vector in the respective direction of propagation, see Fig. 1.1a. The wave number k is given by \(k = 2\pi \, n /\lambda _0\), with n being the refractive index and \(\lambda _0\) the vacuum wavelength. Note that the geometric focus is at position \(\mathbf {r} = (0,0,0)\).

Fig. 1.1
figure 1

Secondary plane waves as well as strength and polarization of a focused wavefront. a A secondary plane wave can be defined for each point G of a wave front W leaving the pupil of an optical imaging system. The wave front of the secondary plane wave is tangential to W in G. b When focusing through a lens, the projection of the electric vector field onto the meridional plane changes its direction of polarization (vectors \(\mathbf {g}_0\) and \(\mathbf {g}_p\)). The part of the vector field orthogonal to the meridional plane retains its polarization direction (vectors \(\mathbf {g}_0^*\) and \(\mathbf {g}_p^*\)). Due to the transition from a plane to a spherical wavefront, the strength of the electric field within associated surface segments (\(dS_0\) and \(dS_p\)) changes such that the energy passing through the surface elements remains constant

In order to derive \(\mathscr {E}(\mathbf {r})\) for an aplanatic, i.e. axially stigmatic and obeying the sine condition, imaging system such as the objective lens of a microscope, \(\mathbf {E}_p(\mathbf {s})\), \(R_1\) and \(R_2\) have to be determined [4].

The typical scenario when focusing with an infinitely corrected lens is shown in Fig. 1.1b. Without loss of generality, we first assume that the light at the entrance pupil of the objective is linear polarized in the x-direction

$$\begin{aligned} \mathbf {E}_0(x_0,y_0) = E_0 \mathbf {e}_x e^{iP(x_0,y_0)}, \end{aligned}$$

where \(E_0\) is the (real-valued) amplitude of the electric field, \(\mathbf {e}_x\) is the unit vector in x-direction and P is the pupil function which encodes the phase distribution of the electric field [5]. For arbitrary polarization states, \(\mathscr {E}(\mathbf {r})\) can then be calculated by the coherent superposition of several solutions of \(\mathscr {E}(\mathbf {r})\) for correspondingly linear polarized vector fields at the entrance pupil. The case of unpolarized light is obtained by averaging over all possible polarization states.

According to Fig. 1.1b, the surface segment at the entrance pupil

$$\begin{aligned} \mathrm {d}S_0 = r_0 \mathrm {d}\phi _0 \mathrm {d}r_0, \end{aligned}$$

will be transformed by the objective lens to a surface segment at the exit pupil

$$\begin{aligned} \mathrm {d}S_p = f^2 \sin \theta _s \mathrm {d}\phi _s \mathrm {d}\theta _s, \end{aligned}$$

where \(r_0\) is the distance of the surface segment from the optical axis and f is the focal length of the lens. As the lens obeys the sine condition

$$\begin{aligned} r_0 = f \sin \theta _s \end{aligned}$$

and the intensity law of geometrical optics [6]

$$\begin{aligned} E_0^2\mathrm {d}S_0 = E_p^2\mathrm {d}S_p \end{aligned}$$

has to be fulfilled, the amplitude of the electric field at the exit pupil is given by

$$\begin{aligned} E_p = \sqrt{\frac{r_0}{f \sin \theta _s} \frac{\mathrm {d}r_0}{f \mathrm {d}\theta _s}} E_0 = \sqrt{\cos \theta _s} E_0. \end{aligned}$$

To determine the polarization of the electric field at the exit pupil, it is advisable to introduce two unit vectors for each light ray passing through the objective lens

$$\begin{aligned} \mathbf {g}_0 = \begin{pmatrix} \cos \phi _0 \\ \sin \phi _0 \\ 0 \end{pmatrix} \text {and } \mathbf {g}_p = \begin{pmatrix} \cos \theta _s \cos \phi _s \\ \cos \theta _s \sin \phi _s \\ \sin \theta _s \end{pmatrix} \end{aligned}$$

in the corresponding meridional plane and two unit vectors

$$\begin{aligned} \mathbf {g}_0^* = \begin{pmatrix} -\sin \phi _0 \\ \cos \phi _0 \\ 0 \end{pmatrix} \text {and } \mathbf {g}_p^* = \begin{pmatrix} -\sin \phi _s \\ \cos \phi _s \\ 0 \end{pmatrix} \end{aligned}$$

which are orthogonal to the meridional plane, Fig. 1.1b. Note that \(\phi _0 = \phi _s\). When the light rays pass through the lens, the portion of their electric fields originally pointing in the \(\mathbf {g}_0\) directions are re-polarized in the \(\mathbf {g}_p\) directions and the portion pointing in the \(\mathbf {g}_0^*\) directions do not change their polarization. Hence, using (1.2), (1.7), (1.8) and (1.9), we can write for the amplitude of the secondary plane waves

$$\begin{aligned} \mathbf {E}_p = \sqrt{\cos \theta _s} \left( (\mathbf {E}_0 \cdot \mathbf {g}_0)\mathbf {g}_p + (\mathbf {E}_0 \cdot \mathbf {g}_0^*)\mathbf {g}_p^* \right) , \end{aligned}$$

which after expansion results in

$$\begin{aligned} \mathbf {E}_p = \sqrt{\cos \theta _s} E_0 \begin{pmatrix} \cos \theta _s + (1-\cos \theta _s)\sin ^2\phi _s \\ (\cos \theta _s-1) \cos \phi _s \sin \phi _s \\ \sin \theta _s \cos \phi _s \end{pmatrix} e^{iP(x_0,y_0)}. \end{aligned}$$

Note that the influence of the lens on the phase of the electric field has already been fully accounted for in the presented geometry and that \(x_0 = f\sin (\theta _s)\cos (\phi _s)\) and \(y_0 = f\sin (\theta _s)\sin (\phi _s)\). The electrical field components in all three directions are depicted in Fig. 1.2. Note that the maximum strength of \(\mathbf {E}_{p,z}\) is about half of the maximum strength of \(\mathbf {E}_{p,x}\) and that \(\mathbf {E}_{p,y}\) is again a factor of three lower. While \(\mathbf {E}_{p,x}\) points into the same direction over the entire aperture, both the y- and the z-component change their signs. As we will see later, this has a direct influence on the distribution of the polarization within the focus. In particular, the y- and z-component interfere destructively on the optical axes (\(x=y=0\)).

Fig. 1.2
figure 2

Electrical field components pointing in x-, y- and z-direction for an initially x-polarized planar wave front after passing a lens

If we assume that the entrance pupil of the objective is homogeneously illuminated and if we use that \(R_1 = R_2 = f\) applies to the principle radii in (1.1), we derive for the electric vector field in the focal region

$$\begin{aligned} \begin{aligned} \mathscr {E}_x(\mathbf {r})&= -\frac{iA}{\pi } \int \limits _{0}^{\alpha } \!\! \int \limits _{0}^{2\pi } \!\! \sqrt{\cos \theta _s} \sin \theta _s \left\{ \cos \theta _s + (1-\cos \theta _s)\sin ^2\phi _s \right\} e^{i(k\mathbf {r}\cdot \mathbf {s} + P(\theta _s,\phi _s))} \mathrm {d}\phi _s \mathrm {d}\theta _s\\ \mathscr {E}_y(\mathbf {r})&= +\frac{iA}{\pi } \int \limits _{0}^{\alpha } \!\! \int \limits _{0}^{2\pi } \!\! \sqrt{\cos \theta _s} \sin \theta _s \left\{ (1-\cos \theta _s) \cos \phi _s \sin \phi _s \right\} e^{i(k\mathbf {r}\cdot \mathbf {s} + P(\theta _s,\phi _s))} \mathrm {d}\phi _s \mathrm {d}\theta _s\\ \mathscr {E}_z(\mathbf {r})&= +\frac{iA}{\pi } \int \limits _{0}^{\alpha } \!\! \int \limits _{0}^{2\pi } \!\! \sqrt{\cos \theta _s} \sin \theta _s \left\{ \sin \theta _s \cos \phi _s \right\} e^{i(k\mathbf {r}\cdot \mathbf {s} + P(\theta _s,\phi _s))} \mathrm {d}\phi _s \mathrm {d}\theta _s \end{aligned} \end{aligned}$$


$$\begin{aligned} A = \frac{kfE_0}{2} \end{aligned}$$

is a constant and \(\alpha \) is the semi-aperture angle of the objective lens. Please note that \(\theta _s\) is defined in negative z-direction, therefore the sign of the \(\mathscr {E}_z\) component has to be changed. Usually, the numerical aperture

$$\begin{aligned} \text {NA} = n\cdot \sin (\alpha ) \end{aligned}$$

is specified instead of \(\alpha \) to indicate the aperture angle of an objective lens, in which n is the refractive index of the immersion medium.

For an incident plane wave \(P(\theta _s,\phi _s)\) is constant (e.g. 0) and the integration with respect to \(\phi _s\) can be carried out and the analytic solution of \(\mathscr {E}(\mathbf {r})\) is [4]

$$\begin{aligned} \begin{aligned} \mathscr {E}_x(\mathbf {r})&= -iA(I_0+I_2\cos 2\phi _r)\\ \mathscr {E}_y(\mathbf {r})&= -iAI_2\sin 2\phi _r\\ \mathscr {E}_z(\mathbf {r})&= -2AI_1\cos \phi _r, \end{aligned} \end{aligned}$$

where the field is expressed in spherical coordinates \(\mathbf {r} = (r, \theta _r, \phi _r)\) and the diffraction integrals are defined as

$$\begin{aligned} \begin{aligned} I_0(\mathbf {r})&= \int \limits _{0}^{\alpha } \sqrt{\cos \theta _s} \sin \theta _s (1+\cos \theta _s) J_0(kr\sin \theta _r\sin \theta _s) e^{ikr\cos \theta _r\cos \theta _s} \mathrm {d}\theta _s \\ I_1(\mathbf {r})&= \int \limits _{0}^{\alpha } \sqrt{\cos \theta _s} \sin ^2\theta _s J_1(kr\sin \theta _r\sin \theta _s) e^{ikr\cos \theta _r\cos \theta _s} \mathrm {d}\theta _s \\ I_2(\mathbf {r})&= \int \limits _{0}^{\alpha } \sqrt{\cos \theta _s} \sin \theta _s (1-\cos \theta _s) J_2(kr\sin \theta _r\sin \theta _s) e^{ikr\cos \theta _r\cos \theta _s} \mathrm {d}\theta _s , \end{aligned} \end{aligned}$$

and \(J_n\) are the Bessel functions of the first kind and order n. The overall intensity in the vicinity of the focal spot is given by

$$\begin{aligned} I(\mathbf {r}) = \mathscr {E}_x^2(\mathbf {r}) + \mathscr {E}_y^2(\mathbf {r}) + \mathscr {E}_z^2(\mathbf {r}). \end{aligned}$$

The contributions of the electric fields of individual polarization directions to the intensity in the focal plane as well as the overall intensity is shown in Fig. 1.3a. It can be clearly seen that the symmetry of the polarization direction distribution on the exit pupil is transferred to the focal plane. As a consequence, the intensity distribution is not rotational symmetric. For example, the focal spot is narrower in the direction orthogonal to the polarization direction of the incident field. However, the focus can be made symmetrical by the use of circular polarized light, Fig. 1.3b. In many cases it is not necessary to know the intensity distribution in the focus down to the last detail. In this case it is useful to approximate it by a Gaussian function with a corresponding full width at half maximum (FWHM). As you can see in Fig. 1.4 this approximation is reasonably good in the central area.

Fig. 1.3
figure 3

Contributions of the electric fields of individual polarization directions and overall intensity in the focal plane for a linear and b circular polarized light in the entrance pupil of the objective lens. Calculations were performed for an NA 1.4 oil immersion objective lens (\(\lambda = 640\) nm, \(n = 1.518\)). Scale bar 250 nm

Fig. 1.4
figure 4

Simulated intensity distribution and Gaussian approximation. The intensity distributions in the x-y (left) and x-z (right) plane through the geometric focus a and the corresponding intensity profiles along the x-(left) and the z-direction (right) b show the good agreement of the Gaussian approximation in the central area. Calculations were performed for an NA 1.4 oil immersion objective lens (\(\lambda = 640\) nm, \(n = 1.518\)) and circular polarized ligth. Scale bars 250 nm

1.2 Incoherent Image Formation

Far-field   fluorescence microscopy has proven to be a powerful and versatile tool in the life sciences and beyond [7,8,9,10,11]. Since it allows to non-invasively image the interior of sufficiently translucent samples in three dimensions, it is well suited for imaging biological samples, even under living conditions [12, 13]. Further, tagging of target proteins or epitopes with fluorescent markers, e.g. by immunolabeling with organic fluorophores or by expression of fluorescent fusion proteins, lends an exceptional molecular specificity to the method [14,15,16]. In order to understand the implementation of a fluorescence microscope, it is instructive to first consider the fluorescence process on the molecular level.

Figure 1.5a illustrates the relevant molecular energy levels and transitions within the singlet state in a Jablonski diagram. Here, \(S_0\) denotes the electronic ground state and \(S_1\) the first excited electronic state. Please note that higher excited states as well as the triplet states are neglected here because they are not necessary for a basic understanding. The thick lines indicate the lowest vibrational energy level, whereas the thin lines indicate levels with higher vibrational energy. At ambient temperatures, a molecule typically resides in the lowest vibrational level of \(S_0\) according to the Boltzmann distribution [17]. By absorption of a photon of suitable energy, the molecule can be excited to higher vibrational levels of \(S_1\). From there, it relaxes radiation-less to the lowest vibrational level, which typically takes place within one picosecond or less [18]. The emission of a fluorescence photon takes place, as the molecule spontaneously returns to higher vibrational levels of \(S_0\). This transition may also occur radiation-less via internal conversion. However, for fluorescent molecules, which are described here, this process is of minor importance. The time interval which the molecule spends in \(S_1\) is known as the fluorescence lifetime and depends on the molecule itself and on its environment. It is typically several nanoseconds and therefore three to four orders of magnitudes longer than the characteristic time for vibrational relaxation [18]. The cycle is completed by vibrational relaxation back to the lowest vibrational level of \(S_0\).

Fig. 1.5
figure 5

Principle of fluorescence microscopy. a Jablonski diagram of a fluorescent molecule. By absorption of a photon the molecule can be excited from the electronic ground state, \(S_0\), into any vibrational level of \(S_1\). Fast nonradiative relaxation into the lowest level of \(S_1\) takes place within a picosecond. The molecule can return into any vibrational levels of \(S_0\) by the spontaneous emission of a photon (fluorescence). From there it relaxes non-radiatively into its lowest vibrational level. b Absorption (blue) and emission (green) spectrum of a fluorescent molecule. The hatched areas indicate the transmission range of the respective bandpass filters. c In a typical experimental implementation the excitation light is focused into the back aperture of the objective lens in order to generate a homogeneous light distribution within the sample. All fluorophores in the sample are equally excited (right inset). The fluorescence signal is collected by the objective lens, separated from the excitation light by a dichroic mirror and a detection bandpass (BP), and imaged onto an area detector such as a CCD camera. The inset on the left depicts the image on the camera (green). Note that the positions of the fluorophores are indicated only for illustration purposes

For the design of a fluorescence microscope, two consequences of the described excitation and emission process are of immediate relevance. First, due to the extended spectrum of vibrational levels, photons within a range of energies may excite the molecule. Likewise, fluorescence photons have a spectrum of energies. Second, due to the dissipation of energy by the vibrational relaxation after excitation, the emitted photon’s energy is always lower than that of the absorbed photon. Figure 1.5b illustrates these aspects in terms of wavelength instead of energy, and shows the absorption and the emission spectrum of a typical fluorescent molecule. Since the photon wavelength scales inversely with the photon energy, the emission spectrum is at longer wavelengths than the absorption spectrum. This red-shift is called Stokes shift [17] and can be harnessed in the implementation of a fluorescence microscope.

Figure 1.5c shows a simple epi-illumination design of such a microscope. A broadband light source, e. g. a metal-halide lamp or a light emitting diode, is spectrally filtered by a bandpass filter (BP), such that the selected wavelength range lies within the absorption spectrum of the respective fluorescent molecule. This excitation light is reflected by a dichroic mirror through the objective lens into the sample. In a typical implementation it is focused into the back aperture of the objective lens in order to illuminate the sample over an extended area (wide-field illumination). All fluorescent molecules inside the sample are equally excited and their fluorescence is collected by the same objective lens. Since the fluorescence is red-shifted with respect to the excitation light, it is transmitted by the dichroic mirror and thus efficiently separated from the excitation light. After passing through a bandpass filter, which blocks residual excitation light and suppresses unwanted room light outside of the desired spectral detection window, the fluorescence is imaged by a tube lens onto a camera. The transmission ranges of the excitation and detection BP are illustrated by hatched regions in Fig. 1.5b.

The fluorescence light emitted in the sample plane is imaged to the detection plane by lenses and is therefore subject to diffraction. The image of a fluorescent molecule, which can be seen as a point emitter of electromagnetic radiation, is therefore not a point, but spread to an extended intensity distribution (cf. Sect. 1.1.1). The projection of this pattern into the sample is called the point spread function (PSF) and it is an important characteristic of a microscope, as it determines its resolution capability (cf. Sect. 1.1.3). Since fluorescence emission is a spontaneous process, emitted light of different sources has no constant phase relation. The light is incoherent and thus the image of several point sources (e.g. a fluorophore distribution in the sample) is composed of single overlapping PSFs. In mathematical terms, the image \(I_{\text {image}}\), which is back-projected into the sample plane, is the convolution of the true object O and the PSF h:

$$\begin{aligned} I_{\text {image}}\left( \mathbf {r} \right) = O\left( \mathbf {r} \right) * h \left( \mathbf {r} \right) \end{aligned}$$

1.3 Classical Resolution Limit

The resolution  of an optical system, e.g. a microscope, describes its ability to distinguish two objects. Therefore, the spatial resolution of a microscope is given by the minimum distance of two structures at which their images can be discerned.

The Abbe Limit

By investigating line gratings, Ernst Abbe discovered in 1873 that the lateral resolution of a light microscope solely depends on the wavelength of the light and the numerical aperture of the objective lens used [19]. In order to resolve two adjacent lines they have to be separated by at least:

$$\begin{aligned} d_{\text {min}} = \frac{\lambda _0}{2\,\text {NA}}, \end{aligned}$$

with \(\lambda _0\) being the vacuum wavelength of the light used. This fundamental limit is often referred to as the diffraction limit. However, Abbe’s considerations do not allow conclusions on light-emitting objects or the axial resolution of the microscope.

The Rayleigh Criterion

As already described in Sect. 1.1.1, the image of a point is not a point but a blurred spot. If we assume an incident plane wave and neglect the direction change of the polarization of the electric field by the focusing process, formula 1.12 simplifies to

$$\begin{aligned} \mathscr {E}(\mathbf {r}) = \mathscr {E}_x(\mathbf {r}) = -\frac{iA}{\pi } \int \limits _{0}^{\alpha } \!\! \int \limits _{0}^{2\pi } \!\! \sqrt{\cos \theta _s} \sin \theta _s e^{i(k\mathbf {r}\cdot \mathbf {s})} \mathrm {d}\theta _s \mathrm {d}\phi _s. \end{aligned}$$

In this case the integration with respect to \(\phi _s\) can be readily performed and we derive (compare (1.16))

$$\begin{aligned} \mathscr {E}(\mathbf {r}) = -iA \int \limits _{0}^{\alpha } \sqrt{\cos \theta _s} \sin \theta _s J_0(kr\sin \theta _r\sin \theta _s) e^{ikr\cos \theta _r\cos \theta _s} \mathrm {d}\theta _s. \end{aligned}$$

Note that \(r\sin (\theta _r)\) is the distance of \(\mathbf {r}\) from the optical axes and \(r\cos (\theta _r)\) corresponds to the z-coordinate of \(\mathbf {r}\). If we concentrate on the focal plane (\(z=0\)) or the optical axis (\(x=y=0\)) and assume that the aperture angle is relatively low (paraxial approximation), the integration with respect to \(\theta _s\) can also be performed and we obtain

$$\begin{aligned} \begin{aligned} \mathscr {E}(x,y,0)&= -iA \frac{J_1(k \sqrt{x^2+y^2} \sin {\alpha })}{k \sqrt{x^2+y^2} \sin {\alpha }} \\ \mathscr {E}(0,0,z)&= -iA \frac{sin(\frac{k}{4} z \sin ^2{\alpha })}{\frac{k}{4} z \sin ^2{\alpha }}. \end{aligned} \end{aligned}$$
Fig. 1.6
figure 6

Images of two point emitters with different distances and corresponding intensity profiles along the dashed white lines. a For large separations both emitters can be easily identified. b According to the Rayleigh criterion, the minimum distance to resolve both emitters is reached, when the maximum of one emitter coincides with the minimum of the other. c Emitters that are closer than this distance cannot be resolved in the image

The absolute square of \(\mathscr {E}(x,y,0)\) results in the so-called Airy pattern in the focal plane. Figure 1.6 shows the images of two point-like emitters for three different distances and the corresponding intensity profiles along the white dashed lines. Dashed blue and yellow curves indicate the profiles for the individual emitters, whereas the red lines show the profile when both emitters are radiating at the same time. When the distance of both emitters is sufficiently large, they can easily be identified as individual emitters. According to the Rayleigh criterion, two spatially separated point sources can be discerned, when the maximum of the diffraction pattern of one point emitter coincides with the first minimum of the other [6, 20]. This case is illustrated in Fig. 1.6b. Note that the Rayleigh criterion only holds true for incoherently radiating sources. When the distance between the emitters gets smaller they cannot be distinguished any more, Fig. 1.6c.

The distance between the main maximum and the first minimum of the Airy pattern is often used as a measure for the lateral resolution and is given by:

$$\begin{aligned} d_{\text {x,y}} = 0.61 \, \frac{\lambda _{0}}{\text {NA}}. \end{aligned}$$

Likewise, the resolution in axial direction (z) can be defined as the distance between the main maximum and the first minimum in z-direction as:

$$\begin{aligned} d_{\text {z}} = 2.00 \, \frac{n \, \lambda _{0}}{\left( \text {NA} \right) ^{2}}. \end{aligned}$$

The Full Width at Half Maximum Criterion

As you can see from Fig. 1.6b, two points with a distance corresponding to the Rayleigh criterion can still be resolved if the signal to noise ratio is sufficiently high. This is no longer the case when their distance corresponds to the FWHM of the Airy pattern, Fig. 1.6c. If the resolution is defined in such a way we get

$$\begin{aligned} d_{\text {x,y}} = 0.51 \, \frac{\lambda _{0}}{\text {NA}} \text { and} \end{aligned}$$
$$\begin{aligned} d_{\text {z}} = 1.77\, \frac{\, n \, \lambda _{0}}{\left( \text {NA} \right) ^{2}}. \end{aligned}$$

Thus, the resolution of a microscope is according to this criterion limited to approximately half the wavelength in lateral and twice the wavelength in the axial direction. The FWHM definition of the resolution is particularly well suited if the PSF is approximated by a Gaussian function which, as is well known, has no minima. In this case the resolution then can either be expressed by its standard deviation, \(\sigma \), or its full width at half maximum (\(\text {FWHM} = 2\sqrt{2\text {ln}2} \sigma \)).

1.4 Confocal Microscopy

Wide-field  fluorescence microscopy offers the possibility to image the entire field of view of the objective lens at once. However, wide-field illumination also has a decisive disadvantage as it not only excites dye molecules in the focal plane, but simultaneously in the entire sample volume. The light emitted by axially distant molecules is detected in addition to the signal from the fluorophores in the focal plane and generates a bright background in the image. This makes it difficult to acquire high quality data, especially in axially extended samples.

In confocal microscopy [21], the axial extent of the sample region from which the signal impinges on the detector can be narrowed down. For excitation, a point-like light source is imaged into the sample plane. Since conventional fluorescent lamps are spatially extended, their light has to be focused onto a pinhole and afterwards collimated with a lens. The invention of lasers as bright and intense point-like light sources rendered the use of an excitation pinhole obsolete and quickly led to the first confocal laser scanning microscopes [22, 23].

The main components of a typical confocal fluorescence microscope are illustrated in Fig. 1.7. A collimated excitation beam is focused into the sample by an objective lens and generates a diffraction-limited excitation PSF, \(h_{\text {ex}}\left( \mathbf {r}\right) \). The inset on the right side in Fig. 1.7a depicts the focal plane and the excitation spot. Fluorescent markers within \(h_{\text {ex}}\left( \mathbf {r}\right) \) will be excited with a probability proportional to the light intensity and can therefore emit fluorescence. In order to acquire an image, either the sample must be scanned through the focus, or vice versa.

Fig. 1.7
figure 7

Principle of a confocal microscope. a In a typical experimental implementation a collimated excitation laser is focused into the sample by the objective lens, creating a diffraction-limited excitation spot. Fluorophores within this spot will be excited with a probability proportional to the excitation light intensity. The fluorescence signal is collected by the objective lens, separated from the excitation light by a dichroic mirror and a detection BP, and imaged onto a detector. A confocal pinhole in front of the detector ensures that only fluorescence from a certain region is detected. The inset on the right depicts the excitation spot and the accordingly excited fluorophores in the sample plane, while the inset on the left depicts the image of the excited molecules in the pinhole plane. The gray circle indicates the detection pinhole. Only fluorescence within the circle is detected. b Additionally, fluorescence from planes distant to the focal plane is blocked by the detection pinhole. This allows to analyze signal from thin optical sections. Note that the positions of the fluorophores are indicated only for illustration purposes

Each excited fluorophore can emit fluorescence, which is collected by the objective lens, separated from the excitation light by a dichroic mirror and a detection BP and imaged onto a pinhole. The pinhole ensures that only fluorescence from the direct vicinity of the geometrical focal point is detected. As the light path is invertible this can be interpreted as imaging the pinhole into the focal plane. This image is called the detection PSF, \(h_{\text {det}}\left( \mathbf {r}\right) \), and describes the probability to detect a photon emitted at position \(\mathbf {r}\). The gray circle in the left inset in Fig. 1.7a indicates the pinhole. Only fluorescence originating from inside this circle is detected with high probability. Note that each fluorescent molecule is imaged diffraction-limited.

Another advantage of the detection pinhole is illustrated in Fig. 1.7b. The fluorescence from axial distant planes (with respect to the focal plane) is blocked by the detection pinhole. Therefore, the key feature of the confocal microscope, other than conventional microscopes, is that it efficiently (and sharply) images only those regions of a volume sample that lie within a thin section around the focal plane of the microscope. In other words, it is able to reject (effectively attenuate) light from out-of-focus regions of the sample [24,25,26,27,28].

The PSF of the confocal microscope is given by the probability that a fluorophore is excited multiplied with the probability that its fluorescence is detected:

$$\begin{aligned} h_{\text {conf}}\left( \mathbf {r} \right) = h_{\text {ex}}\left( \mathbf {r} \right) \cdot h_{\text {det}}\left( \mathbf {r} \right) . \end{aligned}$$

In the theoretical limit of an infinitesimally small detection pinhole and identical wavelengths for illumination \(\lambda _{\text {ex}}\) and detection \(\lambda _{\text {det}}\), a confocal microscope improves the resolution by a factor of \(\sqrt{2}\) [24].

The influence of the size of the detection pinhole on the lateral (black line) and axial (blue line) resolution, as well as on the detected signal (red line) is shown in Fig. 1.8. The graphs are obtained by calculating and analyzing images of a point emitter imaged with an oil-immersion objective lens (NA = 1.4, n = 1.518, \(\lambda _{\text {ex}} = 640\) nm, \(\lambda _{\text {det}} = 680\) nm) using (1.17) and varying pinhole diameters. The pinhole diameter is measured in Airy units (AU), with one AU corresponding to the diameter of the Airy disc in the focal plane (\(1 \text {AU} = 1.22 \, \lambda / \text {NA}\)). It is clearly visible that the best achievable resolution in all directions is achieved with an infinitesimally small pinhole. With increasing pinhole size, the resolution of the confocal microscope decreases. The detected signal, however, grows with increasing pinhole diameter [29]. For experimental purposes, a finite pinhole size is necessary to collect sufficient signal. Often a pinhole size in the range of 1 AU is chosen as a tradeoff between collected signal and resolution. Even though the resolution increase in the lateral direction is almost negligible in this regime, the advantage of optical sectioning remains.

Fig. 1.8
figure 8

Influence of the pinhole diameter on the lateral (black) and the axial (blue) resolution as well as the detected signal (red). Calculations are performed for an NA 1.4 oil-immersion objective lens (\(\lambda _{\text {ex}} = 640\) nm, \(\lambda _{\text {det}} = 680\) nm, \(n = 1.518\)). Increasing the pinhole size increases the detected signal, but also lowers the achievable resolution. Often pinholes with a size of 1 AU are utilized in a confocal microscope, as at this size sufficient signal is collected while the optical sectioning capability is mainly maintained

For a circular detection pinhole, the pinhole function is given by

$$\begin{aligned} p\left( \mathbf {r} \right) = p\left( x,y,z=0 \right) = {\left\{ \begin{array}{ll} 1 &{} \text {for} \sqrt{x^2+y^2} \le p_0 \\ 0 &{} \text {otherwise} \end{array}\right. }, \end{aligned}$$

with \(p_0\) being the pinhole radius. The real detection PSF, \(h_{\text {det, real}} \left( \mathbf {r} \right) \), is then given by the convolution of \(h_{\text {det}} \left( \mathbf {r} \right) \) with the pinhole function:

$$\begin{aligned} h_{\text {det, real}} \left( \mathbf {r} \right) = h_{\text {det}} \left( \mathbf {r} \right) * p\left( \mathbf {r} \right) . \end{aligned}$$

2 Fundamentals of STED Microscopy

For a long time, the resolution of a microscope was considered to be limited by diffraction. But during the last decades, physico-optical methods that circumvent the diffraction barrier emerged in far-field fluorescence microscopy [30]. These new super-resolution microscopy - in short ‘nanoscopy’—methods have been awarded the Nobel prize in Chemistry in 2014 and allow a resolution improvement of at least one order of magnitude. The first method of this kind was stimulated emission depletion (STED) microscopy, proposed in 1994 by Hell and Wichmann [31] and demonstrated by Klar and Hell in 1999 [32].

Ever since their advent, super-resolution microscopy techniques are versatile tools for non-invasive investigations of structures. STED microscopes offer for example the possibility to measure intracellular structures in fixed [33, 34] and living cells [35, 36] with, in principle, unlimited resolution [31]. A lateral resolution of 15 nm was demonstrated by imaging single fluorescent molecules [37], and a resolution of 5.8 nm [38] resp. 2.4 nm [39] have been demonstrated on single nitrogen vacancy centres in diamonds. Furthermore, STED microscopes have been used to measure e.g. colloidal structures [40], and block copolymers [41, 42] and the underlying principle has been used for STED lithography [43, 44]. As STED microscopy is cutting edge technology, new, improved acquisition schemes are continuously being developed and integrated (e.g. RESCue-STED [45] or DyMIN [46]).   

2.1 Basic Idea

A fundamental  breakthrough in the achievable resolution of light microscopes was realized when fluorescent markers were not only considered as contrast agents, but the molecular transitions of the markers were additionally used to specifically switch on and off the ability of a subset of markers to fluoresce. Hereby, the fluorescence from markers within a diffraction-limited spot can be temporally separated and thus be read out sequentially.

As detailed above, confocal microscopy employs a targeted readout scheme. The diffraction-limited focus is scanned through the sample and the detected fluorescence is computationally assigned to the known position, thereby generating an image pixel by pixel. In this mode, increasing the resolution is synonymous to decreasing the spatial extent of the region from where the fluorescence is detected. STED microscopy realizes this by employing the process of stimulated emission to actively switch off fluorescent markers by forcing them to the electronic ground state \(S_0\) without emission of a fluorescence photon (Fig. 1.9a). This can be achieved by overlapping the excitation spot with a spatially extended intensity distribution, \(I\left( \mathbf {r}, t\right) \), featuring at least one zero-intensity region as off-switching requires \(I\left( \mathbf {r}, t\right) > 0\) and is absent for \(I\left( \mathbf {r}, t\right) =0\). If the STED focus has a ring shape (doughnut shape) with a central intensity zero, molecules at its rim are switched off, while molecules in the center are not. This results in a spatial narrowing of the fluorescent spot, whose extent then defines the resolution of the microscope. The resolution, which theoretically can get arbitrarily good, depends not only on the applied STED intensity, but also on the photophysical properties of the fluorophores. A detailed discussion of the photophysics of dye molecules is presented in Sect. 1.2.2.

Fig. 1.9
figure 9

Principle of a STED microscope. a Jablonski diagram of a fluorescent molecule. In addition to the processes of excitation and spontaneous emission, stimulated emission is now used to switch off excited molecules in a targeted way. b Absorption and emission spectrum of a fluorescent molecule. The depletion laser is shifted to the far right of the emission spectrum of the fluorophore. c In comparison to the confocal microscope, an additional depletion laser is now superimposed with the excitation beam. A helical phase mask imprints a phase retardation from \(0-2\pi \,\) onto the STED beam, that when imaged into the sample plane creates a doughnut-shaped depletion pattern. The right inset shows the overlap of excitation and STED beam in the focal plane. Wherever the STED intensity is sufficiently high, excited fluorophores are driven into their off-state. Therefore, fluorescence is only emitted from sample regions where the STED intensity is negligible. This fluorescence is separated from the laser light and imaged onto a point-detector. Most STED microscopes utilize pulsed lasers for excitation and depletion. The central inset illustrates, that a time delay between the excitation and STED pulses is needed for an effective suppression of the fluorescence

The key components of a STED microscope are illustrated in Fig. 1.9c. The setup is based on a confocal microscope (cf. Figure 1.7). Additionally, a STED laser, whose wavelength is at the red end of the fluorescence spectrum (cf. Fig. 1.9b), e.g. \(\lambda _{\text {em}}^{\text {max}} = 654\) nm and \(\lambda _{\text {STED}} = 775\) nm for Abberior STAR 635P, is phase-modulated and superimposed with the excitation laser. Further detail of how to shape the STED beam is given in Sect. 1.2.3. The emitted fluorescence is spectrally separated from the laser beams and detected by a point detector (e. g. a single photon counting module).

The right inset in Fig. 1.9c depicts the overlap of the excitation and STED beams in the sample plane. Only fluorescent molecules in the central region of the depletion pattern are allowed to remain in the excited state and can therefore emit fluorescence and contribute to the detected signal. The inset on the left side depicts the image plane with the gray circle indicating the detection pinhole. Usually, pulsed lasers are used as light sources for excitation and depletion in STED microscopy. The central inset indicates that a temporal delay between the excitation and depletion pulses is needed for an efficient fluorescence suppression (cf. section 1.2.2). Additionally, a helical phase mask, that is used to create the doughnut-shaped depletion pattern by imprinting a phase retardation from \(0-2\pi \,\) onto the STED beam, is depicted.

2.2 Basic Photophysics of Dye Molecules

As described in the previous section, the key principle of STED microscopy is the inhibition of fluorescence emission by stimulated emission. The efficiency of this fluorescence depletion is a crucial parameter for the performance of a STED microscope and it depends on the interplay of the excitation light and the STED light with the fluorescent molecules. In the following, this will be discussed in detail with special attention to the timing between excitation and STED light and to the required STED power. Following [47], rate equations for the population of electronic states of fluorescent molecules will be formulated and their implications will be discussed.

Fig. 1.10
figure 10

Four level system of a fluorophore and laser pulse timing used in numeric calculations. a A fluorescent molecule can be described as a four level system with \(S_0\) and \(S_\text {0,vib}\) being the lowest, respectively a higher vibrational level of the electronic ground state. \(S_1\) and \(S_\text {1,vib}\) are the corresponding levels in the first excitated state. \(N_1\) to \(N_4\) denote the respective population probabilities. The straight arrows indicate the excitation (blue) with its rate constant \(k_\text {ex}\), fluorescence emission (green, \(k_\text {fl}\)) and stimulated emission (red, \(k_\text {STED}\)), while the wiggly arrows denote vibrational relaxation (\(k_\text {vib2}\) and \(k_\text {vib4}\)). b Gaussian-shaped excitation pulse (blue) and STED pulse (red) which exhibit their maximum intensities at time points \(t_\text {0,ex}\) and \(t_\text {0,STED}\). The delay between the pulses is \(\varDelta t\)

In the context of STED microscopy, a fluorophore can be modelled as a simple four level system, in which photo-bleaching, intermediate dark states and radiation-less decay from \(S_1\) to \(S_0\) are neglected (cf. Fig. 1.10a). Note that in comparison to Fig. 1.9a, the spectrum of higher vibrational levels is merged to one level each and transition rates k and population probabilities N have been introduced. Specifically, \(N_1\) and \(N_3\) correspond to the population of the lowest vibrational level of \(S_0\) and \(S_1\), respectively. \(N_4\) and \(N_2\) represent the population of the higher vibrational states \(S_\text {1,vib}\) and \(S_\text {0,vib}\) after excitation and fluorescence emission, respectively. Since \(N_i, i=1,2,3,4\) are probabilites, \(\sum _{i}N_i=1\).

The temporal evolution of the population probabilities \(N_1\) to \(N_4\) can be described by a set of coupled rate equations:

$$\begin{aligned} \begin{aligned} \frac{\partial N_1(t)}{\partial t}&= k_\text {ex} \left[ N_4(t) - N_1(t)\right] + k_\text {vib2}N_2(t)\\ \frac{\partial N_2(t)}{\partial t}&= k_\text {fl} N_3(t) - k_\text {STED} \left[ N_2(t) - N_3(t)\right] - k_\text {vib2}N_2(t)\\ \frac{\partial N_3(t)}{\partial t}&= -k_\text {fl} N_3(t) + k_\text {STED} \left[ N_2(t) - N_3(t)\right] + k_\text {vib4}N_4(t)\\ \frac{\partial N_4(t)}{\partial t}&= -k_\text {ex} \left[ N_4(t) - N_1(t)\right] - k_\text {vib4}N_4(t) \end{aligned} \end{aligned}$$

Here, \(k_\text {fl}\), \(k_\text {vib2}\) and \(k_\text {vib4}\) are the rate constants for fluorescence decay from \(S_1\) and vibrational decay from \(S_\text {0,vib}\) and \(S_\text {1,vib}\), respectively. The rates for these spontaneous processes are given by the inverse of the lifetimes of the starting states, with \(k_{\text {fl}}^{-1}=\tau _{\text {fl}}\) in the range of several nanoseconds and \(k_{\text {vib}}^{-1}=\tau _{\text {vib}}\) on the order of one picosecond or less [18]. Note that excitation from \(S_0\) to \(S_\text {1,vib}\) by the STED light has been neglected.

The rate constants for excitation \(k_\text {ex}\) and stimulated emission \(k_\text {STED}\), however, depend on the intensity of the excitation and the STED light. They are given by the product of the molecular cross-section \(\sigma \) for the respective transition and the light intensity I divided by the photon energy \(hc/\lambda _0\):

$$\begin{aligned} k = \frac{\sigma I}{hc/\lambda _0} \end{aligned}$$

with the Planck constant h, the speed of light c and the vacuum wavelength \(\lambda _0\). Please note that in order to make the notation easier to read, the indices \(_\text {ex}\) and \(_\text {STED}\) are omitted here and in the following and are introduced again later.

When considering the third line in (1.30), it becomes obvious that for an efficient depletion of fluorescence, the depopulation of \(S_{1}\) by stimulated emission must not only dominate over the spontaneous fluorescence emission, but also over the refilling of \(S_1\) from \(S_\text {1,vib}\) caused by vibrational relaxation after excitation. This suggests that a pulsed scheme, which has already been implied in Fig. 1.9b, is beneficial [48]. An excitation pulse is followed by a STED pulse. This separates excitation and stimulated emission temporally, such that \(S_1\) is not refilled during fluorescence depletion. Further, pulsed lasers typically provide a high peak intensity, while the average laser power and thus the light dose in the sample is kept rather low.

For modelling the pulsed scheme, the intensity-dependent rate constants \(k_\text {ex}\) and \(k_\text {STED}\) in the rate equations (1.30) need to be formulated time-dependently. For this, the laser pulses are assumed to have a Gaussian shape in time (cf. Fig. 1.10b) and the time-dependent intensity I(t) is

$$\begin{aligned} I(t)=J\frac{hc}{\lambda _0}\sqrt{\frac{4\ln 2}{\pi \tau ^2}}e^{\frac{-4\ln 2(t-t_0)^2}{\tau ^2}} \end{aligned}$$

with the photon fluence per pulse J (measured in number of photons per area per pulse), the temporal FWHM \(\tau \) and pulse center position \(t_0\).

Usually, in the experiment, the fluence per pulse in the focal plane cannot be measured directly. Instead, the laser power P is readily accessible, which is why the fluence J will now be expressed in terms of power P. The total number of photons per laser pulse n is given by

$$\begin{aligned} n = \frac{P}{r_{\text {rep}}hc / \lambda _0} \end{aligned}$$

with the repetition rate of the laser pulses \(r_{\text {rep}}\) and the photon energy in the denominator. The distribution of photon fluences in the focal plane J(xy) is then given by

$$\begin{aligned} J(x,y) = n h(x,y) \end{aligned}$$

with the focal probability distribution of a single photon h. Please note that in contrast to the previous notation, here the PSF h is not interpreted as an intensity distribution, but as the probability for a photon to be found at a certain position. Therefore, h is normalized such that \(\iint _{-\infty }^{\infty } h (x,y) dx dy = 1\).

Combining (1.31), (1.32), (1.33) and (1.34) gives the time and position dependent rate constant

$$\begin{aligned} k(x,y,t)=\sigma \frac{P}{r_{\text {rep}}hc / \lambda _0}\sqrt{\frac{4\ln 2}{\pi \tau ^2}}e^{ \frac{-4\ln 2(t-t_0)^2}{\tau ^2}}h(x,y) \end{aligned}$$

with a molecule dependent, a laser light dependent and a microscope dependent part. Due to practical reasons, we simplify this expression further by approximating the PSF with a Gaussian function with FWHM \(d_{x,y} \simeq \frac{\lambda _0}{2\text {NA}}\) (see Sect. 1.1.3 and (1.25))

$$\begin{aligned} h(x,y) \simeq \frac{4\ln 2}{\pi d_{x,y}^2}e^{-\frac{4\ln {2}\left( x^2+y^2\right) }{d_{x,y}^2}} \end{aligned}$$

and evaluate it at the geometric focus position

$$\begin{aligned} k(0,0,t)_i \simeq 3.32 \, \sigma _i\frac{P_i}{r_{\text {rep}}hc \lambda _{i} \tau _i}e^{\frac{-4\ln 2(t-t_{0,i})^2}{\tau _i^2}}\text {NA}^2 \end{aligned}$$

where \(i \in \{\text {ex, STED}\}\). Note that here the indices \(_\text {ex}\) and \(_\text {STED}\) are introduced again. This equation depends on experimental parameters, which are easy to obtain, either by direct measurements or by consulting data sheets. Substituting this expression into the rate equations (1.30), we obtain the means to analyze the time-dependent state population of a fluorescent molecule in the pulsed STED scheme. A quantity of particular interest is the overall emitted fluorescence

$$\begin{aligned} F = \int _{0}^{\infty }k_\text {fl}N_3(t)dt \end{aligned}$$

and its dependence on experimental parameters, since the STED microscope’s performance is directly influenced by the efficiency of fluorescence depletion.

Influence of Laser Parameters on Fluorescence Depletion

For successful STED imaging in a pulsed scheme, it is particularly important to consider the influences of the relative timing between the laser pulses and the STED laser power on the efficiency of fluorescence depletion, since these two parameters need to be routinely set by the microscopist. Therefore, the overall emitted fluorescence (1.38) is simulated by numerically solving the rate equations (1.30). The rate constants \(k_\text {ex}\) and \(k_\text {STED}\) are assumed to be time-dependent and are analyzed at position (x,y) = (0,0) according to (1.37). From an experimental point of view, this corresponds to measuring the fluorescence from a very small bead which is located in the very center of the superimposed focal spots of the excitation and the (not spatially shaped) STED light.

For the simulations, fluorophore parameters are set to mimic a typical STED fluorophore: \(\tau _{\text {fl}}=~3.3\) ns, \(\tau _\text {{vib2}} = \tau _\text {{vib4}} = 1\) ps, \(\sigma _{\text {ex}} = 4.6\cdot 10^{-16}\) cm\(^2\), \(\sigma _{\text {STED}}= 4.6\cdot 10^{-17}\) cm\(^2\). Note that effects due to the polarization and the orientation of the transition molecular dipole are neglected. The NA of the objective lens is assumed to be 1.4. The laser wavelengths are set to \(\lambda _{\text {ex}}~=~640\) nm and \(\lambda _{\text {STED}}~=~775\) nm, which are typical for STED imaging of red fluorophores, and the laser repetition rate is assumed to be \(r_{\text {rep}} = 20\) MHz.

The question of suitable laser pulse lenghts deserves a short comment: While a very short excitation pulse in the range of a picosecond is beneficial, because fluorescence decay during excitation can be neglected in this case, there is a clear constraint on the shortest feasible pulse length of the STED laser. The rate for stimulated emission from the \(S_1\) to \(S_\text {0,vib}\) is equal to the rate for re-excitation from \(S_\text {0,vib}\) back to \(S_1\). Therefore, at best, an equal population of both states can be achieved, unless there is sufficient time for vibrational relaxation from \(S_\text {0,vib}\) to \(S_0\). Only due to this drain of \(S_\text {0,vib}\), the state \(S_{1}\) can be efficiently depleted. The STED pulse length should therefore be much longer than the vibrational lifetime [48]. On the other hand, it should be shorter than the fluorescence lifetime since STED photons arriving after the molecule has already fluoresced do not have any effect and are therefore wasted. Considering these aspects as well as specifications of commercially available laser systems, excitation and STED pulse lengths are set to \(\tau _{\text {ex}} = 50\) ps and \(\tau _{\text {STED}}= 800\) ps.

Fig. 1.11
figure 11

Influence of pulse delay \(\varDelta t\) a and STED power \(P_\text {STED}\) b on the relative fluorescence. a The relative fluorescence shows a pronounced minimum at \(\varDelta t = 440\) ps. Calculations were performed for \(P_\text {STED} = 0.25\) mW. b For optimized pulse delay \(\varDelta t = 440\) ps, the relative fluorescence drops to half at \(P_\text {STED} = 0.09\) mW, which is indicated by the red line. All other parameters are as mentioned in the main text

The results of the simulations are shown in Fig. 1.11. It illustrates the relative fluorescence \(\eta \), which depicts the amount of remaining fluorescence, when STED light is applied compared to the case without applying any STED light. The STED power is \(P_\text {STED} = 0.25\) mW and \(P_\text {ex} = 10\,\mu \)W is chosen such that no saturation effects occur during excitation.

In Fig. 1.11a the relative timing \(\varDelta t = t_\text {0,STED} - t_\text {0,ex}\) of the excitation and STED pulse is varied. This so called pulse delay spans a range from −1.5 ns to 12.5 ns, where a positive value corresponds to the situation where the STED pulse peak reaches the sample after the peak of the excitation pulse. If \(\varDelta t\) is too short, the STED efficiency is low, either because STED photons reach the sample even before the molecules have been excited or because they have not yet vibrationally relaxed to the lowest level of \(S_1\). This effect accounts for the steep slope on the left hand side, whose gradient is determined by \(\tau _\text {STED}\). If, however, the pulse delay is too long and the STED pulse reaches the sample too late, some of the molecules will have already fluoresced. The gradient of the right slope therefore depends on \(\tau _\text {fl}\). At optimal time delay the relative fluorescence is minimal, which is the case for \(\varDelta t = 440\) ps in this example.

Figure 1.11b shows the relative fluorescence at optimal time delay as a function of \(P_{\text {STED}}\). The STED power at which the fluorescence drops to half is called saturation power \(P_\text {sat}\). The shape of the curve has strong similarity to an exponential decay and, indeed, if re-excitation of the dye by the STED light is neglected (simple two-level system), \(\eta \) is given by [49]

$$\begin{aligned} \eta = e^{-\sigma _{\text {STED}}J_{\text {STED}}}. \end{aligned}$$

Substituting \(J_{\text {STED}}\) using (1.34) and (1.33) gives the focal shape of the fluorescence suppression induced by the applied STED light

$$\begin{aligned} \eta (x,y) = e^{-\sigma _{\text {STED}}\frac{P_{\text {STED}}}{r_{\text {rep}}hc / \lambda _{\text {STED}}}h_{\text {STED}}(x,y)} \end{aligned}$$

again with a dye dependent, a laser light dependent and a microscope dependent part.

Evaluating the relative fluorescence \(\eta \) at the center of the PSF and setting it to 1/2

$$\begin{aligned} \eta (0,0) = e^{-\sigma _{\text {STED}}\frac{P_{\text {STED}}}{r_{\text {rep}}hc / \lambda _STED} h_\text {cal}(0,0)} \overset{!}{=} 1/2 \end{aligned}$$

results in an analytical expression for the saturation power \(P_{\text {sat}}\)

$$\begin{aligned} P_{\text {sat}}= \frac{\ln 2 /\sigma _{\text {STED}}}{h_\text {cal}(0,0)} r_{\text {rep}}hc / \lambda _{\text {STED}}. \end{aligned}$$

Note that a calibration PSF \(h_\text {cal}\) is introduced here in order to make the expression also applicable for more elaborated shapes of \(h_\text {sted}\), e. g. exhibiting a central intensity zero. In practice, the thus defined \(P_\text {sat}\) defines the power of the STED light which is needed to suppress the fluorescence at the center of a Gaussian-shaped STED PSF by half. It depends on the optical properties of the microscope, photophysical properties of the dye and parameters of the STED laser and allows to write the relative fluorescence (cf. 1.40) in a particularly simple form

$$\begin{aligned} \eta (x,y) = e^{-\ln {2} \zeta \frac{h_{\text {STED}}(x,y)}{h_{\text {cal}}(0,0)}} , \end{aligned}$$

with the saturation factor \(\zeta \) defined as \(P_{\text {STED}}/ P_{\text {sat}}\).

2.3 Shaping the STED Beam

As already mentioned in Sect. 1.2.1, STED nanoscopy is based on the idea of limiting the ability of molecules to fluoresce in the immediate vicinity of the geometric focus. Since the fluorescence is inhibited by the process of stimulated emission, it is necessary to shape the intensity distribution of the STED light such that it is zero in the geometric focus. In order to utilize the maximum power of the available STED light, the phase distribution of the electric field at the entrance pupil of the objective lens \(P(x_0,y_0)\) has to be designed such that the contributions of all secondary plane waves \(\mathbf {E}_p\) interfere destructively at the focus.

Central Retardation

Among the simplest ways to generate a central zero intensity is the generation of a phase delay of \(\pi \) in a central circular region of the aperture [50]

$$\begin{aligned} P(x_0,y_0) = {\left\{ \begin{array}{ll} \pi &{} \sqrt{x_0^2 + y_0^2} \le r_0 \\ 0 &{} \, \text {elsewhere}. \end{array}\right. } \end{aligned}$$

If the effect of focusing on the polarization direction of \(\mathbf {E}_p\) is neglected, \(r_0\) is exactly given by the diameter of the entrance pupil divided by \(\sqrt{2}\). For the usually utilized high NA lenses, however, it is slightly smaller (compare (1.11) and Fig. 1.2). As shown in Fig. 1.12a, the polarization contributions of the secondary plane waves in the direction of the original polarization cancel each other out. The orthogonal polarization directions also interfere destructively, since the phase mask does not change the rotary symmetry of the corresponding field components on the exit aperture (compare Figs. 1.2 and 1.3). The intensity distribution generated by illuminating the phase mask with circular polarized light is shown in Fig. 1.12b. Since the phase of secondary plane waves, originating from the central region of the aperture, changes significantly faster as a function of z than that of secondary plane waves from the boundary region, spots of high constructive interference occur above and below the focal plane. Therefore, this phase mask is usually used to increase the resolution in the axial direction.

Fig. 1.12
figure 12

Central phase retardation. a A phase delay of the central region of the aperture by \(\pi \) (see inset in the top right corner) causes the x-components of the secondary plane waves of the central and outer regions to cancel each other out. The illustration shows \(\mathbf {E}_p\) for two opposing points in the inner and outer region when illuminated with x-polarized light. b Strength of the lateral and axial electric field components and overall intensity in the vicinity of the focal spot for illumination with circular polarized light. Calculations were performed for an NA 1.4 oil immersion objective lens (\(\lambda = 775\) nm, \(n = 1.518\)). Scale bars 250 nm

Helical Retardation

Another way to create a depletion pattern is to helicaly phase retard the STED beam [51]

$$\begin{aligned} P(x_0,y_0) = \phi \end{aligned}$$

where \(\phi \) is the angle between the vector \((x_0,y_0)\) and the x-axis. The operation principle of this phase mask is based on the same effect, which ensures that when focusing a plane x-polarized wavefront, the y- and z-components of the electric field vanish on the optical axis (compare Figs. 1.2 and 1.3). Since two mirror-symmetrical points with respect to the optical axis always exhibit a phase difference of \(\pi \), their x- and y-components of the electric field cancel each other out at the geometric focus (Fig. 1.13a). However, this also creates the effect that the z-components of \(\mathbf {E}_p\) for these points face in the same direction, which means that they interfere constructively at the focal spot. However, this can be avoided by using circular polarized light. For the originally x-polarized part of the illuminating field, the effect still exists, but now the z-components of the originally y-polarized part for two points which are rotated by \(\phi = 90^\circ \) with respect to the originally considered points face in the opposite direction. This causes the z-components of the electric field of the two point pairs to cancel each other out (Fig. 1.13a). Note that this effect is only achieved if the circularity of the light matches the rotation direction of the helical phase mask. If this is not the case, the described effect contradicts and the field distribution has maximum z-component in the geometrical focus. The intensity distribution for a correct circularity of the illuminating light field is depicted in Fig. 1.13b and forms a hollow cylinder around the optical axes. It has been shown that helical phase retardation generates the optimal inhibition pattern for isotropic resolution enhancement in the focal plane [51].

Fig. 1.13
figure 13

Helical phase retardation. a A helical phase retardation (see inset in the top right corner) causes the lateral field components of the secondary plane waves of opposing points to cancel each other out. The illustration shows \(\mathbf {E}_p\) for two opposing points when illuminated with x-polarized (red) and y-polarized (blue) light. The two point pairs are rotated by \(90^\circ \) with respect to each other. The phase delay between the x-polarized and y-polarized light was set to \(\pi / 2\). b Strength of the lateral and axial electric field components and overall intensity in the vicinity of the focal spot for illumination with circular polarized light. Calculations were performed for an NA 1.4 oil immersion objective lens (\(\lambda = 775\) nm, \(n = 1.518\)). Scale bars 250 nm

2.4 Resolution

In this section the effective PSF of a STED microscope is derived. It describes the volume in which fluorescence is still allowed, and whose spatial extent is a measure for the resolution. As an example, a 2D STED microscope utilizing a helical phase mask is considered.

We assume that the excitation and STED light is applied as temporally separated pulses with a pulse duration much shorter than the fluorescence lifetime. Photo-bleaching, intermediate dark-states or re-excitation of the dye by the STED light are neglected (simple two-level model) and dye molecules are assumed to rotate fast enough to average the orientation of their molecular transition dipole relative to the polarization of the excitation and STED light. Under these conditions, the effective PSF of the STED microscope \(h_{\text {eff}}\) is the product of the excitation PSF and the remaining fluorescence in the presence of the STED light [49]

$$\begin{aligned} h_{\text {eff}}(x,y) = h_\text {ex}(x,y)\eta (x,y) . \end{aligned}$$

According to Sect. 1.1.3, the excitation PSF can well be approximated by a symmetrical 2D Gaussian peak in the focal plane with a FWHM of \(d_\text {x,y} \simeq \frac{\lambda _0}{2\text {NA}}\)

$$\begin{aligned} h_\text {ex}(x,y) \propto e^{-\frac{4\ln {2}\left( x^2+y^2\right) }{d_\text {x,y}^2}} \end{aligned}$$

where a normalization constant has been neglected.

Fig. 1.14
figure 14

Pattern steepness and resolution in the case of helical a and central b phase retardation. Top: Focal intensity distribution in the x-y-plane and the x-z-plane, respectively, through the geometric focus. Center: The intensity profiles (black) along the dotted white lines can be well fitted with a parabola (red) in the vicinity of the minimum in both cases. Bottom: With the fitted pattern steepnesses and the indicated FWHM of the excitation PSF in the respective direction, the lateral and axial resolution can be calculated according to (1.50). Calculations were performed for a 1.4 NA oil-immersion objective lens (\(n = 1.518\)), \(\lambda _\text {ex} = 640\) nm and \(\lambda _\text {STED} = 775\) nm

For sufficiently large saturation factors, the FWHM of the effective central spot of the STED microscope is much smaller than the wavelengths used and only the shape of the STED intensity distribution in the vicinity of the focal spot determines the shape of the central spot. In this region, the focal distribution \(h_{\text {STED}}\), which is generated via helical phase retardation (cf. Sect. 1.2.3), can be well approximated by a 2D parabola [52]

$$\begin{aligned} \frac{h_{\text {STED}}(x,y)}{h_{\text {cal}}(0,0)}\simeq 4a (x^2+y^2) . \end{aligned}$$

Here, \(h_\text {cal}(0,0)\) is the calibration factor already known from Sect. 1.2.2 and a is the so called pattern steepness, which is proportional to the curvature of \(h_\text {STED}\) in the geometrical focus. Figure 1.14a shows the 2D STED intensity distribution (top) (cf. Fig. 1.13b) and the good agreement of the parabolic fit (center). Please note that the definition of the pattern steepness differs from a prior definition. Here, it is normalized to \(h_\text {cal}\left( 0,0\right) \), while Harke et al. normalized a to the maximal intensity of \(h_\text {STED}\left( x,y\right) \) in the focal plane [52].

Combining (1.46), (1.47), (1.48) with (1.43) from Sect. 1.2.2 for \(\eta (x,y)\) gives a relatively simple expression for the effective STED PSF \(h_\text {eff}\), which represents a 2D Gaussian peak shape

$$\begin{aligned} h_{\text {eff}}(x,y) = e^{-4\ln {2}\left( x^2+y^2\right) \left( d_\text {x,y}^{-2}+a \zeta \right) } . \end{aligned}$$

Its FWHM along the lateral direction is

$$\begin{aligned} d_{\text {STED}} = \frac{d_\text {x,y}}{\sqrt{1+d_\text {x,y}^2a\zeta }} . \end{aligned}$$

For sufficiently large saturation factors, the attainable resolution of the STED microscope is only governed by the product of the pattern steepness and the saturation factor

$$\begin{aligned} d_{\text {STED}} = \frac{1}{\sqrt{a\zeta }}. \end{aligned}$$

The dependence of the lateral STED resolution on the saturation factor \(\zeta \) according to (1.50) is shown in Fig. 1.14a (bottom). In this example, which was calculated for an NA 1.4 oil immersion objective lens (\(n = 1.518\), \(\lambda _\text {ex} = 640\) nm, \(\lambda _\text {STED} = 775\) nm), a resolution of 50 nm, which corresponds to a resolution improvement of factor 5, is achieved for a saturation factor \(\zeta \simeq 28\).

The resolution formula is not limited to the 2D STED pattern considered here, but is applicable whenever a parabolic fit can be reasonably applied in the vicinity of the zero intensity spot. This specifically also applies to the STED pattern, which is usually used for axial resolution increase (cf. Sect. 1.2.3 and Fig. 1.12b). Figure 1.14b shows the good agreement of the fit to the corresponding focal intensity distribution along the axial direction and presents the attainable axial resolution. Again an oil immersion objective lens with NA = 1.4 was assumed (\(\lambda _\text {ex} = 640\) nm, \(\lambda _\text {STED} = 775\) nm). Because of the larger FWHM and the smaller pattern steepness, a saturation factor of \(\zeta = 28\) yields a resolution of only 103 nm in this case. Still this corresponds to a resolution increase by a factor of 6.

For imaging three-dimensional structures, a resolution increase in all three dimensions is often desired. This can be achieved by an incoherent superposition of both STED patterns. It was shown that a distribution of the total available power of \(30\%\) in the 2D and \(70\%\) in the axial pattern is favorable in terms of focal volume size and axial resolution [40].

3 Imaging Examples

STED microscopy has become an indispensable tool in the life sciences, as it allows non-invasive uncovering of details hidden to conventional light microscopes. By now, nanoscopy has been successfully applied to various fields such as immunology, signaling, virology, bacteriology and cancer biology [53]. Particularly, the possibility to label different types of proteins simultaneously and to record their relative spatial distribution at super-resolution offers important insight into protein co-localization and interaction. In order to demonstrate the current performance of STED microscopy, some selected examples of cell imaging are presented in the following.

Fig. 1.15
figure 15

Data are courtesy of Abberior Instruments, Germany

Confocal and RESCue STED images of nuclear pore complex subunits (NUP98, red) and Golgi apparatus (GM130, blue) in Vero cells. Samples were prepared by indirect immunolabeling using Abberior STAR RED and Abberior STAR ORANGE. Acquisition was performed using an Abberior Instruments Facility Line STED microscope. Shown is a maximum projection of a raw image stack. The inset shows that RESCue STED microscopy can resolve the ring-like organization of the nuclear pore complex proteins (as highlighted by the dotted white circles). Please note that the diameter of individual NUP98 rings is \(\sim \)70 nm. For a better visualization, the STED image in the inset is smoothed.

Example 1: Golgi Apparatus and Nuclear Pore Complex

Figure 1.15 shows the complexly structured Golgi apparatus (blue) in a Vero cell. This cell organelle is known to be a collection and dispatch station of protein products from the endoplasmatic reticulum. It synthesizes and modifies elements of the plasma membrane and generates primary lysosomes. Next to the Golgi apparatus, the easily recognizable oval shaped cell nucleus is visible in red. More precisely, the image depicts the nuclear pore complex, a part of the nuclear envelope surrounding the cell nucleus, that allows transportation across the envelope.

For confocal and STED imaging, the proteins GM130 (Golgi apparatus) and NUP98 (nuclear pore complex) have been immunolabeled with primary antibodies targeting the respective proteins and dye labeled secondary antibodies (Abberior STAR ORANGE, Abberior STAR RED) binding to the latter. It is evident, that the structures are resolved with much more detail in the RESCue STED image. Especially the ring-like arrangement of the nuclear pore complex proteins can be discerned (cf. dotted circles in the inset of Fig. 1.15).

Fig. 1.16
figure 16

Data are courtesy of Abberior Instruments, Germany

Confocal and STED images of spectrin periodicity in primary rat hippocampal neurons. Please note the characteristic \(\sim \)190 nm beta II spectrin periodicity along distal axons (red, blue) which is only visible in the STED image. Labelled structures: beta II spectrin (red, Abberior STAR635P), actin (blue, Abberior STAR 580). Acquisition was performed using an Abberior Instruments STEDYCON microscope. Shown are raw data.

Example 2: Nanoscopy of Neurons

Neurons are highly specialized cells, which are the basic building blocks of the nervous system and transmit information throughout the body. STED nanoscopy revealed that short actin filaments in neuronal axons, dendrites and spine necks are bridged by spectrin tetramers to form an \(\sim \)190 nm periodic structure [53].

An exemplary measurement of the actin (blue) and beta II spectrin (red) distribution in the axons of a primary rat hippocampal neuron is shown in Fig. 1.16. While in the confocal image only little information on the co-localization can be obtained, the characteristic periodicity of the beta II spectrin as well as the actin is easily seen in the STED image. The inset emphasizes previously concealed details, that are clearly visible in the STED image.

Fig. 1.17
figure 17

Data are courtesy of Abberior Instruments, Germany

Confocal and STED images of mitochondrial protein (Tom20, red) and the mitochondrial genome (dsDNA, green) in Vero cells. Samples were prepared by indirect immunolabeling using Abberior STAR RED and Abberior STAR ORANGE. Acquisition was performed using an Abberior Instruments Facility Line STED microscope. Shown are raw data.

Example 3: Nanoscopy of Mitochondria

Although mitochondria are best known for their role as the ‘power houses’ of the cell, they are also key players in executing apoptosis, a tightly regulated suicide program in eukaryotic cells [53]. Moreover, damage and subsequent dysfunction of mitochondria is known as an important factor for several human diseases. With a diameter of approximately 300–500 nm in cultured mammalian cells, their structure is not accessible to conventional light microscopy.

STED nanoscopy revealed that Tom20, a membrane-spanning receptor protein of the translocase of the outer membrane complex, is found in clusters on the surfaces of mitochondria [53]. Super-resolution studies also showed that the nucleoids in mitochondria have a diameter of 70–110 nm and allowed conclusions on the number of copies of mitochondrial DNA (mtDNA) per nucleoid [53].

Figure 1.17 depicts an image of the mitochondria in a Vero cell. The Tom20 proteins are illustrated in red and the mtDNA in green. Again in the STED image the clustering and co-localization of both proteins is evident, whereas only few conclusions can be drawn from the confocal image.