3.1 Experimental Setup

As outlined in the previous chapter, existing technologies for surface profilometry show certain drawbacks in terms of resolution, dynamic measurement range, three-dimensional measurement capabilities and speed. The following chapter introduces a novel approach to surface profilometry which aims to provide solutions to the problems named. The basic setup for all experiments is centered around a two-beam interferometer of the Michelson type, Fig. 3.1.

A WLS—white-light source emits a collimated beam which is typically split in a 50:50 ratio by a BS—cube beamsplitter. The reference arm typically consists of a DE—dispersive element with the thickness \(t_{DE}\). Following the transmission through the DE, the light is reflected off of a REF reference mirror which causes the light to transmit through the DE a second time before it is guided by the beamsplitter. In the sample arm, the light is reflected on a sample surface before it is recombined with the reference arm light by the beamsplitter. The optical path difference of both arms \(\delta \) is fixed as any change in \(\delta \) is usually measured as a change in the samples surface profileFootnote 1. The recombined signal is detected by a grating spectrometer. As the dispersive element plays a significant role in gathering relevant information on the topography of a sample, the term dispersion-encoded low coherence interferometry (DE-LCI) is used throughout the work for this approach.

Figure 3.1
figure 1

Principle of a profilometry setup with WLS—white light source, BS—beamsplitter, DE—dispersive element (with thickness \(t_{DE}\)), REF—reference mirror (translatable in one dimension for adjustment purposes), SMP—sample with a surface profile and SPEC—spectrometer

3.2 Measurement Range and Resolution

In general, the mathematical model of a two-beam interferometer applies in the case of the setup described in Fig. 3.1 according to [249]

$$\begin{aligned} E_{out} = E_0 \cdot e^{i(\omega t \pm \varphi _0)} = E_0 \cdot e^{i2\pi (ft \pm \frac{\delta n}{\lambda })}, \end{aligned}$$
(3.1)

where the resulting electric field \(E_{out}\) is composed of the initial electric field \(E_{0}\) in combination with the oscillating portion defined by the angular optical frequency \(\omega \) or the optical frequency f, the time t and the phase \(\varphi _0\) which consists of the optical path difference \(\delta \), the refractive index of the surrounding medium n and the wavelength \(\lambda \). In an experimental setup, where only the time averaged signal of the electric field can be detected the intensity I is of interest, it can be formulated

$$\begin{aligned}&E_{out}^2 = \frac{c \cdot \varepsilon _0}{2} \left( |E_1|^2 + |E_2|^2 + 2 \cdot E_1 \cdot E_2 \right) \end{aligned}$$
(3.2)
$$\begin{aligned}&I = \frac{c \cdot \varepsilon _0}{2T}\int {\overline{E}_{out}^2 dt}, \end{aligned}$$
(3.3)

which is composed of the velocity of light c, the vacuum permittivity constant \(\varepsilon _0\) as well as the two electric field components of the interferometer arms \(E_{1}\) and \(E_{2}\) and is integrated over a given time. As the relative change between the two arms is of interest when measuring surface heights, the phase term holds the relevant information. Therefore, the intensity \(I(\lambda )\) can be written as the spectral dependency of the phase with

$$\begin{aligned}&I(\lambda ) = I_0 \cdot (1 + \gamma (\lambda ) \cdot \cos \varphi _0) \end{aligned}$$
(3.4)
$$\begin{aligned}&\varphi _0 = \frac{2\pi \cdot \delta \cdot n_{air}}{\lambda }, \end{aligned}$$
(3.5)

where \(I_0\) is the initial spectral intensity and \(\gamma (\lambda )\) is the spectral contrast of the interference fringes. It is dependent of the OPD between the both arms denoted with \(\delta \), the wavelength \(\lambda \) and the refractive index of air \(n_{air}\) which was assumed to equal one for simplification. These equations are suitable under the consideration that the interferometer is dispersion-free. In this general case, the phase is changing proportionally with the wavelength. Through the introduction of a dispersive medium in one arm of the interferometer, Fig. 3.1, the phase term \(\varphi \) extends to [246, 249]

$$\begin{aligned} \varphi = 2\pi \frac{\left[ n^{DE}(\lambda )-1\right] t_{DE}-\delta }{\lambda }. \end{aligned}$$
(3.6)
Figure 3.2
figure 2

Simulation of a typical spectral interference signal \(I(\lambda )\) from a single point having the OPD \(\delta \) with a equalization wavelength \(\lambda _{eq}\) = 600 nm plotted in black and the corresponding phase signal \(\varphi (\lambda )\) plotted in red

Assuming that pure material dispersion is the only effective mechanismFootnote 2, the transformation of the interference signal is dependent on the wavelength-dependent refractive index \(n^{DE}(\lambda )\) and the material’s thickness \(t_{DE}\). The periodicity of fringes tends to a minimum at the so-called equalization wavelength \(\lambda _{eq}\), which is dependent on \(\delta \), Fig. 3.2. Using the interferometer as a profilometer, every height change in the sample’s surface changes the OPD and thus leads to a different equalization wavelength. The equalization wavelength can be calculated analytically from the intensity signal as the minimum of the derivative of the phase signal with respect to the wavelength

$$\begin{aligned} \left( \frac{\partial \varphi }{\partial \lambda }\right) _{\lambda _{eq}} = 0 = 2\pi \frac{\left[ 1-n_g^{DE}(\lambda _{eq})\right] t_{DE} + \delta }{\lambda ^2}\end{aligned}$$
(3.7)
$$\begin{aligned} \text {with } n_g^{DE}(\lambda ) = n^{DE}(\lambda ) - \lambda \cdot \frac{\text {d}{n}}{\text {d}{\lambda }}, \end{aligned}$$
(3.8)
Figure 3.3
figure 3

Simulation of the basic dispersion-related dependencies of the interferometric signal with a) dispersion-dependent behavior where (I) shows the signal with a DE of N-BK7 having a \(t_{DE}\) = 1 mm, (II) \(t_{DE}\) = 2 mm and (III) \(t_{DE}\) = 4 mm as well as b) the dependency regarding the OPD for a N-BK7 DE with \(t_{DE}\) = 4 mm where (I) \(\lambda _{eq}\) = 0.5 \(\upmu \)m, (II) \(\lambda _{eq}\) = 0.6 \(\upmu \)m, (III) \(\lambda _{eq}\) = 0.7 \(\upmu \)m and (IV) representing the slope of \(\delta \) which is a function of the refractive index of the dispersive element \(n^{DE}\)

where \(n^{DE}_{g}(\lambda )\) is the group refractive index of the dispersive element. It can be concluded from this equation that the dispersive element and the path difference \(\delta \) have the most significant influence on the signal. In this setting, the DE defines the phase slope of the signal and furthermore, the axial measurement range where the equalization wavelength can be observed, Fig. 3.3 a). With increasing thickness of the DE of the same \(n^{DE}\), the phase gradient around the equalization wavelength increases which results in a higher frequency intensity signal around \(\lambda _{eq}\). In practical terms, this signal modification is responsible for the effective axial measurement range. This measurement range \(\Delta z(\lambda )\) can be estimated within a spectral range of interest

$$\begin{aligned} \Delta z(\lambda ) = \left[ n_g^{DE}(\lambda ) -1\right] \cdot t_{DE}. \end{aligned}$$
(3.9)

This characteristic of the DE defines the response of \(\lambda _{eq}\) due to a change of the optical path difference, Fig. 3.3 b) which therefore can be used as a measure for surface profile changes. During the design of a DE-LCI setup, the detector size and its resolution as well as its dynamic range in the desired spectral range set the initial boundaries for the axial measurement range and resolution. The characteristics of the DE enable fine tuning of the axial measurement range and resolution even after the initial design of the system.

Table 3.1 Spectrometric system properties for initial simulations regarding measurement range and resolution of the designed DE-LCI

In a system like this, the axial measurement range and resolution are determined by the spectral probing range \(\Delta \lambda \), the center wavelength \(\lambda _c\), the dispersive element and the detecting spectrometer configuration. In order to evaluate the possible axial measurement range as well as the corresponding resolution, the interconnection of the detector properties with the material properties of the DE have been studied. For this purpose, simulations for a system with a defined set of parameters have been performed, Tab. 3.1. As stated before, the center wavelength in relation to the refractive index of the DE and the spectral range have a characteristic influence on the axial height measurement range which follows an e.g. Sellmeier-like slope. This behavior can be used to manipulate the measurement range in a pre-designed setup. According to Eq. (3.9) the measurement range \(\Delta z(\lambda )\) = \(\Delta z\) can be calculated over a given spectral range \(\Delta \lambda \) as a function of the group refractive index and the thickness of the DE, Fig. 3.4 a). The behavior, which is in most cases Sellmeier-like, increases the measurement range for a DE, following this characteristic especially towards shorter wavelengths. With the aid of different dispersive characteristics from e.g. glasses, polymers or thin films, this behavior can be controlled in a wide range while the setup is kept constant, Fig. 3.4 b).

Figure 3.4
figure 4

a) Simulation of measurable height ranges in dependence of the spectral range of the setup \(\Delta z\) for different thicknesses \(t_{DE}\) of N-BK7 and b) simulated measurement ranges for different materials with a \(t_{DE}\) = 1 mm

Light sources such as SLDs or swept-source lasers are commonly used in other LCI approaches. With regard to the system design of a DE-LCI, these light sources are considered suboptimal as the spectral bandwidth is usually about (80 – 120) nm with center wavelengths in the range of (800 - 1300) nm, [117, 118, 120]. The results of the simulation, Fig. 3.4 a), show that such a spectral range with a center wavelength at 1050 nm would significantly limit the height measurement range. In fact, DE-LCI benefits from broadband light sources such as supercontinuum or laser-driven plasma light sources which provide a spectral power output from (300 - 2000) nm, [250].

With regard to the DE, a spectrally defined resolution parameter, \(r_{DE}\), can be calculated

$$\begin{aligned} r_{DE} = \frac{\Delta z}{\Delta \lambda }. \end{aligned}$$
(3.10)

Clearly, the resolution can change due to dispersion as a function of the center wavelength while the nominal spectral range is constant. Nonetheless, the center wavelength influences \(\Delta z\), see also Eq. (3.9). As the detection of spectral interference is usually performed in a spectrometer, its specific resolution properties have to be considered as well. Therefore, the resolution of the system, \(r_{sys}\), can be calculated with

$$\begin{aligned} r_{sys} = r_{DE} \cdot \Delta r_{spec}. \end{aligned}$$
(3.11)

Due to the construction of e.g. a grating spectrometer, its resolution \(r_{spec}\) is defined by the combination of the slit size, the grating constant, the detector size and its specific pixel size. In consequence, a set of typical values for the measurement range and axial resolution was simulated for three different dispersive elements, Tab. 3.2. From this simulation, it can be deduced that the axial height resolution is tied to the measurement range and scales linearly with the thickness of the dispersive element. Additionally, it becomes clear that the dispersive element holds a large potential to tune both measurement range and resolution in a setup where the remaining components are already selected and fixed as shown on Fig. 3.4.

Table 3.2 Calculated system properties regarding measurement range and resolution of the designed dispersion-enhanced low-coherence interferometer for investigations in the spectral range of \(\Delta \lambda \) = 360 - 860 nm

In order to evaluate the simulations, a temporally controlled experiment was carried out with the setup seen in Fig. 3.1. In contrast to classical temporal LCI, the experiment was designed to be controlled temporally but detected spectrally. For this purpose, a spectrometer according to the specifications of Tab. 3.1 (AvaSpec-ULS3648 VB, Avantes BV, Apeldoorn, The Netherlands) was used to capture interference data. The plain mirror in the reference arm acted as a sample which was translated to different OPDs in order to emulate height changes of a sample. The sample arm, which is also equipped with a plain mirror, was kept at a constant position. Aluminum coated mirrors with a flatness of \(\lambda \)/20 (EO Partno. 34360; Edmund Optics Inc., Barrington, NJ, USA), having a scratch-dig of 20–10, served as sample and reference reflectors. In the experiment, a dispersive element of \(t_{DE}\) = 2 mm (EO Partno. 49121; Edmund Optics Inc., Barrington, NJ, USA) was used. The height emulation was performed with a piezo-driven precision stage (SLC 2412, SmarAct GmbH, Germany). As a common reference, the setup was adjusted to an equalization wavelength of \(\lambda _{eq}\) = 500 nm. Subsequently, the translation stage was used to move the reference mirror in steps of \(\Delta \delta \) = 0.5, 1 and 2 \(\upmu \)m along a maximum measurement range of \(\Delta z\) = 60 \(\upmu \)m. The spectral interference at the spectrometer was recorded and the equalization wavelength was determined for every emulated height step. Based on \(\lambda _{eq}\), the analysis of the measured height step, \(z_{meas}\), was performed relative to the previous position with Eq. (3.9), Fig. 3.5. The results show that the emulated height steps could be measured with some statistical deviations. Specifically, heights of \(z_{meas}\) = (0.497 ± 0.098), (0.998 ± 0.106) and (1.997 ± 0.108) \(\upmu \)m could be measured for the nominal emulated steps of \(\Delta \delta \) = 0.5, 1 and 2 \(\upmu \)m respectively. The detected deviations were only slightly larger than the calculated resolution of 0.097 \(\upmu \)m for the dispersive element of \(t_{DE}\) = 2 mm, Tab. 3.2. In reference to the work of Ruiz et al. [118], the dynamic range (DR) can be calculated as the quotient of the measurement range \(\Delta z\) and the resolution \(r_{sys}\). The simulations as well as the experiments show that the DR of about 1667 is constant for the different measurement ranges since it is limited by the hardware. Due to the limitations in available detector sizes and trade offs between measurement range and resolution, other analysis schemes have to be considered in order to enable a higher dynamic range.

Figure 3.5
figure 5

Results for the measured emulated height steps \(z_{meas}\) for the targeted steps of \(\Delta \delta \) = 0.5, 1 and 2 \(\upmu \)m respectively over a given travel range of a piezo stage in the reference arm

Figure 3.6
figure 6

Simulated intensity change in relation to a height change \(\Delta \delta \) for different thicknesses of the DE \(t_{DE}\) of N-BK7 with a) oscillations visible over a \(\Delta \delta \)-range of 1 \(\upmu \)m and b) reduced region of interest for \(\Delta \delta \) where the slope is nearly linear and the maximum difference between different DEs is about 0.012 %

It can be noted that the signal not only shows a dependency of \(\delta \) towards a change in equalization wavelength but also towards the signal amplitude at \(\lambda _{eq}\). In a more advanced analyzing scheme this behavior is the primary component of a two-part process in order to determine height profiles with high-dynamic range. When analyzing the intensity behavior of the signal at the equalization wavelength in a significantly smaller \(\delta \)-range such as \(10 \cdot r_{sys}\approx \)\(\upmu \)m, an oscillatory behavior becomes visible, Fig. 3.6 a). This simulation reveals that the differences in the height dependent intensity behavior are negligibly small for the different dispersive elements. By deceasing the \(\delta \)-range of interest even further towards the maximum hardware resolution of \(t_{DE}\) = 2 mm to \(\Delta \delta \) = 97 nm, the intensity dependency becomes nearly linear and unambiguous, Fig. 3.6 b). This effect can be exploited in a more advanced analysis scheme where the first step consists of the determination of the equalization wavelength as a rough measure for height changes. During a second step, a fit of the intensity amplitude around this point can contribute to a fine-scaled analysis with resolutions beyond the hardware limit. Furthermore, the behavior can be considered independent of the thickness of the DE and hence of the measurement range. The simulation revealed that the intensity difference between the signals of \(t_{DE}\) = 1 mm and \(t_{DE}\) = 4 mm are about 0.012 % at maximum. Once the analysis becomes not only dependent on the wavelength accuracy but also on the intensity signal measured, the noise of this signal determines the height resolution. In the setup presented within this work, Fig. 3.1, the actual measured spectrum, \(S_{i}\), is affected by four main noise components, of the light source, \(P_{LS}\), the cameras chip, \(P_{cam}\), the cameras amplifier circuit, \(P_{amp}\), as well as of the A/D converter, \(P_{A/D}\), which contribute to the measured signal, \(S_{o}\), Fig. 3.7. The impact of the light source as well as the camera sensor together with its amplification are highly influenced by the experimental conditions such as integration time and gain of the camera. On a more detailed level, the four noise components represent more fundamental noise sources. The photon flux of the light source is the main source of statistical variation which leads to photon noise of this component. While receiving this fluctuating source of energy, the camera sensor converts incoming photons into electron-hole pairs with a given quantum efficiency, which in itself is a statistical process known as photo-electron noise. The camera is also source of photo-current noise, which arises when electron-hole pairs are converted to pulses of electric current. In addition to the before mentioned noise sources, the process is dependent on the area of the detector, its quantum efficiency and the integration time, [251]. If the incoming photons are Poisson distributed, this noise source is known as shot noise. Additionally, the amplification of the photon-induced electrical current will be statistically dependent as well as the quantization of the detector current in A/D conversion. Furthermore, thermal variations will cause deviations and random signal contributions in the electronic circuitry of the sensor, the amplification and the A/D conversion which are also known as receiver circuit noise.

Figure 3.7
figure 7

Noise-affected components of the setup with the initial spectrum \(S_i\) which is manipulated by the noise sources PLS, Pcam, Pamp, PA/D resulting in the measured spectrum \(S_{o}\)

It should also be noted that thermal influences will affect the setup as a whole and lead to geometrical deformations of critical components. Due to the slow rate of change of these fluctuations and the expected short integration times of the detector, it can be assumed that they have a neglectable influence on the signals. In determining a combined noise level of a typical measurement situation, these influences will be captured during a noise characterization measurement.

In order to qualify the combined influence, an experiment was conducted to characterize the induced noise. For this purpose, the spectral intensity response of the system with a mirror in the reference arm (Thorlabs PF10-03-P01) and a typical sample surface (silicon height standard Simetrics VS, Simetrics GmbH, Germany) was recorded. The recording was performed at gain levels ranging from 0 to 12 dB in steps of 3 dB and with five equally spaced integration times per gain level. The integration time was used to control the relative magnitude of the intensity signal in steps of 20 % starting from 100 %Footnote 3. The recorded spectral intensity was spline interpolated along the spectral range, which was recorded in order to gather the moving average of the data, Fig. 3.8 a). Subsequently, the spline interpolated data was subtracted from the raw, gray-valued data. A normalized signal was computed to get information on the relative spectral noise, \(\Delta I\), Fig. 3.8 b). For one particular spatial position, a constant fluctuation over the complete spectral range is visible. The analysis of the distribution of this data reveals that it can be modeled using a Gaussian function which fits the data set with a coefficient of determination of \(R_{s}^2\) = 0.979 , Fig. 3.9 a). By analyzing the \(\sigma \) of the Gaussian distribution, the averaged intensity noise was found to be \(\Delta I\) = ± 0.75 % (21.25 dB). To gain a better insight in the spatial dependency of the noise, the distribution of noise for a number of points along the spatial axis was evaluated. The coefficient of determination \(R_{s}^2\) for a Gaussian function describing the noise at the individual point was calculated, Fig. 3.9 b). It can be seen that no spatial dependency is present. For this reason, the spatial domain was not investigated in further detail and all data was integrated over along this domain.

Figure 3.8
figure 8

Analysis of the noise of the setup with a) raw measured spectral intensity in gray-values (abbrev. gw) and spline interpolation as well as b) relative spectral noise \(\Delta I\) with respect to the spectral range

Figure 3.9
figure 9

Analysis of the statistical behavior of the captured intensity noise with a) distribution of the measured, relative intensity noise of all positions in the x-dimension for a gain of 0 dB and a relative signal magnitude of 100 % with a Gaussian fit of the same having a mean \(R_{s}^2\) = 0.979 where \(\Delta I\) = 0.75 % (21.25 dB) could be measured as the averaged intensity noise and b) spatially-dependent plot of coefficient of determination R\(^{2}\) for Gaussian fits of the relative noise distribution

In order to estimate the impact of this noise on the determination of the height of a surface, an analysis of the initial interferometric equation Eq. (3.4) was performed. The equation was solved for the path length difference \(\delta \),

$$\begin{aligned} \delta = \left( n^{DE}(\lambda )-1\right) t_{DE} - \frac{\lambda }{2\pi } \cdot \cos ^{-1}\left[ \frac{I}{I_0} - 1\right] . \end{aligned}$$
(3.12)

Where the derivative of the path length difference \(\delta \) with respect to the intensity variation dI is given by

$$\begin{aligned} \frac{d \delta }{dI} = \frac{d}{dI}\left[ \left( n^{DE}(\lambda )-1\right) t_{DE} \right] - \frac{\lambda }{2\pi } \cdot \frac{d}{dI}\left[ \cos ^{-1}\left( \frac{I}{I_0} - 1\right) \right] \end{aligned}$$
(3.13)

and the path length uncertainty \(\Delta \delta \) is

$$\begin{aligned}&\Delta \delta (I,\lambda ) = \frac{\lambda }{2\pi } \frac{1}{\sqrt{1 - \frac{(I - I_0)^2}{I_0^2}}} \cdot \Delta I. \end{aligned}$$
(3.14)
Figure 3.10
figure 10

Estimation of the resolution based on the measured intensity noise with a) three-dimensional dependency related to the measured spectral and intensity range according to Eq. (3.14) as well as b) detailed plot at a equalization wavelength of \(\lambda _{eq}\) = 562 nm

This relation describes the limits for the described profilometry approach, which is also dependent on the spectral range as well as on the relative intensity, Fig. 3.10 a). As Eq. (3.14) describes, the resolution depends linearly on the wavelength. It could be determined that it has a slope of 0.0012 nm per nm at \(I_0\) = 0.5 arb. units. Furthermore, the resolution strongly depends on the probing intensity in a more complicated relation, Fig. 3.10 b). In order to qualify this dependency, the intensity range in which the resolution increase is < 10 % of the minimal value, was calculated. It was found that this equals an intensity range of I = 0.3 – 0.7 arb. units. For practical purposes, this means that the intensity range should be chosen together with an equalization wavelength as low as possible in order to achieve high resolutions. This can be ensured by adjusting the path length difference of the interferometer.

In all experiments, the setup was adjusted to \(\lambda _{eq}\) = 562 nm. As the spectrometers detection range started at 447 nm, the chosen equalization wavelength enabled the acquisition of enough data for fitting in proximity of \(\lambda _{eq}\). This helped to minimize imaging distortions which are typically present close to the border of the detector. Using the values for \(I_{0}\) and \(\lambda _{eq}\) as well as the intensity noise \(\Delta I\) in Eq. (3.14), a typical resolution of \(\Delta \delta (\lambda _{eq})\) = 0.67 ± 0.05 nm was calculated. The calculated detection limit according to Eq. (3.14) is valid for the analysis at one spectral position. The data analysis of the presented experiments utilized the fit of spectra in a region-of-interest (ROI) around \(\lambda _{eq}\) which is explained in more detail in subsection 3.3.3. Specifically, the measured intensity data was fitted with Eq. (3.4) and (3.6), where the thickness of the dispersive element \(t_{DE}\) and its refractive index \(n^{DE}(\lambda )\) were assumed to be known and \(\delta (x,y)\) as well as \(\gamma (\lambda )\) were approximated. In order to account for utilization of the ROI in fitting, the single point detection limit \(\Delta \delta (\lambda _{eq})\) was used to calculate the resolution of the fitted data, \(r_{fit}\), with the aid of an RMS approach as

$$\begin{aligned} r_{fit} = \sqrt{\frac{\Delta \delta (\lambda _{eq})^2}{n_f}}, \end{aligned}$$
(3.15)

where \(n_f\) is the number of spectral data points used for fitting. Within this work, \(n_f\) = 530 spectral data points were used so that the theoretical resolution was estimated as \(r_{fit}\) = 0.029 nm. With a measurement range of \(\Delta z\) = 79.91 \(\upmu \)m and the above calculated minimal resolution, the dynamic range is \({2.75 \times 10^6}\). This value is significantly higher than the initial estimation which was solely based on the evaluation of \(\lambda _{eq}\) which was DR = 1667.

As this value is based on the model expressed through Eq. (3.14), the experimentally achievable DR might additionally be limited by other influences such as thermal fluctuations or the data processing routines which are not included in the model. Utilizing this analyzing scheme, the limitation on the axial resolution imposed by the thickness of the dispersive element was minimized.

3.3 Signal Formation and Analysis

As shown in the previous examinations, signal analysis in dispersion-encoded low-coherence profilometry needs to be based on a combined evaluation of the equalization wavelength \(\lambda _{eq}\) and the amplitude at this wavelength in order to achieve sufficient resolution. This holds true especially for comparatively large measurement ranges. Different approaches to analyze the information in recorded spectra have been developed and assessed within this work.

3.3.1 Fitting of Oscillating Data

Conventional fitting approaches such as the Levenberg-Marquardt algorithm, converge typically fast for periodical signals with constant phases. In the case where the phase of the data is varying over one dimension as in DE-LCI, fits can converge fast as well but most likely on a local minimum rather than on the global minimum, Fig. 3.11 a). In case of this simulation the actual OPD was \(\delta _{sim}\) = 535.233 \(\upmu \)m. A brute force calculation of spectra according to the model described with Eq. (3.4) and (3.6) and the respective squared sum of errors \(\sigma _{sse}\) was calculated in a range of \(\delta _{rng}\) = (534 – 537) \(\upmu \)m. It can be seen that the \(\sigma _{sse}\) value oscillates and shows two distinct local minima at 534.3 and 536.1 \(\upmu \)m apart from the global minimum. An approximation with the Levenberg-Marquardt method using an initial guess of \(\delta _{guess}\) = 536 \(\upmu \)m converged in < 10 iterations to a value of \(\delta _{fit}\) = 536.118 \(\upmu \)m which falls on one of the local minima. It is clearly visible that the resulting spectrum is significantly different from the one which was actually present, Fig. 3.11 b). One way to circumvent this problem is to compute spectra within a very large range of possible values for the fit parameters. As this approach can be time and memory consuming, one has to consider strategies like downsampling of the measured data or the introduction of more advanced fitting approaches, [252–254]. Here, Monte-Carlo-based methods can be used in order to guess different start parameters and perform fitting in the parameter range with the highest likelihood of convergence on the global minimum, [255]. Furthermore, another strategy is approximating spectral data in multiple stages with coarse variations of the fit parameters to determine the area of interest and finer variations to exactly converge on the final parameters. In either way, it is desirable to determine the start value \(\delta _{guess}\) with high precision. It should be emphasized that under the consideration of a precise determination of good start values any established fitting approach such as Levenberg-Marquardt can be used instead of the brute-force approach.

Figure 3.11
figure 11

Evaluation of classical Levenberg-Marquardt-based fitting approach versus brute force calculation for dispersion-encoded interferometric data with a) error progression of spectra using different path differences \(\delta \) showing several local and a global minima where fitting is likely to fall on a local minimum if initial fit parameters are not carefully chosen and b) resulting spectrum of the fit routine which converged on a value of \(\delta _{fit}\) = 536.118 \(\upmu \)m for an initial guess of \(\delta _{guess}\) = 536 \(\upmu \)m in relation to the simulated spectrum

3.3.2 Frequency Analysis

In order to evaluate \(\delta _{guess}\) from measured data, the equalization wavelength \(\lambda _{eq}\) can be used in conjunction with Eq. (3.9) which follows from \(\frac{\partial \varphi }{\partial \lambda } = 0\), Eq. (3.7)

$$\begin{aligned} \delta _{guess} = \left[ n_g^{DE}(\lambda _{eq}) -1\right] \cdot t_{DE}, \end{aligned}$$
(3.16)

where the value for the group refractive index of the dispersive element is called \(n_g^{DE}(\lambda _{eq})\) as well as its thickness \(t_{DE}\) are used to calculate the initial estimate for the path length difference \(\delta _{guess}\). However, this approach relies on the correct estimation of \(\lambda _{eq}\) from measured data. Although this method describes a theoretical way to gather the necessary data, it is problematic in its implementation for measured data. In order to analyze the phase signal, a \(\cos ^{-1}\)-operation can be performed which results in wrapped data ranging in \([-\pi ,\pi ]\). To avoid phase wrapping, the data processing established within this work relied on a frequency-based analysis. The analysis of periodic signals is typically performed by frequency analysis approaches such as Fourier transforms like FFT. This is the standard method in e.g. FD-OCT, where interference at different axial positions can be separated as different frequency components from the spectral domain signal, [105]. As a number of publications demonstrated, dispersion in the experimental setup or in samples, which might be composed of different materials, leads to measurement uncertainties. For this reason, a variety of dispersion compensation approaches have been researched, [256–258].

Figure 3.12
figure 12

Principle depiction of the STFT where a window of the width \(\Delta w\) is continuously slid over the signal in the direction \(d_s\) to form spectral slices \(S^\prime _n\) with an overlap of \(\Delta o\) and b) resulting spectrogram of the stacked, Fourier-transformed slices \(\mathfrak {F}(S_n^\prime )\)

In distinction to approaches performing dispersion compensation of the gathered spectral data, DE-LCI aims to make use of the dispersion within the system. In order to gain information on the phase and its wavelength-dependent slope, a Fourier-based analysis was performed on small slices \(S_n^\prime \) of the signal. Known as windowed or short-time Fourier-transform, this method assumes that the phase is constant in a small slice of the signal, [259]. It composes a resulting spectrogram by sliding an observation window in steps over the signal having a fixed length and overlap, Fig. 3.12 a). By changing the window shape, window width \(\Delta w\) as well as the overlap between subsequent windows \(\Delta o\), the resolution with regard to the phase minimum can be controlled. In consequence, a spectrogram of the stacked and Fourier-transformed slices \(\mathfrak {F}(S^\prime _n)\) is used to analyze the phase minimum and the resulting equalization wavelength \(\lambda _{eq}\), Fig. 3.12 b). The approach can also be used in analyzing signals composed of multiple frequencies from different axial positions. In the context of DE-LCI, the approximation of the equalization wavelength \(\lambda _{eq}\) was used to calculate an initial value of \(\delta _{guess}\) in the interferometer in order to enable the fit to converge fast on a global minimum.

3.3.3 Two-Stage Fitting

As described in subsection 3.3.1, the usage of conventional fitting routines on oscillatory data with varying phase can lead to problems. For this reason, the here developed fitting routine was constructed as a two-step process. Using a range of \(\Delta \delta _1\) = ± 1 \(\upmu \)m with a step size of 2 nm centered around the previously calculated \(\delta _{guess}\), a set of simulated spectra based on Eq. (3.4) and (3.6) was calculated in a brute-force fashion. The determination of the error sum of squares (SSE) of these calculated spectra with respect to the measured spectrum enabled the estimation of a more precise value for the path length difference \(\delta _1\) at the minimum of the SSE curve, Fig. 3.13. The calculated \(\delta _1\) was used in a second iteration of the routine to calculate another set of spectra with a finer spacing in \(\Delta \delta _2\) = ± 140 nm with steps of 0.02 nmFootnote 4. Comparably, the SSE of the calculated spectra was evaluated with respect to the measured spectrum. The minimum SSE indicates the path length difference \(\delta _2\) which can be used to compute the height at a point of the sample, see also Fig. 3.14 b).

Figure 3.13
figure 13

Visualization of the simplified two-step fitting process based on SSE determination for simulated data sets in two ranges (\(\Delta \delta _1\) and \(\Delta \delta _2\)) for the path length difference \(\delta \)

The described method was chosen instead of other established fitting algorithms to ensure the convergence on the global minimum rather than a local minimum which can be the case due to the oscillating nature of the data. The iterative fitting approach bears further potential for optimizations regarding the processing time.

Opposing to doing a fit of the whole spectrum, fitting in a ROI was carried out in order to reduce processing time, as only about 25 % of the gathered data had to be processed. The processing time of a whole profile with this spectral ROI was about 2 secondsFootnote 5. The ROI was selected as a fixed set of 530 data points distributed symmetrically around the equalization wavelength. The size of the ROI was determined in preliminary experiments in order to include at least one spectral modulation to each side of \(\lambda _{eq}\) which is dependent on the used DE. It is not expected that the size of the ROI influences the resolution of the setup.

Figure 3.14
figure 14

Example of the two-stage fitting routine with a) plot of a measured intensity signal at \(\lambda _{eq}\) with a selection of fit curves where the OPD is separated by \(\Delta \delta \) = 5 nm for each iteration and b) corresponding error sum of squares (SSE) for the different \(\Delta \delta \) and a interpolating curve plotted in black where the arrows indicate a magnified plot of the area in close proximity of the minimum SSE

As described above, the fit of the measured data in close proximity of the equalization wavelength significantly enhances the resolution, see Eq. (3.14). In order to prove the resolution limit experimentally, a sample data set was evaluated regarding its regression error towards simulated data sets within a range of path length differences \(\Delta \delta \), Fig. 3.14 a). Due to the ability to perform a search for the regression minimum based on a SSE approach, the best fitting data set can be used to determine the path length difference as a basis for the height calculation. This method assumes that the calculated minimum value of regression \(SSE_{min}\) is the center of a confidence interval which has a variance \(\sigma ^2\). The variance can be computed using the number of fitted parameters \(m_f\) = 3 and the number of points in the fit interval \(n_f\) = 530 with, [260] p. 287

$$\begin{aligned} \sigma ^2 = \frac{SSE_{min}}{(n_f-m_f)} = {4.89 \times 10^{-6}}. \end{aligned}$$
(3.17)

By using the slope calculated in Fig. 3.14 b), the variance can be used to determine the interval of path length difference with \(\Delta \delta \) = ± 0.27 nm. Using typically N = 10 consecutive measurements, a mean height profile deviation can be calculated with \(\Delta \delta \) and Eq. (3.15) to \(r_{fit}^{exp}\) = 0.085 nm. The intensity of the data at the equalization wavelength influences the resolution as shown in Fig. 3.10. Taking this into account, a further analysis can be done where the data set of Fig. 3.14 a) showed a relative intensity of I = 0.97 arb. units. From the theoretical calculation of Eq. (3.14), Fig. 3.10 it can be deduced that this intensity corresponds to a theoretical resolution of \(r_{fit}\) = 0.087 nm. This is well aligned with the expected experimental value. Further experimental evaluation in the spatial domain was performed in subsection 3.4.

3.3.4 Error Estimation of the Data Processing

Besides the previously discussed error influences due to the optical setup, the implementation of the analysis algorithm introduces further noise. In order to quantify its influence, a simulation and subsequent analysis were conducted. For this purpose, a simulated height profile with a nominal height of \(z_{nom}\) = 0 was constructed. It was represented by 500 spectra per profile and 10 repetitions. Based on a typical equalization wavelength of \(\lambda _{eq}\) = 562 nm, a set of spectra in close proximity to \(\lambda _{eq}\) was calculated having relative intensities of \(I_{0}(\lambda =\lambda _{eq})\) = 0 – 1 arb. units with \(\Delta I\) = 0.05 arb. units, Fig. 3.15. The dispersive element was assumed to be of N-BK7 with \(t_{DE}\) = 2 mm within this simulation. Every calculated spectrum was subsequently obstructed by white, Gaussian noise having a SNR \(SNR_{simu}\) = 20 – 45 dB in steps of \(\Delta \)SNR = 5 dB. In total, this resulted in 630,000 spectra processed by the analysis algorithm.

The analysis was concentrated on two important features, the averaged height \(z_{mean}\) as well as its corresponding standard deviation over the length of the 500 spectra per data set \(\Delta z_{min}\) as an indicator for the resolution. By analyzing the mean height and its standard deviation for the spectra at \(I_{0}\) = 0.5 arb. units with respect to the added noise levels, some initial insight can be drawn, Fig. 3.16 a). As expected, the mean height is close to zero while the standard deviation is dependent on the noise level. A mean value \(z_{mean}\) = \({-2.9\times 10^{-15}}\) nm of across all noise levels could be measured. On a closer look, the standard deviation of the height measurement, which can be regarded as a measure for the resolution, can be fitted by an inverse logarithmic model, which corresponds well with the logarithmic scale of the noise levels. Using this fitted equation, the influence of the used algorithms could be estimated.

Figure 3.15
figure 15

Example of the simulated data set for the characterization of the influence due to noise from the data processing routines where the curves are separated by \(\Delta I\) = 0.05 in a range of \(I_{0}\) = 0 – 1

Figure 3.16
figure 16

Depiction of the analyzed simulated data in relation to the different noise levels at a relative intensity of \(I_{0}\) = 0.5 arb. units with a) mean height data of averaged profiles \(z_{mean}\) and related standard deviation \(\Delta z_{min}\) where the nominal value was \(z_{nom}\) = 0 as well as the slope of \(\Delta z_{min}\) with a fitted inverse logarithmic relationship with respect to the added noise; b) Plot of the measured SNR from the power of \(z_{mean}\) over the length of 500 analyzed spectra per data set with respect to the simulated noise levels displayed on top of a plot of the effective, noise-dependent influence of the algorithm on the resolution \(\Delta z^{alg}_{min}\)

It is known from the analysis of the spectral intensity signal of the light source, Fig. 3.8, that the typical noise is 21.25 dB. By evaluating the fitted curve of the standard deviation of \(z_{mean}\), a value of \(\Delta z_{min}(21.25\ \mathrm{dB})\) = 0.16 nm can be found. As this value is significantly apart from the calculated minimal resolution for this noise level of \(r_{fit}\) = 0.029 nm, a further investigation was performed. The power P of the averaged height over the range of S = 500 spectra was used to calculate the signal-to-noise ration \(SNR_{meas}\) of the height profile z using

$$\begin{aligned} P = \frac{1}{S} \cdot \sum _{0}^{S}|z|^2\end{aligned}$$
(3.18)
$$\begin{aligned} SNR_{meas} = 10*\log {P}, \end{aligned}$$
(3.19)

where it was compared to the initially simulated noise \(SNR_{simu}\), Fig. 3.16 b). It can be seen that the relation follows a linear slope as expected, but a general offset of about 5 dB is present. Under consideration of this offset, the resolution for the typical noise of the system in an experimental situation was re-calculated as \(\Delta z_{min}(21.25\ \mathrm{dB})\) = 0.088 nm. As value takes the complete processing of data into account, it is supposed to be the expected minimal resolution of the system at this noise level. A subtraction of the noise-related resolution according to Eq.(3.14) from the resolution calculated in this simulation\(\textemdash \) see bottom plot of Fig. 3.16 a)\(\textemdash \) results in the effective influence of the algorithm on the resolution \(\Delta z^{alg}_{min}\), bottom plot of Fig. 3.16 b). Obviously, it is dependent on the effective noise of the system.

According to Eq. (3.14), the resolution \(\Delta z_{min}\) of the DE-LCI setup is also dependent on the relative intensity at the equalization wavelength \(\Delta I\). Consequently, the analysis of the height measurement, the standard deviation and the SSE in relation to the relative intensity. In correspondence to the finding of Fig. 3.16 b), this investigation was performed at a noise level of 25 dB, Fig. 3.17 a). From the height measurement it can be derived that the measured values are normally distributed around the nominal value \(z_{nom}\) = 0. More interestingly, the values for the resolution with respect to the relative intensity can be fitted using the already derived relationship from Eq. (3.14). This fit supports the finding that the algorithm has a significant influence on the resolution which prevents it from achieving a minimal theoretical resolution of \(r_{fit}\) = 0.029 nm.

Figure 3.17
figure 17

a) Plot of the measured resolution \(\Delta z_{min}\) with respect to the relative intensity at the equalization wavelength \(\Delta I\) at a simulated noise level of 25 dB and b) three-dimensional representation of the influence of simulated noise and change in relative intensity on the height resolution

As a result of the simulations, the noise behavior of the data processing routines can be plotted against all three influences, the relative intensity, the original SNR level as well as the standard deviation of the measured average height, Fig. 3.17 b). From this plot, the described trends can be analyzed by direct comparison. The mean standard deviation for the height estimation follows an inverse logarithmic relationship with an increase in SNR. Simultaneously, the expected relationship to the relative intensity \(\Delta I\) according to Eq. (3.14) is visible.

In typical measurement scenarios, the characterization of height profiles is desired in one lateral dimension or even in an areal fashion. As other approaches have shown, gathering line profile data either by scanning the beam or by moving the sample relative to the beam is common. In either case, the accuracy of the moving parts has an influence on the accuracy of measurement. In order to evaluate this influence, an experiment with a plane mirror on a translation stage (Z812B, Thorlabs Inc., USA) was conducted, Fig. 3.18 a). The translation stage was moved continuously over a range of 2.5 mm during the experiment along the x-axis. Simultaneously, data was collected with the interferometer. As the same mirrors with high flatness were used on the translation stage as well as reference mirror, the result should ideally show no height differences between both arms, Fig. 3.18 b). It is visible in the data that the translation stage introduces a significant amount of height differences to the measurements. Over a scanning range of 2.5 mm the stage oscillates in z-direction up to ± 1.5 \(\upmu \)m. In order to achieve a resolution in the nanometer range, such deviations have to be diminished or calibrated.

Figure 3.18
figure 18

a) Principle of a setup for the characterization of axial influences from lateral movement where a TLS—translation stage moves a SM—sample mirror in the x-dimension. The sample arm beam is recombined with the beam from the RM—reference mirror using a BS—beamsplitter in order to characterize the errors introduced as height changes in the z-dimension interferometrically and b) Result of the translation stage evaluation showing the introduced, oscillating height error over a scanning range in the x-dimension

Figure 3.19
figure 19

a) Experimental setup with WLS—white light source, BS—beamsplitter, DE—dispersive element (having the thickness \(t_{DE}\) and the refractive index \(n^{DE}(\lambda )\)), REF—reference mirror, SMP—sample profile including the points \(z_1(x_1,y_1)\) and \(z_2(x_2,y_2)\) which are imaged with a given magnification M (typically M = 1.3 or 4) by the L1—imaging configuration, relayed by a FM—folding mirror onto the slit of the IMSPEC—imaging spectrometer as magnified points \(z_1'(x_1,y_1)\) and \(z_2'(x_2,y_2)\) and a detailed view of the same in b) with SPT—measurement spot, SLT—slit, L2—collimating lens, GRT—grating, L3—imaging optics which is used to realize an internal magification (refer also to the appendix in the Electronic Supplementary Material (ESM)) and CAM—camera where the spectral information for every point on the line in x-dimension is recorded

3.4 Two-Dimensional Approach and Characterization

The estimations of section 3.2 demonstrated the capability to perform high-resolution profilometry with the dispersion-encoded low-coherence approach. But they also demonstrated the scanning mechanism’s influence on deviations of the result. In order to avoid these and other unwanted deviations from e.g. thermal or vibrational influences, an imaging approach was developed to gather two-dimensional data without scanning, Fig. 3.19 a). In this configuration the reference arm is composed of an element with known dispersion (here Schott N-BK7, \(t_{DE}\) = 2000 \(\upmu \)m) and a plain mirror. The sample arm holds a sample with a varying height profile along the x-y plane noted with z(xy). The recombined collimated light from sample and reference arm is imaged on the slit of an imaging spectrometer. Within this spectrometer the light is spectrally decomposed and imaged onto the two-dimensional complementary metal-oxide-semiconductor (CMOS) array of a camera, Fig. 3.19 b). In contrast to a single-line detector of a standard spectrometer, this configuration enables the detection of spectra at every point on a line in the x-dimension of the measurement spot. The information is only selected from one position in the y-direction which means that the acquisition of height profiles along a single line at once becomes possible. Following this approach, the recorded signal enables the detection of spectral interferograms along a line of the x-dimension which can be described analogous to Eq. (3.4) and (3.6) with

$$\begin{aligned}&I(x,\lambda ) = I_0(\lambda ) \cdot \left[ 1 + \cos \varphi (x,\lambda ) \right] \end{aligned}$$
(3.20)
$$\begin{aligned}&\text {with } \varphi (x,\lambda ) = 2\pi {\left[ n_{DE}(\lambda ) - 1 \right] t_{DE} -\delta (x)\over \lambda }, \end{aligned}$$
(3.21)

where \(I_0(\lambda )\) is the initial spectral intensity before the beamsplitter and \(\varphi (\lambda ,x)\) is the absolute phase of the signal at every point in the x-dimension which is dependent on the OPD between both arms denoted with \(\delta (x)\) and the wavelength \(\lambda \). The signal detected by the camera of the imaging spectrometer is composed of stacked spectral interferograms where the resolution of the x-dimension is dependent on the magnification M, of the interferometer and the spectrometer construction, see Fig. 3.19. Typically, magnifications M = 1.3 and 4 where used within this work. For all investigations within this work, an imaging spectrometer with the following parameters was designed and built, Tab. 3.3. A detailed calculation including ray-tracing and optimization of optical components for the designed imaging spectrometer as well as of the calibration methods can be found in the appendix in the Electronic Supplementary Material (ESM).

Table 3.3 Calculated parameters and components for the designed imaging spectrometer.

3.4.1 Height Standard Evaluation

Additionally, experiments have been conducted to evaluate the accuracy of the developed system for the determination of small height steps. For this purpose, a Si-based step standard (VS 0.10, Simetrics GmbH, Germany) was examined, Fig. 3.20. The results of this examination revealed a good ability to resolve nm-sized height steps with a measured height of (101.8 ± 0.1) nm which is in good agreement to the nominal value of (100 ± 7) nm quoted by the manufacturer. The corresponding RMS error with regard to the nominal value was 1.1 nm. The roughness, Ra, of the Si surface could be measured with 0.8 nm which scales with a factor of about 8.7 to the roughness measure Rt = 7.0 nm, [261]. This is within the range of 6 - 10 nm quoted by the manufacturer, [262]. The recorded and measured profiles show bat-wing effects at the sharp edges, [263]. The measurement error increases in the regions of these effects due to diffraction and deflections. It was visible that deviations of up to 20 nm occur, marked with red ellipses in Fig. 3.20 b) and c). These deviations were attributed to diffraction effects visible as additional intensity modulation in the spectral interference raw data, Fig. 3.20 b). For calibration purposes, the oscillations of the diffraction can be modeled as Fourier filtering by the aperture of the capturing optical system, [263]. In relation to the simulated raw data, Fig. 3.20 a), it was visible that not only diffraction occurs, but other distortions as well. In case of a flat, properly aligned sample, the spatial distribution of the maxima and minima is parallel to the x-axis of the plot, see Fig. 3.20 a). It can be seen in the actual measured data, that this was not the case, Fig. 3.20 b). This was the result of a slight tilt of the sample (about \(0.11\,\)) in relation to the sample arm. It was corrected during post-processing of the final measured profiles assuming a linear tilt.

Figure 3.20
figure 20

Results of the measurement of a (100 ± 7) nm nominal height Si-standard with a) simulated spectral interference signal over a spectral range of 333 nm and lateral dimension of 450 \(\upmu \)m with the equalization wavelength \(\lambda _{eq}\) marked and b) corresponding measured spectral interferences data with visible intensity modulations due to diffraction (marked with red ellipses) and c) calculated mean height profile from the raw data with diffraction-induced bat-wing effects at the sharp edges (marked with red ellipses) having a measured height of (101.8 ± 0.1) nm

On the same standard, a single edge as well as a series of steps having a pitch of 250 \(\upmu \)m were studied, Fig. 3.21. For both sample positions heights of (104.35 ± 0.11) nm and (99.88 ± 0.11) nm were measured respectively. It is clearly visible that both of these features show the same diffraction effect, which confirms that it is independent of the sample position but dependent on the feature size and slope. Fig. 3.21 a) leads to the note that the effect has a length of influence of about \(l_i\) = 50 \(\upmu \)m into the profile. When measuring structures with lateral feature sizes smaller than \(2 \cdot l_i\) information of these structures can be obscured. It can be seen in the measured profiles of multiple successive steps with a width of only 125 \(\upmu \)m that an evaluation of e.g. roughness is influenced by this effect, see inset Fig. 3.21 b).

Figure 3.21
figure 21

Plot of measured structures of the Si-based height standard showing diffraction effects with a) single edge having a mean height of (104.35 ± 0.11) nm and b) series of steps with a mean height of (99.88 ± 0.11) nm and diffraction effects of up to 50 \(\upmu \)m lateral size from each edge which influence the step width significantly, see inset

Figure 3.22
figure 22

a) Depiction of the calculation of repeatability as the standard deviation \(\sigma _z(x)\) of multiple profiles \(z_i(x)\) according to Eq. (3.22) as well as the resolution \(\Delta z_{min}\) as the standard deviation of the feature height \(h_i\) according to Eq. (3.23) and b) plot of the spatially resolved repeatability of the nm-sized height standard of Fig. 3.20 c) where the impact of diffraction is visible at \(x =\) 150 – 210 \(\upmu \)m and \(x =\) 300 – 350 \(\upmu \)m with an inset to visualize the magnitude of \(\sigma _z(x)\) between \(x =\) 50 – 150 \(\upmu \)m

3.4.2 Repeatability and Resolution Characterization

The error of the system can be analyzed by utilizing two measures where one is the repeatability, defined by the standard deviation \(\sigma _z(x)\) of multiple profiles \(z_i(x)\) gathered in a short time frame. The second measure is the resolution, calculated as the standard deviation \(\Delta z_{min}\) of a feature such as height \(h_i\), Fig. 3.22. a). In order to analyze the repeatability, the structure presented in Fig. 3.20 c) was measured N = 10 times in a row without any other delay than the acquisition and data transfer time. The analysis of the standard deviation of the profiles with respect to their mean, \(\overline{z(x)}\), allows one to conclude on the repeatability,

$$\begin{aligned} \sigma _z(x) = \sqrt{\frac{1}{N-1} \sum _{i=1}^{N}\left( z_i(x) - \overline{z(x)} \right) ^2 }. \end{aligned}$$
(3.22)

It can be noticed by analyzing the standard deviation in relation to the lateral dimension \(\sigma _z(x)\) that the error significantly increased due to the sharp edges and the diffraction effects, Fig. 3.22 b). In order to estimate the repeatability of the setup in this configuration, the standard deviation was evaluated without the data points that were affected by defraction (\(x =\) 150 – 210 \(\upmu \)m and \(x =\) 300 – 350 \(\upmu \)m). A mean value of \(\overline{\sigma _z(x)}\) = 0.13 nm was calculated. The value of \(\overline{\sigma _z}\) = 0.13 nm, measured on a low scattering nm-height standard, is expected to be the lower limit of the setup, as sections of the sample with disturbing influences were excluded from the calculation. In order to characterize the repeatability of the setup further, an experiment was designed where the path length difference was altered with a translatable sample, Fig. 3.23 a). The designed sample was a flat piece of a Si-wafer (15 x 15 mm\(^{2}\)) diced and mounted on a piezo stack (\(\Delta \) = 1.8 \(\upmu \)m per 100 V, PA3JEW, Thorlabs Inc., USA) which was attached to a bulk glass substrate. During the experiment, the voltage of the piezo was discretely controlled (power supply QL355TP, Aim and Thurlby Thandar Instruments, United Kingdom) and monitored (digital multimeter DM3068, RIGOL Technology Co., Ltd, China) to adjust the path length difference in defined steps of \(\Delta \delta \) = 0.2 nm. For every adjusted step, N = 10 consecutive measurements were taken, Fig. 3.23 b). The data for the three pictured positions demonstrates that the setup is capable of resolving steps of 0.2 nm as the standard deviations hardly overlap. The mean standard deviation of every position was calculated as \(\overline{\sigma (\Delta \delta =0)}\) = 0.11 nm, \(\overline{\sigma (\Delta \delta =0.2)}\) = 0.13 nm, \(\overline{\sigma (\Delta \delta =0.4)}\) = 0.10 nm. These values correspond well with the measurements on the Si height standard shown previously and demonstrate the limit of the current setup with regard to stability.

Figure 3.23
figure 23

a) Depiction of the modified setup to characterize the repeatability where the incoming electric field \(E_{in}\) is split by a BS—cube beamsplitter, while 50 % of the light passes the DE—dispersive element (N-BK7, \(t_{DE}\) = 2 mm) and gets reflected on the REF—reference mirror. In the second arm, light gets reflected of the designed sample which was a piece of a Si—silicon wafer attached to a PZO—piezo stack (\(\Delta \) = 1.8 \(\upmu \)m per 100 V, PA3JEW, Thorlabs Inc., USA) which was mounted on a bulk glass substrate. The voltage of the PZO was remotely controlled in order to change the path length difference in steps of \(\Delta \delta \) = 0.2 nm where b) plot of the recorded averaged surface profiles along the x-dimension and the corresponding standard deviation shaded around the slopes for three different positions of the piezo stack

The gathered data was further analyzed using a two-sample Student’s t-test in order to statistically evaluate if the measured averaged slopes are significantly different from each other, [260]. A mean value of \(\hat{t}_{12}\) = 3.282 and \(\hat{t}_{23}\) = 3.619 was calculated between the profiles with \(\Delta \delta \) = 0 / \(\Delta \delta \) = 0.2 nm and \(\Delta \delta \) = 0.2 nm / \(\Delta \delta \) = 0.4 nm respectively. In contrast to the value of \(t_{t}\) = 3.250 (with n = 9 at 99 % probability) it was found that \(\hat{t} > t_{t}\) which rejects the null hypothesis. Consequently, the averaged profiles are significantly different in this experiment. Furthermore, this test can be used to estimate the minimal repeatability for the case where \(\hat{t}\) = \(t_{t}\). It was found to be \(\Delta \delta _{min}\) = 0.12 nm for a 95 % probability and \(\Delta \delta _{min}\) = 0.18 nm for a 99 % probability.

The creation and measurement of samples which are close to the proposed resolution of the setup is complex. For the calibration of AFM instruments, height step samples exist which have heights in the size of one atomic layer of silicon, [264, 265]. However, the steps on these samples usually have widths below 1 \(\upmu \)m which are not resolvable with the presented approach. While the repeatability is a measure for temporal fluctuations that occur from one measurement to the other, the ability to resolve structures along the spatial domain (here denoted as the x-coordinate) is independent from these fluctuations. A measure for the resolution can be found in the standard deviation \(\Delta z_{min}\) of feature sizes such as the height of structures \(h_i\) relative to the mean height of multiple measured features, \(\overline{h}\). It can be assumed that in-between short time frames of the acquisition time of single data sets the sample does not change,

$$\begin{aligned} \Delta z_{min} = \sqrt{\frac{1}{N-1} \sum _{i=1}^{N}\left( h_i - \overline{h} \right) ^2 }. \end{aligned}$$
(3.23)

In case of the nm-sized, Si-height standard, the height was measured as the difference between the two base levels, \(x_1 =\) 100 – 150 \(\upmu \)m and \(x_2 =\) 350 – 400 \(\upmu \)m, and the top plateau of the step at \(x_3 =\) 225 – 275 \(\upmu \)m. The quadratic mean of \(\Delta z_{min}\) for 20 measured heights, and therefore the resolution, was found to be 0.1 nm. The standard deviation of the feature size represents a cumulative measure for the resolution which includes influences of the optical setup, the electronics, the calibration routines and the data processing alike. During the data analysis of the presented results it became obvious that the difference between the calculated minimal resolution of 0.088 nm, see bottom plot of Fig. 3.16 a), and the minimal measured resolution of 0.1 nm are partly due to data processing routines. As the recorded profiles usually were tilted by a minor degree, an appropriate tilt correction was performed based on the linear fit of every captured surface profile. Although the tilt correction was optimized, a minor influence on the standard deviation cannot be excluded.

Figure 3.24
figure 24

a) Depiction of the processing chain to fabricate a sample with a step of 3 nm nominal height where (I) polyimide tape is placed as mask on a Si-substrate and a layer of ITO is sputtered on the substrate having the desired thickness. (II) The removal of the tape finishes the step formation. (III) In order to generate a sample with a uniformly reflecting surface, a layer of 40 nm Ti is sputtered onto the sample which maintains the step. b) Plot of the averaged height profile of the step with \(h_{stp}\) = 3.45 ± 0.19 nm using the DE-LCI setup

As a supplement, a step with a nominal height of \(h_{nom}\) = 3 nm was created using standard semiconductor procedures, Fig. 3.24 a). A silicon substrate was prepared with polyimide tape and sputtered with ITO of about 3 nm thickness. Afterwards, the tape was removed and a second sputtering step using titanium was performed. Due to the initial application of tape, the sample has a defined height difference which is maintained after subsequent generation of the titanium layer. This layer was applied in order to generate a uniformly, high reflecting surface and prevent any thin-film interferences which a sole layer of ITO would have caused.

The created sample was analyzed using the setup similar to the measurements on the height standard where N = 10 consecutive measurements were taken and averaged, Fig. 3.24 b). Using the same scheme as before, the height of the step was determined as \(h_{stp}\) = 3.45 ± 0.19 nm. The result fortifies the claim that the proposed DE-LCI method is capable of measuring surface profiles with sub-nm resolution. In particular, this is supported by the repeatability of \(\overline{\sigma }\) = 0.12 nm and the resolution of the height measurement \(\Delta z_{min}\) = 0.19 nm.

On the basis of the calculated resolution, the DR of the setup was calculated with the measurement range of \(\Delta z\) = 79.91 \(\upmu \)m and the resolution of \(\Delta z_{min}\) = 0.1 nm as DR = \({7.99\times 10^{5}}\). Compared to the latest findings of other areal profilometer approaches such as of Reichold et al., [119], the achieved dynamic range is about 5.8 times higher.

3.4.3 Edge Effects

The occurrence of edge effects on sharp edges is caused by different sources. While diffraction and scattering play an important role, also filtering effects of the aperture, shadowing from the sample’s steep slopes as well as interferometric mixing of components of the different height levels due to the lateral resolution are relevant. A rough estimation of the influence of the aperture revealed that its contribution is neglectable, such that it is assumed that diffraction due to the spatial coherence properties of the light source are dominant. As these influences are mixed with an unknown ratio, no single model can be used to filter the signal appropriately. For this reason, a deconvolution according to the Wiener approach was found to be suitable, [266], Fig. 3.25 a). This approach typically assumes that besides the measured profile U, a response function of the system H is known. Based on these functions, a deconvolution for the ideal filtered profile \(Z^\prime \) can be performed. Using this notation, a measured profile is the convolution of the ideal profile with the response functionFootnote 6

$$\begin{aligned} U = H \cdot Z. \end{aligned}$$
(3.24)

Following Wieners idea, a wild-card function G acting on the measured signal is utilized to minimize the error between the ideal, but unknown profile Z and the deconvoluted signal \(Z^\prime \)

$$\begin{aligned} Z^\prime = G \cdot U \end{aligned}$$
(3.25)
$$\begin{aligned} \min (Z^\prime - Z). \end{aligned}$$
(3.26)
Figure 3.25
figure 25

Depiction of the Wiener-based deconvolution routine with a) components of the process where U represents the measured signal (see Fig. 3.21 a), Z is the ideal profile and H is the estimated impulse response function as well as b) comparison of the measured profile U with the deconvoluted profile \(Z^\prime \)

As the response function in the typical use case of the Wiener deconvolution is known, the wild-card function can be constructed of H

$$\begin{aligned} G = \frac{H^*}{|H|^2 + \left( \frac{1}{SNR}\right) }. \end{aligned}$$
(3.27)

In the case investigated here, a modification has to be performed as H is unknown but Z is known. Hence, in order to perform the deconvolution, a wild-card function Q has to be introduced based on Z

$$\begin{aligned} Q = \frac{Z^*}{|Z|^2 + \left( \frac{1}{SNR}\right) } \end{aligned}$$
(3.28)

which can be used to compute the response function

$$\begin{aligned} H = Q \cdot U. \end{aligned}$$
(3.29)

With the aid of this response function, finally the deconvolution of the original measured signal U can be done using Eq. (3.27) and (3.25) resulting in the filtered signal \(Z^\prime \). This computation was performed on the measured data of a single silicon edge, originally presented in Fig. 3.21 a), Fig. 3.25 b). It is obvious that the edge effects can be filtered well. The outlined procedure can be made part of a calibration routine when measuring similar structures repeatedly.

Additionally, the dependency of edge effects with regards to the used light source was evaluated. For this purpose, a supercontinuum white-light source (SC) (with \(\Delta \lambda \) = 380 – 1100 nm) and a laser-driven plasma light source (LDP) (with \(\Delta \lambda \) = 200 – 1100 nm) were used in comparison. Due to the limited coherence of the broadband light sources the interference contrast of the wavelength-integrated signal forms a considerable envelope function which can be detected by using a spectrometer. This data is the basis for the calculation of the coherence length. The coherence length \(l_c\) can be determined exactly as the integral of the area under normalized degree of coherence. Usually, an approximation can be found by estimating the width of the interferogram at an intensity of , [105]. The respective coherence lengths were determined as \(l_c\) = 1.81 \(\upmu \)m for the LDP light source and \(l_c\) = 1.58 \(\upmu \)m for the SC light source. Consequently, the same step standard of 100 nm nominal height was measured with both light sources. While the measured height was comparable with both light sources, the behavior on the edge of the standard showed differences, Fig. 3.26. The analysis reveals that using a LDP source, a much steeper slope can be measured. A linear approximation results in a slope of 0.94  for the SC source and a slope of 1.82  for the LDP source. Furthermore, it can be seen that the formation of edge effects is stronger in the measurements of the SC light source. As the coherence length is very similar for both light sources, a possible explanation is that the spatial coherence of SC light source is higher. Due to its operation principle, highly spatially coherent light of a microchip laser is used to generate a broadened spectrum . In contrast, the LDP light sources generates its broad spectrum from a random process both temporally and spatially.

Figure 3.26
figure 26

Result of the measured slopes using a SC versus a LDP light source

Further knowledge of the precise spatial coherence properties of the used measurement light source as well as of possible sample geometries can help to develop appropriate filter models for the correction methods demonstrated above. The spatial coherence properties of the light source and resulting effects such as diffraction and scattering are the main limiting factors for measurements requiring high lateral resolutions.

3.4.4 Roughness Evaluation

As surface quality and roughness in particular can be essential for the function of technical products and components, their in-line assessment is of high interest, [21, 82, 267–269]. Various norms and guidelines exist for a number of established measurement technologies such as tactile profilometers, confocal microscopes and AFM, [17–19]. Most commonly, quantification methods based on the distribution of heights like the averaged roughness Ra and the root-mean-square roughness Rq are determined.

As the demonstrated DE-LCI approach is capable of capturing precise height profiles over a large lateral measurement range of several hundred micrometers, an application for roughness evaluation of samples is possible. For the initial qualification, a PTB-traceable surface roughness standard (KNT 20170/3 superfine, Halle GmbH, Germany) was analyzed, Fig. 3.27. According to the specifications of PTB, surface profiles were captured on the slices A, B and C in segments of (I), (II) and (III) in each slice. Finally, the results of the roughness measurement of all segments were averaged. According to ISO 4288-1996, surface quality is assessed by the application of appropriate processing steps and filters, [17]. Following these processing steps, the captured surface profiles were form corrected (\(\Lambda _s\)-filtering according to ISO 3274-1996 [270]) and referenced to their respective mean values in order to generate the so-called primary profile, Fig. 3.28 a). Based on this profile, a Gaussian filter was applied to calculate the waviness and the roughness of the primary profile, Fig. 3.28 b) and c) respectively. Here, the application of the low-pass filter results in the waviness profile, while the application of the high-pass filter results in the roughness profile. The cut-off wavelength \(\Lambda _c\) is determined in correspondence with the evaluation length of the profile which should be approximately 5 \(\cdot \Lambda _c\). In case of the roughness standard, the captured evaluation length was \(L_c\) = 1.25 mm as the filter length was \(\Lambda _c\) = 250 \(\upmu \)m. In order to exclude possible deviations on the roughness calculation that are due to filtering effects, the actual measured length was expanded equally by  \(\cdot \Lambda _c\) at the beginning and the end of the profile. By applying this methodology to the data captured from the roughness standard, values of Ra = (21.15 ± 0.8) nm and Rq = (26.58 ± 1.0) nm were calculated. The value of Ra is within the specifications given by the PTB calibration which measured a mean value of Ra = (22.4 ± 0.5) nm with an uncertainty of ± 5 %. As a second measure of comparison, a roughness evaluation using a confocal microscope was performed. Using the same probing scheme highlighted in Fig. 3.27, data was captured in stitching mode as the 50x magnification objective was needed to achieve sufficient axial resolution and is restricted to a lateral measurement range of 220 \(\upmu \)m. The captured data was filtered for noise using a \(\Lambda _c\) = 8 \(\upmu \)m Gaussian low-pass filter and analyzed in the same way as the interferometric data. The roughness parameters were calculated with Ra = (21.42 ± 0.6) nm and Rq = (26.81 ± 0.7) nm. These values support the interferometric values within the respective errors.

Figure 3.27
figure 27

Schema of the measured PTB-traceable surface roughness standard where measurements are taken in the slices A, B, and C with segments of (I), (II) and (III) all having the same evaluation length of \(L_c\) = 5 \(\cdot \Lambda _c\)

Figure 3.28
figure 28

Result of the measurements on a PTB-traceable surface roughness standard with a) primary, form corrected profile, b) waviness profile (Gaussian low-pass filtration of a)) and c) roughness profile (Gaussian high-pass filtration of a)) both using a cut-off wavelength of \(\Lambda _c\) = 250 \(\upmu \)m

It is known that roughness parameters which are based on amplitude values of the height distribution, such as Ra and Rq, are not directly comparable in different measurement techniques, [267]. This is particularly due to the different spatial bandwidth limitations of the individual techniques, [271]. For the purpose of enhancing comparability, alternative methods to determine the RMS roughness were established. The most commonly used ones are based on the determination of the integral of either the auto-correlation function (ACF) or the power-spectral-density function (PSDf) of a profile, [272–274]. In order to perform a comparison of optically measured surfaces with tactile measured ones, data from the same standard sample was captured using both methods, Fig. 3.29 a). It can be seen from the roughness profiles that both data sets show similar amplitudes. However, the data from the tactile measurement shows significant noise which could be measured with ± 2.6 nm. This noise can be attributed to the typical noise of a tactile measurement system consisting of mechanical and electronic components, [275]. With the expected roughness levels of about 20 nm, this noise has an influence on the roughness measurements. For the comparison of both methods, the respective auto-correlation functions were determined using a Fourier-based approach, Fig. 3.29 b). The acquired data was subsequently fitted using a Gaussian approximation in order to perform further analysis, [276]. A fundamental analysis is the determination of the correlation length \(\tau _x\). Most commonly, it is measured as the distance x where the approximated ACF reaches a value of , [277]. While the analysis of the tactile measurement yielded in a value of \(\tau _x\) = 11.37 \(\upmu \)m, the nine analyzed measurements of the interferometric evaluation resulted in a value of \(\tau _x\) = (11.69 ± 1.0) \(\upmu \)m. Obviously, the measurements of the different technical approaches show a high similarity within the standard deviation of the measurement. Furthermore, the amplitude of the fitted ACF can be interpreted as the squared RMS roughness of the measured data. The comparison of these properties reveals the influence of noise on the tactile measurement. A value of \(\sigma _{rms}\) = 27.6 nm was calculated for the tactile measurement, while a mean value of \(\sigma _{rms}\) = (21.46 ± 1.8) nm was calculated for the interferometric measurements. Taking the noise of the tactile measurements into account, both measurements are very similar within the respective error bars.

Figure 3.29
figure 29

Results of the comparative roughness evaluation with a) roughness profiles of the PTB-calibrated height standard from an interferometric and tactile measurement as well as b) calculated auto-correlation functions and appropriate fits using a Gaussian approximation

In addition to measurements on a standard, the characterization of an industry-relevant configuration, in particular an aluminum mirror coating on a float glass substrate (Layertec GmbH, Mellingen, Germany), was performed. The coating was applied using magnetron sputtering on one half of the circular substrate for evaluation purposes, Fig. 3.30. Using the averaged data of three measurements on different positions along the coating edge with ten measurements at every position, a mean height of \(z_{AL}\) = (99.47 ± 0.12) nm was measured. In comparison with the data analyzed for the height standard, the edge of the mirror was less steep which resulted in significantly lower edge effects. Furthermore, the evaluation of sub-nm roughness differences as a part of production accompanying characterization is of interest. For this purpose, data from the float glass substrate as well as from the aluminum coated part of the mirror was analyzed with the described roughness methodology, Fig. 3.31. The separation of roughness (III) and waviness (II) from the primary form-corrected profile (I) reveals a distinct difference. While the waviness of both profiles is in the same order of magnitude, the roughness of the aluminum coated part of the sample is larger. The mathematical analysis resulted in values of Ra = (0.27 ± 0.01) nm and Rq= (0.35 ± 0.01) nm for the substrate area. The roughness of the aluminum coated part was Ra = (0.38 ± 0.02) nm and Rq= (0.47 ± 0.02) nm. Compared to Ra = 0.31 nm for the substrate and Ra= 0.40 nm for the coating, which are quoted by the manufacturer, the measured values correspond well.

Figure 3.30
figure 30

Plot of a measured aluminum mirror edge on a float glass substrate with height profile as mean value of ten measurements on one position along the edge having a height of \(h_{AL}\) = (99.47 ± 0.12) nm

In contrast to other technologies such as AFM, scanning white-light interferometric microscopy and confocal microscopy, DE-LCI is able to capture data for roughness evaluation on a large lateral measurement range of a few mm in one single data acquisition. The lack of the necessity to scan a sample eliminates problems of stitching, vibration and speed.

3.4.5 High-Dynamic Range Measurements

In order to measure the performance of the setup with a high-dynamic range where the measurement range is > 10 \(\upmu \)m while the achievable height resolution should still be in the nm-range, a precision-turned height standard (EN14-3, PTB, Germany) was examined. The standard provides grooves of defined heights with steps of 1, 5 and 20 \(\upmu \)m which were subject to a series of measurements, Fig. 3.32. The recorded data includes measured steps of (971.26 ± 0.31), (4951.40 ± 0.28) and (19924.00 ± 0.36) nm. This results in an overall averaged RMS error of 26.9 nm with regard to a measurement on a tactile profilometer. According to the calibration certificate of the standard, these values are within the quoted uncertainty for the nominal height steps of ± 33 nm. Furthermore, the high axial resolution leads to the ability to capture roughness data in the nm-range on all height steps, see inset in Fig. 3.32 a). A RMS value of Rq = 26.7 nm was calculated. Some edge effects and noise occur in slopes of the steps. In the current optical design, that uses a NA = 0.06 imaging system, a large lateral measurement range could be covered but data on the slopes with an 70\({^\circ }\) angle could not be gathered reliably. Depending on the application, the setup can be optimized to increase the sensitivity on these parts of the sample. Reference measurements with a tactile profilometer confirmed these heights but emphasized the fact that the transitions between the different levels are formed by segments of 70\(^{\circ }\), Fig. 3.32 b). In contrast, the tactile profilometer is able to gather a much higher number of data points in these areas while having a lower overall resolution.

Figure 3.31
figure 31

Result of roughness measurements on a) an aluminum mirror surface and b) a float glass substrate with (I) primary form corrected profile, (II) waviness profile (Gaussian low-pass filtration of (I)) and (III) roughness profile (Gaussian high-pass filtration of (I)) using a cut-off wavelength of \(\Lambda _c\) = 25 \(\upmu \)m (mean of ten measurements)

Additionally, the repeatability according to Eq. (3.22) was analyzed by investigating N = 10 consecutive measurements of the profile. The sample showed a slightly increased averaged standard deviation of \(\overline{\sigma _z}\) = 0.52 nm with respect to the measurements on a low-scattering silicon sample, Fig. 3.22 b). This can be attributed to influences of noise due to diffraction, scattering and other effects affecting the measured repeatability. In a comparative manner to the analysis of the silicon height standard, see subsection 3.4.2, the heights on the \(\upmu \)m-sized standard, Fig. 3.32 a), were evaluated for 10 measurements as the difference of the two closest base levels (z = 0) to the particular step. From the data, quadratic means of \(\Delta z_{min}\) for the three steps with nominal heights of 1, 5 and 20 \(\upmu \)m were calculated as \(\Delta z_{min1}\) = 0.31 nm, \(\Delta z_{min5}\) = 0.28 nm and \(\Delta z_{min20}\) = 0.36 nm respectively. The result shows that the resolution is not dependent on the size of the measured step.

In contrast to the silicon standard, the \(\upmu \)m-sized height standard has a significantly higher roughness which leads to scattering. In consequence, the measurements on this sample were affected by noise which led to a number of outlier data points. The implemented post-processing routines took these outliers into account and corrected them. The outlier correction was performed by the detection of rising edges using data of the first derivative of the profile with respect to the x-coordinate in combination with the correction of the difference between the outlier and the mean value of five previous data points. In the analysis of the step heights and its standard deviation it could be detected that the outlier correction scheme influences the profile on a sub-nm level. As outliers occur on different spatial positions for consecutive measurements, a higher standard deviation was measured opposing to measurements where less outliers occurred, as on the silicon height standard.

Figure 3.32
figure 32

Plot of an averaged line profile where recorded depths of (971.26 ± 0.31), (4951.40 ± 0.28) and (19924.00 ± 0.36) nm could be measured with a mean RMS error of 26.9 nm with respect to a measurement on a tactile profilometer while having the ability to capture roughness information which is shown in the inset where a value of Rq = 26.7 nm was calculated as well as b) overlay of a dataset from the same sample taken with a tactile profilometer which shows significantly better capabilities to capture data on steep edges

3.4.6 Dual-Channel Approach

As surface profile evaluation is crucial at different industrial processing steps, tomographic evaluation of structures becomes also interesting. For this purpose, the DE-LCI approach was adapted to the NIR to perform tomographic examinations of Si-based structures, Fig. 3.33 a). The setup was designed to work in a dual-channel configuration where a broadband light source illuminates the sample and a dichroic mirror at 800 nm separates the recombined light for the two analysis channels. In contrast to conventional DE-LCI, the light of the NIR spectral range holds information from inside the sample while light of the VIS spectral range holds only surface information. This is of course valid for samples like e.g. silicon which are transmissive in the NIR range but not in the VIS. Consequently, an appropriate imaging spectrometer was calculated and designed for a spectral range of \(\Delta \lambda \) = (1133 – 1251) nm based on an indium gallium arsenide (InGaAs) camera (Bobcat 640, Xenics Ltd., Belgium). A detailed description of assumptions, parameters and components is given in the appendix in the Electronic Supplementary Material (ESM).

Figure 3.33
figure 33

a) Schema of the extended optical setup for dual-channel interferometry with WLS—white-light source, BS—beam splitter, SMP—sample having a thickness \(t_{smp}\) and a refractive index \(n^{smp}(\lambda )\) where \(z_1(x_1,y_1)\) and \(z_2(x_2,y_2)\) are two points, one on the surface, one on the back side, REF—fixed reference mirror, DE—dispersive element with the thickness \(t_{DE}\) and \(n^{DE}(\lambda )\), L1—lens to image the sample with a given magnification M (typically M = 1.3 or 4), HP—high-pass filter @800 nm to reflect the VIS part of the spectrum where VISSPEC—VIS imaging spectrometer detects the surface information of the magnified point \(z_1'(x_1,y_1)\) and FM– -folding mirror relays the NIR part of the spectrum to NIRSPEC—NIR imaging spectrometer which detects the depth information of the magnified point \(z_2'(x_2,y_2)\) as well as b) simulation of possible measurement ranges of materials suitable as dispersive elements for NIR investigations

In order to generate spectral power densities in the spectral range of (1000 – 1400) nm, an amplified supercontinuum light source (ASC) was used, [250]. The source utilizes an Yb\(^{3+}\) doped photonic crystal fiber as medium for non-linear spectral broadening and as gain medium within a fiber amplifier configuration, Fig. 3.34 a). The configuration utilizes a pump diode laser (\(\lambda _{pmp}\) = 976 nm) to core pump the fiber while a passively Q-switched microchip laser (\(\lambda _{seed}\) =1064 nm) with pulse durations of 1.3 ns and a variable repetition rate of up to 20 kHz is used to seed the system. The amplification enables several pulse peak power in the desired spectral range and beyond, Fig. 3.34 b). The high pulse peak power as well as the short pulse duration make the light source interesting for dynamic measurements to e.g. observe MEMS movements. Stroboscopic illumination can be envisioned to achieve high penetration depths and high temporal resolutionFootnote 7.

Figure 3.34
figure 34

Schema of the optical setup for the generation of amplified supercontinuum (ASC) in the NIR range with MCL—microchip Laser, PLD—pump laser diode (\(\lambda _{pmp}\) = 976 nm), OI—optical isolator, M—mirror, NM—notch mirror, PCFYb—Yb:doped PCF—fiber as well as b) the optical spectral pulse peak power density of the light source (both adapted from [250])

Measurement range and resolution in NIR evaluation

With its dispersion characteristics, N-BK7 is a very suitable material to be used for DE-LCI in the VIS where typical measurement ranges of 79.91 \(\upmu \)m are achieved (\(t_{DE}\) = 2 mm). Due to lower spatial resolutions of cameras used for imaging spectrometers in the NIR the covered spectral range is usually low. Furthermore, most materials have a rather flat \(n(\lambda )\) slope in this spectral region. Both factors limit the possible axial measurement range. Using N-BK7 with \(t_{DE}\) = 2 mm in the range of \(\Delta \lambda \) = (1133 – 1251) nm, yields in an axial measurement range of \(\Delta z\) = 1.12 \(\upmu \)m. A possibility to increase the range is the utilization of the sample as dispersive material if its thickness is known. Under the assumption that a sample is a silicon wafer with a thickness of 100 \(\upmu \)m, the resulting range would be 7.03 \(\upmu \)m. Depending on the application, this range can be suitable to detect buried marks in a wafer. Similar measurement ranges are achievable by substituting N-BK7 for a higher refractive glass. The usage of FK51A would enable a range of \(\Delta z\) = 1.8 \(\upmu \)m while SF11 would lead to \(\Delta z\) = 5.8 \(\upmu \)m assuming a thickness of 2 mm. Significantly higher measurement ranges, which are comparable to the VIS approach, can only be achieved with non-glass materials, Fig. 3.33 b). As already discussed, Si can be used to extend the range as it can reach \(\Delta z\) = 146.04 \(\upmu \)m for \(t_{DE}\) = 2 mm. Even higher ranges can be observed with materials like gallium arsenide (GaAs) where \(\Delta z\) = 238.6 \(\upmu \)m. However, it has to be noted that both Si and GaAs are non-transmissive in the visible spectral range. Therefore, these materials are not suitable in a dual-channel approach where it is desired to simultaneously gather data of both VIS and NIR channels. A possible measurement mode would incorporate a highly dispersive material which is non-transmissive in the VIS as a reference mirror. In this way, a glass DE can be used for measurements in the VIS while the reference mirror acts as a DE for the NIR investigations. Alternatively, a transmission mode operation can be implemented with a material which is transmissive in the VIS and NIR range. A material which meets this requirement is zinc selenide (ZnSe). It has high optical transmission starting at 550 nm. In the NIR, a measurement range of \(\Delta z\) = 47.88 \(\upmu \)m can be achieved by the application of a DE with \(t_{DE}\) = 2 mmFootnote 8. Consequently, this leads to a maximal achievable measurement range in the VIS of about \(\Delta z_{VIS}\) = 605.84 \(\upmu \)m, if the DE is used in transmission mode. All presented measurements in this work have been performed using a ZnSe element (\(t_{DE}\) = 2 mm) in transmission mode. Furthermore, it has to be noted that the measurement range only describes the ability to detect surface height changes on one particular surface. In a tomographic measurement, the separation of the equalization wavelengths for multiple surfaces is also important. A simulation showed that a sample has to have a minimum thickness of 60 µm of silicon in order to capture the equalization wavelengths in both spectral channels separately for the particular setup described here.

While the equations derived in section 3.2 for the estimation of the profile height are still applicable for the surface information, the back reflected data from structures within the sample follow a different relation. In this configuration, the phase is transformed in slight variations regarding Eq. (3.20) with an additional component for the samples refractive index \(n^{smp}(\lambda )\) = \(n^{smp}\) and its thickness \(t_{smp}\), where \(n^{smp}\) is supposed to be a known quantity

$$\begin{aligned} \varphi = 2\pi \frac{ \left[ \left( n^{DE} -1 \right) \cdot t_{DE} - \left( n^{smp} -1 \right) \cdot t_{smp} - \delta \right] }{\lambda } . \end{aligned}$$
(3.30)

Consequently, all data captured from the inside of a sample is scaled by the depth-dependent refractive index which has to be corrected, if height information should be obtained. In order to perform this, blind dispersion compensation techniques known from OCT can be utilized, [256–258].

Analogous to the measurements and calculations performed previously, see section 3.2, the resolution limit of the NIR approach was characterized. A measurement of the noise yielded in an average value of \(\Delta I\) = 16.0 dB which was found to be normally distributed along the spatial and spectral dimension of the NIR imaging spectrometer. The measured noise was used to calculate the single point resolution limit \(\Delta \delta \) = 4.73 nm. This calculation was based on Eq. (3.14) where the wavelength range was \(\Delta \lambda \) = (1133 – 1251) nm while the equalization wavelength was \(\lambda _{eq}\) = 1189 nm and the relative normalized intensity at this point was \(I_0\) = 0.5 arb. units. Under the assumption that n = 300 points were used for fitting, a resolution of the NIR system of \(r_{fit}\) = 0.27 nm was calculated, referring also to Eq. (3.15). By extrapolating the estimated influence of the algorithm for the measured noise \(\Delta I\), see subsection 3.3.4, an influence of 0.22 nm can be computed. This leads to an expected resolution of 0.49 nm for this experiment in the NIR spectral range. In relation to the measurement range of \(\Delta z\) = 95.76 \(\upmu \)m, which can be achieved by using a dispersive element of ZnSe with \(t_{DE}\) = 2 mm, a dynamic-range of DR = \({1.95\times 10^{5}}\) was calculated.

Results of tomographic profilometry

Using the above described setup and configuration of the DE, an experiment for the tomographic imaging of a thinned wafer was conducted to capture the surface profile as well as the backside profile of the sample in a simultaneous measurement, Fig. 3.35. The sample was a Si-wafer with a partially coated area which was thinned previously. It was mounted on a thin glass substrate. The measurement shows the stepped surface profile where the coating height was determined with 18.70 ± 1.42 \(\upmu \)m. By analyzing the interferometric signal in the NIR channel and correcting the measured OPD with the refractive index of silicon, the wafer thickness was determined with 84.71 ± 0.38 \(\upmu \)m. The edge of the coating can be identified in the VIS as well as in the NIR signal with some significant noise (x = 110 – 140 \(\upmu \)m). The refractive index of the coating introduced a dispersion-induced deviation in the NIR measurement. This deviation was not corrected in this measurement as the material composition of the coating was unknown. It is also visible that the VIS signal shows significant overall noise. This is due to the relatively small spectral power density of the used ASC source in combination of the low optical transmission of ZnSe in this spectral range. This led to a reduced SNR of 16.5 dB for the VIS measurement compared to the experiments presented before. Future developments will account for this and develop methods to increase the SNR. Furthermore, the development of automatic dispersion correction for tomographic data according to known approaches will be worked on.

Figure 3.35
figure 35

Result of the profilometry measurement using a dual channel approach where the VIS channel shows the surface profile on top of a thinned Si—wafer with a coated area and a tomographic profile from the backside of the wafer using the NIR channel of the setup

3.5 Areal Measurement Approaches

In order to gather areal information, different approaches were developed. While two methods were developed only theoretically, one approach was implemented, characterized and tested.

3.5.1 Translation-Based Areal Information

In this implemented approach, information was obtained by constantly translating the lens L1 in order to image different parts of the sample on the slit, as already depicted in Fig. 3.1. In consequence, a stack of two-dimensional line profiles were gathered and analyzed in order to receive three-dimensional data. As the imaging lens, placed after the light of both interferometric arms, was recombined and the sample was not moved between measurements, negative influences on the measurements were kept to a minimum.

Figure 3.36
figure 36

a) Simulated stack of two-dimensional spectra gathered by the translation of the imaging lens L1 in the y-direction to capture 3D information and b) plot of the three-dimensional surface of a precision-turned groove standard (Gaussian filter applied to reduce edge effects for display purposes) with measured depths of (971.26 ± 0.31), (4951.40 ± 0.28) and (19924.00 ± 0.36) nm

Three-dimensional information of the precision-turned height standard used for the high-dynamic range evaluation, Fig. 3.32 a), was gathered in steps of 25 \(\upmu \)m along the y-direction and a rather small magnification to enable a lateral measurement range in the x-direction of 1.5 mm, Fig. 3.36 b). As noted before, see subsection 3.4.5, the high axial resolution leads to the ability to capture nanometer-sized roughness data on all height steps while maintaining a large axial measurement range of 79.91 \(\upmu \)m. Furthermore, steps of (971.26 ± 0.31), (4951.40 ± 0.28) and (19924.00 ± 0.36) nm were measured over an area of 1500 x 250 \(\upmu \)m\(^{2}\) without the need for stitching, which distinguishes the approach clearly from other techniques such as confocal microscopy. In the current optical design, which utilizes an imaging system with a NA of 0.06, a large lateral measurement range could be covered while it was not possible to gather data on the slopes with an 70\({^\circ }\) angle reliably. This is due to the comparatively low lateral resolution of 5 \(\upmu \)m. Depending on the application, the setup can be optimized to increase the sensitivity on these parts of the sample. The results of this sample also highlight the capability of the approach to decouple the axial resolution from the lateral measurement range as nm features can be detected while measuring over a range of 1.5 mm.

Figure 3.37
figure 37

Plot of the measurements on a PTB-traceable height standard using a confocal microscope with a) three-dimensional representation of the sample, b) profile plot in the middle of the data set with inset highlighting a typical stitching error with the segments (I) and (II) having different distributions of noise n and Gaussian fits

For comparison, the height standard was also analyzed using a confocal microscope (Smartproof 5, Carl Zeiss Microscopy GmbH, Göttingen, Germany), Fig. 3.37 a). It has to be noted that the capturing of a profile having the same length as the DE-LCI measurement relied on stitching of multiple images, as a magnification of 20x with an lateral field of view of 450 x 450 \(\upmu \text {m}^{2}\) was necessary to achieve a comparable resolution. This approach is not only time consuming (approx. 10 minutes per areal image) but also tends to be prone to errors as stitching inconsistencies exist, see inset Fig. 3.37 b). It can be seen that these errors on the nm-scale influence the representation of the surface topography in the stitched regions which in turn can have an effect on quantitative analysis. The roughness distribution in these areas is no longer a Gaussian one, which makes it unusable e.g. for roughness evaluation, Fig. 3.37 c) and d). Furthermore, the RMS value of the profile data was significantly larger compared to the other methods. This fact is usually addressed during roughness evaluation by filtering the signal with an appropriate low-pass filter, known as micro-roughness filtering, [270].

Figure 3.38
figure 38

Result of the measured echelle grating with a) DE-LCI measurement in Littrow configuration with five consecutive steps with a mean height of (9.66 ± 0.4) \(\upmu \)m and an inset which shows the nm-fine structure of the first step at the position y = 100 \(\upmu \)m and b) SEM image that captures the coarse structure of the individual step edges which are the reason for the noise in DE-LCI

Further three-dimensional evaluation was performed by analyzing a commercially available echelle grating in the Littrow configuration, Fig. 3.38 a). A total of five facets of the grating could be imaged in a lateral range of 100 x 200 \(\upmu \text {m}^{2}\), having a mean height of (9.66 ± 0.40) \(\upmu \)m. The data on the edges of the steps is notably noisy. A SEM scan of the grating was performed to examine individual steps as a reference, Fig. 3.38 b). From this image it can be seen that each edge has a very coarse structure in the size of about 6 \(\upmu \)m. These lead to very low SNRs during the DE-LCI measurements which are the reason that a larger area of about 6.75 \(\upmu \)m is obstructed on each side while only about 5.5 \(\upmu \)m of the plateaus are visible. Apart from this, the center of the plateaus could be resolved clearly with sub-nm surface structures, see inset Fig. 3.38 a).

Figure 3.39
figure 39

a) Schematic representation of a modified imaging spectrometer for three-dimensional encoding with SPT—measurement spot which is spatially segmented by an LA1—lens array onto the SLT—multi-slit arrangement. A second LA2—lens array images the slits onto the grating which spectrally decomposes the light of every facette of LA2 while individual LEn—imaging elements image the components on the CAM—camera as well as b) simulation of a signal on the CAM where the axial information on the profile height z is spectrally encoded in the cameras x-dimension, individual spectral slices \(S_n(\lambda )\) are encoding information of one lateral dimension \(x_n\) while the combination of these slices hold information of the second lateral dimension y

Figure 3.40
figure 40

Schema of a modified imaging spectrometer for three-dimensional encoding with \(LS_n\)—low-coherent light sources which are coupled to the interferometer by a spatial combiner where a BS—beamsplitter delivers appropriate beams to the SMP—sample and REF—reference mirror where it is also manipulated by the DE—dispersive element. Finally, the SPT—measurement spot which consists of spatially separated spectral ranges \(\Delta \lambda _n\) which are imaged onto and analyzed by the IMSPEC—imaging spectrometer

3.5.2 Alternative Spectral Encoding for Areal Measurements

In order to gather full areal surface profile data, hence three-dimensional information, without any need for scanning two alternative approaches have been developed.

Multi-slit approach

In this approach, the measurement spot is spatially expanded and the single slit is substituted with a set of parallel slits in order to make use of a large area of the grating in the imaging spectrometer, Fig. 3.39. This arrangement allows the decomposition of the measured spot both spatially and spectrally. In consequence, several spatial parts of the measurement spot can be analyzed in the same fashion as described in Section 3.4 while the individual spectral slices are stacked in the y-dimension of the spectrometer, Fig. 3.39 b).

In a practical realization, the multi-slit approach can be implemented by using imaging fibers in a linear arrangement in order to simplify the setup and to avoid diffraction effects from tight fitted slits. The approach holds the potential for measurements with large axial resolution while the lateral resolution of both dimensions is dependent on the physical size of the camera used.

Spatial combiner approach

In a further approach, the encoding of a second lateral dimension is performed in the spectral domain. For this purpose, the measurement spot is composed of different spatial components which inhibit individual, discrete spectral ranges. These spectral slices, which are formed with low coherent light sources, are used to illuminate discrete regions of the sample, Fig. 3.40. The data evaluation in this approach is similar to the conventional, two-dimensional analysis.

The x-dimension of the camera acquires spectral information while the y-dimension stores information of one lateral dimension. However, the data of certain spectral slices with ranges of \(\Delta \lambda _1\) to \(\Delta \lambda _i\) corresponds to the height of the respective lateral dimension \(x_n\), so that each spectral slice has to be analyzed separately. The resolution in the axial as well as in the lateral domain are controlled by the size of the spectral slices, the detector size as well as the imaging magnification of the measurement spot.