1 Introduction

THz time-domain spectroscopy (THz-TDS) is currently the most powerful technique for the characterization of the complex dielectric properties of a wide range of materials in the far-infrared part of the spectrum. This has opened interesting opportunities for applications in areas as diverse as industry [1,2,3], medicine [4,5,6], art inspection [7, 8], among many others.

THz-TDS measures the electric field waveform of a terahertz pulse directly in the time domain and through Fourier transform one can obtain the spectrum. From this information it is possible to calculate the optical parameters of the sample. Typically in most spectra, the amplitude decreases at high frequencies until it is not longer distinguishable from the noise floor [9, 10]. In order to avoid misinterpretation of the spectral features, it is important to estimate the cutoff frequency at which the signal to noise ratio (SNR) approaches one. The position of this cutoff will determine the usable bandwidth for each particular spectra.

Usually, defining the measurable bandwidth limit in THz spectra is not done by a rigorous analysis because, for a trained eye, it is straightforward to distinguish the frequency at which the spectral amplitude decays to the noise floor, so this cutoff frequency is typically calculated “by eye.” However, when a large volume of data is analyzed, such as in the analysis of THz images which contain thousands of spectra per image, it is necessary to automate the identification of the cutoff frequency for each spectrum in the data set. In this work, we propose and test an algorithm to estimate the cutoff frequency at which the amplitude of the spectrum decays to the level of the noise floor.

2 Cutoff Frequency Calculation

The function \(g_{_{\text {THz}}}= (t-t_{0}) e^{-(t-t_{0})^{2}/\sigma ^{2}}\) is commonly used to analytically model a single-cycle terahertz pulse in the time domain [11]. In our case, without lose of generality, we take t0 = 0 and σ = 1 ps in order to keep this analysis as simple as possible. Also, we assume that the pulse contains white noise nw, which has a constant spectral power density. Therefore, the pulse is expressed as \(g(t)=t e^{-t^{2}} + n_{w}\). Using the Fourier transform, we obtain the THz spectrum

$$ G(\omega)=-i\pi^{\frac{3}{2}} \omega e^{-\frac{\omega^{2} \pi}{2}}+N, $$

where N is the Fourier transform of nw which is a constant for the case of white noise. In this analysis, N represent the noise floor in the spectrum, which experimentally originates, mainly, from THz detectors and laser fluctuations [12, 13]. At low frequencies, G(ω) increases proportionally to ω; however, as ω increases, G(ω) decreases as \(e^{-\frac {\omega ^{2} \pi }{2}}\). Eventually, at some cutoff frequency which we call ωc, the spectral amplitude will have values comparable to N. For frequencies higher than ωc, the first term of the Eq. 1 will be negligible and, therefore, G(ω) ≈ N. Given the above considerations, if we now multiply G(ω) by ω, the new equation G1 = ωG(ω) will have two different behaviors bounded by ωc. For ω < ωc, the first part of G1 will have an ω2 factor multiplying the exponential, resulting in a less abrupt decay of the spectral amplitude; however, after ωc which correspond to noise floor in the spectrum, G1 will behave as a linear increasing function ωN. This change of behavior in G1 will generate a minimum at the frequency ωc. The same reasoning can be applied for Gn = ωnG(ω), generating minima at

$$ \frac{1}{n}e^{-\omega^{2} \pi} \left[ \pi^{4} \omega^{4} - (n+1) \pi^{3} \omega^{2} \right] = N^{2}. $$

Therefore, in order to find a reasonable estimate for the cutoff frequency value ωc, it is sufficient to multiply G(ω) by ωn and find the minimum in order to calculate the cutoff frequency ωc, as shown in Fig. 1 (a). From this figure it is clear that the part of the spectrum corresponding to the noise floor has faster growth rate for higher values of n. As a result, as n increases, the value of ωc decreases slightly; the difference in ωc calculated between using G1 and G4 was only 160 GHz, which is equivalent to 8.4 % of the total bandwidth in this example. In the Additional Material found in the “Appendix” section at the end of this article, we provide the implementation of the algorithm described above in MATLAB/Octave.

Fig. 1
figure 1

(a) Calculation of the cutoff frequencies ωc through Gn = ωnG(ω) for n = 1,2,3,4 and N = 0.0334. For reference, a red dashed line is shown corresponding to ωc calculated for G1. (b)–(d) Calculations of ωc for n = 1,2,3,4 from experimental data

Before analyzing experimental spectra, it is necessary to mention three factors that may affect the performance of our method. The first is the random oscillations in the noise floor, which can generate minima with lower values than the minimum generated in Gn. This problem can be solved by smoothing the spectrum. The second is the noise floor itself, since in some THz systems the noise floor does not remain constant, but has smooth variations with frequency; therefore the generation of the minimum would not be clear. If this is the case, an independent measurement of the instrument noise floor could fix the problem. The third factor is the existence of absorption lines within the “good” signal-to-noise-ratio region, which is also addressed by appropriate smoothing of the spectrum.

2.1 Results and Discussion

Data to support the considerations described above are shown in Fig. 1 (b–d), where reference spectra of three different THz systems in transmission configuration are shown. For the acquisition of the original time-domain data, various spectrometers were used. In particular

  • A home-built spectrometer based on a Ti:sapphire oscillator at a central wavelenght of 800 nm with a pulse duration of 35 fs using a SI-GaAs photoconductive antenna as emitter and a 1 mm [110] ZnTe-crystal-based electrooptic sensor as detector, recording 100 points over a 6.6 ps delay.

  • A commercial spectrometer based on an Yb:fiber laser centered at 1064 nm with a pulse duration of 90 fs using a photoconductive anternnas both as emitter and detector, recording 1600 points over 160 ps.

  • A commercial spectrometer based on an Yb:fiber laser centered at 1064 nm with a pulse duration of 90 fs using a photoconductive anternnas both as emitter and detector 8000 points over 800 ps.

The spectra in Fig. 1(b–d) correspond to the three spectrometers described above in that same order. It is worth mentioning that we tested our algorithm with spectra of 3 more spectrometers with various characteristics, not shown, and all the results are comparable.

The smoothing was done using the moving average method [14], with a width ws equal to 3% of the total data recorded in the waveform. The smoothed spectra is denoted as G. As we can see on Fig. 1 (b), there is no difference using G1 and G2, calculating the cutoff frequency at 4.99 THz, where it is easy to notice that ωc must be at lower frequencies. However, increasing n = 3, the cutoff frequency is displaced at 3.18 THz, which is more consistent than the previous result. Using G4, wc is calculated at the same frequency as G3. On the other hand, analyzing the spectrum of Fig. 1 (c), the small peak centered at 3.93 THz is an artifact of the THz system which resulted in a miscalculation of ωc by G1 and G2; however, both G3 and G4, are not affected by this artifact and agree with the frequency estimated “by eye.” In the case of the spectrum shown in Fig. 1 (d), we obtained three different values of ωc. As in Fig. 1 (b) and (c), both G3 and G4 have the same value of ωc, which suggests that it would be enough to use G3 to correctly estimate the cutoff frequency in a conventional reference spectrum.

As mentioned in the previous section, since the method is based on the calculation of the minimum generated by multiplying the spectrum by ωn, any other minimum originated by random noise may result in an incorrect value of ωc. For this reason, another essential element in our processing is the smoothing, since it facilitates the calculation of the cutoff frequency by omitting the random oscillations in the experimental signal. In order to establish the limits of the degree of smoothing that can be performed to the signals, Fig. 2 shows the spectra and their cutoff frequencies calculated using different values of ws and multiplying the smoothed spectrum by ω3.

Fig. 2
figure 2

Spectra and their corresponding cutoff frequencies using ws = 3%,6%,9%,12%, and 15% for n = 3. Insets show ωc as a function of ws

In the spectrum shown in Fig. 2(a), which corresponds to a reference spectrum recorded from a metallic foil sample, the cutoff frequency calculated using ws = 3% of the total number of data in the spectrum was ωc = 2.1 THz. Although there is a change in ωc as the value of ws increases, this difference is minimal; by increasing ws to 15%, the cutoff frequency only decreased by 107 GHz, which is equivalent to 5.1% of the initially calculated bandwidth. This result suggests that increasing the value of ws in the smoothing for reference spectra does not lead to a significant improvement. It is also worth mentioning that we attempted various smoothing methods which include the Savitzky-Golay, the moving average and the non-centered moving average methods, being the latter, the one that gave best results

On the other hand, Fig. 2 (b) shows the spectrum of a metallic sample with an irregular surface. In this case, the cutoff frequency calculated using ws = 3% is ωc = 2.72 THz. However, as ws increased to 6%, the cutoff frequency decreased to ωc = 1.66 THz, with no further changes for ws greater than 9%. Based on the results shown in Figs. 1 and 2, as well as many others that we do not show, we can empirically say that it is ideal to use ws ≥ 6% and n ≥ 3 for the processing of most spectra, with no significant change in the calculation of ωc.

As shown, the proposed method analyzes reference signals without major difficulties. However, this method is not limited to calculating the cutoff frequency in reference spectra, but can be used in any spectrum, regardless of the characteristic of the sample or the circumstances in which the signals were recorded. For this purpose, three data sets were analyzed. The first set is shown in Fig. 3(a), which correspond to a high-density polyethylene (HDPE) sample. Waveforms were recorded in reflection using a 3-inch focal length lens. The sample was placed at the focal point of the lens and subsequently moved ± 12 mm along the optical axis, recording the waveforms every 3 mm. By moving the sample away from the focal point, the amplitude spectrum decays but it is not affected evenly across the entire band, which results in a decrease in the effective bandwidth, so this data set will allow the validation of the effectiveness of the algorithm in a very practical manner. Cutoff frequencies were calculated using ws = 6% and n = 3, and are indicated by circles in each spectrum.

Fig. 3
figure 3

(a) Spectra of a HDPE sample recorded in reflection geometry. The sample was initially placed at the focal plane of the lens. Subsequently, the sample was displaced ± 12 mm out of the focal length, recording the waveforms every ± 3 mm. Circles show the cutoff frequency calculated using ws = 6% and n = 3. (b) ωc as a function of sample displacement

As expected, the highest cutoff frequency, which was 1.66 THz, corresponds to the sample placed at the focal point; this frequency was the same when the sample was displaced + 3 mm, which is within the Rayleigh length of the focus of the THz beam. In addition, ωc decreases as the sample moves away from the focal point of the lens, falling to 1 THz for − 12-mm displacement. This can be seen in Fig. 3(b), where the cutoff frequencies are shown as a function of sample displacement.

The last two data sets correspond to six signals obtained from a THz image of an artistic painting with two main areas of interest. The first three spectra correspond to an area where mercury sulfide was detected, which has an absorption line centered at 1.12 THz. This feature is important to analyze since the absorption line can be confused with the noise as it is located in a part of the spectrum where the SNR is low. The last 3 spectra were taken from the edges of the image exhibited a very poor measurable bandwidth. These two data sets are of special interest as they are highly complex spectra and will allow us to examine the robustness of the method.

Figure 4 (a–c) shows the spectrum of a measurement where mercury sulfide was present. Although the absorption peak is easily distinguishable, the algorithm correctly calculates the cutoff frequency in (a) and (c). Apparently, the cutoff frequency calculated in (b) should be located at a lower frequency, since clearly the spectral amplitude decays to the noise floor before 2.2 THz. In order to correct the cutoff frequency in this spectrum, the value of ws was increased to 9%, calculating ωc = 1.5 THz. For the spectra in Fig. 4(d–f), it can be seen that these signals are extremely noisy, so it is natural to consider increasing the value of ws. Nevertheless, the calculated cutoff frequencies agree with the “by eye” estimation, this without having increased ws.

Fig. 4
figure 4

(a)–(c) Spectra corresponding to a region of interest containing mercury sulfide with its spectral fingerprints at 1.12 THz. (d)–(f) Spectra with narrow measurable bandwidth. Red line indicates G3, while vertical dashed line indicates the cutoff frequency calculated in each spectrum

Finally, it is important to mention that computation time is a critical factor in this work, since this method is designed to analyze large amounts of data. For this purpose, eight THz time-domain images of 550×460 pixels were analyzed, with a total of more than two million spectra, which were processed in approximately 1840 seconds that corresponds to \(\sim 900 \mu \)s per spectrum. The images were processed in a commercial computer (Intel Core i5-11400H with 8 GB memory, using MATLAB R2021b).

3 Conclusions

In this work, we present a method that allows the calculation of the measurable bandwidth in THz spectrum through smoothing and subsequent multiplication by ωn, which generates minima at the frequencies where the spectral amplitude decays to the amplitude of noise floor. By analyzing different spectra, it was shown that for values of ws = 6% and n = 3 it is possible to calculate the cutoff frequencies in most cases. However, in spectra that present features that propitiate a bad calculation of the cutoff frequencies, it is advisable to use values of ws higher than 9%. It is important to clarify that these parameters are not fixed and can be adjusted by the user in order for the method to work properly according to the data characteristics and THz system used.

Furthermore, the method can be applied to any signal from any THz-TDS system, since this method analyzes the spectrum itself and does not depend on either the system configuration nor the optical parameters of the sample. This was demonstrated by analyzing spectra of out-of-focus samples, with spectral fingerprints and narrow measurable bandwidths. Additionally, computational time is not a problem by using a very simple smoothing method and by multiplying and finding minima.

The code, which is the central part of this contribution, is provided below as Additional Material. The data sets used for testing the code can be obtained from the corresponding author upon reasonable request.