Algorithm for Determination of Cutoff Frequency of Noise Floor Level for Terahertz Time-Domain Signals

The frequency-dependent signal-to-noise ratio of terahertz time-domain signals is a relevant source of uncertainty for parameters measured with it. It also limits the total usable bandwidth of such signals. In the great majority of cases, the processes to establish the limits of this usable bandwidth are determined based on the experience of the user. Therefore, it is desirable to develop a procedure to automate this calculation. In this work, a method to estimate the bandwidth of terahertz time-domain signals is presented. Different spectra were analyzed, showing the potential of the algorithm in the calculation of cutoff frequencies which delimits the usable bandwidth.


Introduction
THz time-domain spectroscopy (THz-TDS) is currently the most powerful technique for the characterization of the complex dielectric properties of a wide range of materials in the far-infrared part of the spectrum. This has opened interesting opportunities for applications in areas as diverse as industry [1][2][3], medicine [4][5][6], art inspection [7,8], among many others.
THz-TDS measures the electric field waveform of a terahertz pulse directly in the time domain and through Fourier transform one can obtain the spectrum. From this information it is possible to calculate the optical parameters of the sample. Typically in most spectra, the amplitude decreases at high frequencies until it is not longer distinguishable from the noise floor [9,10]. In order to avoid misinterpretation of the spectral features, it is important to estimate the cutoff frequency at which the signal to noise ratio (SNR) approaches one. The position of this cutoff will determine the usable bandwidth for each particular spectra.
Usually, defining the measurable bandwidth limit in THz spectra is not done by a rigorous analysis because, for a trained eye, it is straightforward to distinguish the frequency at which the spectral amplitude decays to the noise floor, so this cutoff frequency is typically calculated "by eye." However, when a large volume of data is analyzed, such as in the analysis of THz images which contain thousands of spectra per image, it is necessary to automate the identification of the cutoff frequency for each spectrum in the data set. In this work, we propose and test an algorithm to estimate the cutoff frequency at which the amplitude of the spectrum decays to the level of the noise floor.

Cutoff Frequency Calculation
The function g THz = (t − t 0 )e −(t−t 0 ) 2 /σ 2 is commonly used to analytically model a single-cycle terahertz pulse in the time domain [11]. In our case, without lose of generality, we take t 0 = 0 and σ = 1 ps in order to keep this analysis as simple as possible. Also, we assume that the pulse contains white noise n w , which has a constant spectral power density. Therefore, the pulse is expressed as g(t) = te −t 2 + n w . Using the Fourier transform, we obtain the THz spectrum where N is the Fourier transform of n w which is a constant for the case of white noise. In this analysis, N represent the noise floor in the spectrum, which experimentally originates, mainly, from THz detectors and laser fluctuations [12,13]. At low frequencies, G(ω) increases proportionally to ω; however, as ω increases, G(ω) decreases as e − ω 2 π 2 . Eventually, at some cutoff frequency which we call ω c , the spectral amplitude will have values comparable to N. For frequencies higher than ω c , the first term of the Eq. 1 will be negligible and, therefore, G(ω) ≈ N. Given the above considerations, if we now multiply G(ω) by ω, the new equation G 1 = ωG(ω) will have two different behaviors bounded by ω c . For ω < ω c , the first part of G 1 will have an ω 2 factor multiplying the exponential, resulting in a less abrupt decay of the spectral amplitude; however, after ω c which correspond to noise floor in the spectrum, G 1 will behave as a linear increasing function ωN. This change of behavior in G 1 will generate a minimum at the frequency ω c . The same reasoning can be applied for G n = ω n G(ω), generating minima at 1 n e −ω 2 π π 4 ω 4 − (n + 1)π 3 ω 2 = N 2 .
( 2 ) Therefore, in order to find a reasonable estimate for the cutoff frequency value ω c , it is sufficient to multiply G(ω) by ω n and find the minimum in order to calculate the cutoff frequency ω c , as shown in Fig. 1 (a). From this figure it is clear that the part of the spectrum corresponding to the noise floor has faster growth rate for higher values of n. As a result, as n increases, the value of ω c decreases slightly; the difference in ω c calculated between using G 1 and G 4 was only 160 GHz, which is equivalent to 8.4 % of the total bandwidth in this example. In the Additional Material found in the "Appendix" section at the end of this article, we provide the implementation of the algorithm described above in MATLAB/Octave. Before analyzing experimental spectra, it is necessary to mention three factors that may affect the performance of our method. The first is the random oscillations in the noise floor, which can generate minima with lower values than the minimum generated in G n . This problem can be solved by smoothing the spectrum. The second is the noise floor itself, since in some THz systems the noise floor does not remain constant, but has smooth variations with frequency; therefore the generation of the minimum would not be clear. If this is the case, an independent measurement of the instrument noise floor could fix the problem. The third factor is the existence of absorption lines within the "good" signal-to-noise-ratio region, which is also addressed by appropriate smoothing of the spectrum.

Results and Discussion
Data to support the considerations described above are shown in Fig. 1 (b-d), where reference spectra of three different THz systems in transmission configuration are shown. For the acquisition of the original time-domain data, various spectrometers were used. In particular • A home-built spectrometer based on a Ti:sapphire oscillator at a central wavelenght of 800 nm with a pulse duration of 35 fs using a SI-GaAs photoconductive antenna as emitter and a 1 mm [110] ZnTe-crystal-based electrooptic sensor as detector, recording 100 points over a 6.6 ps delay. • A commercial spectrometer based on an Yb:fiber laser centered at 1064 nm with a pulse duration of 90 fs using a photoconductive anternnas both as emitter and detector, recording 1600 points over 160 ps. • A commercial spectrometer based on an Yb:fiber laser centered at 1064 nm with a pulse duration of 90 fs using a photoconductive anternnas both as emitter and detector 8000 points over 800 ps.
The spectra in Fig. 1(b-d) correspond to the three spectrometers described above in that same order. It is worth mentioning that we tested our algorithm with spectra of 3 more spectrometers with various characteristics, not shown, and all the results are comparable.
The smoothing was done using the moving average method [14], with a width w s equal to 3% of the total data recorded in the waveform. The smoothed spectra is denoted as G. As we can see on Fig. 1 (b), there is no difference using G 1 and G 2 , calculating the cutoff frequency at 4.99 THz, where it is easy to notice that ω c must be at lower frequencies. However, increasing n = 3, the cutoff frequency is displaced at 3.18 THz, which is more consistent than the previous result. Using G 4 , w c is calculated at the same frequency as G 3 . On the other hand, analyzing the spectrum of Fig. 1 (c), the small peak centered at 3.93 THz is an artifact of the THz system which resulted in a miscalculation of ω c by G 1 and G 2 ; however, both G 3 and G 4 , are not affected by this artifact and agree with the frequency estimated "by eye." In the case of the spectrum shown in Fig. 1 (d), we obtained three different values of ω c . As in Fig. 1 (b) and (c), both G 3 and G 4 have the same value of ω c , which suggests that it would be enough to use G 3 to correctly estimate the cutoff frequency in a conventional reference spectrum.
As mentioned in the previous section, since the method is based on the calculation of the minimum generated by multiplying the spectrum by ω n , any other minimum originated by random noise may result in an incorrect value of ω c . For this reason, another essential element in our processing is the smoothing, since it facilitates the calculation of the cutoff frequency by omitting the random oscillations in the experimental signal. In order to establish the limits of the degree of smoothing that can be performed to the signals, Fig. 2 shows the spectra and their cutoff frequencies calculated using different values of w s and multiplying the smoothed spectrum by ω 3 .
In the spectrum shown in Fig. 2(a), which corresponds to a reference spectrum recorded from a metallic foil sample, the cutoff frequency calculated using w s = 3% of the total number of data in the spectrum was ω c = 2.1 THz. Although there is a Fig. 2 Spectra and their corresponding cutoff frequencies using w s = 3%, 6%, 9%, 12%, and 15% for n = 3. Insets show ω c as a function of w s change in ω c as the value of w s increases, this difference is minimal; by increasing w s to 15%, the cutoff frequency only decreased by 107 GHz, which is equivalent to 5.1% of the initially calculated bandwidth. This result suggests that increasing the value of w s in the smoothing for reference spectra does not lead to a significant improvement. It is also worth mentioning that we attempted various smoothing methods which include the Savitzky-Golay, the moving average and the non-centered moving average methods, being the latter, the one that gave best results On the other hand, Fig. 2 (b) shows the spectrum of a metallic sample with an irregular surface. In this case, the cutoff frequency calculated using w s = 3% is ω c = 2.72 THz. However, as w s increased to 6%, the cutoff frequency decreased to ω c = 1.66 THz, with no further changes for w s greater than 9%. Based on the results shown in Figs. 1 and 2, as well as many others that we do not show, we can empirically say that it is ideal to use w s ≥ 6% and n ≥ 3 for the processing of most spectra, with no significant change in the calculation of ω c .
As shown, the proposed method analyzes reference signals without major difficulties. However, this method is not limited to calculating the cutoff frequency in reference spectra, but can be used in any spectrum, regardless of the characteristic of the sample or the circumstances in which the signals were recorded. For this purpose, three data sets were analyzed. The first set is shown in Fig. 3(a), which correspond to a high-density polyethylene (HDPE) sample. Waveforms were recorded in reflection using a 3-inch focal length lens. The sample was placed at the focal point of the lens and subsequently moved ±12 mm along the optical axis, recording the waveforms every 3 mm. By moving the sample away from the focal point, the amplitude spectrum decays but it is not affected evenly across the entire band, which results in a decrease in the effective bandwidth, so this data set will allow the validation of the effectiveness of the algorithm in a very practical manner. Cutoff frequencies were calculated using w s = 6% and n = 3, and are indicated by circles in each spectrum. Fig. 3 (a) Spectra of a HDPE sample recorded in reflection geometry. The sample was initially placed at the focal plane of the lens. Subsequently, the sample was displaced ± 12 mm out of the focal length, recording the waveforms every ±3 mm. Circles show the cutoff frequency calculated using w s = 6% and n = 3. (b) ω c as a function of sample displacement As expected, the highest cutoff frequency, which was 1.66 THz, corresponds to the sample placed at the focal point; this frequency was the same when the sample was displaced +3 mm, which is within the Rayleigh length of the focus of the THz beam. In addition, ω c decreases as the sample moves away from the focal point of the lens, falling to 1 THz for −12-mm displacement. This can be seen in Fig. 3(b), where the cutoff frequencies are shown as a function of sample displacement.
The last two data sets correspond to six signals obtained from a THz image of an artistic painting with two main areas of interest. The first three spectra correspond to an area where mercury sulfide was detected, which has an absorption line centered at 1.12 THz. This feature is important to analyze since the absorption line can be confused with the noise as it is located in a part of the spectrum where the SNR is low. The last 3 spectra were taken from the edges of the image exhibited a very poor measurable bandwidth. These two data sets are of special interest as they are highly complex spectra and will allow us to examine the robustness of the method. Figure 4 (a-c) shows the spectrum of a measurement where mercury sulfide was present. Although the absorption peak is easily distinguishable, the algorithm correctly calculates the cutoff frequency in (a) and (c). Apparently, the cutoff frequency calculated in (b) should be located at a lower frequency, since clearly the spectral amplitude decays to the noise floor before 2.2 THz. In order to correct the cutoff frequency in this spectrum, the value of w s was increased to 9%, calculating ω c = 1.5 THz. For the spectra in Fig. 4(d-f), it can be seen that these signals are extremely noisy, so it is natural to consider increasing the value of w s . Nevertheless, the calculated cutoff frequencies agree with the "by eye" estimation, this without having increased w s . Finally, it is important to mention that computation time is a critical factor in this work, since this method is designed to analyze large amounts of data. For this purpose, eight THz time-domain images of 550×460 pixels were analyzed, with a total of more than two million spectra, which were processed in approximately 1840 seconds that corresponds to ∼ 900 μs per spectrum. The images were processed in a commercial computer (Intel Core i5-11400H with 8 GB memory, using MATLAB R2021b).

Conclusions
In this work, we present a method that allows the calculation of the measurable bandwidth in THz spectrum through smoothing and subsequent multiplication by ω n , which generates minima at the frequencies where the spectral amplitude decays to the amplitude of noise floor. By analyzing different spectra, it was shown that for values of w s = 6% and n = 3 it is possible to calculate the cutoff frequencies in most cases. However, in spectra that present features that propitiate a bad calculation of the cutoff frequencies, it is advisable to use values of w s higher than 9%. It is important to clarify that these parameters are not fixed and can be adjusted by the user in order for the method to work properly according to the data characteristics and THz system used.
Furthermore, the method can be applied to any signal from any THz-TDS system, since this method analyzes the spectrum itself and does not depend on either the system configuration nor the optical parameters of the sample. This was demonstrated by analyzing spectra of out-of-focus samples, with spectral fingerprints and narrow measurable bandwidths. Additionally, computational time is not a problem by using a very simple smoothing method and by multiplying and finding minima.
The code, which is the central part of this contribution, is provided below as Additional Material. The data sets used for testing the code can be obtained from the corresponding author upon reasonable request.