Optical methods of the delay cells characteristics measurements and their applications

The only efficient optical spectrum measurement in infra-red range for low light regime is based on time multiplexing. Typically low timing jitter of the single photon detector combined with fast and high precision electronics is used for photon arrival time measurement. The timing histogram can be used to determine the spectrum of the photons. The better time detection accuracy allows to obtain the more precise spectrum information. This paper proposed an optical method for the sub-picosecond time-to-digital converters characterization prefer to optimize the multi-tapped delay lines implementation process, thus yielding the converters with much improved parameters.

experimental characterization (Frankowski 2006;Wasilewski et al. 2007) and the timeflight measurements applied in the time-of-flight mass spectrometry, where an isotopic composition of the atomic beam is measured (Song et al. 2006;Zieliński et al. 2003). Furthermore, TDCs have been also employed in laser ranging systems, telecommunication systems (for skew measurements) and also in many other experiments which are useful in many field of quantum physics (Lutz et al. 2014) i.e. optical quantum information processing schemes, quantum cryptography protocols and quantum teleportation.
As long as, the precision of TDCs largely depends on quantization parameters, then it is necessary to seeking the new solutions aiming to increase the TDLs resolution and accuracy. Due to the rapid development of microelectronic technology and quick prototyping makes the new series of highest performance FPGAs as a very interesting solution for future investigations.
The new FPGAs generation provide (inside CMOS structure) an array of fast Configurable Logic Blocks (CLB) which constituent components such as: simple logic gates, four and more inputs Look-up Tables (LUT) and fast dual-edge D-type flip-flops (FF) are characterized by a very small propagation times. Its facilities provide the ability to implement in a suitably large structures the multi-stage tapped delay lines (MTDL) with picoseconds and sub-picosecond resolution (Zhang and Zhou 2015;Park et al. 2015). Implementation of such delay lines, requires a new characterization method for testing and calibration. The proposed in this article the optical delay line (ODL) meets these requirements.
This paper is organized as follows. At the beginning we analyze the TDC architectures (in Sect. 2) and explain the TDL process implementation (in Sect. 3). In Sects. 4 and 5 we present a new method for optical measuring of the delay line characteristics. Using this method, we obtain the distributions of probabilities of counts in each time-channels. This informations will be helpfull to determine the level and character of distribution random error which describing a direct coding delay line (TCDL) representing by high precision TDL and coding register module. For further improvement of a one-shot TCDL resolution and effectively minimization of the non-linearity errors we also implemented and characterized (in Sect. 6) an equivalent coding delay line (ECDL) (Szplet et al. 2013). Finally, we summarize our work and suggest possible directions for future investigations.
The presented studies will concern on implementation and characterization of TDLs in XC4VFX12 FPGAs device.

Flash-like TDC architecture
Usually, the heart of high-precision time-to-digital conversion are the phase measuring (PM) modules. Each of PM modules may contain one or more delay lines in many different configurations. For example, to improve of the TDCs resolution may be used an vernier (Zieliński et al. 2005), pulse shrinking (Zhang and Zhou 2015;Szplet and Klepacki 2010) and interpolation techniques (Jansson et al. 2006) where effective resolution depends on the difference between two various LEs propagation times. In other cases, for decrease a dead time and adaptation to multichannel solutions we apply a Flash-like architecture with direct coding lines (Ugur et al. 2012). Consequently, a wide spectrum of TDL architectures determines the most interesting TDCs parameters such as precision, resolution, dead time (the minimum time between two measurements), maximum pulses intensity and limiting the measurement range (the maximum time interval which can be measured).
Nowadays, the most of integrated time counters utilize the advanced time interpolation methods based on classic Nutt scheme (Nutt 1968). According to this, the single-stage conversion process combines a simple counter method and a precise measurement of short time intervals by PM module (Szplet et al. 2009). Its currently one of the most popular solutions preferred to time intervals quantization process based on time-stamps (TS) measuring method ). In practice, the simple form of PM is measure the time interval between rising edges of the n-th pulse and the nearest clock signals by high resolution TDL (detailed relationships are shown in Fig. 1). As a consequence of this fact there is a possibility of designating the TS values for each of pulse signals. Thus any with registered TSs delivers precise (depending on TCDL resolution) information about relative position of pulse in time. In this case, the result of time-interval measurement (registered in direct coding register and converted from pseudo-thermometric code to the natural binary code) between two random in time incoming pulses, can be calculated by subtraction of two TSs according to the following equation: where: n, m-indicates the number of delay cells where incoming pulses was registered; s DL -an coding delay line resolution; T CLK -the main clock period (used time scale); N k , N p -indicates an integer number of appropriate standard clock cycles which are counted (in practical implementation) by two binary clock period counters (operating on the opposite signal edges) when leading pulse edges appear between the n-th and n ? 1 pulses, respectively ( Fig. 1).

The TCDL implementation
The FPGAs have usually a regular structure that mainly forms as a symmetric matrix connections (SMC) and array of CLB. The SMC is connected to specialized global clock inputs (GCI). The number of GCI varies with the device size. Each of GCI can be directly connected to any global clock buffers (signs in Fig. 2a as BUFG). The use of global clock buffers allow designers to access of the global clock trees. To improve the clocking distribution, the Virtex-4 architecture has been divided into several clock regions. All of N K N P t K t P T CLK DL P 0 P 1 P 2 P n P n-1 P n+1 P n+w-1 P n+w P 3 P 0 P 1 P m P m-1 P m+w-4 P m+w-2 P m+w P m+w-3 P m+w-1 CNT N P-1 N P-2 N P N P+1 N P+2 N P+3 Fig. 1 The principle of TS measuring method: CNT, CNT-two binary clock period counters operating on the opposite slopes; t P ; t K -time intervals between the n-th or n ? 1 incoming pulse edges and the nearest clock edges obtained from TCDL taps; Dt i -measured time-interval between consecutively incoming pulses the global clock buffers can drive all clock regions. The used in this paper XC4VFX12 device contain sixteen GCIs and improve the clocking distribution by eight clock regions. Whereas, the main logic resources used for implementing sequential and combinatorial modules are realized in CLBs. Each of CLB elements contains four slices (SL) which are grouped in two pairs (two SL together) and may be connected to SMC matrix. Each pair is organized as a column using an independent fast carry chain interconnection. In Fig. 2a is shown only part of above single column without indicating the CLB blocks. Delay cells are often implemented with use the logical elements (LE), placed inside SLs and fast carry chain interconnections (Fig. 2b). Delays of that individual delay cells are usually about tens of picoseconds and depends on FPGA technology. The main purpose of delay line implementation is to produce a multiphase clock signal and precise phase shifting. A common problem in the implementation process is to guarantee a linearity of the TCDL characteristics. In the real case, the non-linearity problem concerns of differences in the data and clock propagation times. The maximum differences between propagation time to each of global clock regions are at the level of twenty picoseconds and depends on TCDL location in programmable structure. Other signal propagation times within regions are comparable. It should be noted, that all of these propagation times includes the FF setup time. The setup and propagation time disturbances may also cause in the random mistakes during the conversion process which may result in not pseudo-thermometric output code (presents some bubbles) (Frankowski 2011). Thus, it requires either a hardware and software correction. The hardware calibration process of the delay line non-linearities may be realized in several ways. In the easiest form, it require by selecting the one of two available LE locations in single delay cell architecture (Fig. 2b). Another method involves attaching the input capacity (by connecting unused gates) to increase the propagation time of individual delay elements. The last way of hardware linearization applies an appropriate choice of the global clock buffer allocation that distributing the clock signal to the clock regions. Further correction can be achieved by the DCM module (signs in Fig. 2c as Ds PS ), running in a precision phase shifting configuration.

Optical direct method
A TDL time-resolution measurement was performed by optical system constructed in the National Laboratory for Atomic, Molecular and Optical Physics in Toruń, Poland. With them used, it was possible to perform the test pulses with a 40 ps timing jitter. An schematic of experiment setup designed to determine of the DCs characteristics and their practical realization are shown in Fig. 3. For this solution, precise change distance between two pulses may be realized through precision shifting of the retroreflector position. It was possible with used precision platform with ball screw and stepper motor, with step equal to 0.1 mm. Thus, the precision with which you can change the geometric light distance, determines the temporal resolution of an ODL. In the present case it is possible to precisely delaying the signal with a resolution about 0.34 ps. Of course, it should be noticed, that it is also possible to increase the step resolution up to 50 lm. The nominal lead of used in the experiment a ball screw has 5 mm per revolution. Hence, the accuracy grade of lead error shall amount 50 lm. From a practical point of view, the stepper motor positioning accuracy (for stepper motor which has 200 steps per revolution) is about 5 % of its step. Therefore, the maximum error of positioning is calculated by sum of the lead error and the stepper motor error. The corresponding to him of the maximum delay error is about 0.18 ps (Fig. 4). Experiments were performed with femtosecond laser system (Wave Pack, Warsaw University) that delivers 40 femtosecond pulses (FWHM) centered around 774 nm at about 80 MHz repetition rate. Transmitted beam through the retroreflector was collected fully onto a D 2 detector (avalanche photodiodes id101-idQuantique) using a lens. Neutral density filters were used to perform experiments on the same intensities of the light beam. In this way, the registered by the D 1 and D 2 detectors signals had similar characteristics.

Determination of delay cells characteristics
With proposed in Sect. 4 direct method it is possible to recreate the real characteristics of the delay line. Registered information obtained by this method provides detailed information about probabilities of counts for the each positions (number of positions depends on the ODL resolution) of the pulse inside the experimental time channel where it was qualified (shown in Fig. 4). Therefore, the probability distribution functions (for particular channels) may be strictly defined. Knowledge of such information allows to determining non-linearity parameters such as differential and integral nonlinearities (INL and DNL) and also parameters of envelope functions (EF) and then information about the TCDLs module random error distribution.

Distribution of random errors
Measurement of TDLs characteristics using an optical direct method assumes strong correlation between the signals propagated into delay line and registered information from the TDL in the registry. Therefore, performing K successive changes of the geometric light distance it was possible to collect the total number of counts for each of N time channels.  Fig. 4 Principle of the direct method-the probability of counts in the k-th sub-channel of n-th delay cell: DS-a minimal geometrical change of the light beam (the distance that light is present in a vacuum at a given time) Based on the collected information can be determined the probability of counts in the k-th sub-channels contained in the n-th delay cell. Discrepancies between the collected (for each of DCs) probabilities of counts, depends on the TCDLs implementation architectures and allow us to designate the various specific for a given technology of the delay cells parameters. The greatest impact to this, have such factors as: the individual FF or LEs technological properties and the propagation times disturbances. Below we proposed a mathematical model describing the nature of this phenomenon, using for this purpose the cumulative distribution function (CDF) defined as: where U is the Gaussian standard normal probability density function (PDF) with standard deviation r i and mean t i p . Bearing in mind the need to transform it into a simpler form, (which is required for further computation) we can approximated the CDF by the erf function, as shown in next formula: where the classical error function is defined by: An important problem is that the error function has no analytical solution. Nevertheless, there are several solutions which provide an approximation of these functions by many of the numerical methods (Patel and Read 1996). Unfortunately, the numerical integration is more expensive in computation time. Therefore is not suitable solution for implementation these forms to quick computational calculations and real-time processing. However, it was possible to solve, approximate analytic solution for the cumulative normal distribution and error functions. For this purpose we given the simplicity formula which was proposed by Winitzki (2008).
The complementary of these analytical error function, (denoted in relation to the erf and erfc functions) is defined as: From probability theory, a non-zero probability of counts in adjacent time channel depends on the probabilities of counts in the other time channels which are representing by another FF creating the next delay cells. Thus, the probability of these events is equal to the product of the probabilities and can be written in the following form: Assuming that the s DL resolution is much greater than 3r the above formula can be reduce to their simplified form as follow: Whereas, for the case of a delay line that meets the above assumption and consisting of two delay cells, we obtain the following two expressions: Optical methods of the delay cells characteristics... Page 7 of 19 188 C 1 ðt 1 ; r 1 ; t 2 ; r 2 Þ ¼ F 1 ðt 1 ; r 1 ÞK 2 ðt 2 ; r 2 Þ ð 8Þ C 2 ðt 2 ; r 2 ; t 3 ; r 3 Þ ¼ F 2 ðt 2 ; r 2 ÞK 3 ðt 3 ; r 3 Þ ð 9Þ The results of fitting C i functions (named further as EF) for three randomly selected delay cells by used a least squares method are shown in Fig. 5. An EF can be determined for each of MTDL time channels. For this purpose each of the DL taps can be approximated by the CDF. In the real case, the EFs of individual delay cells may be describes by different number of CDF parameters (standard deviation, mean). This fact affect the large asymmetry of EFs. Using the above EFs will be possible to designating the appropriate PDF H n through their normalization (Fig. 6). Knowledge of these functions allows to divide each of real MTDLs channel into multiple time sub-channels and preparation the two sample difference histogram with any finite resolution greater than TDC resolution (Frankowski and Zieliński 2015). Furthermore, knowledge of their coefficients will enable the determination of distribution random errors describing a high precision multi-tapped delay line with coding register module (Fig. 7).

Noise level analysis
The precision of a measurement is permanently limited by the random errors which usually may be determined by repeating the measurements. In most cases, the main sources of random errors are thermal noises and slope disturbances in the electronic circuits. In practice, the thermal noise is a white noise with a constant spectral density which is characterized by a Gaussian distribution of amplitude. Therefore, this random errors are often described by a Gaussian normal distribution parameters.
Consequence of noise or slope disturbances is jitter defined as phase fluctuations in a signals propagated inside the global clock and data path trees (Fig. 2a). The first one relates mainly to the delay cells architecture, while the second one to the global clock trees. Thus, the noise model of global clock path depends on differences in propagation times and can be approximated by: where: U max is the peak amplitude of noise voltage, whereas S represents a signal slope. For a given bandwidth B, the root mean square (RMS) value of the noise voltage is given by: where: k is the Boltzmann's constant, T is the absolute temperature of the resistive R component (in Kelvin degrees). In order to determine the amplitude peak of the noise, we must designate the relevant parameters of noise such as rms and mean values. The peak-to-peak noise volts in a distributed signals depends on level of crest factor (CF) and noise rms value. Typically, the CF is unitless coefficient which can be used to describe the purity of voltage signal. In our applications has been defined as the ratio of the peak value to the rms value: From the selected CF we can predict the probability of an occurrence of noise peaks that exceeds the proposed voltage peak-to-peak limits. In most cases, the CF value is accepted at a sufficient level equal to 3.9. For such value an occurrences of exceeding peaks outside the CF limits will be 0.01. For this range the peak-to-peak noise volts in a inputs signals may be obtained from: where: U rms is the statistical standard deviation of the noise signal.
To characterize time jitter in the delay cells structure (data path tree) we must take into account the contributions of successive multiplexers, placed inside the slices. If we assume that the single stage of multiplexers to introduces a time-jitter given by r MUX and that jitter values for others multiplexers are uncorrelated, we can designate a formula that describe the n-th tap error: where: k-indicates the number of multiplexers which forming a single delay cell. Presented phase fluctuations, are uncorrelated and described by Gaussian distribution Uðt À Dt p ; rÞ with standard deviation r and centered at Dt p . The experimental results of measurement the random errors disturbances for one-stage TCDL was shown in Fig. 7.

Differential and integral non-linearities
Performing K successive changes of the geometric light distance and collecting the total number of counts for each of N time channels it was possible to determine the delay cells characteristics for the whole delay line. According to this, the width of the n-th time channel can be described by the following formula: where: x nk -means the number of counts in k-th sub-channel of the n-th delay cell, R nkthe total number of all events registered in k-th sub-channel of n-th delay cell, Ds-the ODLs resolution. In this way, the average delay segment s q can be determined by summed up all of the delay segments (within a single TDL) and divided by the number of TDL segments from following equation: Thus, the DNL error in the n-th point of conversion characteristic is defined as the difference between width of n-th delay channel and the average value given by (16) equation: From Eq. (17), the maximum value of DNL error in the whole measuring range is defined as follows: In this way, by summing up the various channel width deviations from the average value, the INL errors can be calculated from the following relation: By analogy to DNL error, the maximum value of INL error can be obtained directly from (19) equation: The designated an INL error determines the information, how large an time-interval error during the measurement will be committed. Using presented relations it was possible full characterization of TCDL. The delay cells characteristics obtained for the 200-Taps TCDL are shown in Fig. 8. The attained average quantization step was about 25 ps. Minimum and maximum quantizations were 0.1 and 67.4 ps, respectively. These results gives a wide range of delay cells variation at ±34 ps. Whereas an example of DNL and INL characteristics obtained for the same 200-Taps TCDL are shown in Fig. 9. The presented characteristics shows a very high non-linearity. In this case, the maximum DNL and INL errors was 41.6 and 124.6 ps, respectively.
For further analysis its also reasonable to estimate the above errors for others TCDLs. The calculation results for the sixteen TCDLs are summarized in the Table 1. In reference to these informations, in Sect. 6 we explain the idea of construction an equivalent delay line (Fig. 10) as one of the methods used for improve resolution and effectively compensation of DNL and INL errors.

Multi-stage TCDL architecture
Unfortunately, while the TCDL resolution depends only on delay cells (DC) parameters and are characterized by a large non-linearities, the high-precision measurements is not a satisfactory solution. Typical values of individual delay elements varies in the range from several to tens of picoseconds (i.e. Virtex-4 produced on a 90 nm Copper CMOS Technology). Therefore, for further improvement of a one-shot resolution and effectively minimization of non-linearity errors we can use the equivalent coding line (Szplet et al. 2013).

Principle of ADL and FDL realization
Improved resolution may be simply achieved by increasing the number of TCDLs and effectively parallelization of their conversion process. An application of ECDL to construction of the high-resolution TDC allows us to obtain converters with sub-gate delay time resolution. Each of these lines are composed of the specified number of TCDLs. In this case, it is very important to precisely determination of TDCLs characteristics and disturbances in the propagation times Dt di (as illustrated in Fig. 10) to each of appropriate TCDLs. All of them parameters, may be obtainable by optical direct method described in Sect. 4. The propagation time values may be also determined graphically from the 2-dimensional probabilities of counts matrix as shown in Fig. 11. Generally, we can specify two methods of the ECDL realization. The first one, concerns of ECDL realization with previously assumed resolution (ADL-assumed equivalent delay line) while the second one is a direct combination (FDL-folded equivalent delay line) of  all TCDLs. The principle of both methods, preferred to achieve a high resolution ECDL has been illustrated in Fig. 10. The presented methods allows increasing the ECDL resolution proportionally to the number of used TCDLs. For example, when we analyse the two-stage TCDL (where each of the coding delay lines contain respectively m and n quantization steps), we can obtain the converters with approximately two times better resolution. A special case of the ADL construction is the line with similar or the same resolution such as TCDLs. It explain the possibility of practical application of the first level of hardware linearization process. But we must remember that, there are cases, where a such line may not be always realized. It depends on the finite number of available (similar to the expected resolution) delay cells.

Experimental results
We achieved the various characteristics of our ECDLs. For each of ECDL realizations we using sixteen TCDLs. Their metrological parameters are summarized in the Table 1. Firstly, we have prepared a ADL with similar resolution to TCDL (about 25 ps). The measured characteristics of the 200-Taps ADL, was shown in Fig. 12. Minimum and maximum values of quantization steps were 20.53 and 30.17 ps, respectively. The nonlinearity errors have been effectively reduced to 5.16 ps. The maximal values of INL and DNL errors are less than 6 ps, which corresponds to a quarter of ADL quantization step. You will notice, that this is about seven-fold (DNL) and twenty-fold (INL) improvement in relation to results described in Sect. 5.
In the second approach, we increasing the assumed delay resolution to 10 ps and then to 5 ps. We achieved a 498 and 991 quantization steps (for the same time-interval). The INL characteristics are shown in Fig. 13. The extreme values of both non-linearities was equal to 7.74 and 18.9 ps, respectively. Minimum and maximum of quantization steps for 10 ps of ADL were 2.74 and 15.77 ps, but for 5 ps were about 0.32 and 12.91 ps. In this case, the N P t P P 0 P 1 P 2 P n P n-1 P n+1 P n+w-1 P n+w P 3 ASSUMED DELAY LINE t di t d0 In summary, the best results are achieved in ADL realization with assumed 10 ps resolution. For this purpose, the DNL error is about 0.77 LSB and it is at least three times smaller than in other cases. Whereas the FDL shows a higher resolution (1.54 ps) not encountered anywhere else but characterized by a worse nonlinearities (66.7 LSB).

Timming accuracy analysis
Typically, precision of the TDC depends on the resolution of interpolators, its implementation architecture and reference standard clock stability described by accumulated jitter ). For the small range of measured timeintervals the accumulated jitter can be omitted in the analysis. In this case, only the appropriate delay cell parameters and its deviation determines precision of time interval measurement.
Differential and integral nonlinearity characteristics of the used delay line was presented in Sect. 5.3. These measurements were performed using an optical direct method and demonstrate full compliance with a code density test method Mota and Christiansen 1999). Using the above informations we can determine the average quantization step s q and quantization error. The quantization error was calculated from formula and is equal 7.2 ps. Therefore, an ideal form of the quantization error distribution function can be expressed as follow: where p predicate assumes a one value when À0:5s q t 0:5s q relation is true, otherwise the zero value. In the real case, the delay cells characteristics are not identical. Therefore, the real distribution function must be contain additional information about the deviations from the delay cells uniformity and jitter level. Hence, the distribution function H q ðtÞ is extended by an Gaussian distribution and expressed in following form: When the phase measurement module comprises a plurality of MTDLs, then during the single conversion process it is possible to obtain many different results of time-interval measurement (depending on the number of MTDL modules). The MTDLs characteristics have usually different parameters. In the classical approach, each of them can be described similarly as shown for single stage TDL. Finally, it is possible to improve the system uncertainty according to the ffiffi ffi k p relation, where k indicates the number of delay lines. Constructed in this way delay line always requires a calibration process. Determination of the appropriate propagation times to each of delay lines with sub-picosecond resolution it is not always possible. Fortunately, using proposed in this article the ODL (as described in Sect. 4) is a good solution for this purpose. In addition to the appointed propagation times, it is also possible to obtain of the PDF for each of the newly created delay channels.

Examples of practical applications
Included in this paper informations may be helpful in the construction of sub-picoseconds TDCs implemented in FPGA structures. Designed and characterized (as described above) by us TDC system will be applied in different quantum physics experiments. One of b Fig. 13 The ADLs characteristics obtained from the sixteen 200-Taps TCDLs: a the 498-Taps ADL with a 10 ps resolution, b the 991-Taps ADL with a 5 ps resolution potential applications provides cooperation with an array detector of 32 9 32 smart pixels, where each comprising a 20-lm single-photon avalanche diode (Tisa et al. 2008). Used a high-sensitivity two-dimensional arrays of photodetectors insures a remarkable performance both in photon counting and in the photon timing measurements (Wasilewski 2008).

Summary and conclusions
Knowledge of the real MTDL characteristics (obtained from optical direct method) determines higher precision of the time interval measurement. Its very important solution in many fields of science, especially from the point of view on application of the highresolution time-interval measurement system implemented in the modern reprogrammable CMOS FPGA devices.