# Kolmogorov-Smirnov like test for time-frequency Fourier spectrogram analysis in LISA Pathfinder

- 201 Downloads

## Abstract

A statistical procedure for the analysis of time-frequency noise maps is presented and applied to LISA Pathfinder mission synthetic data. The procedure is based on the Kolmogorov-Smirnov like test that is applied to the analysis of time-frequency noise maps produced with the spectrogram technique. The influence of the finite size windowing on the statistic of the test is calculated with a Monte Carlo simulation for 4 different windows type. Such calculation demonstrate that the test statistic is modified by the correlations introduced in the spectrum by the finite size of the window and by the correlations between different time bins originated by overlapping between windowed segments. The application of the test procedure to LISA Pathfinder data demonstrates the test capability of detecting non-stationary features in a noise time series that is simulating low frequency non-stationary noise in the system.

## Keywords

Kolmogorov-Smirnov test Spectrogram Noise analysis Time-frequency map LISA Pathfinder Gravitational waves eLISA LISA## 1 Introduction

The Kolmogorov-Smirnov test is a well known statistical tool for the analysis of data, it allows to verify with what probability an empirical distribution will tend to a given cumulative distribution function when the number of data points goes to infinite [1]. The great advantage of the Kolmogorov-Smirnov test is its flexibility since the test statistic does not depend from the particular distribution of the test data. The aim of the present work is to develop a procedure based on the Kolmogorov-Smirnov test for the analysis of time-frequency maps of noisy data in the framework of the LISA Pathfinder mission. LISA Pathfinder (LPF) is an European Space Agency mission that will characterize and analyze all possible sources of disturbance which perturb free-falling test masses from their geodesic motion [2, 3, 4, 5, 6]. One of the final outcomes of the mission will be the definition of a noise model for free-falling test masses that will be used as a reference for the design and realization of future space-based gravitational wave detectors. This will require a technique to quantitatively analyze noise data and to assess the differences between noise measurements and models. Moreover, the analysis of noise is typically performed in the frequency or time-frequency domain therefore we aim to develop a noise analysis tool that is suited for such data. The problem of the statistical analysis of noise in the frequency domain was already formulated in [7] where a number of data analysis strategies were developed and compared. In this paper we present a further refinement of the Kolmogorov-Smirnov test presented in [7] and we extend its range of application to the time-frequency domain. The analysis of time-frequency data is particularly interesting since it allows to identify and characterize non-stationary noise. In LISA Pathfinder non-stationary noise can be the result of natural processes such as test masses random charging due to high energy particles and thermal drift in the electronics. In Section 2, the statistical properties of the time-frequency spectrogram are discussed while in Section 3 the Kolmogorov-Smirnov test is introduced and applied to the analysis of time-frequency spectrogram data. The influence on the test statistic of the correlations introduced by the data windowing process are analyzed in details for 4 different windows. In Section 4 the test is applied to LISA Pathfinder synthetic noise. The noise series is made non-stationary assuming that the noise provided by the capacitive sensor has an energy that is increasing quadratically with the time. Once applied to the time-frequency noise spectrogram the Kolmogorov-Smirnov test detect unambiguously the increase of the noise excess with the time. In this paper we simulated an example of non-stationary noise adding a non-stationary term (quadratic with time) in one of LPF sub systems (electrostatic actuators). The global noise model used is representative of the current expectations of LPF performances. The non-stationary scenario that has been chosen is only one of the possibility but currently it is not possible to predict if LPF noise will be non-stationary, in what amount and in what sub-system. We know that some subsystems are sensitive to thermal drifts but the true environment in the space will be known only when the mission will fly. In any case the method presented here is not dependent from the model assumed. The method detects the differences in the underlining distributions for the noise sample spectrum at two different times independently from the underlining model.

## 2 Statistical properties of the noise spectrogram

*x*

_{0},…,

*x*

_{ N−1}that is based on the application of a short-time Fourier Transform. A segment of data of length

*M*<

*N*is selected by a window function

*w*and the spectrogram elements for a time

*t*

_{ i }and a frequency

*f*

_{ j }are calculated by:

*T*is the sampling time of the data,

*p*is the starting point of the data segment of length

*M*, the window function

*w*

^{1}is defined over a segment of length

*M*starting at 0, the frequency

*f*

_{ j }=

*j*/(

*TM*) with

*j*= 0,…,

*M*/2 and

*t*

_{ i }is the time corresponding to the center of the interval \(\left [x_{p}, x_{p+M-1}\right ]\). It is worth to note that the frequency series defined by the spectrogram data for a given

*t*

_{ i }is the sample spectrum (sample periodogram) of the reduced time series

*x*

_{ p },…,

*x*

_{ p+M−1}. If the data series

*x*

_{0},…,

*x*

_{ N−1}is constituted of non-stationary noise then the spectrogram provides the spectral evolution of the noise power with the time.

Where *k* = *ν*/2, *ν* = 2, *𝜃* = 2*λ*, \(\lambda = E\left [S\left (t_{i},f_{j}\right )\right ]/\nu \) \(z = S\left (t_{i},f_{j}\right )\) and \(E\left [S\left (t_{i},f_{j}\right )\right ]\) is the expectation value for the spectrogram element.

*f*

_{ j }and the time axis

*t*

_{ i }respectively. The correlations along the frequency axis are introduced by the data windowing process that naturally correlates different frequency bins since it is a convolution operation in frequency domain between the data and the window function. The correlations as a function of the frequency bins separation Δ

*f*can be written as [8]:

*k*is an overlap shift factor. The expected values for the different windows of our set are reported in Fig. 1b where it can be seen that the Blackman-Harris window is the better performing in terms of suppressing correlations for a given overlap. This property is particularly advantageous for spectrogram estimation since it allows to obtain a finer time grid without increasing too much the degree of correlation between the different time bins.

## 3 Kolmogorov-Smirnov test

*X*

_{1},…,

*X*

_{ n }be a set of independent random variables with cumulative distribution function

*F*(

*x*), and let \(\bar {X}_{1}, \ldots , \bar {X}_{n}\) be the same set sorted in ascending order, we define the empirical distribution of the sample:

*F*

_{ n }(

*x*) →

*F*(

*x*). The Kolmogorov-Smirnov test provides a statistical tool to verify if an empirical distribution is compatible with a given cumulative distribution function [1]. Moreover the test can be used to verify if two empirical distributions share the same asymptotic cumulative distribution function. In this case, given two empirical distributions \(F_{n_{1}}\left (x\right )\) and \(F_{n_{2}}\left (x\right )\) we test the hypothesis that they share the same cumulative distribution function

*F*(

*x*) if we define a distance in the space of the cumulative functions:

Where \(d_{K}\left (x\right )\) is defined on the interval \(\left [0,1\right ]\) and \(K = \left (N_{1} N_{2}\right ) /\left (N_{1} + N_{2}\right )\) [1]. The statistical properties of \(d_{K} = \textnormal {max}\left [d_{K}\left (x\right )\right ]\) are independent from the particular distribution *F*(*x*) that we are testing. This flexibility represents the major advantage of the Kolmogorov-Smirnov test and it allows to implement the test for spectrogram data in a straightforward way.

As already discussed the statistic of the sample spectrum at each frequency bin depends on the expectation value at that frequency (equation (2)), those problems are solved if we considered a normalized spectrum (pre-whitened), which is obtained dividing the sample spectrum for its expectation value. In this case the statistic of each frequency bin become the same and the Kolmogorov-Smirnov test can be applied to the data. Therefore assuming to have a normalized spectrum or white noise the Kolmogorov-Smirnov test can be easily applied for the analysis of non-stationary noise in time-frequency maps obtained with the Fourier spectrogram technique. The expectation value for the sample spectrum that is used for its normalisation (pre-whitening) is typically not known a priori therefore it has to be estimated from the data itself or from a previous noise run. As a consequence the model used for the normalisation can not be an exact representation of the expectation value but, since the same model is used for all the sample spectra corresponding to different time bins of the spectrogram, the effect of the model inaccuracy cancels out and the test results preserve their reliability.

Given a spectrogram, we select the sample spectrum corresponding to the first time bin as reference and construct the reference cumulative distribution from it according to equation (5). We then compare the spectra corresponding to the other time bins with the reference using the Kolmogorov-Smirnov test as formulated in equation (6).

^{2}is enlarged proportionally. The overlap, instead, reduces such fluctuations since the overlapping time series share a given amount of the data points. As a consequence a large overlap correlation tends to decrease the expected critical values for the test. This can be easily seen in Fig. 3 where we report the calculated critical values at 95

*%*confidence for the Kolmogorov-Smirnov test, obtainded with a Monte Carlo simulation over 5000 independent realizations of white noise. The critical values are shown for 4 different data windows as a functions of the overlap and the number of samples in the data series.

## 4 Application to LISA Pathfinder synthetic data

LISA Pathfinder is a controlled three body system composed of two test masses and the enclosing spacecraft. One test mass is free falling along the principal measurement axis and it is used as reference for the drag-free controller of the spacecraft. The second test mass is actuated at very low frequencies (below 1 mHz) in order to follow the free falling test mass. This actuation scheme provides a measurement bandwidth 1 ≤ *f* ≤ 100 mHz in which both test masses can be considered effectively free-falling. The system has two output channels along the principal measurement axis, one measures the displacement of the spacecraft relative to one free falling test mass and the other measures the relative displacement between the test masses. From the knowledge of the displacement signals an effective force-per-unit-mass, *a* _{ eff }, acting on the test masses can be extracted by a data reduction procedure that project displacement data into force-per-unit-mass using a model for the spacecraft dynamics [11]. Thanks to the common mode rejection between the two test masses, the differential force-per-unit-mass is not affected by the spacecraft noise, while It is largely dominated by test mass noise at frequencies *f* < 10 mHz and by the interferometer readout noise for *f*>10 mHz. The test mass noise is then determined by the combination of different contributions such as magnetic noise, thermal gradients, test mass charging and capacitive actuation noise.

*f*< 10 mHz, combining this range with the measurement bandwidth we get a frequency band of interest for our experiment 1 ≤

*f*≤ 10 mHz. We adopted the procedure reported in [12] for the generation of two-channel cross-correlated data series. We then converted the raw displacement time series in effective force-per-unit-mass and we calculated the spectrogram for the differential force-per-unit-mass using a Balckman-Harris data window and 50

*%*overlap between different segments. We then divided the sample spectra at each time bin by an expected model

^{3}for the acceleration noise in order to have a normalized time-frequency map as reported in Fig. 5.

*d*

_{ K }reported in Fig. 6 corresponds to the Kolmogorv-Smirnov statistic \(d_{K} = \textnormal {max}\left [d_{K}\left (x\right )\right ]\), where \(d_{K}\left (x\right )\) is defined in equation (6).

As can be observed in Fig. 6, the Kolmogorov-Smirnov statistic is presenting an increasing trend with the increase of the time, which is indicating a departure from the statistic of the spectrum corresponding to the first time bin. The dashed lines on the plot are the thresholds corresponding to different confidence levels. As can be seen all the lines are crossed with the increase of the time with the exception of the 99.99 *%* confidence line. In order to have a quantitative comparison we can look at the values of the in-band energy excess for the different time bins. In particularly we have that at the time 1.5,2 and 2.5 × 10^{5} seconds, we observe an excess energy with respect to the first time bin of 12,16 and 21*%* respectively.

## 5 Conclusions

A procedure for the statistical analysis of time-frequency noise maps was presented and applied to LISA Pathfinder synthetic data. The procedure is based on the Kolmogorov-Smirnov test that, thanks to its flexibility, can be applied in a straightforward way to the analysis of time-frequency maps. The influence of the correlations introduced by the data windowing process was classified and quantified thanks to a Monte Carlo calculation over 5000 independent realizations of a Gaussian white noise process. The application of the test to LISA Pathfinder synthetic noise data has demonstrated the capability of detecting non-stationary features in the noise data series. The proposed experiment was simulating a failure in the capacitive actuation hardware that was introducing a quadratically increasing power term to the test mass noise time series. The test applied to a normalized time-frequency map has unambiguously demonstrated its capabilities of detecting non-stationary behavior in noise data series. In fact, the Kolmogorov-Smirnov statistic clearly demonstrates an evolution with the time that is a consequence of the change in the power content of the noise time series. In particular we have observed, in our test, that the Kolmogorov-Smirnov statistic is convincingly crossing the 95 *%* confidence threshold for in-band energy excess greater then 12 *%*.

## Footnotes

- 1.
*w*is assumed to be square normalized to 1 so that \(\sum _{i} {w_{i}^{2}} = 1\). - 2.
Critical values are cut-off values that define regions where the test statistic has a probability lower than

*α*to be if the null hypothesis is true.*α*is the significance level such that the confidence level is 1−*α*. The null hypothesis is rejected if the test statistic lies within this region which is often referred to as the rejection region [10]. - 3.
The expected model was obtained by a fit procedure of a sample spectra realized with all the noise sources kept stationary at their nominal values.

## Notes

### Acknowledgements

This research was supported by the Centre National d’Études Spatiales (CNES).

## References

- 1.Feller, W.: Ann. Math. Statist.
**19**(2), 177 (1948)CrossRefzbMATHMathSciNetGoogle Scholar - 2.Congedo, G., Ferraioli, L., Hueller, M., De Marchi, F., Vitale, S., Armano, M., Hewitson, M., Nofrarias, M.: Phys. Rev. D
**85**(12), 122004 (2012)CrossRefADSGoogle Scholar - 3.Antonucci, F., et al.: Class. Quantum Gravity
**28**(9), 094002 (2011)CrossRefADSGoogle Scholar - 4.Antonucci, F., et al.: Class. Quantum Gravity
**28**(9) (2011)Google Scholar - 5.Antonucci, F., et al.: Class. Quantum Gravity
**28**(9), 094006 (2011)CrossRefADSGoogle Scholar - 6.Armano, M., et al.: Class. Quantum Gravity
**26**(9), 094001 (2009)CrossRefADSGoogle Scholar - 7.Ferraioli, L., Congedo, G., Hueller, M., Vitale, S., Hewitson, M., Nofrarias, M., Armano, M.: Phys. Rev. D
**84**, 122003 (2011)CrossRefADSGoogle Scholar - 8.Percival, D.B., Walden, A.T.: Spectral Analysis for Physical Applications. Cambridge University Press, Cambridge, UK (1993)CrossRefzbMATHGoogle Scholar
- 9.Harris, F.: IEEE Proc.
**66**(1), 51 (1978)CrossRefADSGoogle Scholar - 10.NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/ (2013)
- 11.Ferraioli, L., Hueller, M., Vitale, S.: Class. Quantum Gravity
**26**(9), 094013 (2009)CrossRefADSGoogle Scholar - 12.Ferraioli, L., Hueller, M., Vitale, S., Heinzel, G., Hewitson, M., Monsky, A., Nofrarias, M.: Phys. Rev. D
**82**(4), 042001 (2010)CrossRefADSGoogle Scholar