The variance inflation factor to account for correlations in likelihood ratio tests: deformation analysis with terrestrial laser scanners

The measurement noise of a terrestrial laser scanner (TLS) is correlated. Neglecting those correlations affects the dispersion of the parameters when the TLS point clouds are mathematically modelled: statistical tests for the detection of outliers or deformation become misleading. The account for correlations is, thus, mandatory to avoid unfavourable decisions. Unfortunately, fully populated variance covariance matrices (VCM) are often associated with computational burden. To face that challenge, one answer is to rescale a diagonal VCM with a simple und physically justifiable variance inflation factor (VIF). Originally developed for a short-range correlation model, we extend the VIF to account for long-range dependence coming from, for example, atmospheric turbulent effects. The validation of the VIF is performed for the congruency test for deformation with Monte Carlo simulations. Our real application uses data from a bridge under load.


Introduction
The noise of geodetic sensors exhibits temporal correlations: the value recorded at a given time depends on its temporal neighbours. The causes are multiple, from the influence of the medium travelled, to cables and electronic components (Hooge 1994;Buchanan 1996;Wheelon 2001), to cite but a few. The correlations can be (i) short-term, decreasing exponentially to zero from the origin with increasing time, or (ii) long-term, decreasing rapidly from the origin but staying close to a constant value after a few time lags, and for a long time (Karagiannis et al. 2004). Various models B Gaël Kermarrec kermarrec@meteo.uni-hannover.de 1 have been proposed to describe temporal correlations; we cite the autoregressive (AR) process and all its variations (Box et al. 2016), the general Matérn function or rational quadratic covariance functions (Rasmussen and Williams 2006;Lilly et al. 2017), exemplarily.
Within a geodetic context, the description of the temporally correlated noise from the global navigation satellite system (GNSS) or terrestrial laser scanner (TLS) observations has been the topic of various publications (Li et al. 2008;Kermarrec and Schön 2014 for GNSS;Holst and Kuhlmann 2016;Jurek et al. 2017;Kermarrec et al. 2021b for TLS). The growing data rate of those sensors is often linked to the idea that the observations may have more "in common", which is often associated with an exponential decay of the autocorrelation function (ACF) as the time lag increases. Unfortunately, the description based on the empirical ACF has strong limitations: the decrease of the correlation function at the origin and the level of correlations remaining after a given time lag have a strong impact on the inverse of the variance covariance matrix (VCM), but are difficult to catch from empirical fitting (Beran 1994;Stein 1999;Karagiannis et al. 2004). To face that challenge, a modelization of the noise correlations should be preferred, rather than an estimation of its ACF. A model has the advantage that no iteration or a posteriori computation is needed, unlike variance covariance estima-tion (Teunissen and Amiri-Simkooei 2008). It is physically traceable, which is expected to increase acceptance and use. In this contribution, we will restrict ourselves to the noise correlations of a TLS. We will use the results of Kermarrec et al. (2021b) and Kermarrec and Hartmann (2021), who proposed a description of the temporal long-range dependence of the range and angles noises from a TLS. Unlike Schmitz et al. (2021), we will model the correlations as being temporal. This assumption will be discussed in Appendix 2, as well as the concept of atmospheric correlations in Appendix 3. The description of the variance of the noise is taken from Wujanz et al. (2017) for a phase TLS (see also Schmitz et al. 2019, and the references within; Kermarrec et al. 2018 for a simplification).
In statistical hypothesis tests, an incorrect stochastic model arises, e.g., when correlations are neglected. This simplification leads to a sensitive, distorted and not true to expectations quantity in its quality as quadratic form (Koch 2000;Jäger et al. 2005, p. 195;Teunissen 2000). Koch (2000) proposed using an additional covariance matrix through pseudo-observations to circumvent the problem of unrealistic stochastic description, in addition to applying a Bayes statistical test procedure. While the parameters estimated do not change with this method, the sensitivity of the test could be controlled by the additional covariance matrix. Unfortunately, Koch (2000) left the way open to determine this covariance matrix. Our proposal, called the variance inflation factor or VIF, aims to fill this gap and propose an intuitive, yet, easy-to-understand way to account indirectly for correlations. Correlations are known to act to decrease the number of available observations. In the literature, this can be found under various terminologies like "effective number of independent data", "effective degree of freedom" or "effective sample size" (Taubenheim 1974;Thiébaux and Zwiers 1984;Kirkup and Frenkel 2006;Holst and Kuhlmann 2016). Alternatively, the term "autocorrelation time", abbreviated as AT, is employed (Thompson 2010). One of the most popular method to evaluate the AT is the initial sequence estimator, as derived in Straatsma et al. (1986). This formulation was used, for example, in Wang (2022) to derive the confidence interval of GNSS-derived site velocities, for GPS positioning (Jansson and Persson 2013), or in meteorology and climatology (Dickinson 1985;Kamenkovich et al. 2011), see also Vallejos and Acosta (2021) for an application to soil contamination for multivariate data. The AT formula links the variance of the mean with the variance itself and depends on the empirical ACF. Another possibility to estimate the AT is to model the time series as an autoregressive process (AR) of a given order p following Wei (2006). All these methods involve the computation of the empirical ACF. Here we introduce the VIF, which is based on the inverse of the covariance matrix of the AR process (Wise 1955), to avoid the estimation of the ACF. We apply the VIF for the congruency test (Mahalanobis 1936;Pelzer 1971;Kargoll 2012;Manly 2005), and we place ourselves within the framework of deformation analysis without lack of generality regarding other likelihood ratio (LR) test statistics, such as an outlier test.
Our main intentions can be summarized as follows: • Point out the impact of correlations on the congruency test in deformation analysis by providing an intuitive understanding of how correlations act, • Propose a strategy to account for correlations based on an inflation factor and compare with more usual strategies to estimate the AT. We introduce the VIF of high order to account for long-range dependence and extend the previous derivation of Kermarrec and Schön (2016), • Show how the results from simulations can be transferred to real cases for testing of deformation.
We make use of simulated point clouds from a plane, and consider a rigid deformation of the object as, for example, a variation of the range between two epochs. The real observations are taken from a bridge under load (Schacht et al. 2017).
In the first part of this contribution, we will briefly describe the models for temporal correlations of the TLS range and angle noises. We will introduce the AT and its different formulations, and derive the VIF. A second section will be devoted to Monte Carlo (MC) simulations for the congruency test for a plane adjustment. The last part of this article presents the results for testing of deformation using a real data set. We conclude this contribution with some recommendations.

Mathematical background
The AT or effective sample size is as a way to account for correlations by reweighting the variance of the sample mean. In this section, we introduce the VIF and propose a model for the correlated measurement noise of a TLS.

The autocorrelation time AT
The autocorrelation time τ measures the convergence rate of the sample mean of a stationary Markov chain with bounded variance and is defined as the value so that n τ where X n is the sample mean of (X 1 , X 2 , ..., X n ) with {X i } n i 1 the n values of a scalar function of the states of a Markov chain with E({X i }) μ and var({X i }) σ 2 , the mean and the variance, respectively. The variance of the sample mean is called σ 2 X σ 2 n. The AT is sometimes introduced as an "effective sample size", which is similar as searching a corrected variance of the mean (Thiébaux and Zwiers 1984;Zieba 2010).
In the following, we present two ways of estimating the AT and introduce an alternative based on the inverse of the VCM of an AR process.

The initial sequence estimator
Following early publications from Bartels (1935) or Straatsma et al. (1986), see also chapter 2.1.5 of Box et al. (2016), the AT τ AT can be expressed as: with ρ k being the ACF of the series at lag k. Equation (2) comes from expressing the variance of the sample mean as the sum of all elements of the covariance matrix and is based on the infinite sum of all elements of the autocorrelation matrix of the process under consideration. Here the main challenge is to approximate empirically the ACF. This can be done, for example, by expressing ρ k asρ k 1 nσ 2 n−k i 1 X i − X n X i+k − X n for small lags, withσ 2 the (estimated) sample variance of {X i } n i 1 and. stating for "estimated" quantities. Here the corresponding τ AT is calledτ AT as it is estimated fromρ k . Unfortunately, the partial sum ofρ k does not have a variance that goes to zero as n grows to infinity. To overcome that challenge, Geyer (1992) proposed to truncate the sum when the sum of adjacent sample ACF values is negative, see also Thiébaux and Zwiers (1984). Alternative forms of (2) are called the initial monotone sequence estimator or the initial convex sequence estimator and are described in Thompson (2010).

The AT and the autoregressive process
A second approach to estimate the AT is to model the time series: the AR process as introduced in Box et al. (2016) is a prominent example of modelling, see Kargoll et al. (2020) and the references inside for applications in geodesy. In that case, a value from a time series is regressed on previous values from that same time series, i.e. we suppose a model of order p such as: X t− p is the lag value of the process at t − p. The values a 1: p a 1 , ..., a p are called the AR coefficients; they can be estimated with the Yule-Walker method (Box et al. 2016, Ch. 2) or by minimizing the standard sum of squared forwardprediction errors. We further callρ 1: p ρ 1 , ...,ρ p the vector of estimated autocorrelations for the AR process.
Estimating the order of the process The order p of the process can be chosen by an information criterion (IC), such as the Akaike or the Bayesian IC, abbreviated as AIC and BIC, respectively (Burnham and Anderson 2002;Montillet and Bos 2020 for applications within a geodetic context). In this contribution, we restrict ourselves to the AIC defined as AIC n ln σ 2 ε + 2p.
We callσ 2 ε the estimated variance of the model residuals. The first term of the AIC in (4) measures the difference between the log-likelihood of the fitted model and the loglikelihood of the model under consideration. The second term penalizes models with a high number of parameters to avoid overfitting. A minimum of the AIC is searched by varying the order p, so that the best model provides both a high goodness of fit and an optimum number of parameters.
The AIC does not penalize excessively models with many parameters, as the BIC would do. This important property of the AIC seems preferable for modelling long-range dependence with the aim not to underestimate the number of coefficients, and, per extension the VIF.

Estimating the AT for AR processes
The approach to estimate the AT for an AR process is related to its power spectrum (Thiébaux and Zwiers 1984). Here we start from the approximation for the sample mean variance σ 2 X defined in (1) and given by σ 2 X ≈ 2π S X X (0) n, with S X X (0) the power spectral density (psd) function of the observed time series at the origin. For an AR process, we follow Wei (2006) and express the moment estimator of σ 2 u aŝ σ 2 u σ 2 1 −â 1ρ1 − · · · −â pρ p withρ 1: p estimated by solving the Yule-Walker system of equations. Because the process under consideration is modelled as being AR, we have S X X (ω) σ 2 u 2π 1 |1−a1e −iω −···−a p e −iω | 2 . Correspondingly, the AT can be estimated here aŝ where . T denotes matrix transpose. The formula (5) is based on empirical values and, thus, is an approximation.

The VIF: starting from the inverse of the VCM
To avoid the computation of empirical ACF, we derive the τ VIF as an alternative form of (5) for the AT of an AR process. This formulation relies on the close expression of the inverse of the VCM of an AR of order p, see Wise (1955) or Shaman (1973), and is not based on the variance of the sample mean directly. As (5), the derivation starts from the spectral density of the process.
To that aim, we define the n × n auxiliary identity matrix U as The AR process (3) can be expressed in a matrix form as It follows from the definition of the autocovariance matrix

The VIF for an AR(1)
To illustrate how −1 is filled, we take the example of an AR process of order 1. In that case, the inverse will have a band Toeplitz structure-except for the first and last row-(see Rao and Toutenburg 1999), i.e.
The AT for an AR(1) process can be estimated from (7) aŝ . This is similar to summing the terms of one line of the inverse VCM from the second one in (7), and having taken the inverse to be homogeneous with a variance, i.e.τ AR(1) −2â 1 +1+â 2 We define the variance inflation factor for the AR(1) process based on the inverse of the VCM as VIF AR(1) σ 2τ VIF/AR(1) . Calling I the identity matrix, the simplified form AR(1)_equi VIF AR(1) I leads to an identical VCM of the least-squares (LS) estimates as the fully populated AR(1) (Luati and Proietti 2009;Kermarrec and Schön 2016). As aforementioned, this simplification can be considered as a rescaling of the variance to account for correlations. Here we follow Thiébaux and Zwiers (1984) and avoid using the term "equivalent number of degrees of freedom" as proposed in Livezey and Chen (1983): the degrees of freedom will not change in case of correlations, only the effective number of observations, or better said the effective sample size has to be adapted.

The VIF for an AR(p)
Analogically to the AR(1) process, we extend the derivation of the VIF AR(1) for an AR model of order p as follows: We illustrate how the VIF AR( p) can be computed by considering the AR(2) process with two coefficients a 1 , a 2 . Using (6), the exact inverse of the VCM is given in its closed form by We define, similarly to the AR(1) process, VIF AR(2) There is a strong analogy between (5) and (7):ρ 1: p is replaced by the estimated AR coefficientsâ 1: p . However, the derivation of (7) is not based on the definition of the sample mean but on the inverse of the VCM. It is not exactly an AT per se: the strength of the VIF AR( p) is that it can be used in an LS adjustment when −1 is involved; it is a simplification of the fully populated VCM of the estimated parameters by a weighted identity matrix. There is no need to solve the Yule-Walker equations to compute the autocorrelation coefficients: an additional approximation is here avoided, but the derivation is still based on the AR assumption.
Hereafter, we will skip the subscript AR( p) and explain how the VIF can be used within the context of stochastic modelling for TLS observations.

The stochastic model of TLS measurements
A TLS acquires a point cloud by emitting laser pulses toward points on the surface scanned (Vosselman and Maas 2010).
The measurements recorded are a slant range and two associated polar angles in the horizontal and vertical planes passing through the centre of the instrument. The range measurement is a measure of time and can be made either by pulse ranging, phase difference or a mixed form combining both methods (Rüeger 1996). In this contribution, we will focus only on the measurements from a phase scanner. This choice does not affect the generality of our results regarding, e.g., the atmospheric correlations and the congruency test; the numerical values of the VIF may have to be adapted accordingly, as developed in Sect. 2.3.3 and Appendix 3.

Temporal correlations and the fGn
Due to the high scanning rate of the instrument, we presume that the correlated noise of the TLS polar observations has a long-range dependence and corresponds to a stationary fractional Gaussian noise (fGn, see, e.g. Molz et al. 1997 for an example). This noise has the particularity that the degree of dependence between the observations will stay strong as the time lag τ increases. The corresponding temporal ACF C H (τ ) is said to be fat-tailed and is given by where H is the so-called Hurst exponent (Mandelbrot and Van Ness 1968). The fGn has a power law spectrum, i.e.
S fGn ( f ) ∝ 1 f β , with f , β being the frequency and the power law of the process, respectively, and H (β + 1) 2. This model is used for various kinds of physical processes, particularly atmospheric ones (Vyushin et al. 2009). Its use is well-founded to model correlations of the range noise, which are expected to depend on the random variation of the refractivity index of the atmosphere, similar to GNSS phase observations (Kermarrec and Schön 2020;Wheelon 2001). In that case, H 5 6, i.e. β 2 3 (refer to Appendix 3 for more details). We mention that the flicker noise corresponds to the limit as H → 1 (β 1) and the white noise (WN) to H 1 2 (β 0). A combination of fGn with different Hurst parameters can approximate various physical processes, i.e. the noise due to atmospheric effects and due to electronic instruments (mostly flicker and WN).

Temporal correlations versus spatial correlations for TLS measurements
In this contribution, we model the correlations of the TLS as being temporal. This section aims to justify our choice.
The link with what is called "spatial" correlations is further described in Appendix 2.

Range correlations
The loosely called "raw" observations of the TLS are time differences. Consecutively, we con-sider the correlations as being temporal by placing ourselves at the level of the sensor. These time differences are used to compute the range and do not carry information about the reflected surface from which the laser light comes from. The fact that the object may have been black, white, rough or smooth is mainly summarized in the intensity, together with deterministic information about the atmosphere (Wujanz et al. 2017). However, the random variations of the atmospheric refractivity index are not accounted for in the intensity. Following Wheelon (2001, Ch. 5 and 6), such variations correlate the measurements. These atmospheric correlations can be modelled by a fGn with H ≈ 0.8 in (8), see Wheelon (2001, Ch. 6, Eq. 6.84). Here we make use of the plane wave assumption. Kermarrec et al. (2021b) show empirically that the correlation structure of range noise for a phase TLS can be set to H ≈ 0.7 in a first approximation. This corresponds to a combination of WN and atmospheric noise. The sum of stochastic processes modelled as one stochastic process leads to a smaller value of H than expected for pure atmospheric correlations. Making a parallel with AR processes, this is similar to saying that an AR( p) to which a WN is added corresponds to an AR(q) with q < p (see Sect. 2.3.3. and Granger and Morris 1976). In this contribution, we will estimate both the ratio of WN and fGn and H in a first calibration step by maximum likelihood. From empirical investigations, we found a ratio of variance less than 0.1 for the flicker noise with respect to the sum of WN and atmospheric noise. Correspondingly, we propose to neglect this noise for the sake of simplification. The reader is referred to Appendix 3 for more details on the maximum likelihood estimation.
Angle correlations Encoders measure the rotational position of the optical mirrors and provide output that corresponds to the displacement by counting the number of rotations of the motor axis. Based on physical considerations, the correlation structure of angle measurements can be explained as a combination of flicker noise for approximately 60% and WN for 40% (Kermarrec and Hartmann 2021). The variance of the cross-correlations between vertical and horizontal angles could be shown to be up to ten times smaller than that for simple correlations (one angle with itself). We allow ourselves to neglect these cross-correlations in a first approximation.

Modelling with an AR process: the VIF
In this contribution, we simplify the fGn to an AR process to compute the VIF for the noise of TLS measurements (range and angles). This simplification is justified in Appendix 1.
We start by fitting a small plane to the object and estimate the parameters of the physical model, by analysing the residuals of the approximation (Sect. 2.3.2 and Appendix 3 for more details). From the ratio WN/atmospheric noise, we To that aim, 1 000 realizations of the estimated fGn are generated with different random vectors. For each run, thê τ VIF is estimated following (10). Here, we set the order of an AR process using the AIC. Finally, the mean value ofτ VIF over the 1000 samples is taken.
This MC-based method permits to start from a physical modelization of the noise and allow to fix the VIF for the whole experiment: there is no need to estimate the VIF each time an approximation of the point cloud is made. We mention that from (9), the VIF will depend on the variance of the process, which we supposed to have been empirically determined by an intensity model (Kermarrec et al. 2018;Wujanz et al. 2017).
With the example of a Hurst exponent of H 5 6 ≈ 0.8, a combination of 70% fGn and 30% WN for the range and 60% flicker noise and 40% WN for the angles, we found in meanτ VIF,r 17 andτ VIF,φ τ VIF,λ 48 for a sample size of 1 000. We note thatτ VIF,range <τ VIF,angle : this is logical as the correlation structure of the angle corresponds to the one of a flicker noise. This noise has a stronger long-range dependence than the atmospheric one.
The VIF can be used to account for correlations each time the inverse of the VCM of some estimated parameters is involved. It replaces the fully populated inverse of the VCM of the measurement noise. In case of small sample size, the VIF may have to be recomputed. Indeed, the order of the AR process has to be restricted to ensure an accurate computation of its coefficients (Granger and Moris 1976). The procedure adopted to estimate the VIF based on a physical model is summarized in a flowchart in Fig. 1.
We point out the possibility to estimate the correlation structure directly from the psd of the residuals of a LS approximation. This method was proposed in Kermarrec et al. (2021a) with an application to Kalman filtering. Here we prefer a calibration step from a plane adjustment to estimate the correlation structure as we assume this latter to be constant for the experiment. For point clouds with large range differences, or for measurements lasting several hours, variations of atmospheric conditions may occur and necessitate to reiterate the calibration procedure. Fortunately, and from our personal experience, no strong differences in the correlation parameters should be expected. Such investigations are let to a next contribution.
We further mention that the AT as proposed in (2) and (5) involves the empirical computation of the autocorrelation coefficients and need to be computed for each approximation.
Here we circumvent this challenge as bothρ 1: p and theâ 1: p may be inaccurate for small samples, resulting in an additional degree of uncertainty forτ AT andτ AT/AR .

The temporal VCM for TLS measurements: angle and range
The information about the temporal correlation structure of the noise of the TLS measurements can be summarized in a symmetric and positive semidefinite matrix called the VCM: • Its diagonal elements correspond to the variances of the observations. Whereas the angle variances are often taken as constant, a popular empirical model for the TLS range variance is based on the point-wise intensity (Wujanz et al. 2017(Wujanz et al. , 2018. Heteroscedasticity is accounted for since a different variance is associated with each point. • The off-diagonal elements of the VCM describe the correlations between the observations and are set up using the proposed model as a combination of fGn and WN. The VCM is, thus, not obligatory Toeplitz.
The VCM for the two polar angles φ , λ and the range r is called ll . It is built by sorting the measurements temporally as follows: with Q ll being the cofactor matrix and σ 2 0 the variance of unit weight. Figure 2 (left) shows a visualization of φ by setting the standard deviation σ φ of the angle noise to 0.005°. Figure 2 (right) is the corresponding ACF, modelled as a mix of flicker noise and WN. It illustrates the strong long-range dependence of the measurements simulated: the correlation decays quickly at the origin but stays at a low level for a long time.
This matrix can be simplified: (i) By neglecting the correlations between the angle noise, so that both φ , λ become diagonal matrices. We call the corresponding VCM ll_noCangle , (ii) By considering the noise as non-correlated. In that case, ll is a diagonal matrix called ll_diag . It is the usual WN assumption made when correlations are said to be "neglected". Heteroscedasticity is still taken into consideration. (iii) By using the VIF for both angles and range, VIF φ σ 2 φτ VIF,φ , VIF λ σ 2 λτ VIF,λ , VIF r σ 2 rτ VIF,r . Herê τ VIF can be replaced byτ AT orτ AT/AR following the derivation presented in Sect. 2. Heteroscedasticity can be accounted for with a point-wise variance. The corresponding diagonal VCM is given by

Simulations
In this section, we will demonstrate the advantages of the VIF computed withτ VIF compared to the VIF with theτ AT andτ AT/AR formulations. To that aim, we will make use of simulated observations from a plane. Planes have a simple geometry and are often used for calibration purpose (Wujanz et al. 2017). We explain how neglecting correlations will affect the results of the LR test and its distribution. Our aim is not to perform a sensitivity analysis of the test statistics as, for example, Neumann and Kutterer (2007). We further point out that B-splines surface fitting could be used in a similar manner ).

Plane fitting
The infinitively extended surface of a plane is defined by a normal vector n T n x n y n z , with n T n 1, and a distance parameter d as follows: n T P i d. P i is an arbitrary Cartesian point lying on the plane (Bronshtein 2007, p. 214f). Since n is a normalized vector, d represents the shortest distance of the plane from the origin of the frame. In order to estimate the plane parameters x T n x n y n z d with LS method, the TLS polar observations are converted into Cartesian coordinates by where n refers to the number of points observed. Following Lösler (2020), numerically stable equation systems are more likely for models using Cartesian coordinates instead of polar observations. An LS estimation allows the derivation of the parameters of the plane given a set of points not lying on a straight line. Here the objective function (v) v T −1 ll_car v is minimized, where v are the zero-mean observational residuals of the adjustment considered as being normally distributed. We call ll_car the transformed VCM ll of the polar measurements of size 3n × 3n after the propagation of uncertainty, i.e. ll_car F ll F T , with F the matrix of the linearized transformation between polar and Cartesian coordinates. ll can be one of the previously introduced VCM in Sect. 2. The parameters of the plane and the observational residuals are obtained by an errors-in-variables model. Regarding a detailed derivation, the reader is referred to the contribution by Neitzel (2010).
The a posteriori variance factor of the unit weight is defined asσ 2 0 (v) n − 3 for a plane adjustment. It can be tested against σ 2 0 using the global test (Teunissen 2000). The reader should refer to Kermarrec and Schön (2016) for an application of the VIF in the global test. In this contribution, we focus on the congruency test, which is based on the Mahalanobis distance.

Data generation
The TLS observations are generated as being from a plane of size 1 m × 1 m. The distance from the vertical plane to the TLS is fixed at 5 m. The simulated point cloud is aligned centrically at the height of the tilting axis of the TLS. The number of observations is 700 per plane, and the time between two measurements is set at 0.08 ms to keep the simulations computationally manageable. Decreasing the simulated scanning rate does not affect the generality of the results due to the long-range dependence of the simulated correlation structure. The value of 0.001 for the ACF for the range noise is reached within one scanning line so that increasing the number of observations will not affect our conclusions. We refer to Appendix 3 for a discussion on "gluing" the residuals of the LS fitting to estimate the noise parameters and the impact of this method on the spectral content of v.
We add a noise vector to each TLS-simulated polar coordinates consisting of one range and two angles. The standard deviations for range and angles are chosen following the manufacturer's datasheet for a Zoller + Fröhlich Imager 5006: 1.5 mm for the range and 0.005°for the two angles. The correlation structure is chosen following the derivation of Sect. 2.
We generate noise vectors r c L T r w corresponding to the fully populated VCM ll using a Cholesky decomposition (Koch 2017). We set r w ∼ N(0, 1), and decompose ll LL T with L being a real lower triangular matrix with positive diagonal entries. The noise vectors are generated within an MC simulation, so that the underlying random vectors are different at each run. We recall that four main cases of VCM will be considered in the LS adjustment of the simulated observations: 1. The correct fully populated VCM ll : we will refer to this case as the "reference" in the following, 2. The fully populated VCM built by neglecting the correlations between angle measurements called ll_noCangle , 3. A diagonal VCM accounting for heteroscedasticity with all correlations disregarded and named ll_diag , 4. A diagonal VCM ll_VIF for which range and angles correlations are accounted for by means of the VIF. We have four subcases to compute ll_VIF : (i) Using (2): based on the AR empirical ACF:τ AT .
(ii) Using (5): based on the AR empirical ACF using an AR noise model:τ AT/AR . (iii) Using (7): based on using the AR coefficients computed individually for each run of the MC (thus, residual-based):τ VIF . (iv) Based on a mean value of (iii) for theτ VIF , as proposed in Sect. 2.3.3.
With these configurations, we can investigate: (i) how a simplification of the stochastic model by neglecting correlations gradually acts on the LR test, as well as (ii) the impact of four different formulations of the VIF.

Congruency test
In order to test for deformation between two epochs, we use an LR test, called a test of the congruence, congruency test, or Mahalanobis distance test (Mahalanobis 1936;Tagushi et al. 2001, see also Caspary andBorutta 1987;Zaminpardaz and Teunissen 2022). Before presenting the results of our investigations, we need to clarify the term "deformation" to avoid misunderstanding.

Note on deformation and hypothesis testing
Deformations are caused by physical changes of an object. The goal in deformation analysis is to detect these changes through measurements. Unfortunately, measurements are always affected by measurement uncertainties. In LS adjustment, these uncertainties as well as correlations are specified in the stochastic model. Whereas the noise corresponds to the dispersion of the measurement, the signal relates to the physical change of the object. Hence, the challenge in deformation analysis is to separate between the noise and the signal. Hypothesis testing is a common statistical tool to decide between noise and signal, but requires a reliable and realistic stochastic model. Disregarding correlations affects the test statistic, and the probability density function of such a test may not belong to known statistical test distributions.
Applying critical values taken from improper distributions leads to wrong decisions, as shown by Lehmann and Lösler (2018). This problem can be solved by improving critical values or by deriving test statistics that follow known statistical test distributions using a proper stochastic model. According to Lehmann (2012), adapting critical values requires a Monte Carlo integration: this leads to additional numerical effort, which is usually undesirable. For that reason, we focus on the test statistic and investigate the VIF as an appropriate approximation of the stochastic model.

We callˆ
x 2 −x 1 the vector corresponding to the differences of the estimated plane parameters between two epochs designed with the subscripts 1 and 2. The uniformly most powerful invariant test for deformation under the null hypothesis is based on the test statistic: with Qˆ ˆ Qx 1 + Qx 2 the dispersion cofactor matrices of the estimated parameters. Q + ˆ is the pseudo-inverse of Qˆ ˆ .
We call p f rank(Qˆ ˆ ), which is three for a plane adjustment. We define the hypotheses for deformation analysis as follows: H 0 : 0 vs H 1 : 0 so that the null hypothesis states that no deformation occurred.
Here we set the significance level of the LR test to α 0.05: Lehmann and Lösler (2018) showed that the loss of power of the test due to nonlinearity is small for that choice. Under some regularity conditions, the test statistic T under H 0 follows a χ 2 p f (chi2) distribution. Here the normality of the estimated parameters is assumed, as well as the existence of derivatives of the likelihood function with respect to the parameters. Another key condition is that the region over which the likelihood function is positive cannot depend on unknown parameter values. It is questionable whether these conditions taken from the Wilks' theorem hold in case of a misspecified stochastic model (Wilks 1938). In order to check the discrepancy with these assumptions, we make use of the Gamma distribution . This 2-parameter distribution (a, b) with a the shape and b the scale corresponds to a χ 2 p f for the a p f 2 and b 2, i.e. p f 2, 2 χ 2 p f . By using this distribution, we can investigate how misspecifications of the stochastic model affect the distribution of T , i.e. if b 2 and/or a p f 2. We make use of a parametric distribution intentionally to ease the comparison with the chi2. We follow hereby the empirical results of Ferencz et al. (2005) regarding the distribution family of similarity distances, see also Burghouts et al. (2008).

The null hypothesis is accepted if
is the critical value of the chi2 distribution.

Simulating rigid deformations
We simulate a true deformation by varying the distance to the scanner from 5.000 m (reference configuration) to 5.003 m. This corresponds to a shift of the plane along the target axis without tilting. Here we consider the simple case of a rigid body motion: only the distance to the centre of the TLS is changed. The test statistic (12) may not be the most optimal in that particular case, as only one parameter is changed. We tested for the distance only for the sake of completeness. Fortunately, the conclusions presented in the next section were not changed.
For each deformation case, we compute the mean of the test statistic T , as defined in (12), over the 1 000 independent runs. In the following, we investigate the extent to which the VIF is a way to face the challenge of unrealistic test statistics, by using physical information about the correlation structure, as developed in Sect. 2.

No deformation
In this section, no deformation was simulated with the aim of validating the statistical properties of the congruency test with regard to the stochastic model. The results are presented in Table 1. We call "reference", the values obtained with the correct stochastic model ll (first column in Table 1). We propose to guide the discussion along two axes: (i) the impact of the stochastic model and (ii) the four different computations of the VIF. We point out that the Gamma distribution was identified as the best or the second-best parametric distribution with IC. Within the MC simulation, we computed quantiles of the distribution numerically with the given significance level: we found values close to the one from the parametric distribution Gamma. The results are not presented for the sake of shortness.

Impact of the stochastic model
• Neglecting the correlations between angle measurements. Table 1 highlights the nearly equivalence between the results given with the true VCM ll used to generate the correlated noise for the range and the two angles, and ll_noCangle , for which only correlations between range measurements were considered as a simplification. The mean of T and the parameters of the Gamma distribution are similar. To further visualize the near equivalence between the two stochastic models, we have plotted the The mean of T , the parameters of the Gamma distribution (shape and scale) as well as the corresponding critical values are given for each case under consideration. We recall that a Gamma distribution with p f 2,2 corresponds to a χ 2 p f Fig. 3 Eigenvalues decomposition of the dispersion matrix for the four stochastic models under consideration (the VIF corresponds to iv) three eigenvalues of Qˆ ˆ , which is the dispersion matrix involved in the computation of T . Figure 3 shows that the lines (blue and red, corresponding to the two stochastic models under consideration) are confounded. We link this finding with the 10 times smaller standard deviation of the angle measurements with respect to the range.
• Neglecting all correlations. When ll_diag is used (Table 1, last column), neither T nor the parameters of the Gamma distribution is corresponding to what we would expect.
T is unrealistic and too high by up to a factor 10 with respect to the reference. The shape of the Gamma distribution is 1, which does not correspond to the correct value (a p f 2 1.5, b 2), which was found with the reference model ll . Thus, there is a strong difference when correlations are neglected and heteroscedasticity is accounted for only: the χ 2 p f is not the correct distribution when correlations exist and an uncorrected diagonal VCM is assumed. We further point out that this example is independent of the fitted object: the stochastic model has a low impact on the parameters estimated by LS as long as the number of generated observations is large. The difference found is linked with the structure of the dispersion matrix. Figure 3 shows that all three eigenvalues of ll_diag are much smaller than for the reference matrix. Correspondingly, a higher value of T is found when the inverse is taken.
• Using the VIF to account for correlations. Even if there are discrepancies between the four methods under consideration to compute the VIF, the mean values of T are closer to the expected one than with ll_diag . The mean of T for (iv) corresponds to the one obtained with the correct VCM and is similar to the reference value compared to (i), (ii) and (iii). Correspondingly, the correct critical value corre-sponding to the χ 2 p f can be used with ll_VIF with a high trustworthiness. The effect of fully populated VCM compared with a diagonal VCM is seen in the parameters of the Gamma distribution: they are not corresponding to the correct p f , i.e. the shape a near 1 is common to all diagonal VCM, including ll_diag , but the critical value is close to the correct one for iv only.
• Impact of the VIF computations. We found that the mean of T computed using ll_VIF with methods (iii) and (iv) is the closest to the reference value. We recall that method (iv) uses the approximation of the VIF. Method (iii) is our proposal for computation the weighted variance for which the VIF is estimated by fitting the residuals with an AR process for each MC run independently.
Our simulations could validate the use of the VIF with our proposal to estimateτ VIF based on physical consideration. This latter gives test statistics and critical values closer to the one obtained with the reference VCM, compared to the proposals (ii) and (i), which make use of the empirical ACF. Correspondingly, it is plausible to use the VIF in the congruency test to replace the correct fully populated VCM of the noise. This conclusion was shown to vary little with the number of generated observations, provided that it was higher than 500.

Deformation
When the distance changes from 5.000 m to 5.003 m, the null hypothesis H 0 is rejected from a deformation of magnitude between 1.0 mm and 1.5 mm for the three models (Fig. 4). We note that this value is slightly higher (1.5 mm) when correlations are considered, w.r.t. the one obtained with ll_diag and ll_VIF . This result is more than acceptable considering the advantages of not having to perform matrix inversion of fully populated VCM.
If we disregard correlations and use the ll_diag with the critical value derived from the chi2 (χ 2 Fig. 4 Test statistics T by varying the distance to the plane (y-axis in log scale). Three VCMs are compared ( ll , ll_diag , ll_VIF ). We plot the critical values corresponding to the confidence level of 95% for the Gamma/chi2 distribution given in Table 1

General conclusions
In the following, we summarize the main findings of the simulations: • The stochastic model can be simplified by neglecting the correlations between angle measurements without affecting the results of the congruency test. • The diagonal model, which neglects entirely correlations, should be avoided in case of correlated observations: the test statistics T is unrealistic. • The VIF provides a good alternative to using the fully populated VCM. Our proposal is not based on the empirical ACF but only on the computation of the AR coefficients.
The results for T and the associated critical value were in good adequacy with the one found using the true model and slightly better compared to using usual proposals for computing the AT. • The shape parameter of the Gamma distribution using the stochastic model of reference was corresponding to half the degree of freedom, as expected. Correspondingly, the critical value of 7.81 can be used with α 0.05 to make the test decision for a plane estimation (3 degrees of freedom). • In case of correlated measurements, a deformation 1.5 times lower than the standard deviation of the range noise can be detected (H 0 cannot be rejected with a significance level of α 0.05).

Real data analysis
In this section, we propose to apply the VIF with our proposal to computeτ VIF , to the testing of deformation using the congruency test with real TLS observations. The data set corresponds to an artificially loaded bridge and provides a framework for testing the deformation of small magnitudes. In the first section, we will describe the data set briefly. We will present results for small patches of the point cloud adjusted as a plane.

Description of the experiment
The TLS data set chosen corresponds to a scanned historic masonry arch bridge located over the river Aller near Verden in Germany. During the experiment under consideration, the bridge was artificially loaded to simulate deformation due to a traffic jam. The increasing load steps are called E00 (reference), E11, E22, E33, E44 and E55 (see Paffenholz and Stenz 2017;Schacht et al. 2017, for a more detailed description of the experiment). Here we aim to investigate the impact of accounting for correlations of the TLS observations on the congruency test. The 3D point cloud acquisition was carried out using a Zoller + Fröhlich Imager 5006H TLS, in periods of a constant load on the bridge. After registration, we selected the same small patches of observations in three point clouds (E00, E11 and E33) using the software CloudCompare. They consist of approximately 700 for the smallest patch 2 and 2500 points (patch 1 and 3) located as shown in Fig. 5 (right). The smooth shape of the bridge combined with the small patches selected allowed us to perform an adjustment to a plane following the methodology developed in Sect. 3. We used the aforementioned correlation models and variances to compute three VCM as described in Sect. 3: ll , ll_VIF (method iv), and ll_diag . We make use of the VIF values proposed in Sect. 2 for both range and angles. We further neglect cross-correlations between angles. The reader is referred to Appendix 3 for a discussion on the correlation structure and the way to estimate it empirically. We use the intensity model proposed by Wujanz et al. (2017Wujanz et al. ( , 2018 and the simplification of Kermarrec et al. (2018) justified by the small and homogeneous surfaces chosen. This corresponds to a standard deviation of approximately 2 mm.
The global test could not reject the null hypothesis regarding the functional model with a confidence level of 95%. Here a compatibility test of the individual epoch is not mandatory as we do not use a posteriori variance. The registration uncertainty σ xyz given by the software Scantra (2.0.1.20, technet GmbH; Berlin) had a maximum of 0.4 mm for the area under consideration. This value represents the positional accuracy of a scan after registration and is computed by means of error propagation. It is considered as a value under which no deformation can be detected. We do not consider this information as being stochastic in the test statistics.

Results
We compute the deformation magnitude as the Euclidian distance between each LS approximation for the epochs of load  under consideration. The results are presented in Table 2. We do not start a discussion if deformation "really" occurred, see Sect. 3.3. The output of the statistical test in (12) depends on the sensor under consideration. Correspondingly, one may conclude of the statistical significance of the deformation for a laser tracker measurement but not for a TLS measurements due to their respective noise property. We recall that the focus of this contribution is to show that neglecting correlations leads to unrealistic test statistics and to present the VIF as a simple alternative. We restrict ourselves to two representative epochs of loading: the first and the third one: from epoch 4, the deformation magnitude is strong enough so that the null hypothesis is always rejected, independently of the stochastic model used.
The main findings are summarized as follows: • Realistic results: The test statistics presented in Table 2 validate the results deduced from the simulations of Sect. 3. The VIF using a modelization with a high-order AR process allows one to account indirectly for correlations with a diagonal VCM. It gives comparable results regarding the one given by a fully populated VCM in most cases. We point out that the test statistics without accounting for correlations are unrealistic. As stated previously, it is let to the practitioner if he wants to be overoptimistic or not. • We found in Sect. 3.5.2 that only deformation magnitude slightly larger than 1.5 times the range noise can be considered as statistically significant with the approximated correlation structure chosen. Here the deformation of 0.99 mm for path 2 (E00-E33) gives a test statistic above the critical value, exemplarily. This highlights the high trustworthiness of the physical model and of the simplification with the VIF, which led to a similar conclusion. If we consider patch 3 located close to the load, the H 0 is clearly rejected for all models. We point out the need to adapt the critical value for a diagonal model ll_diag to obtain a more realistic test decision: the critical value from the χ 2 3 is not reliable when correlations are neglected. The critical value to choose cannot be derived analytically from mathematical consideration, on the contrary to the one obtained with the true (or close to the truth) VCM.
• Effective sample size: The ll_VIF efficiently corrects the effective number of observations available compared to the diagonal VCM ll_diag which does not account for correlations. • The VIF does not have to be computed for each run or simulated noise vector, provided that one has some knowledge of the correlation structure (see Sect. 2 and Appendix 3 for a proposal to estimate the parameters). We further point out that the χ 2 3 distribution can be used with ll_VIF but we insist that it cannot be used with ll_diag .
We notice that the test statistics close to the critical value will always have to be interpreted with care, as for E00-E11, patch 2 using ll_diag .

Conclusion
A realistic knowledge of the stochastic properties of TLS measurements includes both the description of the variance and the correlation structure of the noise. Neglecting correlations can lead to an unnecessary movement of the decision chain as the critical values for the test decision are unrealistic. Unfortunately, correlations are often neglected: they are misunderstood as being synonymous with a high computational burden.
In this contribution, we have addressed this challenge and developed a new procedure to correct the effective sample size for the congruency test when observations are correlated. This factor has a similar effect as inflating the variance to decrease the number of observations available and can be used to simplify the fully populated VCM of the noise. We compared our derivation with more usual formulation of the effective sample size within the context of deformation from a plane. We pointed out that our method allowed one to account for correlations in a simple way using a diagonal VCM. The mean of the test statistics was close to the correct one, and allowed to use the correct critical value of the chi2 distribution with a higher trustworthiness than usual AT.
We based ourselves on a simplification of the correlation structure using an AR process of a high order. This formulation accounts for the long-range dependence of both the range and angle measurements of TLS observations. We provided first indications how to estimate atmospheric correlations and have shown that the VIF could account for them using a diagonal VCM, with a low computational burden. A correction of the effective sample size led to a more realistic and trustworthy test decision compared to VCM accounting for heteroscedasticity only. The inverses of dispersion matrices of parameters are involved in many parametric test statistics, such as the congruency test for deformation detailed in this contribution. In a next step, we will extend the use of the VIF to further testing procedures, such as outlier detection, and ambiguity resolution for GNSS observations. the results of Appendix 3. The spectral content is compared with the one of an AR(1) for the sake of simplicity. Clearly, there is a slight discrepancy at high frequencies between the two spectra. Indeed, a fGn can only be approximated as a sum of k AR processes of first order (Sørbye et al. 2019). This sum is itself modellable as an autoregressive process with moving average (ARMA) of order (k, k − 1) for the AR and MA components, respectively. From Granger and Morris (1976), we know that an ARMA of order ( p, p) corresponds to an AR( p) process plus WN. The addition of a WN component can (but may not) lead to a change of the order of the corresponding AR process. Correspondingly, it is plausible that a noise model expressed as a combination of fGn plus WN as proposed in Sect. 2.3.2. can be approximated by an AR model of a given order, as Fig. 6 illustrates. Following Luo et al. (2012), this order may be large as case of long-range correlations, as for GPS phase observations (Kermarrec and Schön 2014). We note that a fractional process, such as the fractional autoregressive integrated moving average, can asymptotically approximate a fGn (Kim and Kim 2015). Unfortunately, the inverse of the VCM of the process cannot be expressed in a simple way as for an AR process. We, thus, prefer the simple and effective AR process to approximate a fGn.

Spatial versus temporal correlations
In this appendix, we provide some insights on the link between spatial and temporal correlations. In Sect. 2, we have justified the temporal assumption for the correlations for TLS observations due to the propagation of the laser signal through the atmosphere considered as a random medium.
Here we wish to explain the link with spatial correlations, following a modelization proposed by Schmitz et al. (2021). To that aim, we propose to switch from a temporal to a spa-tial perspective by sorting the residuals "spatially", i.e. with respect to the distance to a reference point. For the sake of simplicity, the reference is taken to be the first measurement: on a scanned plane this corresponds here to the first "point" on the left bottom corner, exemplarily but this choice does not affect our conclusion. Under spatially sorted, we follow the principle of the variogram in Kriging (Stein 1999). Here we won't have a frequency power spectrum but a wavenumber spectrum (Wheelon 2001, Ch. 1).
We start from a simulated plane, as in Sect. 3. We add a temporal correlated noise corresponding to a fGn with slope −2/3 (H 5 6 equivalently) and investigate the spectral content of the residuals. In Fig. 7, we plot the spectral decomposition of the temporally sorted noise using the Lomb periodogram (green line). The use of this periodogram is justified by the fact that the spatially sorted observations may not be exactly equidistant. We further sort the residuals spatially, i.e. per increasing distance. The residuals are still correlated, as shown by the slope of the spatial psd (Fig. 7, blue line). However, the correlation structure is clearly different as the temporal one (green line). We have superimposed the slope of the underlying noise to highlight the differences. In the spatially sorted residuals, we lost the expected -2/3 slope, which is still clearly visible in Fig. 7 (green line) for the temporal counterpart: when investigating spatially sorted residuals, the fine correlation structure linked with atmospheric turbulence disappeared to let place to a combination of WN and flicker noise, most probably. The reason comes from the sorting of the residuals; the "remixing" switches the power spectral content of the noise and makes the determination of the correlation structure coming from physical considerations more challenging. We, thus, prefer to estimate the correlation parameters in the temporal domain, which allows a much closer investigation with respect to atmospheric or electronic correlating effects. This does not mean that the spatially sorted residuals are improper to analyse correlations. For some scanners without time stamp, this may be even the Fig. 7 Left: power spectral density of the noise (yellow) and the residuals (temporal in green and spatial in blue). Right: the sorted line-wise residuals "glued" as a time series only possibility. The method as presented in Appendix 3 can still be used to estimate the correlation parameters.

Atmospheric correlations
In this appendix, we wish to highlight that the spectral decomposition of the temporally sorted residuals from patch 1 is corresponding to the structure expected from physical considerations. The other patches gave similar pattern but contain a small number of observations for an accurate determination of the slope of the psd via maximum likelihood. We conjecture that the correlation structure of the residuals corresponds to an atmospheric noise with a psd slope of -2/3, a white and, eventually, a flicker noise.
We analyse the amount of the different noises using the Hector software (Bos et al. 2013), which allows to fix a given model and estimate its parameters by maximum likelihood.
We did not include a deterministic functional model as we deal with the residuals of an LS approximation; they should not contain any drift or sinusoidal pattern.
In Fig. 8 (left, top), we plot the spectrum corresponding to the atmospheric noise, following the model proposed in Wheelon (2001). The simulated observations are sorted temporally. We assume that a time stamp for each measurement is given and the observations are equidistant. The lomb periodogram may be a wise choice in case of non-equidistance (VanderPlas 2018), as mentioned in Appendix 2. We adopt the principle of "gluing" the residuals of adjacent scanning lines. This leads to a time series which elements are not exactly temporally sorted, i.e. there is a time-jump between each line (Fig. 8, left, bottom). Fortunately, the correlation structure of the residuals is our focus and will not be affected by this procedure because of the stationarity of the residuals. This latter is given since we approximate observations that lie on a plane with a plane adjustment from a Gauss-Helmert model. Fig. 8 Left (top): power spectral density from a simulated atmospheric noise, planar wave approximation. Different ratios of WN were added exemplarily. Right: psd of the glued temporally sorted residuals from the approximation of patch 1 (blue) and residuals corresponding to one scanning line (green). Bottom: the temporally sorted residuals glued together This property can be seen in the slope of the psd, which is linked with the expected Hurst exponent H 5 6. In Fig. 8 (right, top), we plotted both the spectral decomposition of the temporal residuals corresponding to a scanning line and the whole residuals sorted temporally. Gluing the residuals may add low frequencies only in case an inadequate functional model is used, which may introduce periodic components/drift in the line-wise residuals. These latter won't bias the estimation of the correlation parameters, which is a slope and not specific frequencies.
We estimate the slope of the psd as well as the ratio of WN and power law noise by maximum likelihood as proposed in Montillet and Bos (2020), see also Bos et al. (2013). We found an estimate of the slope of -0.65, close to the -0.66 expected from the turbulence theory (H 5 6). The ratio of WN to atmospheric noise was 22/78. We could not find evidence for an additional flicker noise.
In Fig. 8 (right, top), we highlight that the structure of the temporal range correlations validates, indeed, the assumption of atmospheric turbulence by means of a real case example. We point out that specific investigation will be the aim of a dedicated contribution. The link with spatial correlations is developed in Appendix 2 for the sake of completeness.