Keywords

1 Introduction

Concepts like stationarity and homogeneity (invariance with respect to transformation) play a central role in the treatment of stochastic processes. According to the definition of (Brockwell and Davis 1991, Def. 1.3.1) stationary processes are defined on the one hand by the unchanged (marginal) distribution function, but according to definition (Brockwell and Davis 1991, Def. 1.3.2) stationarity can also be defined by covariance stationarity. This means that the covariance function is invariant with respect to a linear transformation (see also Moritz (1980, Sec. 12)). In practice, many phenomena modelled by stochastic processes do not satisfy this requirement and exhibit a time- or location-varying character.

The term time-variable is often used differently. Following Priestley (1989, Sec. 6.1), time-variable processes can be subdivided into models with a deterministic trend (e.g. polynomial or seasonal) or with an “explosive” AR models where the roots of the characteristic polynomial are not only inside but also outside the unit circle. Here we want to study another type of non-stationary processes, where the coefficients of a discrete AR process \({\mathcal {S}}_t,~t\,\in \,\mathbb {Z}^+\) are variable in time.

However, the motion of the coefficients must be constrained to ensure a finite variance of the resulting process, \(E\left \{{\mathcal {S}}_t-E\{{\mathcal {S}}_t\})^2\right \} < \infty \). Korte et al. (2022) restricts the restricts the variability of the time-varying coefficients of an AR process by the restriction that the roots of the characteristic polynomials should be only within the unit circle. Since only linear motions of the roots (poles) are allowed, this requirement can be guaranteed also for higher order processes. For general (infinite) time-variable AR processes of first order, TVAR(1), however, the condition of convergence of the product sequence of the time-variable AR coefficients is a sufficient condition to guarantee finite variances/covariances (Knopp 1922, Sec. VII). For finite AR processes these restrictions simplify accordingly.

The article is organized as follows. In the Sect. 2, we examine the time-variable AR process of first order and put special focus on the inhomogeneity of the first and second central moment of the density function (expectation and covariances). In Sect. 3 we use this time-variable AR process to construct a time-variable spatio-temporal covariance model for a DInSAR-SBAS point stack of surface displacements from ERS1 and ERS2 data from the Lower-Rhine Embayment in North Rhine-Westphalia. A summary and outlook concludes the work.

2 Time-Variable Autoregressive Process of First Order (TVAR(1))

The focus of this section is to derive the first and second central moments of a time-variable autoregressive process of first order, TVAR(1), which is defined by

$$\displaystyle \begin{aligned} {\mathcal{S}}_t := \alpha_t\; {\mathcal{S}}_{t-1} + {\mathcal{E}}_t, \qquad t\,\in\,\mathbb{Z} {} \end{aligned} $$
(1)

where \(\big \{\alpha _t\big \}_{\Delta t} \in \mathbb {R}\) form a sequence of time-variable coefficients under the condition that the product series \(\lim \limits _{t\rightarrow \infty } \prod \limits _{j=1}^t \alpha _j^2\) converges. \(\big \{{\mathcal {E}}_t\big \}_{\Delta t}\) represents an independent and identically distributed (i.i.d.) sequence of random variables with expectation \({E \left \{ {\mathcal {E}}_t \right \}}=0\) and a constant variance \({\Sigma \left \{ {\mathcal {E}}_t \right \}}=\sigma _e^2\). \(\Delta t\) denotes the sampling rate.

Process Definition and Moving Average Representation of a TVAR(1) Process

To find an equivalent representation of the TVAR(1) process by a moving average process, we have to substitute the past signals

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathcal{S}}_t & = &\displaystyle \alpha_t\; {\mathcal{S}}_{t-1} + {\mathcal{E}}_t \end{array} \end{aligned} $$
(2)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & = &\displaystyle \alpha_t \big(\alpha_{t-1} {\mathcal{S}}_{t-2} + {\mathcal{E}}_{t-1}\big) + {\mathcal{E}}_t \end{array} \end{aligned} $$
(3)

and obtain the general representation

$$\displaystyle \begin{aligned} {\mathcal{S}}_t = \alpha_t \big(\alpha_{t-1}\big(\ldots \alpha_{2}\big(\alpha_{1}\;{\mathcal{S}}_0 + {\mathcal{E}}_1\big)\ldots\big)+ {\mathcal{E}}_{t-1}\big) + {\mathcal{E}}_t \end{aligned} $$
(4)

where \({\mathcal {S}}_0\) denotes the signal at the initial point \(t\!=\!0\). This results in compact form to

$$\displaystyle \begin{aligned} {\mathcal{S}}_t = \prod_{j=1}^t \alpha_j\; {\mathcal{S}}_0 + \sum_{k=1}^t \prod_{j=k+1}^t \alpha_j\; {\mathcal{E}}_k {} \end{aligned} $$
(5)

(notice: \(\prod \limits _{j=t+1}^t\!\!\alpha _j\!:=\!1\)). With this moving average representation of a TVAR(1) process it is now straightforward to compute the expectation and the covariances of the process.

Expectation of a TVAR(1) Process

The expectation

$$\displaystyle \begin{aligned} \begin{array}{rcl} & {}{}{}&\displaystyle {{E \left\{ {\mathcal{S}}_t \right\}} = {E \{ \prod_{j=1}^t \alpha_j\; {\mathcal{S}}_0 + \sum_{k=1}^t \prod_{j=k+1}^t \alpha_j\; {\mathcal{E}}_k \}}} \end{array} \end{aligned} $$
(6)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & = &\displaystyle \prod_{j=1}^t \alpha_j\; {E \left\{ {\mathcal{S}}_0 \right\}} + \sum_{k=1}^t \prod_{j=k+1}^t \alpha_j\; {E \left\{ {\mathcal{E}}_k \right\}} {} \end{array} \end{aligned} $$
(7)

depends on the expectation of the initial state \({\mathcal {S}}_0\) and the stochastic behavior of the noise \({\mathcal {E}}_t\), which is by definition (1) of the AR process \({E \left \{ {\mathcal {E}}_j \right \}}\!=\!0\) for \(j\!=\!1,\ldots ,t\). The expectation of the initial state \({\mathcal {S}}_0\) is unknown, but we can deduce from (7) the conditional expectation of \({\mathcal {S}}_t\) given a known initial condition \({E \left \{ {\mathcal {S}}_0 \right \}}\!=\!s_0\)

$$\displaystyle \begin{aligned} {E \left\{ {\mathcal{S}}_t\arrowvert s_0 \right\}} = \prod_{j=1}^t \alpha_j\; s_0 \ . {} \end{aligned} $$
(8)

In the following we restrict this general formulation of TVAR(1) processes by assuming that \({\mathcal {S}}_0\) has the same stochastic properties as a long convergent AR(1) process with constant coefficient \(\arrowvert \alpha \arrowvert \!<\!1\),

$$\displaystyle \begin{aligned} {\mathcal{S}}_i=\alpha\ {\mathcal{S}}_{i-1} +{\mathcal{E}}_i, \quad \text{for} \quad i=\ldots,-2,-1,0 \ . {} \end{aligned} $$
(9)

Taking the properties of the i.i.d. sequence of the random variables \({\mathcal {E}}_t\) into account, we can state that \({\mathcal {S}}_{i-1}\) and \({\mathcal {E}}_i\) are uncorrelated and due to the convergence behaviour \(\lim \limits _{t\rightarrow \infty } \alpha ^t\,{=}\,0\), the expectation and variance of \({\mathcal {S}}_0\) is asymptotically independent of the initial state of this process and given by

$$\displaystyle \begin{aligned} {E \left\{ {\mathcal{S}}_0 \right\}}\!=\!0 \qquad \text{and} \qquad \sigma_{{\mathcal{S}}_0}^2 = \frac 1{1-\alpha^2}\;\sigma_e^2 {} \end{aligned} $$
(10)

cf. e.g. Box and Jenkins (1970, pp. 57-58). Applying the expectation \({E \left \{ {\mathcal {S}}_0 \right \}}\!=\!0\) in (7) or (8) this immediately results in

$$\displaystyle \begin{aligned} {E \left\{ {\mathcal{S}}_t \right\}} = 0 {} \end{aligned} $$
(11)

for the TVAR(1) process under the assumption (9) for \({\mathcal {S}}_0\). This choice of the initial state has of course an influence on the further deviations of the variances and covariances. In contrast to Wegman (1974) where the second moments are defined as conditional moments we integrate the stochastic properties of the initial state \({\mathcal {S}}_0\) under assumption (10) of a long convergent AR(1) process.

Variance/Covariance of a TVAR(1) Process

The covariance as joint second central moment is defined by

$$\displaystyle \begin{aligned} \hspace{-0mm}{\Sigma \left\{ {\mathcal{S}}_t, {\mathcal{S}}_{t+h} \right\}} \!=\! {E \left\{ ({\mathcal{S}}_t\!-\!{E \left\{ {{\mathcal{S}}_t} \right\}})({\mathcal{S}}_{t+h}\!-\!{E \left\{ {{\mathcal{S}}_{t+h}} \right\}}) \right\}} . \end{aligned} $$
(12)

where t and \(t+h\) denote the time points. Substituting the moving average representation (5) and noting that the expectation value vanishes due to (11) we obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} & {}{}{}&\displaystyle { \hspace{-8mm}{\Sigma \left\{ {\mathcal{S}}_t, {\mathcal{S}}_{t+h} \right\}} = E\left\{ \big( \prod_{j=1}^t \alpha_j\; {\mathcal{S}}_0 + \sum_{k=1}^t \prod_{j=k+1}^t \alpha_j\; {\mathcal{E}}_k \big)\right. }\\ & &\displaystyle \left.\big( \prod_{m=1}^{t+h} \alpha_m\; {\mathcal{S}}_0 + \sum_{\ell=1}^{t+h} \prod_{m=\ell+1}^{t+h} \alpha_m\; {\mathcal{E}}_\ell \big) \right\}. \end{array} \end{aligned} $$
(13)

A reordering with respect to the expectation and the products provides

$$\displaystyle \begin{aligned} \begin{array}{rcl} & {}{}{}&\displaystyle { {\Sigma \left\{ {\mathcal{S}}_t, {\mathcal{S}}_{t+h} \right\}} = \prod_{j=1}^t \alpha_j \prod_{m=1}^{t+h}\alpha_m \;{E \left\{ {\mathcal{S}}_0{\mathcal{S}}_0 \right\}} + }\\ & &\displaystyle + \prod_{j=1}^t \alpha_j \sum_{\ell=1}^{t+h} \prod_{m=\ell+1}^{t+h} \alpha_m\; {E \left\{ {\mathcal{S}}_0\; {\mathcal{E}}_\ell \right\}}\ +\\ & &\displaystyle + \prod_{m=1}^{t+h} \alpha_m \sum_{k=1}^t \prod_{j=k+1}^t \alpha_j\; {E \left\{ {\mathcal{E}}_k\;{\mathcal{S}}_0 \right\}} +\\ & &\displaystyle + \sum_{k=1}^t \prod_{j=k+1}^t\!\!\alpha_j \sum_{\ell=1}^{t+h} \prod_{m=\ell+1}^{t+h}\!\!\alpha_m\;{E \left\{ {\mathcal{E}}_k\;{\mathcal{E}}_\ell \right\}} . {} \end{array} \end{aligned} $$
(14)

Taking into account the properties of the i.i.d. sequence of the random variables \({\mathcal {E}}_t\), we can state

$$\displaystyle \begin{aligned} \begin{array}{rcl} {E \{ {\mathcal{E}}_k\;{\mathcal{E}}_\ell \}} & =&\displaystyle \sigma_{{\mathcal{E}}}^2 \sigma_{{\mathcal{E}}}^2 \delta_{k\ell} , \\ \quad {E \{ {\mathcal{S}}_0\;{\mathcal{E}}_\ell \}} & =&\displaystyle 0 \quad \text{for}\quad \ell\ne0, \quad \text{and} \\ {E \{ {\mathcal{S}}_0\;{\mathcal{S}}_0 \}} & =&\displaystyle \sigma_{{\mathcal{S}}_0}^2, \end{array} \end{aligned} $$
(15)

where \(\delta _{k\ell }\) denotes the Kronecker-Delta.

From (14) one obtains

$$\displaystyle \begin{aligned} \begin{array}{rcl} & {}{}{}&\displaystyle { {\Sigma \left\{ {\mathcal{S}}_t, {\mathcal{S}}_{t+h} \right\}} = \prod_{n=t+1}^{t+h} \alpha_n \prod_{j=1}^t \alpha_j^2 \;\sigma_{{\mathcal{S}}_0}^2 +}\\ & &\displaystyle \sum_{k=1}^t \prod_{j=k+1}^t\!\! \alpha_j^2 \prod_{m=k+1}^{t+h}\!\! \alpha_m \; \sigma_{{\mathcal{E}}}^2 \quad \text{for} \ h > 0 \ . \end{array} \end{aligned} $$
(16)

This can be reformulated to

$$\displaystyle \begin{aligned} \begin{array}{rcl} & {}{}{}&\displaystyle {{\Sigma \left\{ {\mathcal{S}}_t, {\mathcal{S}}_{t+h} \right\}} =}\\ & &\displaystyle \prod_{n=t+1}^{t+h}\!\!\alpha_n \left( \prod_{j=1}^t \alpha_j^2 \;\sigma_{{\mathcal{S}}_0}^2 + \sum_{k=1}^t \prod_{j=k+1}^t\!\!\alpha_j^2 \; \sigma_{{\mathcal{E}}}^2 \right), \ h > 0\\ {} \end{array} \end{aligned} $$
(17)

which gives the covariance sequence of a TVAR(1) process for the times t and \(t\!+\!h\), for a positive lag h. Note that because of the symmetry properties of covariances \({\Sigma \left \{ {\mathcal {S}}_t, {\mathcal {S}}_{t+h} \right \}}\!=\!{\Sigma \left \{ {\mathcal {S}}_{t+h}, {\mathcal {S}}_t \right \}}\) holds.

The covariance sequence (17) can be split into two parts. The first part involving only the future events connected with the index n and the second part involving only the events before time t including the time itself linked to the indices j and k. The expression in the brackets defines the variance (i.e. lag 0) at time t

$$\displaystyle \begin{aligned} \gamma_t(0) := {\Sigma \left\{ {\mathcal{S}}_t, {\mathcal{S}}_{t} \right\}} = \prod_{j=1}^t \alpha_j^2 \;\sigma_{{\mathcal{S}}_0}^2 + \sum_{k=1}^t \prod_{j=k+1}^t \!\!\alpha_j^2 \; \sigma_{{\mathcal{E}}}^2 \ . {} \end{aligned} $$
(18)

The variance \(\gamma _t(0)\) is influenced by two quantities. The variance of the initial value \({\mathcal {S}}_0\) characterizes the warm up behavior of the process and the constant variance of the process \(\sigma _{{\mathcal {E}}}^2\) establish in connection with the time-variable coefficients \(\alpha _j\) the further variance behaviour of the process. It is thus clear that the variance of the TVAR(1) process is time-variant (see also Fig. 1).

Fig. 1
figure 1

Variance sequence of a TVAR(1) process. The time-variable coefficients follow the third degree polynomial \(\alpha _t = 0.7 + 0.016 t -0.00035 t^2 +0.000002 t^3\) (a). The variances (b) are computed in three different ways. (1) by (18) (2) by the filter approach and (3) by the filter approach supplemented by the influence of the variance of the initial state added by the \({\mathcal {S}}_0\). The figure shows, that all three approaches delivers the same result taking into account the warm up phase of the filter approach without prior information of the statistics of initial signal \({\mathcal {S}}_0\)

Often it is convenient to use a simple recursion formula instead of (18). If we rewrite (18) and use \(t\!-\!1\) as maximal upper bound instead of t we get

$$\displaystyle \begin{aligned} \begin{array}{rcl} & {}{}{}&\displaystyle { \gamma_t(0) := \alpha_t^2 \prod_{j=1}^{t-1} \alpha_j^2 \;\sigma_{{\mathcal{S}}_0}^2 +}\\ & &\displaystyle \prod_{j=t+1}^t \alpha_j^2 \sigma_{{\mathcal{E}}}^2 + \sum_{k=1}^{t-1} \alpha_t^2 \prod_{j=k+1}^{t-1} \alpha_j^2 \; \sigma_{{\mathcal{E}}}^2 \end{array} \end{aligned} $$
(19)

which can be rewritten to

$$\displaystyle \begin{aligned} \gamma_t(0) := \alpha_t^2 \left(\prod_{j=1}^{t-1} \alpha_j^2 \;\sigma_{{\mathcal{S}}_0}^2 + \sum_{k=1}^{t-1} \prod_{j=k+1}^{t-1} \alpha_j^2 \; \sigma_{{\mathcal{E}}}^2 \right) + \sigma_{{\mathcal{E}}}^2\ . \end{aligned} $$
(20)

Here the term in the brackets represents \(\gamma _{t-1}(0)\). Therefore we end up with the simple recursion equation

$$\displaystyle \begin{aligned} \gamma_t(0) := \alpha_t^2 \gamma_{t-1}(0) + \sigma_{{\mathcal{E}}}^2 \ {} \end{aligned} $$
(21)

for the variances at time t, with the initial state \(\gamma _{0}(0) = \sigma _{{\mathcal {S}}_0}^2\). The covariances with respect to a time lag h follows from (17) and (18)

$$\displaystyle \begin{aligned} {\Sigma \left\{ {\mathcal{S}}_t, {\mathcal{S}}_{t+h} \right\}} = \prod_{n=t+1}^{t+h}\! \alpha_n \ \gamma_t(0)\ , \qquad h > 0 \ . {} \end{aligned} $$
(22)

The covariance matrix \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}\) of the TVAR(1) process of finite length \(t_{max}\) can now be computed by arranging the covariances \({\Sigma \left \{ {\mathcal {S}}_t, {\mathcal {S}}_{t+h} \right \}}\) for \(t\,{=}\,0,\ldots ,t_{max}\) and \(h\,{=}\,0,\ldots ,t_{max}-t\) into the upper triangle of a matrix. The lower triangle part is completed symmetrically accordingly. Figure 2 gives an example for the covariance matrix of a TVAR(1) process.

Fig. 2
figure 2

Covariance matrix \({ \boldsymbol {{\Sigma }}}_{{ \mbox{ {\hspace{-0.1em}{$ \boldsymbol {\mathcal {S}}$}}}}}\) of a TVAR(1) process. The time-variable coefficients follow the third degree polynomial \(\alpha _t = 0.7 + 0.016 t -0.00035 t^2 +0.000002 t^3\). The computation of the matrix can be performed by (17) or by the recursion (21) in connection with (22)

Filter Representation and Covariance Matrix

It should be mentioned, that the same covariance matrix \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}\) can be derived from the filter approach (cf. Schuh and Brockmann (2020)). The covariance matrix \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}\) consists of of the filter part \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}^F\) and the warm up part \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}^W\)

$$\displaystyle \begin{aligned} {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}={\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^F\!+\!{\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^W\ . {} \end{aligned} $$
(23)

The filter matrix H is defined by

$$\displaystyle \begin{aligned} H = \left[\begin{array}{cccccc} 1 & \\ -\alpha_1 & 1\\ & -\alpha_2 & 1\\ & & \ddots & \ddots\\ & & & -\alpha_t & 1\\ \end{array}\right] \end{aligned} $$
(24)

and the covariance matrix \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}^F\) for the filter part follows from

$$\displaystyle \begin{aligned} {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^F = H^{-1} \big(H^{-1}\big)^T \ . \end{aligned} $$
(25)

(cf. Schuh and Brockmann (2020, Sec. 5)). Because of the warm up phase of the filter approach this covariance matrix must be modified with respect to the influence of \({\mathcal {S}}_0\) by the matrix \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}^W\) which elements are computed according to (17) by

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sigma_{{{\mathcal{S}}_0^i}^2}^W &=& \prod_{n=1}^i \alpha_n^2 \sigma_{{\mathcal{S}}_0}^2 ,\quad \sigma_{{\mathcal{S}}_0}^2 ,\quad i=1,\ldots,t \end{array} \end{aligned} $$
(26)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \sigma_{{\mathcal{S}}_0^{ij}}^W &=& \sigma_{{\mathcal{S}}_0^{ji}}^W =\!\! \prod_{n=i+1}^j\!\! \alpha_n\ \sigma_{{{\mathcal{S}}_0^i}^2}^W , \ \begin{array}{l}i=1,\ldots,t\\j=i,\ldots,t\ .\end{array} \end{array} \end{aligned} $$
(27)

The complete covariance matrix \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}\) of a TVAR(1) process results from the sum of the two parts according to (23) and is identical to the calculation of the covariance matrix by (17) or (21) and (22) respectively (see also Fig. 1).

3 Time-Variable Collocation of a DInSAR Point Stack

We apply these inhomogeneous covariances to model the temporal component of a spatio-temporal point stack derived from a DInSAR-SBAS analysis. The test region is the Lower-Rhine Embayment in North Rhine-Westphalia, Germany, with the still active open-cast mines Garzweiler, Hambach and Inden and the already closed coal mines Sophia-Jacoba in the mining region Erkelenz and Emil Mayrisch in the mining region Aachen. The Remote Sensing Software Graz (RSG) is used to analyze the data from the ERS1 and ERS2 mission. This results in a spatio-temporal point stack of surface displacements with respect to the initial frame 1992, May 5th up to 2000, Dec. 12th (cf. Esch et al. (2019)).

The construction of a time-variable spatio-temporal covariance model allows to use the least squares collocation approach to estimate the surface displacements at any place and at any time and provide a report on the uncertainty of this estimation.

When evaluating the deformations, the estimation error of the prediction should be minimized according to the Wiener-Kolmogorov-Principle. For this purpose, we consider the measured deformations as a special realization of a random process. Since the distribution function of this random process is unknown and no assumptions are to be made about it, we choose a linear approach via the principle of the Best Linear Predictor (BLP) (Teunissen 2007). Due to the pre-processing of the DInSAR data it can be assumed that the expected value of the signal (deformations) becomes zero over the entire area. This implicitly transforms the best linear predictor into the Best Linear Unbiased Predictor (BLUP) (cf. e.g. Schuh (2016, Sec. 3.2) or Teunissen (2007, Corollary I(i))). The BLUP corresponds to the Least Squares Collocation approach (cf. e.g. Moritz (1980, Sec. 11) or Schuh (2016)) and the predictor is defined by

$$\displaystyle \begin{aligned} {\hspace{.1ex}\widetilde{\hspace{-.1ex}{\boldsymbol{\MakeLowercase{s}}}}}_p\!=\!\ {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}\{{\boldsymbol{\MakeLowercase{x}}}_p,{\boldsymbol{\MakeLowercase{x}}}_o\} \Big( \underbrace{{\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}\{{\boldsymbol{\MakeLowercase{x}}}_o,{\boldsymbol{\MakeLowercase{x}}}_o\}\!+\!{\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{N}}$}}}}}\{{\boldsymbol{\MakeLowercase{x}}}_o,{\boldsymbol{\MakeLowercase{x}}}_o\}}_{ := {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S} + \mathcal{N}}$}}}}} \{{\boldsymbol{\MakeLowercase{x}}}_o,{\boldsymbol{\MakeLowercase{x}}}_o\}}\Big)^{-1} \!\!\!{\Delta {\boldsymbol{\MakeLowercase{\ell}}}} {} \end{aligned} $$
(28)

where \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}\{{\boldsymbol {\MakeLowercase {x}}}_o,{\boldsymbol {\MakeLowercase {x}}}_o\}\) denotes the covariance matrix between the observed locations \({\boldsymbol {\MakeLowercase {x}}}_o\), whereas \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {S}}$}}}}}\{{\boldsymbol {\MakeLowercase {x}}}_p,{\boldsymbol {\MakeLowercase {x}}}_o\}\) represents the covariance matrix between the observed locations \({\boldsymbol {\MakeLowercase {x}}}_o\) and the location \({\boldsymbol {\MakeLowercase {x}}}_p\) which are supposed to be predicted. \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {N}}$}}}}}\{{\boldsymbol {\MakeLowercase {x}}}_o,{\boldsymbol {\MakeLowercase {x}}}_o\}\) reflects the noise characteristics. Here \({\Delta {\boldsymbol {\MakeLowercase {\ell }}}}\) represents the observed displacements of the point stack. In this example 144.302 scatterers are identified in 64 time frames. The data points are clustered in urban regions. To achieve a homogeneous data distribution as well in urban regions as in rural regions the whole area is divided in \([9x7]\) tiles and in each tiles the same number of points are randomly selected.

The huge computational effort to solve

$$\displaystyle \begin{aligned} \Big({\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S} + \mathcal{N}}$}}}}} \{{\boldsymbol{\MakeLowercase{x}}}_o,{\boldsymbol{\MakeLowercase{x}}}_o\} \Big)^{-1} {\Delta {\boldsymbol{\MakeLowercase{\ell}}}} \ , \end{aligned} $$
(29)

for which the dimension follows from the number of measurements can be significantly reduced in case the covariances can be separated into a spatial and temporal domain and by the use of finite covariance functions (Schuh 1989).

Spatial Covariance Model

To make the covariances in space independent from the time we only consider the observed displacements of the same time difference to compute the spatial empirical covariance function. For each chosen time difference the empirical covariance function is computed and provides a sample of the stochastic behavior. All samples are documented in Fig. 3 (left). By plotting the confidence region for the estimates it can be stated that the spatial behavior is homogenous with respect to the time. These samples of empirical covariance functions are approximated by a finite covariance function which is constructed by the autocorrelation of truncated polynomial base functions (cf. Schubert and Schuh (2022)). The positive definite finite analytical covariance function can be seen in Fig. 3 (right). Due to the finite support of this positive semidefinite function the covariance matrix is sparse.

Fig. 3
figure 3

Spatial covariance functions of the DInSAR point stack. (left) empirical covariance function of the distortions with respect to equal time differences (right) analytic model of the spatial covariance function

Temporal Covariance Model

The data characteristics in the time domain are characterized by the epochs of the available SAR recordings. Especially for the images of the ERS1 and ERS2 satellites the recorded data are irregularly distributed in time. From the variance plot in Fig. 4 (a) the time dependence of this signal is obvious. We approximate these variances by an equidistant TVAR(1) process with a sampling that is twice as high as the time difference of the ERS1 and ERS2 recordings. The time variation of the coefficients is modeled by a polynomial of degree three.

Fig. 4
figure 4

Temporal covariance modeling of the DInSAR point stack. (a) empirical variances of the distortions with respect to dates compared with the variances derived from the approximated TVAR(1) model (b) empirical determined temporal covariance matrix (c) covariance matrix derived from the TVAR(1) model thinned for the measurement dates

The variances of the TVAR(1) model can also be seen in Fig. 4 (a). The covariance matrix for all equidistant time points of the TVAR(1) model follows (22) and can be downsampled to the measurement epochs. The identification of the measurement dates is done by the nearest neighbors. Thus, we obtain the temporal covariance matrix at the identified measurement dates from the TVAR(1) model which is shown in Fig. 4 (b).

It should be mentioned, that the temporal covariance can be computed only for discrete times, but of arbitrarily small time intervals.

Separable Spatio-Temporal Collocation Approach The above investigations have shown that the spatio-temporal covariance function can be separated into a time-variable temporal \(\gamma _{t}(t,t+h)\) and a homogeneous spatial component \(\gamma _{sp}({\Delta {\MakeLowercase {x}}})\),

$$\displaystyle \begin{aligned} \gamma({\Delta {\MakeLowercase{x}}},t,t+h) = \gamma_{t}(t,t+h) \, \cdot\, \gamma_{sp}({\Delta {\MakeLowercase{x}}})\ . \end{aligned} $$
(30)

Since only permanent back scatterers, which are detected in all recordings, are included in the SBAS solution, the temporal distances are, however, the same for all scatterers. This allows for a compact representation of the covariance matrices by the Kronecker product

$$\displaystyle \begin{aligned} {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}\{{\boldsymbol{\MakeLowercase{x}}}_k, {\boldsymbol{\MakeLowercase{t}}}_k;{\boldsymbol{\MakeLowercase{x}}}_o,{\boldsymbol{\MakeLowercase{t}}}_o\} = {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^t\{{\boldsymbol{\MakeLowercase{t}}}_k,{\boldsymbol{\MakeLowercase{t}}}_o\} \otimes {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^{sp}\{{\boldsymbol{\MakeLowercase{x}}}_k,{\boldsymbol{\MakeLowercase{x}}}_o\}, \end{aligned} $$
(31)

with \(k\in \{p,o\}\) and (28) can thus be represented by

$$\displaystyle \begin{aligned} \begin{array}{rcl} & {}{}{}&\displaystyle {{\hspace{.1ex}\widetilde{\hspace{-.1ex}{\boldsymbol{\MakeLowercase{s}}}}}_p = {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^t\{{\boldsymbol{\MakeLowercase{t}}}_p,{\boldsymbol{\MakeLowercase{t}}}_o\} \otimes {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^{sp}\{{\boldsymbol{\MakeLowercase{x}}}_p,{\boldsymbol{\MakeLowercase{x}}}_o\} }\\ & &\displaystyle \hspace{-3mm}\Big(\underbrace{{\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^t\{{\boldsymbol{\MakeLowercase{t}}}_o,{\boldsymbol{\MakeLowercase{t}}}_o\} \otimes {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^{sp}\{{\boldsymbol{\MakeLowercase{x}}}_o,{\boldsymbol{\MakeLowercase{x}}}_o\} + {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{N}}$}}}}}}_{ {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S} + \mathcal{N}}$}}}}} \{{\boldsymbol{\MakeLowercase{x}}}_o,{\boldsymbol{\MakeLowercase{t}}}_o; {\boldsymbol{\MakeLowercase{x}}}_o,{\boldsymbol{\MakeLowercase{t}}}_o\}} \Big)^{-1} \!\!\!\!{\Delta {\boldsymbol{\MakeLowercase{\ell}}}}\;, \end{array} \end{aligned} $$
(32)

where \({\boldsymbol {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldsymbol {\mathcal {N}}$}}}}}\) characterizes the noise component. If the noise is designed appropriately, the calculations of the estimator can be split into a temporal and spatial component according to the rules of array algebra (cf. e.g. Blaha (1977); Rauhala (1974))

$$\displaystyle \begin{aligned} \begin{array}{rcl} & {}{}{}&\displaystyle {{\hspace{.3ex}\widetilde{\hspace{-.3ex}{\boldsymbol{S}}}}_p = {{\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^{sp}\{{\boldsymbol{\MakeLowercase{x}}}_p,{\boldsymbol{\MakeLowercase{x}}}_o\} \big({\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}+{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{N}}$}}}}}^{sp}\{{\boldsymbol{\MakeLowercase{x}}}_o,{\boldsymbol{\MakeLowercase{x}}}_o\}\big)^{-1}} }\\ & &\displaystyle \Delta{\boldsymbol{{L}}} \ {\big({\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}+{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{N}}$}}}}}^t\{{\boldsymbol{\MakeLowercase{t}}}_o,{\boldsymbol{\MakeLowercase{t}}}_o\}\big)^{-1} {\boldsymbol{{\Sigma}}}_{{\mbox{{\hspace{-0.1em}{$\boldsymbol{\mathcal{S}}$}}}}}^t\{{\boldsymbol{\MakeLowercase{t}}}_o,{\boldsymbol{\MakeLowercase{t}}}_p\}}\ . \end{array} \end{aligned} $$
(33)

Here the observations are arranged in the matrix \(\Delta {\boldsymbol {{L}}}\), each column represents the displacement of all scatterers for a specific epoch, i.e.

$$\displaystyle \begin{aligned} \Delta{\boldsymbol{{L}}} := \text{reshape}({\Delta {\boldsymbol{\MakeLowercase{\ell}}}},n^{sp}_o,n^{t}_o)\ , \end{aligned} $$
(34)

where \(n^{sp}_o\) denotes the number of observed scatterers and \(n^{t}_o\) the number of recordings (time frames). The same rearrangement is done for the predicted values

$$\displaystyle \begin{aligned} {\hspace{.3ex}\widetilde{\hspace{-.3ex}{\boldsymbol{S}}}}_p := \text{reshape}({\hspace{.1ex}\widetilde{\hspace{-.1ex}{\boldsymbol{\MakeLowercase{s}}}}}_p,n^{sp}_p,n^{t}_p)\ , \end{aligned} $$
(35)

with \(n^{sp}_p\) the number of points to predict and \(n^{t}_p\) the number of time frames to be predicted. According to the rules of array algebra the noise can be designed in two different ways without destroying the Kronecker structure, either

(36)

But in both cases the interpretation of the noise behaviour is not straightforward. A much more obvious choice for the noise would be

(37)

As shown in Schuh et al. (2022) an eigenvalue decomposition of \({\boldmath {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldmath {\mathcal {S}}$}}}}}^{sp}\{{\boldmath {\MakeLowercase {x}}}_o,{\boldmath {\MakeLowercase {x}}}_o\}\) into \({\boldmath {\mathcal {U}}}_{sp} {\boldmath {{\Lambda }}}_{sp} {\boldmath {\mathcal {U}}}_{sp}^T\) or \({\boldmath {{\Sigma }}}_{{\mbox{ {\hspace{-0.1em}{$\boldmath {\mathcal {S}}$}}}}}^{t}\{{\boldmath {\MakeLowercase {x}}}_o,{\boldmath {\MakeLowercase {x}}}_o\}\) into \({\boldmath {\mathcal {U}}}_{t} {\boldmath {{\Lambda }}}_{t} {\boldmath {\mathcal {U}}}_{t}^T\) again gives a separable form for the prediction,

(38)

The great advantage of the collocation approach is that besides the predicted values, the accuracy of the prediction can also be determined by variance propagation (Moritz 1980, Sec. 17). Also these calculations can be separated into a temporal and spatial component (cf. Schuh et al. (2022)).

Results of the Rigorous Collocation of a DInSAR-Stack

Our test region is, as mentioned above the Lower-Rhine Embayment in North Rhine-Westphalia, Germany. For ERS1 and ERS2, the DInSAR-SBAS analysis results in a spatio-temporal point stack with 144.302 permanent scatterers in 64 time frames. The covariances are separated in a time-variant temporal component and a homogeneous space component. Since the data are available strictly at the respective recording times, a Kronecker representation of the covariance matrices is possible, which allows to split the calculations into a temporal and a spatial one. Thus, the numerical complexity of the task can be reduced significantly and it becomes possible to compute this very extensive collocation task on a workstation or notebook within about one to two hours.

With the collocation methods, surface deformations can be predicted for any location at any discrete time point. The tailored collocation approach elaborated here thus provides a continuous prediction in space for previously freely defined discrete time points. Beside the predicted values, their uncertainty is also quantified. In Fig. 5 the effects caused by groundwater management from the active opencast mines Garzweiler, Hambach and Inden are clearly recognizable by subsidence. Whereas in the already closed coal mines, an uplift is taking place. The accuracy (standard deviation) of the prediction is in a range 5–15 [mm] and it is immediately apparent that this accuracy is very heterogeneous. The bright points correspond to the measured permanent scatterers, while measurements in the vicinity are missing for the brown areas or they correspond to an extrapolation outside the image scene.

Fig. 5
figure 5

Predicted surface displacements (left) and their uncertainties (right) in the Lower-Rhine Embayment in North Rhine-Westphalia

The orange line from the northwest to the southeast shown in Fig. 5 (left) marks a profile. Figure 6 (left) displays the behaviour of the displacement in time along this profile. Figure 6 (right) shows the displacement for the time span 1992.4 to 2000.9 [yr] and their predicted accuracies. These are just a few examples to illustrate the many possibilities of the collocation approach. In Schuh et al. (2022) the patterns of movement in time are provided as an animation.

Fig. 6
figure 6

Time-dependent behavior of the surface displacement along the northwest-southeast profile (see shown in Fig. 5) (left) behavior in time predicted profile (right) distortion in a fixed time span and their uncertainties

To study the benefit of the time-variable covariance model it will be now of interest to show the difference between modeling with a time-variable and a static temporal covariance function. As static covariance function, a Gaussian function with a standard deviation of 5 [mm] and a half-value width of approx. 4 [yr] years is fitted to the mean empirical covariances calculated from all ’training points’. To stabilize the temporal covariance matrix, an additional i.i.d. noise of with a standard deviation of 2 [mm] has to be introduced. Using the residuals between predicted and measured deformations on 912 randomly selected test locations, the mean, standard deviation, RMS and maximum deviations for each time epoch is empirically determined and compared to the predicted formal standard deviations. Figure 7 summarizes the differences between static and time-varying modeling. While only minor differences can be observed in the predicted values, the predicted formal variances show a significantly different behavior. While in the static case the variances are constant over time, the time-variable modeling shows a steady increase in the variances in line with the empirical values. In contrast to static modeling, the time-variable formulation thus results in better consistent behavior between the model and the data and opens up further possibilities to fit the model even better to the data.

Fig. 7
figure 7

Statistics of the residuals between predicted and measured distortions on 912 randomly distributed test locations. (left) static (right) time-variable temporal covariance function

4 Summary and Outlook

In addition to the many advantages of the collocation method as a data-adaptive method, it is repeatedly stated that there is a lack of flexibility in the modeling of the covariances and that the models cannot be implemented due to the enormous computational effort. In this work, we demonstrate that these limitations can be overcome by appropriate methodological approaches. The advantages of the collocation method can be even used in case of time-varying behavior and extensive measurement points. Since the collocation method can be used to estimate function values and their uncertainties for arbitrary locations and times, this method is also very well suited for the fusion of SAR data with other data, e.g. epoch-wise levelling campaigns, which will be investigated in the future.