Modeling trends and periodic components in geodetic time series: a unified approach

Kermarrec, Gaël; Maddanu, Federico; Klos, Anna; Proietti, Tommaso; Bogusz, Janusz

doi:10.1007/s00190-024-01826-5

Modeling trends and periodic components in geodetic time series: a unified approach

Original Article
Open access
Published: 04 March 2024

Volume 98, article number 17, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Geodesy Aims and scope Submit manuscript

Modeling trends and periodic components in geodetic time series: a unified approach

Download PDF

Gaël Kermarrec ORCID: orcid.org/0000-0001-5986-5269¹,
Federico Maddanu²^na1,
Anna Klos³^na1,
Tommaso Proietti⁴ &
…
Janusz Bogusz⁵

1712 Accesses
Explore all metrics

Abstract

Geodetic time series are usually modeled with a deterministic approach that includes trend, annual, and semiannual periodic components having constant amplitude and phase-lag. Although simple, this approach neglects the time-variability or stochasticity of trend and seasonal components, and can potentially lead to inadequate interpretations, such as an overestimation of global navigation satellite system (GNSS) station velocity uncertainties, up to masking important geophysical phenomena. In this contribution, we generalize previous methods for determining trends and seasonal components and address the challenge of their time-variability by proposing a novel linear additive model, according to which (i) the trend is allowed to evolve over time, (ii) the seasonality is represented by a fractional sinusoidal waveform process (fSWp), accounting for possible non-stationary cyclical long-memory, and (iii) an additional serially correlated noise captures the short term variability. The model has a state space representation, opening the way for the evaluation of the likelihood and signal extraction with the support of the Kalman filter (KF) and the associated smoothing algorithm. Suitable enhancements of the basic methodology enable handling data gaps, outliers, and offsets. We demonstrate the advantage of our method with respect to the benchmark deterministic approach using both observed and simulated time series and provide a fair comparison with the Hector software. To that end, various geodetic time series are considered which illustrate the ability to capture the time-varying stochastic seasonal signals with the fSWp.

Least Squares Contribution to Geodetic Time Series Analysis

Stochastic Modelling of Geophysical Signal Constituents Within a Kalman Filter Framework

Modelling the GNSS Time Series: Different Approaches to Extract Seasonal Signals

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Trends and periodic signals are present in many domains and can be related to climate change (Mudelsee 2019), temperatures, ethane or carbon dioxide concentrations (Maddanu and Proietti 2023), stratospheric ozone (Bloomfield et al. 1994), global water storage (Schmidt et al. 2008), or total solar irradiance (Montillet et al. 2022). The analysis of the time-evolution of trend, as well as the time-varying phase and amplitude of seasonality, is also valuable within the context of structural health monitoring, such as bridge oscillations (Omidalizarandi et al. 2020), wind turbines (Zivanovic et al. 2023), or stock exchange dynamics (Stratimirović et al. 2018).

In the context of geodetic time series, trends and seasonal signals can be readily identified. We cite, for instance, the altimetry and tide-gauge sea-level records (Benveniste et al. 2020), the displacements recorded by permanent stations of Global Navigation Satellite System (GNSS) or Doppler Orbitography and Radiopositioning Integrated by Satellite (DORIS) (Altamimi et al. 2023; Chanard et al. 2020), GNSS time series used for frame realization (Freymueller 2009), or terrestrial water storage time series derived from the Gravity Recovery and Climate Experiment (GRACE) satellite mission. The ability to extract the amplitudes and phase-lags of seasonal signals is crucial for revealing geophysical anomalies or interesting variability at low frequencies. Not only the periodicity is worthy of study but also linear or stochastic trends, involving change-points, accelerated increases, or nonlinear behavior, which are linked to climate changes (Mudelsee 2019). In these contexts, significant trends and seasonal variations are attributed to factors such as changes in global sea level, plate motion, extended droughts due to climate or human activities, gravitational excitation, variations in soil moisture, and fluctuations in atmospheric or oceanic loading. These factors are explored in studies such as (Altamimi et al. 2023; Rodell et al. 2018; Cheng et al. 2021). Unfortunately, inconsistent conclusions may arise if the model for the periodic and linear components is misspecified. A notable example is the divergence found in the surface air temperature of the Northern Hemisphere between (Thompson and Wallace 1998) and (Barbosa and Andersen 2009), employing different estimation methods for the same dataset (complex demodulation versus dynamic linear model representation of autoregressive (AR) process).

For simplicity, it is commonly assumed in geodesy that the trend and the amplitudes and phase-lags of periodic components are deterministic, estimated using the least-squares method (Tiao et al. 1990). However, Davis et al. (2012) and Wernicke and Davis (2010) (and the references therein) highlighted that the seasonal signals in various geodetic time series should be treated as stochastic. They used the displacement time series (DTS) observed by GNSS stations and DTS derived from GRACE records for GNSS locations to support their view. Since then, various methods have been proposed to account for time variations of trend and seasonal components and have been constantly improved. These methods include moving average least-squares (Klos et al. 2018a), trigonometric functions, sample-splitting to detect changes, or local averaging (Artemov et al. 2015; Nadaraya 1964), among others. Alternative proposals are based on wavelet analysis (Ji et al. 2020), singular spectrum analysis (Chen et al. 2013), principal component analysis (Shen et al. 2013), the adaptive Wiener filter (Klos et al. 2020), and empirical mode decomposition and its extensions (Huang et al. 1998; Li and Guo 2023). For a comparison of some of these methods and further references, we refer to Deng and Fu (2019) or Klos et al. (2017).

Models of the unobserved stochastic components of trends and seasonal components provide the flexibility needed to capture interesting variability in geodetic time series. The statistical analysis of these data sets can be conducted using state space methods, which are fundamental tools for estimating underlying parameters, as well as for signal extraction and prediction. The approach of Davis et al. (2012) or Didova et al. (2016) marked a first step toward a more data-coherent approach to modeling seasonal components in geodetic time series using the Kalman Filter (KF). Applications include estimating the realistic velocity of tectonic plates together with its uncertainty and gaining a better understanding of seasonal variations, such as those due to droughts (Van Loon et al. 2014). Previous attempts to use the KF for extracting time-varying seasonal signals in geodesy were restricted to a random walk (RW), as in Davis et al. (2012), or an AR model of first order (AR(1), also called red noise or first-order Markov process (Didova et al. 2016)). A similar approach was proposed in Ming et al. (2019), who developed a network-based KF using a generalized simulated annealing algorithm, where each component, except for the trend, was allowed to variate over time, and their amplitudes were estimated by maximization of likelihood function. In this contribution, we propose to generalize those representations of the trend and periodic components and further extend the basic statistical treatment to account for data gaps, outliers, or offsets.

This paper makes a twofold contribution. Firstly, we introduce a new process to geodesy, the fractional Sinusoidal Waveform process for periodic components, referred to as fSWp (Proietti and Maddanu 2022). This process is defined by the modulation of trigonometric functions by two independent fractional noise (FracN) processes, discrete-time counterparts of changes in fractional Brownian motion. These processes share the same memory and variance and can potentially be non-stationary, with RW as a limiting case. In the FracN models, the persistence or long-range dependence of the underlying stochastic processes is regulated by a memory parameter; defined by $d \in (0,1)$. Mandelbrot and Van Ness in Mandelbrot and Ness (1968) proposed the name FracN (sometimes called fractal noise) to emphasize that the exponent of the spectrum could take non-integer values. The Hurst exponent introduced by Mandelbrot and Ness (1968) is related to the memory parameter via the relation $d=H-0.5$ (cf. (Hosking 1981)). The (discrete) flicker noise used in geodesy can be approximated using a FracN and refers to a process with a power spectral density proportional to 1/f, with f the frequency (Rekhviashvili 2006).

The fSWp is more general than previous proposals, encompassing traditional models of stochastic and deterministic seasonality. The stochastic component is inherently ’rough’,^{Footnote 1} i.e., having potentially a RW-like behavior for a memory parameter less than unity, as the time-varying amplitude and phase are driven by FracN. This allows for a different interpretation than the usual short-memory assumption, as proposed in Klos et al. (2020). From a statistical standpoint or goodness of fit, the fSWp appears particularly suitable for studying stochastic seasonal signals of climatic origin (Proietti and Pedregal 2023). We extend this method to time series with an additional trend (stochastic or deterministic), whenever the decay of the autocovariance function with time is power-like and slower than exponential, such that the process exhibits long-range dependence or long-memory (Hassler 2019).

Our second contribution is empirical and focuses on illustrating the potential of unobserved stochastic components in trend and seasonal periods for modeling a variety of geodetic time series. We specifically use geodetic time series known to have strong trends and seasonal components for illustration, including vertical DTS recorded by GNSS permanent stations (both preprocessed and not preprocessed), vertical DTS predicted by environmental loading models (non-tidal hydrospheric and atmospheric loading models), and GNSS-derived precipitable water vapor (PWV) time series. We use the IGS station called DRAO located in Penticton (British Columbia, Canada) for that purpose, without lack of generality. Our modeling significantly influences the statistical significance of the parameters, providing a basis for a trustworthy climatological interpretation (Alshawaf et al. 2018). We will compare some of our results with the one given by the Maximum Likelihood-based software called Hector (Bos et al. 2013) using the noise model "VaryingAnnual", which relies on the work of Langbein (2004). Information criteria and autocorrelation analysis of residuals will be utilized to assess the goodness of fit of the various models under consideration. We will discuss further filtering strategies such as the Savitzky-Golay filter for extracting stochastic trends (Langbein 2004; Savitzky and Golay 1964). Throughout the manuscript, we will ensure that a reader with a geodetic background can understand the statistical developments that we propose to present rigorously. This article should be considered a well-founded introduction to the fSWp and will be followed by a second one, where a systematic analysis of the differences with, e.g., the Hector software will be conducted.

The remainder of this paper is structured as follows: The first section introduces the fSWp and the KF methodology. In the second section, we describe the various datasets and discuss their optimal fitting, comparing them with the widely used fully deterministic approach. Monte-Carlo simulations, presented in an appendix, will complement this contribution, highlighting the high potential of the fSWp within a geodetic context.

2 Methodology

The theory described below is based on the recent contributions provided by Proietti and Maddanu (2022) and Maddanu and Proietti (2023) in climate time series analysis. Section 2.1 presents a process for long-range dependent cyclical/seasonal time series, embedding both stochastic and deterministic cycles. Section 2.4 introduces a general parametric model to describe periodical geodetic signals, based on the theory of the previous section. Finally, Sect. 2.6 provides statistical inference according to the augmented KF methodology (see de Jong (1989) and Proietti and Luati (2013)). Outliers, offsets, data gaps, and model restrictions are discussed in the remaining sections.

2.1 The fractional sinusoidal waveform process

2.1.1 Introduction

A deterministic periodic component is generated by the following harmonic model:

$$\begin{aligned} s_t= a \cos (\lambda t) + a^* \sin (\lambda t), \end{aligned}$$

(1)

for $t=1, \ldots , n$. Equation (1) defines a sinusoidal wave with period $2 \pi /\lambda $, constant amplitude, $ \text{ A }=\sqrt{a^2 + a^{*2}} $, and phase displacement equal to $\text{ Ph }= \arctan (-\frac{a^{*}}{a})$. The properties of $s_t$ depend on $a, a^*$ and $\lambda $. By assuming that a and $a^{*}$ are drawn from independent normal distributions with mean zero and variance $\sigma _{a}^2$ and defining $\gamma (k) = E(s_t s_{t-k})$ as the autocovariance function of the process, it holds that $\gamma (k) = \sigma _{a}^2 \cos (\lambda k)$. A deterministic cycle is "smooth" and perfectly predictable (i.e., with zero forecast error) using a finite realization.

A stochastic periodic component arises by letting the parameters that regulate the amplitude and phase-lag to evolve stochastically over time, i.e.,

$$\begin{aligned} s_t= a_t \cos (\lambda t) + a_t^* \sin (\lambda t), \end{aligned}$$

(2)

where we let $a_t$ and $a_t^*$ evolve over time according to two orthogonal colored noise processes (Williams 2003a). A general specification, encompassing the most popular model of stochastic components posits $(1-\phi L)^d a_t= \eta _t,$ and $(1-\phi L)^d a_t^*= \eta _t^*$, where $\eta _t$ and $\eta _t^*$ are orthogonal white noise sequences with variance $\sigma _{\eta }^2$, and L is the lag operator.

2.1.2 Special cases

For $0<\phi < 1$ and $d=1$, $a_t$ and $a_t^*$ are generated by a red noise process, also called first-order Markov process or autoregressive process of order 1 abbreviated as AR(1). As a result, $s_t$ is a stationary cycle with ARMA(2,1) representation exhibiting short memory^{Footnote 2} (Hannan 1964).
For $\phi =d=1$, $a_t$ and $a_t^*$ are generated by a brown noise process (integrated white noise or RW) and $s_t$ is a non-stationary (unit root) cycle at the frequency $\lambda $. We recall that brown noise has a frequency density proportional to $1/f^2$. Pink noise (sometimes called flicker noise or 1/f noise) falls between brown and white noise. Both can be generated by sequencing a zero-mean white noise through an AR filter of order N following Kasdin (1995). The more general FracN has a frequency density proportional to $1/f^{2d}$ with $d \in (0,1)$.
The fSWp proposed by Proietti and Maddanu (2022) arises in the case $\phi =1$ and $d>0$, such that $a_t$ and $a_t^*$ are FracN processes (Hosking 1981; Andvel 1986),
$$\begin{aligned}{} & {} a_t= (1-L)^{-d} \eta _t, \eta _t\sim \text{ WN }(0,\sigma ^2_\eta ),\nonumber \\{} & {} \;\;\;a_t^*= (1-L)^{-d} \eta _t^*, \eta _t^*\sim \text{ WN }(0,\sigma ^2_\eta ). \end{aligned}$$
(3)
with infinite moving average (MA) representation $a_t= \sum _{i=0}^{\infty } \psi _i(d) \eta _{t-i}$ and $a^*_t= \sum _{i=0}^{\infty } \psi _i(d) \eta ^*_{t-i}$, where the coefficients $\psi _i(d)=\frac{\Gamma (i+d)}{\Gamma (i+1)\Gamma (d)}$ come from the binomial expansion of $(1-L)^{-d}$.

2.1.3 Properties

The properties of the FracN process depend on the memory parameter d. The process is stationary if $d < \frac{1}{2}$, and non-stationary for $d > \frac{1}{2}$. More explicitly, if $0<d<\frac{1}{2}$:

1.
The spectral density and autocovariance functions of both $a_t$ and $a_t^*$ are, respectively,
$$\begin{aligned} f_a(\omega ) = \frac{\sigma _{\eta }^2}{2 \pi } \biggr ( 2 \sin (\frac{\omega }{2}) \biggr )^{-2d}, \end{aligned}$$
(4)
and
$$\begin{aligned} \gamma _a(k)=\sigma _{\eta }^2 \frac{\Gamma (1-2d)\Gamma (d+k)}{\Gamma (k+1-d)\Gamma (d)\Gamma (1-d)}, \end{aligned}$$
(5)
where $\Gamma (u)= \int _0^\infty z^{u-1} e^{-z} dz$ is the Euler Gamma function (see Hosking (1981) and Andvel (1986) for the proof of Eqs. 4 and 5).
2.
The fSWp $s_t$ has spectral density
$$\begin{aligned} f(\omega )= & {} \frac{\sigma ^2_\eta }{4\pi } \biggr [ \left( 2\sin \left( \frac{\omega -\lambda }{2}\right) \right) ^{-2d}\nonumber \\{} & {} + \left( 2\sin \left( \frac{\omega +\lambda }{2}\right) \right) ^{-2d} \biggr ] \end{aligned}$$
(6)
and autocovariance function
$$\begin{aligned} \gamma (k) = \gamma _a(k) \cos (\lambda k), \end{aligned}$$
(7)
as showed by Proietti and Maddanu (2022). Hence, the process displays cyclical long-memory in the sense specified by Oppenheim and Viano (2004), since as $k \rightarrow \infty $ the autocovariance function is a cosine wave modulated by a hyperbolically decaying sequence, $\gamma (k) \sim \frac{k^{2d-1}}{\Gamma (d)} \cos (\lambda k)$, and the spectral density is unbounded at the frequency $\lambda $, i.e., $f(\omega ) \sim \frac{\sigma ^2}{4 \pi } |\omega - \lambda |^{-2d}$, as $\omega \rightarrow \lambda $.

A nice feature of the fSWp model is that it encompasses non-stationary persistent cycles ($\frac{1}{2} < d\le 1$) as well as deterministic cycles which are a limiting case when $d\rightarrow \frac{1}{2}$ from the left and $\sigma ^2_\eta \rightarrow 0$. In that case, $\gamma (k)\rightarrow \sigma ^2_a \cos (\lambda k)$.

2.2 Simplification

The model of Eq. (2) with $a_t= (1-\phi L)^{-d} \eta _t, \;\;\;a_t^*= (1-\phi L)^{-d} \eta _t^*,$ is the most general one. However, joint inferences on $(\phi ,d)$ are problematic when $\phi $ is close to 1 and d is the non-stationary region^{Footnote 3} (the information matrix is almost singular as with an RW, i.e., $\phi =1$ and $d=1$). To address this challenge, we suggest setting $\phi =1$ and focusing on the fSW model as the primary case of interest for geodetic time series. To accommodate the non-stationary scenario, we assume that the FracN processes, governing the amplitude and phase of the cycle, initiate at time $t=0$. Consequently, we can express their MA representation in a finite form as follows:

$$\begin{aligned} a_t= a_0 + \sum _{i=0}^{t-1} \psi _i(d) \eta _{t-i}, \;\;\;a_t*= a_0^* + \sum _{i=0}^{t-1} \psi _i(d) \eta _{t-i}^*, \end{aligned}$$

(8)

where $a_0$ is a constant term embedding the past information: $a_0 = \sum _{i=t}^{\infty } \psi _i(d) \eta _{t-i}$. A similar representation holds for $a_0^*$. The time series model for $a_{t}$ and $a_t^*$ in Eq. (8) is referred to as a type II fractionally integrated noise (Marinucci and Robinson 1999).

The fSWp as in Eqs. (2) and (8) defines a stochastic periodic model, characterized by time-varying amplitude, $A_{t}=\sqrt{(a_{t}-a_{0})^2 + (a_{t}^*-a_{0}^*)^{2}}$, and phase $\text{ Ph}_{t}=\arctan ((a_{t}^*-a_{0}^*)/(a_{t}-a_{0})).$ The processes driving their evolution are independent FracNs with the same variance and memory parameter. The parameters d and $\sigma ^2_\eta $ regulate the persistence and the smoothness of the cycle.

To illustrate the FracN process, Fig. 1a displays time series of length 5,000 simulated from Eq. (3), with $\sigma ^2_\eta = 1$ and d ranging from 0.1 (blue line) to 0.7 (yellow line). In the latter case, the FracN process is non-stationary. Panel (b) exhibits the corresponding periodogram, which, as anticipated from Eq. (3), displays a power-law dependence with respect to frequency. Panel (c) presents realizations of fSWp (Eq. 2) with $\lambda =2$. The corresponding periodograms are plotted in panel (d), along with their theoretical spectrum. They show the typical power-law dependence with frequency within a given bandwidth. The broad peak at the yearly frequency ($\omega = 0.017$ rad) is the distinctive indicator of the fSWp, interpreting the stylized characteristics of periodicity in geodetic time series, see Davis et al. (2012) and Klos et al. (2020).

2.3 A parallel with the bandpass noise model

Langbein (2004) accounted for the aforementioned widening of the power spectrum due to the amplitude modulation of a periodic signal using the concept of an additive bandpass noise. The autocovariance of this process as implemented in the software Hector (Bos et al. 2013) under "Varyingseasonal" for a maximum likelihood estimation, is inspired by the development of Klos et al. (2020) and reads:

$$\begin{aligned} \gamma _a(k)=\sigma _{\eta }^2 \frac{\phi _{d=1}^k}{2(1-\phi _{d=1}^2)} \cos (\lambda k). \end{aligned}$$

(9)

where $\phi _{d=1}$ corresponds to the modulation of the periodic signal with an AR(1) process. The parallel with Eq. (7) is strong, but the underlying philosophy differs. While both processes can be non-stationary, the fSWp can model long-range dependencies through the memory parameter d of the FracN process. However, this modelization seems to be the closest to our proposal in the Hector software. The Wiener filter as proposed by Klos et al. (2020) necessitates approximated values for the noise and has the same aforementioned limitation. We will show further in Sect. 3 that the fitted time series using the fSWp are more realistic than with the Wiener filter approach. We will, thus, restrict ourselves to a comparison of the fSWp with the Band-pass noise model, abbreviated as BP in the following.

2.4 A general parametric model for periodic geodetic signals

In the analysis of geodetic time series, linear additive models are fundamental. The primary objective is to provide a coherent representation of the data, distinguishing between the trend component that characterizes the long-term behavior and the seasonal component responsible for the majority of the series’ variability. Linear trends, coupled with colored noise (such as RW or flicker noise, Williams 2003a; Klos et al. 2018b; Montillet and Bos 2020), are commonly employed to model the nonseasonal aspects. Simultaneously, deterministic trigonometric cycles, specifically at annual and semiannual frequencies, are used to capture the seasonality. As pointed out by Maddanu and Proietti (2023) and Friedrich et al. (2020), a deterministic model may mask important features in the data. Considering the trend and seasonal components as stochastic, with deterministic features included as limiting cases, is a more general and preferable assumption. In the following, we let $y_t$ for $t=1,2,\ldots ,n$ be a daily geodetic time series. We propose a linear additive specification:

$$\begin{aligned} y_t = \mu _{t} + s_t + u_t, \end{aligned}$$

(10)

where $\mu _t$ is the trend component, modeled as the local linear trend process (Harvey 1990):

$$\begin{aligned} \begin{array}{lclcr} \mu _{t} &{}=&{} \mu _{t-1} + \tau _{t-1} + \varepsilon _t, &{} &{} \varepsilon _t \sim \text{ WN }(0,\sigma ^2_{\varepsilon }), \\ \tau _{t} &{}=&{} \tau _{t-1} + \zeta _t, &{} &{} \zeta _t \sim \text{ WN }(0,\sigma ^2_{\zeta }), \\ \end{array} \end{aligned}$$

(11)

and $\text{ E }(\varepsilon _t\zeta _s)=0, \forall (t,s);$ $\sigma ^2_{\varepsilon }>0$ allows the level of the trend to evolve stochastically over time, while $\sigma ^2_{\zeta }>0$ accounts for the trend variation. The same trend specification has been considered by (Didova et al. 2016). Note that a standard RW with drift is obtained if $\sigma ^2_{\zeta }=0$, while the deterministic trend arises as a special case if both $\sigma ^2_{\varepsilon }$ and $\sigma ^2_{\zeta }$ are restricted to zero. Indeed, $\sigma ^2_{\zeta }=0$ implies $\tau _{t} = \tau _{t-1} = \tau _0$, where $\tau _0$ is a constant, such that $\mu _{t} = \tau _{0} + \mu _{t-1} + \varepsilon _t$. Then, restricting also $\sigma ^2_{\varepsilon }=0$, we obtain by recursion $\mu _{t} = \mu _{0} + \tau _{0}t$, where $\mu _{0}$ is a constant.

The periodic, or seasonal, component, $s_t$, is modeled as the sum of q fSW processes defined at the annual frequency, $\lambda _1 = 2\pi /P$, $P = 365.25$ and at the harmonic frequencies $\lambda _j = 2\pi j/P$, $j=2, \ldots , m$:

$$\begin{aligned} \begin{array}{rclcr} s_t &{}=&{} \sum \limits _{j=1}^{m} s_{jt}, &{} &{} \\ &{} &{} &{} &{} \\ s_{jt} &{}=&{} a_{jt}\cos (\lambda _j t) + a^*_{jt}\sin (\lambda _j t), &{} \>\>\>\>\>\> &{} \\ a_{jt} &{}=&{} a_{j0} + \sum \nolimits _{i=0}^{t-1} \psi _{ji}(d_j)\eta _{j,t-i}, &{}&{} \eta _{jt} \sim \text{ WN }(0,\sigma ^2_{\eta j}), \\ a_{jt}^* &{}=&{} a_{j0}^* + \sum \nolimits _{i=0}^{t-1} \psi _{ji}(d_j)\eta ^*_{j,t-i}, &{}&{} \eta ^*_{jt} \sim \text{ WN }(0,\sigma ^2_{\eta j}). \end{array}\nonumber \\ \end{aligned}$$

(12)

The parameters $\sigma ^2_{\eta j}$ are specific to each cycle, but they may also be restricted to be invariant with j. Typically, only the annual and semiannual cycles are needed to characterize the seasonal pattern ($m=2$).

Finally, $u_t$ is assumed as a (possibly power-law) noise term following a standard ARMA(r, q) process as developed in Bhootna et al. (2023):

$$\begin{aligned} u_t = \sum ^r_{i=1} \varphi _i u_{t-i} + \sum ^q_{i=1} \upsilon _i \xi _{t-i} + \xi _t \end{aligned}$$

(13)

where $\xi _t \sim N(0,\sigma ^2_\xi )$.^{Footnote 4}

The specification of the model is completed by the assumption that the disturbances $\varepsilon _t$, $\zeta _{t}$, $ \xi _t$, $\eta _{jt}$ and $\eta ^*_{jt}$ for $j=1,2$, are mutually uncorrelated. We refer to Kasdin (1995) for a description of how to generate colored noise models using ARMA representation. We further point out that the approach consisting of approximating the FracN by a truncated ARMA model was developed in Hartl and Jucknewitz (2022a). Focusing on the state space approximations, they find that the ARMA(3,3) and ARMA(4,4) approximations exert a very similar performance, which did not seem to depend on the specification of d. In this contribution, we used a ARMA(3,3) modeling since a higher order does not provide significant advantages in terms of goodness of fitting. This problem has been also investigated by Dmitrieva et al. (2015) within a geodetic content using a sum of AR(1) models as an alternative.

2.5 State space representation

The state space representation of a linear Markovian time series model paves the way for the statistical estimation of parameters and unobserved components, facilitated by computationally efficient algorithms such as the KF and the associated smoothing algorithms. A time series model is Markovian if $y_t$ can be expressed as a linear combination of a finite number of functions of the past, or states.

Long-memory models are not Markov, but a finite approximation is obtained by truncating the infinite MA or AR representation of the process (see for instance Chan and Palma 1998; Dissanayake et al. 2016). An alternative strategy, leading to greater parsimony, deals with approximating the FracN processes $a_{jt}$ and $a_{jt}^*$ by an ARMA(p, p) model, e.g., $(1-\sum _{i=1}^p {\phi }_i(d) L^i) \tilde{a}_t = (1+\sum _{i=1}^p {\beta }_i(d) L^i){\tilde{\eta }}_t$. This approach is proposed by Hartl and Jucknewitz (2022b). Usually, setting $p=3$ suffices.

The coefficients of the ARMA approximating model are obtained by minimizing a least-squares criterion. For that purpose, we let $\tilde{a}_t = \tilde{a}_0 + \sum _{i=1}^{t-1} {\tilde{\psi }}_{i} \eta _{t-i}$ be the process approximating $a_t$ in (8), where the coefficients ${\tilde{\psi }}_i$ satisfy

$$\begin{aligned}\biggr ( 1-\sum _{i=1}^p \phi _i(d) L^i\biggr )\left( \sum _{j=0}^\infty {\tilde{\psi }}_{i} L^j\right) =1+\sum _{i=1}^p {\beta }_i(d) L^i;\end{aligned}$$

The coefficients $(\phi _i (d),\beta _i(d))$, $i=1, \ldots , p,$ are obtained by minimizing the mean square approximation error

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n(n-i+1)\left( \psi _{i}(d) - {\tilde{\psi }}_{i}\right) ^2. \end{aligned}$$

where $\psi _{i}(d)=\frac{\Gamma (i+d)}{\Gamma (i+1)\Gamma (d)}$. The linear Markovian approximating model is then cast in state space form with the following measurement and transition equations (cf. Durbin and Koopman (2012)):

$$\begin{aligned} \begin{array}{lllll} y_t &{}=&{} \textbf{z}_t'\varvec{\alpha }_{t} + \varvec{w}_t'\varvec{\delta }, &{} &{} \\ \varvec{\alpha }_{t+1} &{}=&{} \textbf{T}\varvec{\alpha }_{t} + \textbf{h}\varvec{\epsilon }_t,&{} &{} \end{array} \end{aligned}$$

(14)

where the notation $\text{0 }_p'$ means a vector of length p with all elements equal to zero. We further assume that $q=2,$ so that

$$\begin{aligned} \varvec{w}_t= & {} (1, \cos (\lambda _1 t), \sin (\lambda _1 t), \cos (\lambda _2 t), \sin (\lambda _2 t))' ,\nonumber \\ \varvec{\delta }= & {} (a_{00}, a_{10}, a_{10}^*, a_{20}, a_{20}^*)' , \nonumber \\ \textbf{z}_t'= & {} (1,1, \cos (\lambda _1 t), \text{0 }_p', \sin (\lambda _1 t), \text{0 }_p',\cos (\lambda _2 t),\nonumber \\{} & {} \quad \text{0 }_p', \sin (\lambda _2 t)\text{0 }_p',1,\text{0 }'_{\max (r,q+1)}). \end{aligned}$$

(15)

The state vector has dimension $(2+4p+\max (r,q+1))$,

$$\begin{aligned} \varvec{\alpha }_t = (\mu _t,\tau _t,\varvec{\alpha }^{'}_{\textit{Cy},1t}, \varvec{\alpha }^{*'}_{\textit{Cy},1t}, \varvec{\alpha }^{'}_{\textit{Cy},2t}, \varvec{\alpha }^{*'}_{\textit{Cy},2t},\varvec{\alpha }^{'}_{\textit{No},t})'. \end{aligned}$$

The trend component is represented by the first two elements in $\varvec{\alpha }_t$. The ARMA(p, p) processes approximating the FracN processes regulating the amplitude and phase of the j-th cyclical component $s_{jt}$ in Eq. (12) are represented, for $j=1,2$, by

$$\begin{aligned}\varvec{\alpha }_{\textit{Cy},jt}= \begin{bmatrix} \tilde{a}_{jt}-\tilde{a}_{j0} \\ \phi _2(d_j)(\tilde{a}_{j,t-1}-\tilde{a}_{j0}) + \cdots + \phi _p(d_j)(\tilde{a}_{j,t-p+1}-\tilde{a}_{j0}) + \beta _1(d_j) \eta _{j,t-1}+\cdots + \beta _{p}(d_j) \eta _{j,t-p} \\ \phi _3(d_j)(\tilde{a}_{j,t-2}-\tilde{a}_{j0}) + \cdots + \phi _p(d_j)(\tilde{a}_{j,t-p+1}-\tilde{a}_{j0}) + \beta _2(d_j) \eta _{j,t-1}+\cdots + \beta _{p}(d_j) \eta _{j,t-p+1} \\ \vdots \\ \beta _p(d_j)\eta _{j,t-1} \end{bmatrix}, \end{aligned}$$

and $\varvec{\alpha }^*_{\textit{Cy},jt}$ has identical structure. Finally, the noise term $u_t$ is represented in state space form by

$$\begin{aligned}\varvec{\alpha }_{\textit{No},t}= \begin{bmatrix} {u}_{t} \\ \varphi _2 {u}_{t-1} + \cdots + \varphi _r {u}_{t-r+1} + \upsilon _1 \xi _{t-1} + \cdots + \upsilon _{q} \xi _{t-q} \\ \varphi _3 {u}_{t-2} + \cdots + \varphi _r {u}_{t-r+1} + \upsilon _2 \xi _{t-1}+\cdots + \upsilon _{q} \eta _{t-q+1} \\ \vdots \\ \upsilon _q \xi _{t-1} \end{bmatrix}, \end{aligned}$$

The parameter $a_{00}$ in the coefficient vector $\varvec{\delta }$, represents the initial level. The transition matrix is block diagonal,

$$\begin{aligned}{} & {} \textbf{T}= \begin{bmatrix} \textbf{T}_\textit{Tr} &{}\quad \text{0 } &{} \quad \text{0 } &{} \quad \text{0 } &{} \quad \text{0 } &{} \quad \text{0 }\\ \text{0 } &{} \quad \textbf{T}_\textit{Cy,1} &{}\quad \text{0 } &{} \quad \text{0 } &{} \quad \text{0 } &{} \quad \text{0 }&{} \\ \text{0 } &{} \quad \text{0 } &{}\quad \textbf{T}^*_\textit{Cy,1} &{} \quad \text{0 } &{} \quad \text{0 }&{}\quad \text{0 } \\ \text{0 } &{}\quad \text{0 } &{}\quad \text{0 } &{}\textbf{T}_\textit{Cy,2} &{}\quad \text{0 } &{}\quad \text{0 } \\ \text{0 } &{}\quad \text{0 } &{}\quad \text{0 } &{} \quad \text{0 } &{}\quad \textbf{T}^*_\textit{Cy,2} &{}\quad \text{0 } \\ \text{0 } &{} \quad \text{0 } &{}\quad \text{0 } &{}\quad \text{0 } &{} \quad \text{0 } &{}\quad \textbf{T}_\textit{No} \\ \end{bmatrix}, \>\>\>\>\> \\{} & {} \textbf{h}= \begin{bmatrix} \textbf{h}_\textit{Tr} &{}\quad \text{0 } &{}\quad \text{0 } &{}\quad \text{0 } &{}\quad \text{0 } &{} \quad \text{0 }\\ \text{0 } &{}\quad \textbf{h}_\textit{Cy,1} &{}\quad \text{0 } &{} \quad \text{0 } &{}\quad \text{0 } &{} \quad \text{0 }&{} \\ \text{0 } &{} \quad \text{0 } &{}\quad \textbf{h}^*_\textit{Cy,1} &{}\quad \text{0 } &{}\quad \text{0 }&{} \quad \text{0 } \\ \text{0 } &{} \quad \text{0 } &{}\quad \text{0 } &{}\quad \textbf{h}_\textit{Cy,2} &{}\quad \text{0 } &{}\quad \text{0 } \\ \text{0 } &{} \quad \text{0 } &{}\quad \text{0 } &{}\quad \text{0 } &{}\quad \textbf{h}^*_\textit{Cy,2} &{} \quad \text{0 } \\ \text{0 } &{} \quad \text{0 } &{}\quad \text{0 } &{}\quad \text{0 } &{} \quad \text{0 } &{}\quad \textbf{h}_\textit{No} \\ \end{bmatrix}, \end{aligned}$$

where

$$\begin{aligned} \begin{array}{ccc} \textbf{T}_\textit{Tr} = \left[ \begin{array}{cc} 1 &{}\quad 1 \\ 0 &{}\quad 1 \end{array} \right] , &{} &{}\quad \textbf{h}_\textit{Tr} = \left[ \begin{array}{cc} 1 &{} \quad 0 \\ 0 &{}\quad 1 \end{array} \right] \end{array} \end{aligned}$$

(16)

for $j=1,2$

$$\begin{aligned} \begin{array}{ccc} \textbf{T}_\textit{Cy,j} = \left[ \begin{array}{cccc} \phi _1(d_j) &{}\quad 1 &{} \quad 0 &{} \quad 0 \\ \vdots &{}\quad 0 &{}\quad \ddots &{}\quad 0 \\ \phi _p(d_j) &{} \quad 0 &{}\quad \ddots &{}\quad 1\\ 0 &{}\quad &{}\quad \cdots &{} \quad 0 \end{array} \right] , &{} &{} \textbf{h}_\textit{Cy,j} = \begin{bmatrix} 1 \\ \beta _1(d_j) \\ \vdots \\ \beta _p(d_j) \end{bmatrix}, \end{array}\nonumber \\ \end{aligned}$$

(17)

(similar expressions hold for $\textbf{T}^*_\textit{Cy,j}$ and $\textbf{h}^*_\textit{Cy,j}$) and

$$\begin{aligned} \begin{array}{ccc} \textbf{T}_\textit{No} = \left[ \begin{array}{cccc} \varphi _1 &{} \quad 1 &{}\quad 0 &{}\quad 0 \\ \vdots &{}\quad 0 &{}\quad \ddots &{}\quad 0 \\ \varphi _r &{} \quad 0 &{}\quad \ddots &{}\quad 1\\ 0 &{} &{}\quad \cdots &{} \quad 0 \end{array} \right] , &{} &{} \textbf{h}_\textit{No} = \begin{bmatrix} 1 \\ \upsilon _1 \\ \vdots \\ \upsilon _q \end{bmatrix}. \end{array} \end{aligned}$$

(18)

All the disturbances $(\varepsilon _{t},\zeta _t,\eta _{1,t},\eta ^*_{1,t}, \eta _{2,t},\eta ^*_{2,t},\xi _t )$ are collected in the vector $\varvec{\epsilon }_t$, according to the above representation. The initial state vector $\varvec{\alpha }_1$ has a Gaussian distribution centered around $\text{0 }$ with variance-covariance matrix $\textbf{h}\text{ Var }(\varvec{\epsilon }_t)\textbf{h}'$, where $\text{ Var }(\varvec{\epsilon }_t) = \text{ diag }(\sigma ^2_\epsilon , \sigma ^2_\zeta , \sigma ^2_{1,\eta }, \sigma ^2_{1,\eta }, \sigma ^2_{2,\eta }, \sigma ^2_{2,\eta }, \sigma ^2_{\xi })$.

2.6 Statistical inference and signal extraction

Under the normality assumption, the likelihood of the observed time series can be evaluated with the support of the KF (Durbin and Koopman 2012; Proietti and Luati 2013). Maximum likelihood estimation of the hyperparameters $\varvec{\theta }=(\sigma ^2_\epsilon , \sigma ^2_\zeta , \sigma ^2_{1,\eta }, \sigma ^2_{1,\eta }, \sigma ^2_{2,\eta }, \sigma ^2_{2,\eta }, \sigma ^2_{\xi })'$ and the initial conditions in the vector $\varvec{\delta }$ can be conducted through a numerical optimization routine, employing a quasi-Newton algorithm. A reparameterization is applied to guarantee that the estimates of hyperparameters fall within their admissible range.

Point and interval estimates of the various unobserved components in the model described by Eqs. (10)–(12), conditional on the observed time series and the parameter estimates, are obtained via the smoothing algorithm by de Jong (1989), according to Appendix A. If the assumption of Gaussianity is relaxed, the smoothing algorithm still provides optimal linear estimates of the unobserved components.

Dealing with data gaps and outliers

Geodetic signals often exhibit data gaps, outliers, and shifts in the mean value, referred to as offsets. To address these challenges, we enhanced our fSWp to accommodate non-preprocessed time series. Missing values are managed by excluding the updating operations in the KF recursions (refer to Appendix A for mathematical details). When the location of an outlier or offset is identified, the state space model can incorporate the impact of suitable intervention variables. In the case of an additive outlier, this takes the form of a pulse dummy, i.e., a variable that equals one at the time of the intervention and zero otherwise. Offsets are addressed by including a step dummy among the regression effects, i.e., a variable that equals 1 after the break and 0 before the break (Montillet and Bos 2020).

Situations, when outlier contamination occurs randomly across the available sample, are more suitably handled by a robustification of the filter, as in Proietti and Pedregal (2023) and the references therein. In the presence of an outlying observation, as indicated by a notably large one-step-ahead prediction error, the robust KF employs a strategy of shrinking the real-time estimate of the state components toward the one-step-ahead prediction. This prediction doesn’t incorporate the current contaminated observation. Instead, the robust KF replaces the observation after extracting the contaminated part, mitigating the influence of the outlier on the estimation process. This approach enhances the robustness of the state estimation against the impact of outliers.

2.7 Model evaluation and selection

The model introduced in Sect. 2.4 provides a broad specification encompassing several particular cases that are useful in practice. Model selection based on information criteria (IC), goodness of fit assessment, and a thorough assessment of the test error will suggest the specification that is most suitable for the series under consideration. We envisage five relevant restricted versions of the general model in Eqs. (10)–(12):

1.
Model 1: The trend component is deterministic ($\sigma ^2_\varepsilon = \sigma ^2_\zeta = 0$) and $u_t$ is an AR(1) (red noise) process ($r=1$ and $q=0$ in Eq. (13)).
2.
Model 2: The trend component is deterministic ($\sigma ^2_\varepsilon = \sigma ^2_\zeta = 0$) and $u_t$ is an ARMA(1,1) process ($r=1$ and $q=1$ in Eq. (13)).
3.
Model 3: The noise component $u_t$ is an AR(1) process ($r=1$ and $q=0$ in Eq. 13).
4.
Model 4: The trend component is a RW with constant drift ($\sigma ^2_\zeta = 0$). The noise component is absent ($u_t=0$).
5.
Model 5: Both trend and seasonal component are deterministic and $u_t$ is an AR(1) (red noise) process.

Comparison with usual methods

Model 5 is a standard approach in fitting geodetic time series and it will be used in the empirical section as a benchmark. Specialized software has been developed for its estimation, with the most prominent example being the Hector software (Bos et al. 2013). In this contribution, we restrict ourselves to the modeling of $u_t$ as an AR(1) process. In the future, $u_t$ will be considered as a FracN, i.e., a power-law noise for which d can be estimated as described in Sect. 2.5, encompassing both flicker noise and power-law noise in a unifying approach.
Model 1 has been implemented by Proietti and Maddanu (2022) in analyzing inter-annual and intra-annual variability of carbon dioxide (CO2) concentrations at Mauna Loa, even though the cyclical component was assumed to be stationary. A further restriction of the general model in Eqs. (10)–(12) can be found in the contribution by Davis et al. (2012) on the analysis of geodetic signals with stochastic seasonal components. Here, the memory parameters in the fSWp specification have been constrained to be 1, implying that the amplitude and phase governing the stochastic cycle are influenced by RWs.
Models 3 and 4 consider the trend as a non-stationary stochastic process.

Visual inspections of the time series, as well as criteria to judge the goodness of fit can provide indications toward one or another model. This approach can be compared with high-pass filtering which would extract the component with a frequency over one year through tuning the cutoff frequency. The fine-tuned weighted Savitzky-Golay filter (Savitzky and Golay 1964) is a robust filter that has the main advantage of preserving the shape (peaks and features) of the filtered time series. We refer to, e.g., Liu et al. (2015) for filtering seismic signals. From a principled standpoint, this filter would be close to the stochastic trend modeling.

Table 1 Summary of the main specifications under consideration in this contribution

Full size table

Table 1 provides an overview of the main specifications of the models under consideration. Please note that various methods exist for modeling a given process. The software Hector is widely utilized in geodetic contexts, known for its flexibility and diverse functionalities. For instance, time-varying periodic components can be modeled using a Chebyshev polynomial (Bennett 2008). With this assumption, neither the amplitude nor the phase are stochastic, representing a distinct approach from the one employed in the fSWp, and providing smoothed periodic components. With such a simple polynomial fitting, the long-range dependent noise is entirely captured in the residuals, typically through a random walk (RW). We acknowledge that this perspective is visually more common, although it is likely suboptimal from a modeling standpoint.

In this contribution, we have opted to model the residuals with an AR(1), a short-memory noise. Our algorithm can be readily adapted to other types of noise, as long as the optimization process aligns with the KF framework, as described in Sect. 2.6. For consistency, AR of higher order or ARMA processes should be considered, given their ease of implementation. Additionally, we recommend employing stochastic trend modeling when in doubt about the presence of an RW, which can be visually identified at first glance through the wandering of the time series.

We do not intend to replace the well-established software Hector but rather to offer an alternative approach to consider the modeling of the same data, providing different insights through the stochastic amplitude and phase FracN assumption. That is the reason why the nearest and fairest comparison of our approach is, in our opinion, the BP noise model described in Sect. 2.3.

Goodness of fit

We judge the goodness of fit of the models using the following indicators:

The log-likelihood of the model, evaluated at the maximum likelihood estimates of the parameters, $LL({\hat{\varvec{\theta }}})$, where ${\hat{\varvec{\theta }}} = \text{ argmax}_\theta LL(\theta ) $. The preferred model should usually have the highest logLikelihood, or equivalently, the smallest deviance, defined as $-2LL({\hat{\varvec{\theta }}})$. Models with different complexity are compared by introducing a penalty for the number of estimated parameters, k. The Akaike IC is defined as $AIC=2k-2LL({\hat{\varvec{\theta }}})$ (Burnham et al. 2010).
Residual autocorrelation. For a correctly specified model, the standardized KF innovations are serially uncorrelated. Departure from the stated assumption is revealed by a large Ljung-Box or Box-Pierce statistic (Box and Jenkins 1976), which is based on the sum of the squares of the first m sample autocorrelations, scaled by the number of observations. In this contribution, we use the descriptive statistic called hereafter Sum2Corr and defined as $\mathbf {\Sigma }_{{\hat{\rho }}^2_\epsilon } =\sum _{t=2}^{n-1} {\hat{\rho }}^2_{t,\epsilon }$, where ${\hat{\rho }}_{t,\epsilon }$ is the sample correlation function of the studentized residuals (Hassani and Yeganegi 2019). When multiplied by n, the Box-Pierce autocorrelation test is obtained. Since our goal in this contribution is not to test the statistical significance of the results but rather to make comparisons between models for the same time series, Sum2Corr is sufficiently descriptive to achieve this objective.
We estimate the uncertainty of the deterministic trend by considering the uncertainty as $\sigma _{bKF} = \sqrt{\text{ Var }(\varvec{\delta })_{(2,2)}}$, see Appendix A. This value increases when more parameters are estimated (the total variance of the model increases) and will be higher for the fSWp compared to a fully deterministic approach. For Hector, we used the standard deviation of the driving noise for comparison. A more usual definition of uncertainty in geodesy follows, e.g., Alshawaf et al. (2018); Tiao et al. (1990). In that case we have $\sqrt{\text{ Var }(\hat{b})} =\frac{\sigma _N}{\sum _{n}(t-\bar{t})} \sqrt{(} \frac{1+\phi }{1-\phi })$, which we roughly call "$\sigma _b$" in the following. ${\sum _{n}(t-\bar{t})}$ is the deviance of time, and $\sigma _N$ is the standard deviation of the residuals after reducing the deterministic components. This value can be computed using the BP approach in Hector and an additional AR (or ARMA as in model 2) noise.

The methodology for the hyperparameters estimation and relative extraction of the unobserved components within the KF framework is summarized in a flowchart form in Fig. 2.

3 Case study

In this section, we intend to thoroughly examine the modeling of diverse geodetic time series through the methodology outlined in Sect. 2.6. The diversity of data and processing methods demonstrates that the fSWp can be successfully applied to geodetic time series with a periodical component, requiring only proper adaptation. The reader should be aware that the chosen modeling can be customized, similar to what can be done with, e.g., the software Hector. This aspect, as well as a deeper comparison with other methods, will be addressed in a subsequent contribution. In this present study, we will focus on:

Geodetic time series from the DRAO GNSS permanent station (British Columbia, Canada). To illustrate the variety of models, we have chosen the (i) GNSS vertical DTS processed by the International GNSS Service (IGS), (ii) precipitable water vapor (PWV) time series derived within GNSS processing, (iii) vertical DTS predicted by non-tidal atmospheric loading (NTAL) model for the location of DRAO station and (iv) vertical DTS predicted by hydrological loading (HYDL) model for the location of DRAO station. This station is considered to represent, nominally, the stable North American continent (Khazaradze et al. 1999). It was chosen randomly for illustration.
IGS-processed vertical DTS from the two additional stations recognized as affected by strong periodical components following Klos et al. (2020), called NEAH (Neah Bay, USA) and LPAL (LaPalma, Canary Islands). These stations demonstrate the necessary adjustments required in the stochastic model compared to the DRAO case study to achieve optimal results. We deliberately selected these stations because the visual representation of fitting the time-varying periodical component using the Wiener filter approach is available in Klos et al. (2020) for comparative analysis.
non-preprocessed GNSS vertical DTS processed by the Nevada Geodetic Laboratory (NGL) from six neighboring GNSS stations, situated in Wettzell, Germany. We will show how the adapted KF algorithm can deal with offsets, data gaps of various sizes and outliers present in the non-preprocessed DTS. Additionally, we will explore how the modeling with the fSWp can provide novel insights into studying the trend and periodic components from geodetic time series.

All these types of time series contain a trend (potentially small as for NTAL) and seasonal components and play an important role in studying climatological effects as well as geophysical hazards. They are the topic of various publications in which a simple deterministic model (trend and seasonal components) is used, most of the time. Many contributions focus on analyzing the noise structure of the residuals and finding potential (spatial) dependencies. Unfortunately and as highlighted in, e.g., Davis et al. (2012), this highly restrictive deterministic assumption can obscure the time variability of seasonal components, leading to a cascading effect that biases the interpretation of noise. The fSWp, with its high flexibility, is an answer to such challenges. A comparison with the BP noise model of the Hector software is provided for orientation as described in Sect. 2.3. We strongly insist that other modelizations could have been chosen but that we restricted ourselves to comparable implementations for fairness. The optimal fitting of geodetic time series is a controversial and non-solved topic with various (nearly philosophical) interpretations. This discussion is not the topic of this contribution. Our intention is not to engage in broad geophysical interpretations at this point; Such analyses will be addressed in specific and dedicated studies. Our focus centers on the variety of models, their goodness of fit, and a comparison with a conventional deterministic case and the BP modelization.

3.1 Data description

Before analyzing the results, we provide a short description of the chosen data set in the next sections.

3.1.1 IGS-processed vertical DTS

We use the vertical DTS which results from the third IGS reprocessing campaign “repro3” (http://acc.igs.org/repro3/repro3.html), performed to provide reliable input into the newest International Terrestrial Reference Frame (ITRF2020).

Three different GNSS stations are selected. The DRAO station is located in a region considered as stable with a radial velocity of 0.7 ± 0.01 mm/yr based on 27 years of continuous measurements (Mazzotti et al. 2007). Moreover, we employ two GNSS stations known for their pronounced seasonal variations to compare outcomes with those achieved using the Wiener Filter, as outlined in Klos et al. (2020): NEAH (Neah Bay, USA) and LPAL (La Palma, Canary Islands).

The IGS GNSS vertical DTS were preprocessed by removing outliers using three times the interquartile range rule and offsets using available databases of offsets, supported by manual inspection.

3.1.2 HYDL- and NTAL-predicted vertical DTS

We use the predictions of vertical displacements arising from the non-tidal atmospheric mass loading and hydrospheric mass loading, computed by the Earth System Modeling group of the German Research Center for Geosciences at Potsdam (Dill et al. 2013).

Hydrospheric mass loading modeled by the ESM GFZ group (Dill and Dobslaw 2013) considers the surface loading of water contained in different layers (soil moisture, snow cover, shallow groundwater, and surface water stored in rivers and lakes). The DTS predicted by HYDL are sampled daily and interpolated from their original grid into the location of DRAO station. The DTS predicted by NTAL is sampled 3-hour and computed using the 3-hourly atmospheric surface pressure from the European Center for Medium-Range Weather Forecast using surface-pressure fields provided by the latest operational model of the European Centre for Medium-range Weather Forecasts (ECMWF) (Memin et al. 2020). We re-sample the 3-hour time series into epochs consistent with GNSS and interpolate NTAL-predicted DTS into the location of the DRAO station.

3.1.3 PWV time series

We use the tropospheric time series obtained within the GNSS processing as provided by Wang et al. (2016) and described in Wang et al. (2016). In a nutshell, zenith wet delays are estimated as part of the GNSS processing and converted to the PWV time series by using the relationship suggested by Bevis et al. (1992). In the data set from Wang et al. (2016), ERA-Interim was used to interpolate the pressure values for the GNSS stations for the same period (2000–2012).

PWV time series are of great interest for meteorological and climatological studies (Poli et al. 2007). They serve to improve the prediction of local heavy rainfall but also for a better understanding of the hydrological cycle and greenhouse gas effects (Guerova et al. 2016). The linear trend in GNSS-derived PWV time series has been regarded as a very important resource for investigating climate change (Alshawaf et al. 2018). We anticipate that a stochastic modeling approach holds even greater promise in this context.

3.1.4 NGL-processed vertical DTS

We further use NGL-processed vertical DTS from the Wettzell site. The GNSS observations were processed using the GIPSY-OASIS II Release 6.1 from JPL with IERS2010/IGS08 standards/models (Precise Point Positioning with ambiguity resolution using the WLPB method). We aim to underscore the capability of the Kalman Filter (KF) and the developed methodology in overcoming the challenges outlined in Sect. 2.6. Consequently, we intentionally did not remove outliers and offsets in a pre-processing step for the sake of illustration.

We have considered six Wettzell stations in total located nearby, as illustrated in Fig. 3 with WTZA, WTZJ, WTZR and WTZS. The two stations WTZZ and WTZL were considered additionally. The WTZR is called a "gold" station and records observations continuously for more than 20 years. Interestingly, the vertical DTS of the different antennas as depicted in Fig. 7a–g, top, differ although all stations should record the same geophysical phenomena. An improved model for fitting the DTS with a stochastic trend and time-varying periodical components should allow for deeper investigations of the DTS compared to a full deterministic model.

Table 2 DRAO station: IGS DTS, PWV, NTAL, and HYDL. Model 1: deterministic trend, FracN or RW, periodical component with FracN amplitude

Full size table

3.2 Results

In the first part, we discuss the modeling of time series from the DRAO station with the fSWp. Following that, we present the outcomes for two stations, NEAH (USA) and LPAL (Canary Islands), known for their prominent periodical components. A third section is dedicated to the analysis of the six Wettzell stations in Germany.

We mention that all computations were done on a computer with a processor 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz, 3.00 GH, and 32GB of RAM. The MATLAB toolboxes are freely available on the GitHub repository and can be downloaded for testing purposes, see Sect. 4. For further testing purposes, we added Model 5 (fully deterministic) with $u_t$ described as a FracN. Please note that through the empirical case studies, we did not consider the draconitic frequency which is very close to the annual component and difficult to reliably separate from it Amiri-Simkooei (2013).

3.2.1 Case study: DRAO station

Comments on the trend

Table 2 presents a comparison between different modelization as described in Table 1.

For all cases under consideration (GNSS DTS, NTAL-predicted DTS and PWV time series), the superiority of modeling with the fSWp (model 1 or 2) compared to a full deterministic model (model 5) is highlighted. The logLikelihoods are nearly similar for NTAL-predicted DTS and GNSS DTS but the Sum2Corr is 3 times (for GNSS DTS) and 2 times (for NTAL-predicted DTS) lower compared to the deterministic model. In all cases, the AIC is lower with the fSWp modeling, although the number of model parameters is larger. The studentized residuals are less correlated when an fSWp is used, which is reflected by the lower $\sigma _b$ in all cases. Thus, the trends are estimated with a higher confidence with model 1 than with model 5. We note that the trends do not strongly differ from the deterministic model and are even the same for the PWV time series. This result is expected as the improved modeling should affect the uncertainty of the trend rather than its conditional mean.
For the stable station DRAO IGS, the fitting results with the fSWp are similar to the one obtained with the software Hector using the BP modelization. This finding lends credibility to the fairness of the comparison. The likelihoods given by both software are nearly identical.

Correlation of the residuals

The lower correlation of the studentized residuals from model 1 is further highlighted in Fig. 5. We used the Lomb-Scargle periodogram (Lomb 1976) to detect periodicities in irregular space and incompletely observed time series as demonstrated in, e.g., Gobron et al. (2021), within a geodetic context. The periodograms are flat and do not show spurious frequencies neither at the yearly angular frequency ($\omega =0.0172 $ rad) nor at the sub-yearly frequency. For DRAO PWV time series, a slight decrease in frequency around 0.03 rad may be attributed to the unnecessary modeling of the yearly frequency as stochastic. This is further emphasized by the logLikelihood of model 5 (full deterministic) which is higher ($-$1.8323 versus $-$1.8397), and the AIC smaller, indicative of a more optimal fit within the context of information criterion analysis. Our results highlight the closeness of the models in that case: the choice for one model or another is left to individual visual inspection depending on the application. The trend from the PWV time series is estimated with a lower uncertainty in model 1 (0.0317 compared to 0.0326 for model 5), favoring its suitability for climatic trend analysis (Alshawaf et al. 2018).
We further show the sample autocorrelations, obtained by considering the discrete Fourier inverse transform of the Lomb-Scargle periodogram and displayed at the bottom of each subfigure. These latter do not present any serial correlations for model 1. This supports the small value of Sum2Corr when using the improved model. We note, however, that the fitting is less satisfactory for NTAL-predicted DTS, although an ARMA(1,1) ranges dependencies of the residuals instead of a short-memory AR(1) process. We attribute this lower goodness of fit to the strong heteroscedasticity of the time series. Here, a boxcox transformation may be favorable (Box and Cox 1964).
The uncertainty $\sigma _b$ with Hector for the DRAO IGS time series was found to be slightly higher than with the fSWp, a phenomenon we attribute to the higher correlation of the residuals and the capturing of long-range dependency in the stochastic amplitude of the periodical component with the fSWp. As aforementioned, this is almost a philosophical question, the two AIC being close. Due to the station’s stability and quality, we prefer residuals with low correlation. For the DRAO NTAL time series, the modeling with Hector was slightly more favorable from an information criterion standpoint, but the Sum2Corr was higher, despite modeling the residuals with an ARMA model. This has a significant impact on trend estimation, leading to a high uncertainty with Hector (BP noise model). Due to its heteroscedasticity, the NTAL time series is challenging to model, and in this case, the fSWp seems to outperform the modeling with Hector from a parameter perspective. The HYDL time series is similarly badly fitted with Hector using the BP noise model, although the trends are comparable. The residuals stay strongly correlated as Hector failed at estimating an RW in addition to the BP noise. We recall that the goal of this contribution is not to find the best model with Hector, which could be the topic of the next contribution. It clearly shows, however, that the fSWp is flexible and easy to tune due to the KF.

Deterministic versus stochastic trend

The HYDL-predicted DTS was fitted using a deterministic trend, an additional correlated noise and an RW as the amplitude of the periodical components, as in Davis et al. (2012). The results are shown in Fig. 4a, bottom. The difference between a deterministic and stochastic trend manifests itself by the linear temporal evolution of the trend for NTAL-predicted DTS, GNSS DTS, and PWV time series compared to its random behavior for HYDL-predicted DTS. This example illustrates how the trend can be represented by a stochastic process (i.e., an RW) modeling a long-range dependency nicely. The higher logLikelihood of model 4 compared to model 5 supports that finding. Alternative approaches would have necessitated a filtering of the time series within a given bandwidth, i.e., a two-steps method. Unfortunately, we were not able to find an appropriate modelization with the Hector software. The simple polynomial fitting of the trend is not comparable with the modeling as a RW, and the noise model RW + estimation of a trend was unfavorable in that case. We do not investigate further that point, which is not the focus of this contribution and potentially due to the small sample length.
The sample autocorrelation may be slightly worthier than for GNSS DTS and PWV time series (more lags are visible). This behavior is due to the cycle estimation as an fSWp with FracN components very close to a RW. This assumption fits the long-range dependency of the trend but challenges the MATLAB optimization algorithm. It is noteworthy that fitting a full deterministic model (model 5) led to strongly correlated studentized residuals with numerous significant lags (not displayed for brevity). This is corroborated by the 50 times higher Sum2Corr compared to model 4. We recognize that fitting the HYDL-predicted DTS using a stochastic trend and time-varying amplitude as a RW is less conventional than employing a full deterministic model. However, this approach is expected to yield novel insights into hydrological mass loading and its periodic variations.

Comments on the yearly component

The amplitudes of the yearly components for the four time series under consideration are shown in Fig. 4b, together with the yearly components themselves in panel (c). The amplitudes from GNSS DTS, PWV time series and NTAL-predicted DTS are modeled as $\text{ Amp}_{1t} =\sqrt{(a_{1t}-a_{10})^2 + (a_{2t}^*-a_{20}^*)^{2}}$ according to Eq. (12) and are "noisy" as shown in Fig. 1. We note that:

The amplitude of the fSWp component for HYDL-predicted DTS is close to an RW and nearly non-stationary, leading to a smooth trend.
The aforementioned heteroscedasticity of the NTAL-predicted DTS is still visible in the amplitude. The strong high frequencies contained in NTAL-predicted DTS (Klos et al. 2021) are divided into the additional ARMA(1) from the linear component and the stochastic amplitude of the seasonal components. The logLikelihood of model 2, together with Sum2Corr as presented in Table 2 makes us confident that the modeling of the NTAL-predicted DTS as an fSWp is still more realistic than for model 5.

First geophysical interpretation

It is challenging to identify a clear similarity in amplitudes between NTAL-predicted DTS and GNSS DTS from Fig. 4b. However, NTAL-predicted DTS and PWV time series follow each other from 1996 to 2002 and are comparable with the stochastic trend of HYDL-predicted DTS. These quantities depend on the same physical processes (redistribution of water), making this finding plausible.

Interestingly, we observe that the stochastic trend of the HYDL-predicted DTS closely mirrors the amplitude of the yearly component from the PWV time series. This observed similarity is plausible given the physical relationship between PWV and hydrological mass loading. Such a clear relationship is neither visible from the original time series nor if a deterministic model would have been fitted (constant amplitude of the periodical components).

The HYDL-predicted DTS are often considered as 0-mean, masking the smooth random character of the trend to the price of a valuable geophysical interpretation. The fSWp modeling is very promising for further investigations on the response between hydrological mass loading and PWV during heavy rain events (Kim et al. 2023). To enhance the comparison between the amplitudes of the yearly components or the yearly components themselves, specific distances could be used for features analysis from Fig. 5c rather than the correlation coefficient. We cite the dynamic time warping distance as an example, which is particularly worth investigating in the case of noisy time series with periodic components (Świtoński et al. 2019). This is beyond the scope of this contribution.

Note on the memory parameter d

To guide the Monte-Carlo simulations outlined in Appendix B, it is worth mentioning that the fractional parameter d of the fSWp was found to be 0.59 for NTAL-predicted DTS, 0.37 for PWV time series and 0.9 for HYDL-predicted DTS, which is close to an AR(1). The variance $\sigma _{\eta }$ was 4.42 for GNSS DTS, 1.59 for NTAL-predicted DTS, 1.14 for PWV time series and 0.005 for HYDL-predicted DTS. These values provide insights into the long-range dependencies and the strength of the FracN from the yearly component.

We emphasize that when using the Hector BP noise model or the Wiener filter, the noise of the periodical component is constrained to an AR(1) model with $d=1$.

3.3 DTS with strong annual component

In this subsection, we will not conduct additional comparisons with the Hector software, as the chosen time series were processed in (Klos et al. 2020) with the Wiener filter, to which we refer.

Station NEAH

The most optimal model for fitting the GNSS DTS from station NEAH was found to be model 1, as in Sect. 3.2.1. The value for Sum2Corr was 0.0027 versus 0.0054 for the deterministic model 5. The uncertainty on the trend $\sigma _{b}$ was 1.25 times smaller with model 1 than for model 5.

The vertical DTS together with the fitted trend, the yearly component as well as the Lomb-Scargle periodogram are presented in Fig. 6a. We note an increase of the fSWp variance on the yearly component from 2015 (middle panel). This behavior was not discernible with the Wiener filter (see Fig. 9 in their paper Klos et al. (2020)), where only an increase in the amplitude of the yearly component could be detected from 2010.

With a KF approach combined with the fSWp, the amplitude of the yearly component exhibits an increase around the year 2000, a modest decrease between 2006 and 2008, and remains relatively constant thereafter. Thus, temporal variations in amplitude are identified with a higher level of detail than when a deterministic model or a simplified stochastic modeling, as the short-memory AR(1) in Klos et al. (2020). The FracN nicely models the underlying long-range dependency of the seasonal process (Proietti and Maddanu 2022). We note that the fractional parameter d of the fSWp was found to be 0.35 with a variance $\sigma _{\eta }^2$ of $6.6\text { mm}^2$. These values closely align with those found for the GNSS DTS of the DRAO station, as discussed in Sect. 3.2.1. Additionally, we highlight that the KF framework effectively addresses data gaps in 2003, as described in Sect. 2.5.

Station LPAL

The vertical DTS from the LPAL station cannot be accurately modeled with a deterministic trend, as shown in Fig. 6b, top. This is apparent from the smooth and slow variations of the DTS, specifically a slight increase between 2008 and 2010 followed by a decrease observed in the time series. With a linear trend as in Fig. 6a, top, for the station NEAH, this shape would be lost to the benefit of simplicity (and a simple yet meaningless trend value). A stochastic trend modeled as an RW (model 4) is more appropriate from a statistical point of view and leads to a higher logLikelihood compared to model 1 and 5. The smaller value of Sum2Corr (more than 9 times than for model 1 and 5) supports that viewpoint. The optimal fitting is further highlighted in the periodogram: its flatness and lack of additional frequency at the yearly or sub-yearly frequency strongly support the fSWp as an optimal model. Interestingly, the amplitude of the yearly component varies with time and decreases significantly from 2014. This behavior was not evident from visual inspection of the time series and is not found in Klos et al. (2020) (their Fig.9) with a simplified AR(1) model for the stochastic amplitude of the periodical component and a deterministic trend with additional power law noise. Our modeling seems more plausible from a statistical point of view, although a geophysical interpretation (e.g., a potential link with hydrological or atmospheric effects) is beyond the scope of this contribution.

The fractional parameter d of the fSWp reaches 0.42 with a variance $\sigma _{\eta }^2$ of $1.31\text { mm}^2$.

This time series highlights the importance of a visual inspection for a better understanding of the most optimal underlying model before the fitting and an optimal setup of the initial condition.

3.3.1 Neighboring GNSS stations in Wettzell, Germany

We used the vertical DTS from six GNSS stations in Wettzell, Germany as described in Sect. 3.1.4.

The most optimal model identified corresponds to model 3, as indicated in Table 1. It is important to note that our primary objective was not to explicitly detect offsets (shifts of mean value), outliers, or data gaps in the DTS. Instead, our focus was on demonstrating how the KF can robustly perform modeling, where a traditional least-squares adjustment with a full deterministic model would likely fail.

The following comments can be made:

The similarity between WTZA and WTZZ DTS is evident from Fig. 7a and b: in both cases, the amplitudes of the yearly component depicted in Fig. 7h have a higher standard deviation as the DTS themselves compared to, e.g., WTZR or WTZS. The stochastic trends are consistent and exhibit a gradual decrease over time. It is worth noting that a mean trend could be computed for comparison purposes, as demonstrated in, for instance, Maddanu and Proietti (2023) and Didova et al. (2016). The periodogram of WTZA reveals a subtle loss of power at low frequency, possibly attributed to the declining DTS from the year 2020.
The station WTZR is considered as being a "gold" station as highlighted in Fig. 7d, top, by the low standard deviation of the DTS. The introduction of an RW trend allows for the modeling of an offset in the year 2009. Notably, the amplitude of the stochastic seasonal signal exhibits a gradual increase starting from the year 2008, as illustrated in Fig. 7g. We found a similar pattern for station WTZS (the black box corresponds to the same time range) in the yearly component although the magnitude increases for WTZS from the year 2015 more strongly than for WTZR. The parameter d was 0.41 and 0.45, respectively, as for the DRAO or NEAH stations. We point out that a modeling of the yearly component as purely deterministic would mask the time-variability of the amplitude, and, thus, associated geophysical phenomenon (most probably hydrological). The observed slight differences in shape and magnitude between WTZS and WTZR modelization should be considered in light of the potential loss of information and non-optimality associated with a deterministic approach.
The data gap in the WTZS raw time series (Fig. 7e) in year 2008 neither affects the estimation of the stochastic trend nor the one of the stochastic periodical component. The similarities between WTZS and WTZR in the yearly component and trend from year 2010 may be attributed to their equipment, as depicted in Fig. 3. Our analysis using the fSWp shows that the noise induced by the antennas and potentially receivers affects the standard deviation of the FracN on the yearly component when the time series are fitted with a stochastic trend. The parameters d were close (0.28 and 0.36 for the station WTZA and WTZZ, respectively). From the Lomb-Scargle periodograms, the studentized residuals can be considered nearly uncorrelated for all time series.
The stations WTZJ and WTZL are of poor quality. However, the fitting of the DTS with long data gaps and strong heteroscedasticity is made possible with an fSWp within a KF framework (WTZL). Although the stochastic trend is smooth and has a similar shape compared to the other stations (Fig. 7c), outliers present in WTZJ are hardly detected and affect the amplitude of the yearly component, leading to a loss of power at low frequency visible in the Long-Scargle periodogram. Here we found a high value of 0.64 for the parameter d, (non-stationarity region).

Our investigations highlight that the fSWp effectively models raw time series with a high degree of reliability, even when confronted with offsets, outliers, and data gaps, with the implementation proposed in Sect. 2.5. The results from the Monte-Carlo simulations conducted in Appendix B corroborate this finding. We refer to Proietti and Maddanu (2021) for further simulations to assess the goodness of fit of the approach for varying parameter combinations (amplitude, memory parameters).

4 Conclusion: the fSWp for modeling geodetic time series

Our analysis of various geodetic time series have highlighted the high potential of the fSWp to model time-varying seasonal signals and trends with a high trustworthiness. Our model improvement should avoid masking important geophysical phenomena and relationships between time series, as illustrated with the GNSS DTS from the DRAO station and the corresponding environmental loadings. The full deterministic model usually used within a geodetic context showed its weaknesses both in terms of goodness of fit and by visual inspection of the residuals. We found a parallel between the modeling with a fSWp, a BP filter, and the Wiener filter approach despite the inability of these last two methods to capture the long-range dependency of the stochastic periodical component. Further, stochastic trend modeling is a promising approach for specific time series. This feature is not implemented in usual software such as Hector. The estimation of an additional RW to the deterministic trend could be an alternative but was not satisfactory from a parameter estimation perspective in our case study.

The specific improvements detailed in this contribution enable the KF to estimate the time-varying seasonal component and trend, even in the presence of data gaps, outliers, and offsets of various sizes. Monte-Carlo simulations have validated and confirmed this capability.

We found that in most cases, GNSS DTS can be modeled with model 1, the PWV time series with either a full deterministic model or model 1, NTAL-predicted DTS with model 1, extended to an ARMA model for the stochastic amplitude on the seasonal component to account for heteroscedasticity. We refer to Table 1 for a description of the various possibilities. Non-preprocessed DTS were well fitted with a stochastic trend, as some particular GNSS DTS with large seasonal variations. Here, no rule of thumb can be given. We intended to think that each type of time series should be handled separately, depending, e.g., on the sampling (daily, monthly). Further investigations will focus on the possibility of grouping stations with some criteria for determining an optimal fitting and a focus on geophysical analysis (atmospheric loading or zenith wet delay for climatological analysis, hydrological loading for geophysical investigations, or GNSS DTS to study deformations of the Earth’s crust and its periodical components more accurately).

Our methodology is flexible and can be adapted to various time series, as highlighted in this contribution. It is easy to implement and computationally low, i.e., less than one minute processing time for the 25-year-long DRAO GNSS daily vertical DTS. We further provide freely all Matlab functions for the sake of dissemination. An extension to the fully deterministic model using a FracN for the residuals is available. A comparison with the power-law noise from the software Hector is targeted in the next contribution. Our intention is not to replace installed software like Hector but rather to provide an alternative solution when a different insight into the geophysical process is needed or the usual solution may be doubtful and unrealistic.

Data availability

All time series used in this article are available publicly as mentioned in the article and can be downloaded freely.

Code availability

The Matlab functions are publicly available in a GitHub repository at https://github.com/kermarrecg/fSWp and can be downloaded for testing purposes together with the corresponding time series used for this manuscript.

Notes

Roughness refers to a feature of the fractional Brownian Motion with Hurst exponent $H<1/2$.
We refer to the concept of long and short memory as in (Hassler 2019), such that a stationary process displays long-memory if its autocovariance function is not summable or, equivalently, if its spectral density is unbounded at one or more frequencies. Otherwise, it displays short memory.
The process $a_t= (1-\phi L)^{-d} \eta _t$ with $\phi <1$ and $d>0$ is a discrete-time version of a Mátern process, see Lilly et al. (2017) and the supplementary materials in Maddanu (2023).
In our empirical applications we did not find evidence for values of r and q greater than 1.

References

Alshawaf F, Zus F, Balidakis K et al (2018) On the statistical significance of climatic trends estimated from GPS tropospheric time series. J Geophys Res Atmos 123(19):10967–10990. https://doi.org/10.1029/2018JD028703
Article Google Scholar
Altamimi Z, Rebischung P, Collilieux X et al (2023) ITRF2020: an augmented reference frame refining the modeling of nonlinear station motions. J Geod 97:47. https://doi.org/10.1007/s00190-023-01738-w
Article Google Scholar
Amiri-Simkooei A (2013) On the nature of GPS draconitic year periodic pattern in multivariate position time series: GPS position time series analysis. J Geophys Res Solid Earth 118:2500–2511. https://doi.org/10.1002/jgrb.50199
Article Google Scholar
Anděl J (1986) Long memory time series models. Kybernetika 22(2):105–123
Google Scholar
Artemov A, Burnaev E, Lokot A (2015) Nonparametric decomposition of quasi-periodic time series for change-point detection. In: Verikas A, Radeva P, Nikolaev D (eds) Eighth international conference on machine vision (ICMV 2015). International society for optics and photonics. SPIE, vol 9875, p 987520, https://doi.org/10.1117/12.2228370,
Barbosa SM, Andersen OB (2009) Trend patterns in global sea surface temperature. Int J Climatol 29(14):2049–2055. https://doi.org/10.1002/joc.1855
Article Google Scholar
Bennett R (2008) Instantaneous deformation from continuous GPS: contributions from quasi-periodic loads. Geophys J Int 174:1052–1064. https://doi.org/10.1111/j.1365-246X.2008.03846.x
Article Google Scholar
Benveniste J, Birol F, Calafat F et al (2020) Coastal sea level anomalies and associated trends from Jason satellite altimetry over 2002–2018. Sci Data 7:357. https://doi.org/10.1038/s41597-020-00694-w
Article Google Scholar
Bevis M, Businger S, Herring TA et al (1992) GPS meteorology: remote sensing of atmospheric water vapor using the global positioning system. J Geophys Res Atmos 97(D14):15787–15801. https://doi.org/10.1029/92JD01517
Article Google Scholar
Bhootna N, Dhull MS, Kumar A et al (2023) Humbert generalized fractional differenced ARMA processes. Commun Nonlinear Sci Numer Simul 125:107412. https://doi.org/10.1016/j.cnsns.2023.107412
Article Google Scholar
Bloomfield P, Hurd HL, Lund RB (1994) Periodic correlation in stratospheric ozone data. J Time Ser Anal 15(2):127–150. https://doi.org/10.1111/j.1467-9892.1994.tb00181.x
Article Google Scholar
Bos M, Fernandes R, Williams S et al (2013) Fast error analysis of continuous GNSS observations with missing data. J Geod 87:351–360. https://doi.org/10.1007/s00190-012-0605-0
Article Google Scholar
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc Ser B (Methodol) 26(2):211–243. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
Article Google Scholar
Box G, Jenkins G (1976) Time series analysis: forecasting and control, Revised (edn). Holden Day, San Francisco
Burnham KP, Anderson DR, Burnham KP (2010) Model selection and multimodel inference: a practical information-theoretic approach. Springer, Berlin
Google Scholar
Chan NH, Palma W (1998) State space modeling of long-memory processes. Ann Stat 26(2):719–740
Article Google Scholar
Chanard K, Métois M, Rebischung P et al (2020) A warning against over-interpretation of seasonal signals measured by the global navigation satellite system. Nat Commun 11:1375. https://doi.org/10.1038/s41467-020-15100-7
Article CAS Google Scholar
Cheng X, Ou N, Chen J et al (2021) On the seasonal variations of ocean bottom pressure in the world oceans. Geosci Lett 8(1):1–12. https://doi.org/10.1186/s40562-021-00199-3
Article CAS Google Scholar
Chen Q, van Dam T, Sneeuw N et al (2013) Singular spectrum analysis for modeling seasonal signals from GPS time series. J Geodyn 72:25–35. https://doi.org/10.1016/j.jog.2013.05.005, sI: Geodetic Earth System
Davis JL, Wernicke BP, Tamisiea ME (2012) On seasonal signals in geodetic time series. J Geophys Res Solid Earth. https://doi.org/10.1029/2011JB008690
Article Google Scholar
de Jong PJ (1989) Smoothing and interpolation with the state-space model. J Am Stat Assoc 84:1085–1088. https://doi.org/10.1080/01621459.1989.10478876
Article Google Scholar
Deng Q, Fu Z (2019) Comparison of methods for extracting annual cycle with changing amplitude in climate series. Clim Dyn 52:5059–5070. https://doi.org/10.1007/s00382-018-4432-8
Article Google Scholar
Didova O, Gunter B, Riva R et al (2016) An approach for estimating time-variable rates from geodetic time series. J Geod 90:1207–1221. https://doi.org/10.1007/s00190-016-0918-5
Article Google Scholar
Dill R, Dobslaw H (2013) Numerical simulations of global-scale high-resolution hydrological crustal deformations. J Geophys Res Solid Earth 118(9):5008–5017. https://doi.org/10.1002/jgrb.50353
Article Google Scholar
Dill R, Dobslaw H, Thomas M (2013) Combination of modeled short-term angular momentum function forecasts from atmosphere, ocean, and hydrology with 90-day eop predictions. J Geod 87(6):567–577. https://doi.org/10.1007/s00190-013-0631-6
Article Google Scholar
Dissanayake G, Peiris M, Proietti T (2016) State space modeling of gegenbauer processes with long memory. Comput Stat Data Anal 100:115–130. https://doi.org/10.1016/j.csda.2014.09.014
Article Google Scholar
Dmitrieva K, Segall P, Demets C (2015) Network-based estimation of time-dependent noise in GPS position time series. J Geod 89:591–606. https://doi.org/10.1007/s00190-015-0801-9
Article Google Scholar
Durbin J, Koopman SJ (2012) Time series analysis by state space methods. OUP catalogue. Oxford University Press, Oxford
Book Google Scholar
Freymueller J (2009) Seasonal position variations and regional reference frame realization. In: Drewes H (ed) Geodetic reference frames. International association of geodesy symposia, vol 134. Springer, Berlin. https://doi.org/10.1007/978-3-642-00860-3_30
Chapter Google Scholar
Friedrich M, Beutner E, Reuvers H et al (2020) A statistical analysis of time trends in atmospheric ethane. Clim Change 162(1):105–125. https://doi.org/10.1007/s10584-020-02806-2
Article CAS Google Scholar
Gobron K, Rebischung P, Van Camp M et al (2021) Influence of aperiodic non-tidal atmospheric and oceanic loading deformations on the stochastic properties of global GNSS vertical land motion time series. J Geophys Res Solid Earth 126:e2021JB022370. https://doi.org/10.1029/2021JB022370
Article Google Scholar
Guerova G, Jones J, Douša J et al (2016) Review of the state of the art and future prospects of the ground-based GNSS meteorology in Europe. Atmos Meas Tech 9(11):5385–5406. https://doi.org/10.5194/amt-9-5385-2016
Article Google Scholar
Hannan EJ (1964) The estimation of a changing seasonal pattern. J Am Stat Assoc 59(308):1063–1077
Article Google Scholar
Hartl T, Jucknewitz R (2022) Approximate state space modelling of unobserved fractional components. Econ Rev 41(1):75–98
Article Google Scholar
Hartl T, Jucknewitz R (2022) Approximate state space modelling of unobserved fractional components. Econ Rev 41(1):75–98. https://doi.org/10.1080/07474938.2020.1841444
Article Google Scholar
Harvey AC (1990) Forecasting. In: Structural time series models and the Kalman filter. Cambridge University Press. https://doi.org/10.1017/CBO9781107049994
Hassani H, Yeganegi MR (2019) Sum of squared ACF and the Ljung-Box statistics. Phys A Stat Mech Appl 520:81–86. https://doi.org/10.1016/j.physa.2018.12.028
Article Google Scholar
Hassler U (2019) Time series analysis with long memory in view. Wiley, New York
Google Scholar
Hosking JRM (1981) Fractional differencing. Biometrika 68(1):165–176. https://doi.org/10.1093/biomet/68.1.165
Article Google Scholar
Huang N, Shen Z, Long S et al (1998) The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc Lond Ser A Math Phys Eng Sci 454:903–995. https://doi.org/10.1098/rspa.1998.0193
Article Google Scholar
Ji K, Shen Y, Wang F (2020) Signal extraction from GNSS position time series using weighted wavelet analysis. Remote Sens 12(6):992. https://doi.org/10.3390/rs12060992
Article Google Scholar
Jong PD (1991) The diffuse Kalman filter. Ann Stat 19(2):1073–1083. https://doi.org/10.1214/aos/1176348139
Article Google Scholar
Kasdin N (1995) Discrete simulation of colored noise and stochastic processes and 1/f/sup /spl alpha// power law noise generation. Proc IEEE 83(5):802–827. https://doi.org/10.1109/5.381848
Article Google Scholar
Kermarrec G, Lösler M, Guerrier S et al (2022) The variance inflation factor to account for correlations in likelihood ratio tests: deformation analysis with terrestrial laser scanners. J Geod 96(11):86. https://doi.org/10.1007/s00190-022-01654-5
Article Google Scholar
Khazaradze G, Qamar A, Dragert H (1999) Tectonic deformation in western Washington from continuous GPS measurements. Geophys Res Lett 26:3153–3156. https://doi.org/10.1029/1999GL010458
Article Google Scholar
Kim YJ, Jee JB, Lim B (2023) Investigating the influence of water vapor on heavy rainfall events in the southern Korean peninsula. Remote Sens 15(2):340. https://doi.org/10.3390/rs15020340
Article Google Scholar
Klos A, Bos M, Bogusz J (2017) Detecting time-varying seasonal signal in GPS position time series with different noise levels. GPS Solut 22:1–11. https://doi.org/10.1007/s10291-017-0686-6
Article Google Scholar
Klos A, Bos MS, Fernandes RMS et al (2018) Noise-dependent adaption of the wiener filter for the GPS position time series. Math Geosci 51:53–73
Article Google Scholar
Klos A, Hunegnaw A, Teferle FN et al (2018) Statistical significance of trends in zenith wet delay from re-processed GPS solutions. GPS Solut 22(2):1–12. https://doi.org/10.1007/s10291-018-0717-y
Article Google Scholar
Klos A, Dobslaw H, Dill R et al (2021) Identifying the sensitivity of GPS to non-tidal loadings at various time resolutions: examining vertical displacements from continental Eurasia. GPS Solut. https://doi.org/10.1007/s10291-021-01135-w
Article Google Scholar
Klos A, Bogusz J, Bos MS (2020) Modelling the GNSS time series: different approaches to extract seasonal signals. In: Geodetic time series analysis in earth sciences. Springer. https://doi.org/10.1007/978-3-030-21718-1_7
Langbein J (2004) Noise in two-color electronic distance meter measurements revisited. J Geophys Res. https://doi.org/10.1029/2003JB002819
Article Google Scholar
Li W, Guo J (2023) Extraction of periodic signals in GNSS vertical coordinate time series using adaptive ensemble empirical modal decomposition method. Nonlinear Process Geophys Discuss 2023:1–25. https://doi.org/10.5194/npg-2023-23
Article Google Scholar
Lilly JM, Sykulski AM, Early JJ et al (2017) Fractional Brownian motion, the Matérn process, and stochastic modeling of turbulent dispersion. Nonlinear Process Geophys 24(3):481–514. https://doi.org/10.5194/npg-24-481-2017
Article Google Scholar
Liu Y, Dang B, Li Y et al (2015) Applications of Savitzky-Golay filter for seismic random noise reduction. Acta Geophys 64:101–124. https://doi.org/10.1515/acgeo-2015-0062
Article Google Scholar
Lomb NR (1976) Least-squares frequency analysis of unequally spaced data. Astrophys Space Sci 39(2):447–462. https://doi.org/10.1007/bf00648343
Article Google Scholar
Maddanu F (2023) Forecasting highly persistent time series with bounded spectrum processes. Stat Pap 64:285–319. https://doi.org/10.1007/s00362-022-01321-z
Article Google Scholar
Maddanu F, Proietti T (2023) Trends in atmospheric ethane. Clim Change 176(5):53. https://doi.org/10.1007/s10584-023-03508-1
Article CAS Google Scholar
Mandelbrot BB, Ness JWV (1968) Fractional Brownian motions, fractional noises and applications. SIAM Rev 10(4):422–437
Article Google Scholar
Marinucci D, Robinson P (1999) Alternative forms of fractional Brownian motion. J Stat Plann Inference 80(1):111–122. https://doi.org/10.1016/S0378-3758(98)00245-6
Article Google Scholar
Mazzotti S, Lambert A, Courtier N et al (2007) Crustal uplift and sea level rise in northern Cascadia from GPS, absolute gravity, and tide gauge data. Geophys Res Lett. https://doi.org/10.1029/2007GL030283
Article Google Scholar
Memin A, Boy JP, Santamaría-Gómez A (2020) Correcting GPS measurements for non-tidal loading. GPS Solut. https://doi.org/10.1007/s10291-020-0959-3
Article Google Scholar
Ming F, Yang Y, Zeng A et al (2019) Decomposition of geodetic time series: a combined simulated annealing algorithm and Kalman filter approach. Adv Space Res 64(5):1130–1147. https://doi.org/10.1016/j.asr.2019.05.049
Article Google Scholar
Montillet JP, Bos MS (2020) Geodetic time series analysis in earth sciences. In: Geodetic time series analysis in earth sciences
Montillet JP, Finsterle W, Kermarrec G et al (2022) Data fusion of total solar irradiance composite time series using 41 years of satellite measurements. J Geophys Res Atmos 127(13):e2021JD036146. https://doi.org/10.1029/2021JD036146
Article Google Scholar
Mudelsee M (2019) Trend analysis of climate time series: a review of methods. Earth-Sci Rev 190:310–322. https://doi.org/10.1016/j.earscirev.2018.12.005
Article Google Scholar
Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9(1):141–142. https://doi.org/10.1137/1109020
Article Google Scholar
Omidalizarandi M, Herrmann R, Kargoll B et al (2020) A validated robust and automatic procedure for vibration analysis of bridge structures using mems accelerometers. J Appl Geod 14(3):327–354. https://doi.org/10.1515/jag-2020-0010
Article Google Scholar
Oppenheim G, Viano MC (2004) Aggregation of random parameters Ornstein-Uhlenbeck or AR processes: some convergence results. J Time Ser Anal 25:335–350. https://doi.org/10.1111/j.1467-9892.2004.01775.x
Article Google Scholar
Poli P, Moll P, Rabier F et al (2007) Forecast impact studies of zenith total delay data from European near real-time GPS stations in Météo France 4DVAR. J Geophys Res Atmos. https://doi.org/10.1029/2006JD007430
Article Google Scholar
Proietti T, Maddanu F (2022) Modelling cycles in climate series: the fractional sinusoidal waveform process. J Econ. https://doi.org/10.1016/j.jeconom.2022.04.008
Article Google Scholar
Proietti T, Pedregal DJ (2023) Seasonality in high frequency time series. Econ Stat 27:62–82. https://doi.org/10.1016/j.ecosta.2022.02.001
Article Google Scholar
Proietti T, Luati A (2013) Maximum likelihood estimation of time series models: the Kalman filter and beyond. In: Handbook of research methods and applications in empirical macroeconomics. Edward Elgar Publishing
Proietti T, Maddanu F (2021) Modelling cycles in climate series: the fractional sinusoidal waveform process. CEIS: Centre for Economic and International Studies Working Paper Series
Rekhviashvili S (2006) Simulation of flicker noise by fractional integro-differentiation. Tech Phys 51:803–805. https://doi.org/10.1134/S1063784206060181
Article CAS Google Scholar
Rodell M, Famiglietti J, Wiese D et al (2018) Emerging trends in global freshwater availability. Nature 557:651–659. https://doi.org/10.1038/s41586-018-0123-1
Article CAS Google Scholar
Savitzky A, Golay MJE (1964) Smoothing and differentiation of data by simplified least squares procedures. Anal Chem 36(8):1627–1639. https://doi.org/10.1021/ac60214a047
Article CAS Google Scholar
Schmidt R et al (2008) Periodic components of water storage changes from grace and global hydrology models. J Geophys Res Solid Earth. https://doi.org/10.1029/2007JB005363
Article Google Scholar
Shen Y et al (2013) Spatiotemporal filtering of regional GNSS network’s position time series with missing data using principle component analysis. J Geod 88:1–12. https://doi.org/10.1007/s00190-013-0663-y
Article Google Scholar
Stratimirović D, Sarvan D, Miljković V et al (2018) Analysis of cyclical behavior in time series of stock market returns. Commun Nonlinear Sci Numer Simul 54:21–33. https://doi.org/10.1016/j.cnsns.2017.05.009
Article Google Scholar
Świtoński A, Josiński H, Wojciechowski K (2019) Dynamic time warping in classification and selection of motion capture data. Multidimens Syst Signal Process 30:1437–1468. https://doi.org/10.1007/s11045-018-0611-3
Article Google Scholar
Thompson DWJ, Wallace JM (1998) The arctic oscillation signature in the wintertime geopotential height and temperature fields. Geophys Res Lett 25(9):1297–1300. https://doi.org/10.1029/98GL00950
Article Google Scholar
Tiao GC, Reinsel GC, Xu D et al (1990) Effects of autocorrelation and temporal sampling schemes on estimates of trend and spatial correlation. J Geophys Res Atmos 95(D12):20507–20517. https://doi.org/10.1029/JD095iD12p20507
Article Google Scholar
Van Loon AF, Tijdeman E, Wanders N et al (2014) How climate seasonality modifies drought duration and deficit. J Geophys Res Atmos 119(8):4640–4656. https://doi.org/10.1002/2013JD020383
Wang X, Zhang K, Wu S et al (2016) Water vapor-weighted mean temperature and its impact on the determination of precipitable water vapor and its linear trend. J Geophys Res Atmos 121(2):833–852. https://doi.org/10.1002/2015JD024181
Wang X, Zhang K, Wu S, et al (2016) Long-term global GPS-derived precipitable water vapor data set. https://doi.org/10.1594/PANGAEA.862525. supplement to: Wang, X et al. (2016): Water vapor-weighted mean temperature and its impact on the determination of precipitable water vapor and its linear trend. Journal of Geophysical Research: Atmospheres, 121(2), 833-852, https://doi.org/10.1002/2015JD024181
Wernicke B, Davis J (2010) Detecting large-scale intracontinental slow-slip events (SSEs) using geodograms. Seismol Res Lett 81:694–698. https://doi.org/10.1785/gssrl.81.5.694
Williams S (2003) The effect of coloured noise on the uncertainties of rates estimated from geodetic time series. J Geod 76:483–494. https://doi.org/10.1007/s00190-002-0283-4
Article Google Scholar
Zivanovic M, Plaza A, Iriarte X et al (2023) Instantaneous amplitude and phase signal modeling for harmonic removal in wind turbines. Mech Syst Signal Process 189:110095. https://doi.org/10.1016/j.ymssp.2023.110095
Article Google Scholar

Download references

Acknowledgements

FM gratefully acknowledges financial support from the CY Initiative of Excellence (grant “Investissements d’Avenir" ANR-16-IDEX-0008), Project “EcoDep" PSI-AAP2020-0000000013. AK and JB are supported by the National Science Centre, Poland, grant no. UMO-2021/41/B/ST10/01458.

Funding

Open Access funding enabled and organized by Projekt DEAL. This study is supported by the Deutsche Forschungsgemeinschaft under the project KE2453/2-1 for correlation analysis within the context of optimal fitting.

Author information

Federico Maddanu and Anna Klos have contributed equally to this work.

Authors and Affiliations

Institute for Meteorology and Climatology, Leibniz Universität Hannover, Herrenhäuserstr. 4, 30169, Hannover, Germany
Gaël Kermarrec
Laboratory AGM UMR8088, CY Cergy Paris Université, 2 av. Adolphe Chauvin, 95302, Cergy-Pontoise, France
Federico Maddanu
Faculty of Civil Engineering and Geodesy, Military University of Technology, gen. S. Kaliskiego 2, Warsaw, Poland
Anna Klos
Department of Economics and Finance, Universitá di Roma, “Tor Vergata”, via Columbia 2, 00133, Rome, Italy
Tommaso Proietti
Faculty of Civil Engineering and Geodesy, Military University of Technology, gen. S. Kaliskiego 2, Warsaw, Poland
Janusz Bogusz

Authors

Gaël Kermarrec
View author publications
You can also search for this author in PubMed Google Scholar
Federico Maddanu
View author publications
You can also search for this author in PubMed Google Scholar
Anna Klos
View author publications
You can also search for this author in PubMed Google Scholar
Tommaso Proietti
View author publications
You can also search for this author in PubMed Google Scholar
Janusz Bogusz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gaël Kermarrec.

Ethics declarations

Conflicts of interest

The authors declare no conflict of interest

Appendices

Appendix A: Augmented KF and smoothing algorithm

The (augmented) Kalman filter is the basic recursive algorithm for prediction and likelihood evaluation for the state space model in Eq. (14). The presentation of the algorithm follows de Jong (1989), Jong (1991) and Proietti and Luati (2013), which provide a proof of the filter. The augmentation deals with initial and regression effects.

The augmented KF, initialized by $\textbf{A}_{0} = \varvec{0}$, $\textbf{q}_0= \varvec{0}$ and $\textbf{S}_t=\varvec{0}$, and adapted for the presence of missing observations, is given by the following equations, which are computed recursively for $t=1,\cdots ,n$:

if $y_t$ is observed:
$$\begin{aligned} \begin{array}{ll} \nu _t=y_t-\textbf{z}_t'{\hat{\varvec{\alpha }}}_{t \vert t-1}, &{} \textbf{V}_t'= \varvec{w}_t' -\textbf{z}_t'\textbf{A}_{t-1}, \\ f_t=\textbf{z}_t' \textbf{P}_{t \vert t-1} \textbf{z}_t, &{} \textbf{K}_t=(\textbf{T}\textbf{P}_{t \vert t-1} \textbf{z}_t' )/f_t, \\ {\hat{\varvec{\alpha }}}_{t+1 \vert t}=\textbf{T}{\hat{\varvec{\alpha }}}_{t \vert t-1}+ \textbf{K}_t \nu _t, &{} \textbf{A}_t = \textbf{T}\textbf{A}_{t-1} +\textbf{K}_t\textbf{V}_t', \\ \textbf{P}_{t+1 \vert t}=\textbf{T}\textbf{P}_{t \vert t-1} \textbf{T}'\\ + \textbf{H}\textbf{H}'- f_t \textbf{K}_t \textbf{K}_t', &{} \textbf{q}_t= \textbf{q}_{t-1} + \nu _t\textbf{V}_t'/f_t, \textbf{S}_t= \textbf{S}_{t-1} + \textbf{V}_t'\textbf{V}_t/f_t; \end{array} \end{aligned}$$
(A1)
else, if $y_t$ is missing:
$$\begin{aligned} \begin{array}{ll} {\hat{\varvec{\alpha }}}_{t+1 \vert t}=\textbf{T}{\hat{\varvec{\alpha }}}_{t \vert t-1}, &{} \textbf{A}_t = \textbf{T}\textbf{A}_{t-1}, \\ \textbf{P}_{t+1 \vert t}=\textbf{T}\textbf{P}_{t \vert t-1} \textbf{T}'+ \textbf{H}\textbf{H}', &{} \\ \textbf{q}_t= \textbf{q}_{t-1}, &{} \textbf{S}_t= \textbf{S}_{t-1}, \\ \end{array} \end{aligned}$$
(A2)

where ${\hat{\varvec{\alpha }}}_{t \vert t-1} = \text{ E }(\varvec{\alpha }_t \vert \mathcal {F}_{t-1}, \varvec{\delta }= 0)$ and $\textbf{P}_{t \vert t-1} = \text{ Var }(\varvec{\alpha }_t \vert \mathcal {F}_{t-1},\varvec{\delta }= 0) $ are the one-step-ahead prediction of the expectation and variance of the state, with $\mathcal {F}_{t}$ defined as the information set available at time t, $\mathcal {F}_{t}=\lbrace y_1, y_2, \cdots , y_t \rbrace $, $f_t= \text{ Var }(y_t \vert \mathcal {F}_{t-1},\varvec{\delta }= 0)$ is the one-step-ahead prediction error variance and $\textbf{K}_t$ is known as the Kalman gain. All these quantities are conditional on $\varvec{\delta }= 0$ (i.e., are obtained by the standard Kalman filter for the model without initial and regression effects).

The log-likelihood of $\lbrace y_1, y_2, \ldots , y_t, \ldots , y_T \rbrace $ is

$$\begin{aligned} LL(\varvec{\theta })= & {} -\frac{1}{2} \bigg ( (\tilde{n} - k)\log (2 \pi ) + \sum _{i \in \tilde{N}}^{\tilde{n}} \log (f_t) + \log ( \vert \textbf{S}_{\tilde{n}} \vert ) \nonumber \\{} & {} + \sum _{i \in \tilde{N}}^{\tilde{n}} \frac{\nu _t^2}{f_t} - \textbf{q}_{\tilde{n}}' \varvec{\delta }\bigg ), \end{aligned}$$

(A3)

where k is the number of elements in the vector $\varvec{\delta }$, $\tilde{n}$ is the number of observations that are not missing and $\tilde{N}$ is the subset of natural numbers for which $t \in \tilde{N}$ if and only if $y_t$ is observed. The parameter $\varvec{\delta }$ can be concentrated outside the likelihood and its estimator is $ {\hat{\varvec{\delta }}} = \textbf{S}_{\tilde{n}}^{-1} \textbf{q}_{\tilde{n}}, $ with $\text{ Var }({\hat{\varvec{\delta }}}) = \textbf{S}_{\tilde{n}}^{-1}.$

Finally, the smoothed estimates of the state vector, defined as ${\hat{\varvec{\alpha }}}_{t \vert n}=\text{ E }(\varvec{\alpha }_t \vert \mathcal {F}_{n})$, and their covariance matrix $\text{ Var }(\varvec{\alpha }_t \vert \mathcal {F}_{n})= \textbf{P}_{t \vert n}$ are obtained via the smoothing algorithm. Starting at $t = n$, with initial values $r_n = 0$ and $\textbf{N}_n = 0$, it computes backwards the following recursive formulae for $t = n-1, \cdots ,1$:

if $y_t$ is observed, defining $\textbf{L}_t = \textbf{T}-\textbf{K}_t\textbf{z}_t'$,
$$\begin{aligned} \begin{array}{llllll} \textbf{r}_{t-1} = \textbf{L}_t' \textbf{r}_t + f_t^{-1}\textbf{z}_t \nu _t, &{} \textbf{R}_{t-1}= \textbf{L}_t' \textbf{R}_t + f_t^{-1}\textbf{z}_t \textbf{V}_t, \\ &{}\textbf{N}_{t-1} =\textbf{L}_t'\textbf{N}_t \textbf{L}_t + f_t^{-1}\textbf{z}_t\textbf{z}_t',\\ {\hat{\varvec{\alpha }}}_{t \vert n} = {\hat{\varvec{\alpha }}}_{t \vert t-1} + \textbf{P}_{t \vert t-1} ( \textbf{r}_{t-1} - \textbf{R}_t {\hat{\varvec{\delta }}} ), &{} \textbf{A}^*_t = \textbf{A}_{t} + \textbf{P}_{t \vert t-1}\textbf{R}_t, \\ \textbf{P}_{t \vert n} = \textbf{P}_{t \vert t-1}-\textbf{P}_{t \vert t-1}\textbf{N}_{t-1}\textbf{P}_{t \vert t-1}+\textbf{A}^*_t\text{ Var }({\hat{\varvec{\delta }}})\textbf{A}^{*'}_t, \\ \end{array} \nonumber \\ \end{aligned}$$
(A4)
else, if $y_t$ is missing:
$$\begin{aligned} \begin{array}{rclcrl} \textbf{r}_{t-1} &{}= &{}\textbf{L}_t' \textbf{r}_t &{} \textbf{R}_{t-1} &{}= &{}\textbf{L}_t' \textbf{R}_t \\ &{}&{}&{} \textbf{N}_{t-1} &{}=&{} \textbf{L}_t'\textbf{N}_t \textbf{L}_t \\ {\hat{\varvec{\alpha }}}_{t \vert n} &{}=&{} {\hat{\varvec{\alpha }}}_{t \vert t-1} + \textbf{P}_{t \vert t-1} ( \textbf{r}_{t-1} - \textbf{R}_t {\hat{\varvec{\delta }}} ) &{} \textbf{A}^*_t &{}=&{} \textbf{A}_t + \textbf{P}_{t \vert t-1}\textbf{R}_t \\ \textbf{P}_{t \vert n} &{}=&{} \textbf{P}_{t \vert t-1}-\textbf{P}_{t \vert t-1}\textbf{N}_{t-1}\textbf{P}_{t \vert t-1} +\textbf{A}^*_t\text{ Var }({\hat{\varvec{\delta }}})\textbf{A}^{*'}_t. &{} &{} \\ \end{array}\nonumber \\ \end{aligned}$$
(A5)

For more details and proofs of the above formulae see de Jong (1989), Jong (1991) and Proietti and Luati (2013).

Hyperparameter estimation To restrict the parameter space, we do not directly estimate the model’s parameters but the so-called hyperparameters, that are a function of them.

For instance, suppose we want to estimate the parameters d and $\sigma ^2$ via the hyperparameters $\theta _1$ and $\theta _2$, with $\theta _1,\theta _2 \in \mathbb {R}$. The parameters space is defined by $d \in (0,1)$ and $\sigma ^2 > 0$. It results that the restrictions

$$\begin{aligned} d=e^{\theta _1}/(1+e^{\theta _1}) \>\>\>\text{ and } \>\>\>\sigma ^2=e^{2\theta _2}\end{aligned}$$

imply that $d, \sigma ^2 \rightarrow 0 $ as $\theta _1,\theta _2 \rightarrow -\infty $ and $d \rightarrow 1$, $\sigma ^2 \rightarrow +\infty $ as $\theta _1,\theta _2 \rightarrow +\infty $.

A last remark concerns the dimension of the hyperparameters vector. A model identification problem may arise as the dimension increases, affecting in particular the cyclical components. If we simulate for instance a process with two cyclical components, we could obtain bias in the respective $d_j$ and $\sigma _j$ parameters, since the components may go into conflict with each other. Anyway, from our experience, the fit remains good, in the sense that both the true values and the estimated ones provide good results in terms of residuals diagnostic.

Table 3 Linear trend: results of the Monte-Carlo simulation for the deterministic trend case with fixed slope $b_{true}=0.03$ mm and periodic fSWp with $d_{true}$ equal to 0.2, 0.4, and 0.7, and $\sigma _\eta ^2$=2.5; the expected outlier contamination was fixed at 1% of the sample, and 5% of the observations are missing. The total sample size is 3,000

Full size table

Appendix B: Monte-Carlo simulations

This appendix reports the result of a Monte-Carlo simulation experiment carried out to illustrate and evaluate the methodology proposed in this paper.

1.1 B.1 Data generation

Following Klos et al. (2018a), we generate 500 synthetic time series of length 3,000, which is a typical sample size for geodetic time series, according to the model presented in Sect. 10. The components were calibrated to mimic the properties of the empirical time series considered in the real life illustrations presented in Sect. 3.

As for the trend component, we considered the following cases: (i) $\mu _t$ is a deterministic trend with slope $b=0.03$ mm (as, for instance, for the DRAO or NEAH GNSS DTS); (ii) $\mu _t$ is a stochastic trend with a RW specification. The variance of the stochastic trend was set equal to $\sigma _{\epsilon }^2=0.05\text { mm}^2$, while $\sigma _{\zeta }^2=0$. This corresponds to the HYDL-predicted DTS, the GNSS DTS from the LPAL station, see Sect. 3.3 or from Wettzell in Sect. 3.3.1.

For the seasonal component, we assumed that it was generated by a fSWp with different fractional parameter, $d = 0.2, 0.4, 0.7$ and variance $\sigma _{\eta }^2=2\text { mm}^2$ defined at the fundamental frequency ($q=1$) in Eq. (2). We did not include the component $u_t$.

The simulated series were contaminated by outliers, by sampling independently the outlier indicator from a Bernoulli distribution with success probability 0.01 and adding to the raw series a random draw from a normal distribution with mean zero and variance $1.2\ text{ mm}^2$ to all observations with the outlier indicator equal to 1. A single level shift was also added at a random location with twice the size of the standard deviation of $\Delta \mu _t$. We further modeled one level shift or offset for the stochastic trend as for the Wettzell stations. The variance of the normal distributed variable determining the time at which the shift occur is determined by multiplying the sample mean of the simulated process by 2.

1.2 Results

Table 3 presents the average values of the parameter estimates for the data generating process featuring a deterministic trend, computed across the 500 replications. The bottom row presents is the mean square estimation error of the parameters.

From Table 3, we see that the memory parameter d is accurately estimated in all cases with a low estimation error variance, although the performance slightly deteriorates for $d=0.7$. Similar considerations hold for $\sigma _{\eta }^2$ and the MSE. The MSE of the model increases with the values of d. The presence of outliers does not seem to affect the good performance of the estimation method. No substantial bias emerges for the estimation of the slope of the trend, the mean value of the estimation error $\hat{b} - b_{true}$, being close to zero with a very small variance. Regarding the values found in real cases, this Monte-Carlo simulations make us confident in the physical interpretation of the amplitude of the periodical component, as well as of the trend parameter.

The results obtained for the Monte-Carlo simulations with a stochastic trend, presented in Table 4, confirm the excellent performance of our estimation strategy. Again, as d increases the accuracy declines, but it remains solid for $d= 0.2,$ and 0.4. The mean square estimation error is satisfactorily contained for the remaining hyperparameters, $\sigma _{\epsilon }$ and $\sigma _{\eta }$. As the simulated data encompass a deviation, randomly distributed anomalies, and gaps in the data, the outcomes instill confidence in the results achieved for the Wettzel stations in Sect. 3.3.1.

Table 4 Stochastic trend: results of the Monte-Carlo simulation for the RW trend with size $\sigma _{\epsilon }^2=0.05\text { mm}^2$ and periodic fSWp with $d_{true}$ equal to 0.2, 0.4, and 0.7, and $\sigma _\eta ^2$=2.5; the expected outlier contamination was fixed at 1% of the sample, 5% of the observations are missing, and one level shift is simulated. The total sample size is 3,000

Full size table

As a final comment, we report that the mean of Sum2Corr, the diagnostic of the residual autocorrelations defined in Sect. 2.7, was lower than 0.003 for all cases with deterministic trend and $d<0.7$. For $d>=0.7$, we found a mean of Sum2Corr of 0.005, as for the stochastic trend. This result highlights that, even in the case of a challenging near non-stationarity, the level of residual correlations is very low. Correspondingly, any additional correlated noise found in the residuals should not be due to the estimation process, providing that the observations correspond to an fSWp, i.e., an additional correlated noise is due either to a misfitting (particularly for a full deterministic model), or to a physically related noise (as an atmospheric noise Kermarrec et al. (2022)).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kermarrec, G., Maddanu, F., Klos, A. et al. Modeling trends and periodic components in geodetic time series: a unified approach. J Geod 98, 17 (2024). https://doi.org/10.1007/s00190-024-01826-5

Download citation

Received: 12 June 2023
Accepted: 03 February 2024
Published: 04 March 2024
DOI: https://doi.org/10.1007/s00190-024-01826-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Modeling trends and periodic components in geodetic time series: a unified approach

Abstract

Similar content being viewed by others

Least Squares Contribution to Geodetic Time Series Analysis

Stochastic Modelling of Geophysical Signal Constituents Within a Kalman Filter Framework

Modelling the GNSS Time Series: Different Approaches to Extract Seasonal Signals

1 Introduction

2 Methodology

2.1 The fractional sinusoidal waveform process

2.1.1 Introduction

2.1.2 Special cases

2.1.3 Properties

2.2 Simplification

2.3 A parallel with the bandpass noise model

2.4 A general parametric model for periodic geodetic signals

2.5 State space representation

2.6 Statistical inference and signal extraction

2.7 Model evaluation and selection

3 Case study

3.1 Data description

3.1.1 IGS-processed vertical DTS

3.1.2 HYDL- and NTAL-predicted vertical DTS

3.1.3 PWV time series

3.1.4 NGL-processed vertical DTS

3.2 Results

3.2.1 Case study: DRAO station

3.3 DTS with strong annual component

3.3.1 Neighboring GNSS stations in Wettzell, Germany

4 Conclusion: the fSWp for modeling geodetic time series

Data availability

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Appendices

Appendix A: Augmented KF and smoothing algorithm

Appendix B: Monte-Carlo simulations

1.1 B.1 Data generation

1.2 Results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation