A flexible two-piece normal dynamic linear model

Aliverti, Emanuele; Arellano-Valle, Reinaldo B.; Kahrari, Fereshteh; Scarpa, Bruno

doi:10.1007/s00180-023-01355-3

A flexible two-piece normal dynamic linear model

Original Paper
Open access
Published: 24 April 2023

Volume 38, pages 2075–2096, (2023)
Cite this article

Download PDF

You have full access to this open access article

Computational Statistics Aims and scope Submit manuscript

A flexible two-piece normal dynamic linear model

Download PDF

Emanuele Aliverti¹,
Reinaldo B. Arellano-Valle²,
Fereshteh Kahrari¹ &
…
Bruno Scarpa ORCID: orcid.org/0000-0002-9628-5164^1,3

1265 Accesses
1 Altmetric
Explore all metrics

Abstract

We construct a flexible dynamic linear model for the analysis and prediction of multivariate time series, assuming a two-piece normal initial distribution for the state vector. We derive a novel Kalman filter for this model, obtaining a two components mixture as predictive and filtering distributions. In order to estimate the covariance of the error sequences, we develop a Gibbs-sampling algorithm to perform Bayesian inference. The proposed approach is validated and compared with a Gaussian dynamic linear model in simulations and on a real data set.

Introduction to Dynamic Linear Models for Time Series Analysis

Time Series Modeling with MATLAB: The SSpace Toolbox

Linear models with time-varying parameters: a comparison of different approaches

Article Open access 30 January 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

State-space models have been extensively considered in diverse areas of application for modeling and forecasting time series. An important special case is the class of dynamic linear models (hereafter dlm). This class of models includes the ordinary static linear model as a special case, and assumes that the parameters can change over time, thus incorporating in the observational system variations that can significantly affect the observed behavior of the process of interest. The dlm is defined by

$$\begin{aligned} {\textbf{Y}}_t= & {} {\textbf{F}}^{\top }_t\varvec{\theta }_t + \varvec{\nu }_t\quad ({\text {observation equation}}),\\ \varvec{\theta }_t= & {} {\textbf{G}}_t\varvec{\theta }_{t-1} + \varvec{\omega }_t\quad ({\text {system or state equation}}), \end{aligned}$$

for $t=1,\ldots , T$, where ${\textbf{Y}}_t$ is a $r \times 1$ response vector, ${\textbf{F}}_t$ is a $p \times r$ matrix that links the observed data with $\varvec{\theta }_t$, a $p \times 1$ vector of latent states at time t, ${\textbf{G}}_t$ is a $p\times p$ transition matrix, which describes the evolution of the state parameters. The terms $\varvec{\nu }_t$ and $\varvec{\omega }_t$ are mutually independent white-noise vectors of dimension $r\times 1$ and $p \times 1$, respectively, with zero-means and constant variance-covariance matrice ${\textbf{V}}$ and ${\textbf{W}}$, respectively. The most popular case is the Gaussian dlm, which assumes that

$$\begin{aligned}&\varvec{\nu }_t \buildrel \text {ind}.\over \sim N_r(\varvec{0},{\textbf{V}}),\qquad \varvec{\omega }_t \buildrel \text {ind}.\over \sim N_p(\varvec{0},{\textbf{W}}),\qquad \varvec{\theta }_0 \sim N_p(\varvec{m}_0,\varvec{C}_0), \end{aligned}$$

all being mutually independents for each $t=1,\ldots , T$, where $N_k(\varvec{\mu },\varvec{\Sigma })$ denotes the k-variate normal distribution with mean vector $\varvec{\mu }$ and variance-covariance matrix $\varvec{\Sigma }$. For an extensive introduction to dlms with a Bayesian perspective, refer to West and Harrison (1997), Petris et al. (2009). In this article, we focus on a setting where ${\textbf{F}}_t$ and ${\textbf{G}}_t$ are known, consistently with classical Kalman filter setting (Kalman 1960) and recent developments in state-space models (e.g., Fasano et al. 2021).

Leveraging the properties of the multivariate normal distribution and the structure of the Gaussian dlm, it is possibile to derive closed form expressions for the predictive and filtering distributions and conduct dynamic inference on the states $\varvec{\theta }_t$ via Kalman filter, conditioning on ${\textbf{F}}_t,{\textbf{G}}_t, {\textbf{W}}$ and ${\textbf{V}}$. However, Naveau et al. (2005) observed that the Gaussian assumption may be questionable for a large number of applications, as many distributions used in a state-space model can be skewed. In order to mitigate this issue, Naveau et al. (2005) assumed that state initial parameter vector $\varvec{\theta }_0$ follows a multivariate closed skew-normal distribution, preserving the typical assumptions of independence and normality for the error sequences $\varvec{\nu }_t$ and $\varvec{\omega }_t$. From this work, several authors have proposed different mechanisms to obtain dlm with more skewness. For instance, Kim et al. (2014) extended the results in Naveau et al. (2005) by assuming a scale mixtures of closed skew-normal distributions for the initial state parameter vector $\varvec{\theta }_0$; Cabral et al. (2014) proposed a Bayesian dlm relaxing the assumption of normality and assuming an extended skew-normal (Azzalini and Capitanio 1999) for the initial distribution of the state parameter; Arellano-Valle et al. (2019) proposed a dlm in which the error sequence $\varvec{\nu }_t$ in the observational equation are assumed to have multivariate skew-normal distribution. Furthermore, several authors have been dealing with a similar problem; see, for example, Gualtierotti (2005), Pourahmadi (2007), Corns and Satchell (2007), among many others.

In this work we take a similar perspective, and derive a novel dlm that allows one to induce asymmetry by means of a scalar parameter, inducing a skewed initial distribution for the state space parameter $\varvec{\theta }_0$. Our purpose is to replace the normal distribution for $\varvec{\theta }_0$ by a more flexible one, incorporating asymmetry via a two-piece normal (tpn) mixing distribution. Using this simple method, we obtain an extension of the classic Kalman filter, and closed form expressions for one-step-ahead and filtering distributions. These results are further combined into a Markov-Chain Monte Carlo procedure via a forward filtering–backward sampling algorithm to provide inference on the covariances $\textbf{V}$ and $\textbf{W}$ of the error terms, providing posterior inference on the unknown quantities.

2 Two-piece normal and skew normal distributions

2.1 Two-piece normal distributions

According to Arellano-Valle et al. (2005), a continuous random variable Y follows a two-piece normal (tpn) distribution with location $\mu$, scale $\sigma$ and asymmetry parameter $\gamma$ if its density function for any $y\in {\mathbb {R}}$ can be written as

$$\begin{aligned} h(y) = \frac{2}{\sigma (a(\gamma ) + b(\gamma ))}\left\{ \phi \left( \frac{y-\mu }{\sigma a(\gamma )}\right) I_{[\mu ,\infty )}(y) + \phi \left( \frac{y-\mu }{\sigma b(\gamma )}\right) I_{(-\infty ,\mu )}(y)\right\} , \end{aligned}$$

(1)

where $\phi (x)$ denotes the density of a standard Gaussian and $I_A(x)$ denotes the indicator function of the set A; we write such a distribution compactly as $Y\sim TPN(\mu , \sigma , \gamma )$. Skewness is controlled via two functions $a(\gamma )$ and $b(\gamma )$ satisfying the following properties:

(i)
$a(\gamma )$ and $b(\gamma )$ are positive-valued functions for $\gamma \in (\gamma _L,\gamma _U)$, a possibly infinite interval;
(ii)
one of the functions is (strictly) increasing and the other is (strictly) decreasing;
(iii)
there exists an unique value $\gamma _* \in (\gamma _L, \gamma _U)$ such that $a(\gamma _*) = b(\gamma _*)$, and therefore the tpn density (1) becomes
$$\begin{aligned} h(y)=\frac{2}{\sigma a(\gamma _*)}\phi \left( \frac{y-\mu }{\sigma a(\gamma _*)}\right) . \end{aligned}$$

In addition, the tpn distribution has several interesting formal properties in terms of stochastic construction (Arellano-Valle et al. 2005). The following list includes those most relevant for the purposes of this work:

P1. The tpn density (1) can be expressed as a finite mixture of two truncated normal densities $f_a$ and $f_b$ given by

$$\begin{aligned} f_a(y) = \frac{2}{\sigma a(\gamma )}\phi \left( \frac{y-\mu }{\sigma a(\gamma )}\right) I_{[\mu ,\infty )}(y),\quad f_b(y) = \frac{2}{\sigma b(\gamma )}\phi \left( \frac{y-\mu }{\sigma b(\gamma )}\right) I_{(-\infty ,\mu )}(y). \end{aligned}$$

That is,

$$\begin{aligned} h(y)=\pi _a f_a(y) + \pi _b f_b(y),\quad y \in {\mathbb {R}}, \end{aligned}$$

(2)

where

$$\begin{aligned} \pi _s = \frac{s(\gamma )}{a(\gamma ) + b(\gamma )}, \quad s = a, b. \end{aligned}$$

(3)

P2. If $Y \sim h$, then $Y \buildrel d\over = \mu + \sigma W_\gamma V$, where the notation $\buildrel d\over =$ indicates equality in distribution. Specifically, $V \sim TN(0,1, [0, \infty ])$, a truncated normal with location 0, scale 1, truncated over the positive real line, while $W_\gamma$ is an independent discrete random variable with probability function

$$\begin{aligned} p{(w;\gamma )} = {\left\{ \begin{array}{ll} \pi _a &{} \quad \text {if } w = a(\gamma ), \\ \pi _b &{} \quad \text {if } w = - b(\gamma ), \\ 0 &{} \quad \text {otherwise}, \end{array}\right. } \end{aligned}$$

which can be rewritten as

$$\begin{aligned} p{(w;\gamma )} = \pi _a^{(s+1)/2}\pi _b^{(s-1)/2}I_{\{-1,1\}}(s), \end{aligned}$$

with $s = \text{ sign }(w)$ and $\pi _s$ defined by (3). Equivalently, if $Y = \mu + \sigma W_\gamma |X|$, where $X \sim N(0,1)$ and is independent of $W_\gamma$, then $Y \sim h$. This stochastic representation allows one to obtain the mean and variance of Y leveraging the law of total expectation; refer to Arellano-Valle et al. (2020) for further details.

2.2 Skew-normal distribution

A random vector ${\textbf{Y}}$ has a multivariate Skew-Normal sn distribution with location vector $\varvec{\xi }$, positive definite scale matrix $\varvec{\Omega }$ and skewness/shape vector $\varvec{\lambda }$, denoted by ${\textbf{Y}} \sim SN_p (\varvec{\xi }, \varvec{\Omega }, \varvec{\lambda })$, if its density function is given by

$$\begin{aligned} f({\textbf{y}}; \varvec{\xi },\varvec{\Omega },\varvec{\lambda }) = 2\phi _p({\textbf{y}}; \varvec{\xi },\varvec{\Omega })\Phi ({\varvec{\lambda }}^{\top } {\varvec{\Omega }}^{-1/2}({\textbf{y}} - \varvec{\xi })),\quad \quad {\textbf{y}} \in {\mathbb {R}}^p. \end{aligned}$$

Here, $\phi _p(\cdot ;\varvec{\xi },\varvec{\Omega })$ denotes the density function of the p-variate normal distribution with mean vector $\varvec{\xi }$ and variance-covariance matrix $\varvec{\Omega }$, and $\Phi (\cdot )$ is the cumulative distribution function of a standard normal. The sn random vector ${\textbf{Y}} \sim SN_p (\varvec{\xi }, \varvec{\Omega }, \varvec{\lambda })$ can be introduced as the location-scale transformation ${\textbf{Y}} = \varvec{\xi } + {\varvec{\Omega }}^{1/2}{\textbf{X}}$, where ${\textbf{X}}$ has the following stochastic representation:

$$\begin{aligned} {\textbf{X}} \buildrel d\over = \varvec{\delta }\vert X_0\vert + {\textbf{X}}_1, \end{aligned}$$

(4)

where $\varvec{\delta } = \varvec{\lambda }/{(1 + {\varvec{\lambda }}^{\top }\varvec{\lambda })}^{1/2}$, $X_0 \sim N(0,1)$ and ${\textbf{X}}_1 \sim N_p (\varvec{0}, \varvec{I}_p - \varvec{\delta }{\varvec{\delta }}^{\top })$, which are independent. By (4) we can get that if ${\textbf{Y}} \sim SN_p (\varvec{\xi }, \varvec{\Omega }, \varvec{\lambda })$, then there are two independent random quantities Z and ${\textbf{U}}$, with $Z \buildrel d\over = \vert X_0 \vert$ and ${\textbf{U}} \buildrel d\over = {\varvec{\Omega }}^{1/2}{\textbf{X}}_1$, such that

$$\begin{aligned} {\textbf{Y}} = \varvec{\xi } + \varvec{\Delta }Z + {\textbf{U}}, \end{aligned}$$

(5)

where $\varvec{\Delta } = {\varvec{\Omega }}^{1/2}\varvec{\delta }$. Note that $Z \sim HN(0,1)$ and ${\textbf{U}} \sim (\varvec{0}, \varvec{\Omega } - \varvec{\Delta }{\varvec{\Delta }}^{\top })$. Thus, using (5), it can be shown that the mean vector and variance-covariance matrix of ${\textbf{Y}}$ are given respectively by

$$\begin{aligned}&E({\textbf{Y}}) = \varvec{\xi } + \sqrt{\frac{2}{\pi }}\varvec{\Delta }\quad \text{ and } \quad \text{ var }({\textbf{Y}}) = \varvec{\Omega } - {\frac{2}{\pi }}\varvec{\Delta }{\varvec{\Delta }}^{\top }. \end{aligned}$$

3 A two-piece normal dynamic linear model

3.1 The initial state distribution

Our proposal in this section is to derive a more flexible dlm that regulates asymmetry through a simple scalar parameter. Specifically, preserving the classical independence assumptions, we consider the dlm defined by

$$\begin{aligned} {\textbf{Y}}_t= & {} {\textbf{F}}^{\top }_t\varvec{\theta }_t + \varvec{\nu }_t,\qquad \varvec{\nu }_t \sim N_r(\varvec{0},{\textbf{V}}),\end{aligned}$$

(6)

$$\begin{aligned} \varvec{\theta }_t= & {} {\textbf{G}}_t\varvec{\theta }_{t-1} + \varvec{\omega }_t,\qquad \varvec{\omega }_t \sim N_p(\varvec{0},{\textbf{W}}), \end{aligned}$$

(7)

for $t=1,\ldots , T$, replacing the initial state parameter $\varvec{\theta }_0$ distribution with the following hierarchical specification:

$$\begin{aligned} \varvec{\theta }_0 \vert \varphi\sim & {} N_p(\varvec{m}_0 + \varphi \varvec{\beta }_0,\varvec{C}_0), \end{aligned}$$

(8)

$$\begin{aligned} \varphi\sim & {} {TPN} (\mu ,\sigma _0,\gamma _0). \end{aligned}$$

(9)

The model defined by the Eqs. (6, 7) and (8, 9) will be referred to as two-piece normal dynamic linear model (hereafter tpn-dlm).

As a first important result, we note that the hierarchical specification (8, 9) leads a mixture of two multivariate skew-normals as initial distribution for $\varvec{\theta }_0$. Its proof can be derived as a direct extension of Proposition 2 in Arellano-Valle et al. (2020), and so it is omitted here.

Proposition 3.1

Under the hierarchical representation defined by Eqs. (8, 9), the initial density of $\varvec{\theta }_0$ is given by

$$\begin{aligned} p(\varvec{\theta }_0)=2\pi _a\phi _p(\varvec{\theta }_0;\varvec{\xi }_0,\varvec{\Omega }_a) \Phi (\varvec{\eta }^{\top }_a(\varvec{\theta }_0 -\varvec{\xi }_0))+2\pi _b\phi _p(\varvec{\theta }_0;\varvec{\xi }_0,\varvec{\Omega }_b) \Phi (\varvec{\eta }^{\top }_b(\varvec{\theta }_0-\varvec{\xi }_0)), \end{aligned}$$

(10)

where, for $s=a,b$, $\pi _s$ is defined by (3), and

$$\begin{aligned}{} & {} \varvec{\xi }_0=\varvec{m}_0+\mu \varvec{\beta }_0,\,\, \varvec{\alpha }_s=\sigma _0 s(\gamma _0)\varvec{\beta }_0,\,\, \\{} & {} \varvec{\Omega }_s=\varvec{C}_0+\varvec{\alpha }_s\varvec{\alpha }^\top _s,\,\, \varvec{\eta }_s=(1-\varvec{\alpha }^\top _s\varvec{\Omega }^{-1}_s\varvec{\alpha }_s)^{-1/2} {\varvec{\Omega }_s}^{-1}\varvec{\alpha }_s. \end{aligned}$$

Here it should be noted that from the well-known matrix inversion formula

$$\begin{aligned} {\varvec{\Omega }}^{-1}_s=(\varvec{C}_0+\varvec{\alpha }_s{\varvec{\alpha }}^\top _s)^{-1}={\varvec{C}_0}^{-1} -\frac{{\varvec{C}_0}^{-1}\varvec{\alpha }_s{\varvec{\alpha }}^\top _s {\varvec{C}_0}^{-1}}{1+{\varvec{\alpha }}^\top _s{\varvec{C}_0}^{-1}\varvec{\alpha }_s}, \end{aligned}$$

we get, for $s=a,b$, that $1-{\varvec{\alpha }}^\top _s{\varvec{\Omega }}^{-1}_s\varvec{\alpha }_s = {(1+{\varvec{\alpha }}^\top _s{\varvec{C}_0}^{-1}\varvec{\alpha }_s)}^{-1}>0$ and ${\varvec{\alpha }}^\top _s{\varvec{\Omega }}^{-1}_s={(1+{\varvec{\alpha }}^\top _s{\varvec{C}_0}^{-1}\varvec{\alpha }_s)}^{-1}{\varvec{\alpha }}^\top _s{\varvec{C}_0}^{-1}$, so that the term $\varvec{\eta }_s$ defined in Proposition 3.1 can be rewritten as

$$\begin{aligned} \varvec{\eta }_s={(1+{\varvec{\alpha }}^\top _s{\varvec{C}_0}^{-1}\varvec{\alpha }_s)}^{-1/2} {\varvec{C}_0}^{-1}{\varvec{\alpha }}_s. \end{aligned}$$

The distribution of the initial random vector $\varvec{\theta }_0$ can be written as

$$\begin{aligned} p(\varvec{\theta }_0)= & {} 2\pi _a\phi _p(\varvec{\theta }_0;\varvec{m}_0 +\mu \varvec{\beta }_0,\varvec{C}_0+\varvec{\alpha }_a\varvec{\alpha }^\top _a) \Phi \left( \frac{{\varvec{\alpha }}^\top _a{\varvec{C}}^{-1}_0(\varvec{\theta }_0-\varvec{\xi }_0)}{\sqrt{1+{\varvec{\alpha }}^\top _a{\varvec{C}}^{-1}_0{\varvec{\alpha }}_a}}\right) \nonumber \\{} & {} +2\pi _b\phi _p(\varvec{\theta }_0;\varvec{m}_0+\mu \varvec{\beta }_0,\varvec{C}_0 +\varvec{\alpha }_b\varvec{\alpha }^\top _b)\Phi \left( \frac{{\varvec{\alpha }}^\top _b {\varvec{C}}^{-1}_0(\varvec{\theta }_0-\varvec{\xi }_0)}{\sqrt{1+{\varvec{\alpha }}^\top _b{\varvec{C}}^{-1}_0{\varvec{\alpha }}_b}}\right) \quad , \end{aligned}$$

(11)

which correspond to the density of a two-component mixture of the multivariate skew-normal densities reported in Sect. 2.2. Specifically, from Proposition (3.1), we see that the initial state parameter is distributed as

$$\begin{aligned} \varvec{\theta }_0\sim \pi _a SN_p(\varvec{\xi }_a,\varvec{\Omega }_a,\varvec{\lambda }_a) + \pi _b SN_p(\varvec{\xi }_b,\varvec{\Omega }_b,\varvec{\lambda }_b), \end{aligned}$$

(12)

where

$$\begin{aligned} \varvec{\xi }_s={\varvec{\xi }}_0\quad \textrm{and}\quad \varvec{\lambda }_s=(1-{\varvec{\alpha }}^\top _s{\varvec{\Omega }}^{-1}_s \varvec{\alpha }_s)^{-1/2}{\varvec{\Omega }}^{-1/2}_s\varvec{\alpha }_s,\quad \quad s=a,b. \end{aligned}$$

3.2 The Kalman filter

Our next step is to develop a Kalman filter based on the new initial distribution given by (12), and assuming that the conditional distribution of $\varphi$ corresponds to a mixture of two truncated Gaussian densities.

Let $D_t = \{{\textbf{y}}_1,\cdots ,{\textbf{y}}_t\}$ denote the available information at time t, where ${\textbf{y}}_i$ indicates a realization of the random variable ${\textbf{Y}}_i$. In the proposed tpn-dlm we consider a conditionally normal distribution for $\varvec{\theta }_0$ given $\varphi$, with a tpn initial distribution for $\varphi$ (8). Furthermore, we assume by induction that

$$\begin{aligned}{} & {} \varvec{\theta }_{t-1} \vert \varphi ,D_{t-1} \sim N_p(\varvec{m}_{t-1} + \varphi \varvec{\beta }_{t-1},\varvec{C}_{t-1}), \nonumber \\{} & {} \varphi \vert D_{t-1} \sim \pi _{t-1}^a TN(\eta _{t-1}^a,\tau _{t-1}^a, [\mu ,\infty )) + \pi _{t-1}^b TN(\eta _{t-1}^b,\tau _{t-1}^b,(-\infty ,\mu )). \end{aligned}$$

(13)

Specifically, the conditional distribution of $\varphi$ correspond to a mixture of two truncated Gaussian with locations $\eta _{t-1}^s$, scales $\tau _{t-1}^s$ and mixing weights $\pi ^s_{t-1}$ with $s=a,b$ and truncation point $\mu$, defined by the initial distribution given in 9.

Leveraging the conditional independence properties of the tpn-dlm outlined in Eq. 7, the one-step-ahead predictive distribution of ${\varvec{\theta }}_t$ given $(\varphi ,D_{t-1})$ is given by

$$\begin{aligned} \varvec{\theta }_{t} \vert (\varphi ,D_{t-1}) \buildrel d\over = \varvec{G}_t(\varvec{\theta }_{t-1} \vert (\varphi ,D_{t-1}))+\varvec{\omega }_t \sim N_p(\varvec{a}_t + \varphi \varvec{b}_t, \varvec{R}_t), \end{aligned}$$

(14)

where

$$\begin{aligned} \varvec{a}_t = \varvec{G}_t\varvec{m}_{t-1},\quad \varvec{b}_t = \varvec{G}_t\varvec{\beta }_{t-1},\quad \varvec{R}_t = \varvec{G}_t\varvec{C}_{t-1} \varvec{G}^{\top }_t + \varvec{W}. \end{aligned}$$

(15)

Similarly, using (6) we find that the one-step-ahead predictive distribution of ${\textbf{Y}}_t$ given $(\varphi ,D_{t-1})$ becomes

$$\begin{aligned} {\textbf{Y}}_{t} \vert (\varphi ,D_{t-1}) \buildrel d\over = \varvec{F}^{\top }_t(\varvec{\theta }_{t} \vert (\varphi ,D_{t-1}))+\varvec{\nu }_t \sim N_r(\varvec{F}^{\top }_t\varvec{a}_t + \varphi \varvec{F}^{\top }_t\varvec{b}_t, \varvec{\Sigma }_t), \end{aligned}$$

(16)

where

$$\begin{aligned} \varvec{\Sigma }_t = \varvec{F}^{\top }_t\varvec{R}_{t} \varvec{F}_t + \varvec{V}. \end{aligned}$$

(17)

In other words, from (14) and (16) we have that

$$\begin{aligned} \begin{bmatrix}{\textbf{Y}}_t\\ \varvec{\theta }_t\end{bmatrix} \vert (\varphi ,D_{t-1})\buildrel d\over = \begin{bmatrix} \varvec{F}^{\top }_t\varvec{G}_{t} \\ \varvec{G}_t\\ \end{bmatrix}(\varvec{\theta }_{t-1} \vert (\varphi ,D_{t-1})) + \begin{bmatrix} \varvec{F}^{\top }_t &{} \varvec{I}_r \\ \varvec{I}_p &{} \varvec{0}\\ \end{bmatrix} \begin{bmatrix} \varvec{\omega }_{t} \\ \varvec{\nu }_t\\ \end{bmatrix}, \end{aligned}$$

and therefore

$$\begin{aligned} \begin{bmatrix}{\textbf{Y}}_t\\ \varvec{\theta }_t\end{bmatrix} \vert (\varphi ,D_{t-1}) \sim N_{p+r} \left( \begin{bmatrix} \varvec{F}^{\top }_t\varvec{a}_{t} \\ \varvec{a}_t\\ \end{bmatrix} + \varphi \begin{bmatrix} \varvec{F}^{\top }_t\varvec{b}_{t}\\ \varvec{b}_t\\ \end{bmatrix}, \begin{bmatrix} \varvec{\Sigma }_t &{} \varvec{F}^{\top }_t\varvec{R}_{t} \\ \varvec{R}_t\varvec{F}_{t} &{} \varvec{R}_t\\ \end{bmatrix} \right) . \end{aligned}$$

Finally, by applying the properties of the conditional normal distribution, we obtain the following filtering distribution of $\varvec{\theta }_t$ given $(\varphi , D_{t})$:

$$\begin{aligned} \varvec{\theta }_t \vert (\varphi , D_{t}) \sim N_p ({\varvec{m}}_t + \varphi {\varvec{\beta }}_t, \varvec{C}_t), \end{aligned}$$

(18)

where

$$\begin{aligned} \left. \begin{array}{l} \varvec{m}_{t} = \varvec{a}_{t} + \varvec{R}_t\varvec{F}_{t}{\varvec{\Sigma }}^{-1}_t ({\textbf{y}}_t-\varvec{F}^{\top }_t\varvec{a}_{t}), \\ \varvec{\beta }_t = \varvec{b}_{t} - \varvec{R}_t\varvec{F}_{t} {\varvec{\Sigma }}^{-1}_t\varvec{F}^{\top }_t\varvec{b}_{t},\\ \varvec{C}_t = \varvec{R}_{t} - \varvec{R}_t\varvec{F}_{t} {\varvec{\Sigma }}^{-1}_t\varvec{F}^{\top }_t\varvec{R}_{t}, \end{array}\right\} \end{aligned}$$

(19)

with $\varvec{a}_t$, $\varvec{b}_t$, $\varvec{R}_t$ and $\varvec{\Sigma }_t$ defined as in (15) and (17), respectively.

The above results are formalized below:

Proposition 3.2

Consider the TPN-dlm defined by Eqs. (6)-(7) and (8)-(9), with the induction assumptions (13). Then:

(i):: The one-step-ahead conditional predictive distribution of
$$\begin{aligned} \varvec{\theta }_{t} \vert (\varphi , D_{t-1}) \sim N_p (\varvec{a}_{t} + \varphi \varvec{b}_{t}, \varvec{R}_{t}); \end{aligned}$$
(ii):: The one-step-ahead conditional predictive distribution of
$$\begin{aligned} {\textbf{Y}}_{t} \vert (\varphi , D_{t-1}) \sim N_r ( \varvec{F}^{\top }_t\varvec{a}_{t} + \varphi \varvec{F}^{\top }_t\varvec{b}_{t}, \varvec{\Sigma }_{t}); \end{aligned}$$
(iii):: The conditional filtering distribution of
$$\begin{aligned} \varvec{\theta }_{t} \vert (\varphi , D_{t}) \sim N_p ( \varvec{m}_{t} + \varphi \varvec{\beta }_{t}, \varvec{C}_{t}). \end{aligned}$$

The next proposition establishes the conditional distribution of $\varphi \vert D_{t}$.

Proposition 3.3

Consider the TPN-dlm defined by Eqs. (6)-(7) and (8)-(9), with the induction assumptions (13). Then the conditional distribution of $\varphi \vert D_{t}$ has a finite mixture density of two truncated Gaussian distributions, given by

$$\begin{aligned} \varphi \vert D_{t} \sim \,\pi _{t}^a \, TN(\eta _{t}^a,\tau _{t}^a, [\mu ,\infty )) + \pi _{t}^b \,TN(\eta _{t}^b,\tau _{t}^b,(-\infty ,\mu )) \end{aligned}$$

(20)

where, for $s=a,b$,

$$\begin{aligned} \eta ^s_t&=\frac{\eta ^s_{t-1}+\tau _{t-1}^s\varvec{b}^\top _t\varvec{F}_t \varvec{\Sigma }^{-1}_t({\textbf{y}}_t-\varvec{F}^\top _t\varvec{a}_t)}{1+\tau _{t-1}^s\varvec{b}^\top _t\varvec{F}_t\varvec{\Sigma }^{-1}_t\varvec{F}^\top _t \varvec{b}_t},\quad \tau ^s_t=\frac{\tau ^s_{t-1}}{1+\tau ^s_{t-1} \varvec{b}^\top _t\varvec{F}_t\varvec{\Sigma }^{-1}_t\varvec{F}^\top _t\varvec{b}_t}, \end{aligned}$$

and

$$\begin{aligned} \pi ^a_t&=c_t \pi ^a_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^a_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^a_{t-1} \varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t)\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}}\right) , \\ \pi ^b_t&=c_t \pi ^b_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^b_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^b_{t-1} \varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t)\Phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau _t^b}}\right) , \end{aligned}$$

where

$$\begin{aligned} c_t^{-1}&=\pi ^a_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^a_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^a_{t-1} \varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t)\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}}\right) \\&\quad +\pi ^b_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^b_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^b_{t-1} \varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t)\Phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau _t^b}}\right) \end{aligned}$$

This representation allows to characterize the expected value and variance of $\varphi \mid D_t$, which can be expressed as

$$\begin{aligned} E\{\varphi \mid D_t\} = \pi _t^a\left[ \eta _t^a + \sqrt{\tau _t^a} \frac{\phi \left( \frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}}\right) }{\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}}\right) }\right] + \pi _t^b\left[ \eta _t^b - \sqrt{{\tau _t^b}} \frac{\phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau _t^b}}\right) }{\Phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau _t^b}}\right) }\right] \end{aligned}$$

(21)

and

$$\begin{aligned} Var\{\varphi \mid D_t\}= & {} \pi _t^a\left[ (\eta _t^a)^2 + \tau _t^a + \sqrt{\tau _t^a}(\mu + \eta _t^a) \frac{\phi \left( \frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}}\right) }{\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}}\right) }\right] \nonumber \\{} & {} +\pi _t^b\left[ (\eta _t^b)^2 + \tau _t^b - \sqrt{\tau _t^b}(\mu + \eta _t^b) \frac{\phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau _t^b}}\right) }{\Phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau _t^b}}\right) }\right] \nonumber \\{} & {} -\left[ E\{\varphi \mid D_t\} \right] ^2. \end{aligned}$$

(22)

A simpler expression for $Var\{\varphi \mid D_t\}$ can be obtained in terms of Chi-square cumulative distribution function, adapting (Barr and Sherrill 1999).

Immediate consequences of these results are given in the following proposition.

Proposition 3.4

Consider the TPN-dlm defined by Eqs. (6)-(7) and (8)-(9), with the induction assumptions (13). Then:

(i):

The one-step-ahead predictive distribution of $\varvec{\theta }_{t}$ given $D_{t-1}$ is

$$\begin{aligned} p(\varvec{\theta }_{t} \vert D_{t-1})&= \pi ^a_{t-1}\phi _p(\varvec{\theta }_{t}; \varvec{a}_t+\eta ^a_{t-1}\varvec{b}_t,\varvec{R}_t+\tau ^a_{t-1}\varvec{b}_t \varvec{b}^\top _t)\Phi \left( -\frac{\mu -\chi _t^a}{\sqrt{\vartheta _t^a}}\right) \\&\quad +\pi ^b_{t-1}\phi _p(\varvec{\theta }_{t};\varvec{a}_t+\eta ^b_{t-1}\varvec{b}_t, \varvec{R}_t+\tau ^b_{t-1}\varvec{b}_t\varvec{b}^\top _t)\Phi \left( \frac{\mu -\chi _t^b}{\sqrt{\vartheta _t^b}}\right) , \end{aligned}$$

where, for $s=a, b$,

$$\begin{aligned} \chi _t^s =\frac{\eta ^s_{t-1}+\tau ^s_{t-1}\varvec{b}^\top _t\varvec{R}_t^{-1} (\varvec{\theta }_t-\varvec{a}_t)}{1+\tau _{t-1}^s \varvec{b}^\top _t\varvec{R}_t^{-1} \varvec{b}_t},\quad \vartheta _t^s=\frac{\tau _{t-1}^s}{1+\tau _{t-1}^s \varvec{b}^\top _t\varvec{R}_t^{-1}\varvec{b}_t}. \end{aligned}$$

(ii):

The one-step-ahead predictive distribution of ${\textbf{y}}_{t}$ given $D_{t-1}$ is

$$\begin{aligned} p({\textbf{y}}_{t} \vert D_{t-1})&=\pi ^a_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^a_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t +\tau ^a_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top } \varvec{F}_t)\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}}\right) \nonumber \\&\quad +\pi ^b_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^b_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^b_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t) \Phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau _t^b}}\right) \nonumber \\ \end{aligned}$$

where $\eta _t^s$ and $\tau ^s_t$, for $s=a,b$, are defined above in Proposition 3.3.

(iii):

The filtering distribution is

$$\begin{aligned} p(\varvec{\theta }_{t} \vert D_{t})= & {} \pi ^a_t\phi _p(\varvec{\theta }_{t}; \varvec{m}_t+\eta _t^a\varvec{\beta }_t,\varvec{C}_t + \tau ^a_t\varvec{\beta }_t\varvec{\beta }^\top _t) \frac{\Phi \left( -\frac{\mu -\delta _t^a}{\sqrt{\upsilon _t^a}}\right) }{\Phi \left( -\frac{\mu -\eta ^a_t}{\sqrt{\tau ^a_t}}\right) }\,\nonumber \\{} & {} +\pi ^b_t\phi _p(\varvec{\theta }_{t}; \varvec{m}_t+\eta _t^b\varvec{\beta }_t,\varvec{C}_t + \tau _t^b\varvec{\beta }_t\varvec{\beta }^\top _t) \frac{\Phi \left( \frac{\mu -\delta _t^b}{\sqrt{\upsilon _t^b}}\right) }{\Phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau ^b_t}}\right) }, \end{aligned}$$

(23)

where $\pi _t^s$, $\eta ^s_t$ and $\tau ^s_t$, for $s=a, b$, are defined in in Proposition 3.3, and

$$\begin{aligned} \delta _t^s=\frac{\eta _t^s+\tau _t^s\varvec{\beta }^\top _t\varvec{C}_t^{-1}(\varvec{\theta }_{t}-\varvec{m}_t)}{1+\tau _t^s\varvec{\beta }^\top _t\varvec{C}_t^{-1}\varvec{\beta }_t},\quad \upsilon _t^s=\frac{\tau _t^s}{1+\tau _t^s\varvec{\beta }^\top _t\varvec{C}_t^{-1}\varvec{\beta }_t}. \end{aligned}$$

Proposition 3.4 shows the distribution of one-step-ahead predictive distribution of the states is typically skewed, and the same is true for the analogous predictive distribution of the response. Also, Proportion 3.4 shows that the filtering distribution is also typically skewed. This can be seen by comparing the results of our Proposition 3.4 with the results from the usual dlm (see, e.g., Petris et al. (2009)). Finally, since

$$\begin{aligned}&E(\varvec{\theta }_t \vert D_{t-1})=E\{E(\varvec{\theta }_t \vert \varphi , D_{t-1})\},\\&E({\textbf{Y}}_t \vert D_{t-1})=E\{E({\textbf{Y}}_t \vert \varphi , D_{t-1})\}, \end{aligned}$$

and

$$\begin{aligned}&Var(\varvec{\theta }_t \vert D_{t-1})=E\{Var(\varvec{\theta }_t \vert \varphi , D_{t-1})\}+Var\{E(\varvec{\theta }_t \vert \varphi , D_{t-1})\},\\&Var({\textbf{Y}}_t \vert D_{t-1})=E\{Var({\textbf{Y}}_t \vert \varphi , D_{t-1})\}+Var\{E({\textbf{Y}}_t \vert \varphi , D_{t-1})\}, \end{aligned}$$

then from equations (14), (16) and the property P3. of the tpn distribution (see Sect. 2.1), we obtain the following results:

Proposition 3.5

Under the tpn-dlm defined by Eqs. (6)-(7) and (8)-(9), with the induction assumptions (13), the one-step-ahead expected filtering and prediction distributions and their covariance matrices are, respectively, given by

$$\begin{aligned} E(\varvec{\theta }_t \vert D_{t-1})&= \varvec{a}_{t} + E\{\varphi \mid D_t\} \varvec{b}_{t},\\ Var(\varvec{\theta }_t \vert D_{t-1})&= {\textbf{R}}_t+ Var\{\varphi \mid D_t\} \varvec{b}_{t}\varvec{b}^{\top }_t,\\ E({\textbf{Y}}_t \vert D_{t-1})&= \varvec{F}^{\top }_t\varvec{a}_{t} + E\{\varphi \mid D_t\} \varvec{F}^{\top }_t\varvec{b}_{t},\\ Var({\textbf{Y}}_t \vert D_{t-1})&= \varvec{\Sigma }_t+ Var\{\varphi \mid D_t\} \varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}^{\top }_t\varvec{F}_{t}, \end{aligned}$$

where $E\{\varphi \mid D_t\}$ and $Var\{\varphi \mid D_t\}$ are derived in Eqs. 21 and 22, respectively.

4 Outline of Bayesian computation

In this section we combine the results obtained above to derive a forward filtering backward sampling (ffbs) to conduct full Bayesian inference on the model’s parameters $\varvec{\Theta }= (\varvec{\theta }_1, \dots , \varvec{\theta }_T)$, $\varvec{V}$ and $\varvec{W}$ via Markov-Chain Monte Carlo (mcmc). In particular, we assign Inverse-Wishart priors on the error covariances $\varvec{V}$ and $\varvec{W}$ as

$$\begin{aligned} \varvec{V} \sim IW_r(\ell , \varvec{M}) \qquad \varvec{W} \sim IW_p(g, {\varvec{Z}}), \end{aligned}$$

where $\varvec{M}$ and $\varvec{Z}$ are positive definite matrix with size $r\times r$ and $p\times p$, respectively, while $\ell$ and g are scalars such that $\ell >(r-1)/2$ and $g>(p-1)/2$. This choice guarantees that the covariance matrices ${\textbf{V}}$ and ${\textbf{W}}$ are positive definite.

Conditionally on the latent states, such distributions are conjugate, as the model is conditionally Gaussian. Therefore, the full conditional distributions of $\varvec{V}$ and $\varvec{W}$ are again Inverse-Wishart with

$$\begin{aligned} \varvec{V}|(D_T,\varvec{\theta })&\sim IW_r\left( \ell +\frac{T}{2}, \frac{1}{2} \sum _{t=1}^{T}(\varvec{y}_t-\varvec{F}_t^{\top }\varvec{\theta }_t) (\varvec{y}_t-\varvec{F}_t^{\top }\varvec{\theta }_t)^{\top }+\varvec{M}\right) , \end{aligned}$$

(24)

$$\begin{aligned} \varvec{W}|(D_T,\varvec{\theta })&\sim IW_p\left( g+\frac{T}{2},\frac{1}{2} \sum _{t=1}^{T}(\varvec{\theta }_t-\varvec{G}_t\varvec{\theta }_{t-1}) (\varvec{\theta }_t-\varvec{G}_t\varvec{\theta }_{t-1})^{\top }+\varvec{Z}\right) . \end{aligned}$$

(25)

In order to sample from $\varvec{\Theta }|(D_T,\varphi )$ we rely on backward recursions and decompose the filtered distribution for the state parameters following Carter and Kohn (1994), Frühwirth-Schnatter (1994), as

$$\begin{aligned} p(\varvec{\Theta }|D_T,\varphi )&= \prod _{t=0}^{T}p(\varvec{\theta }_t|\varvec{\theta }_{t+1},D_t, \varphi ) \end{aligned}$$

(26)

$$\begin{aligned}&= p(\varvec{\theta }_T|D_T,\varphi )\prod _{t=0}^{T-1}p(\varvec{\theta }_t|\varvec{\theta }_{t+1},D_t, \varphi ), \end{aligned}$$

(27)

where

$$\begin{aligned} \varvec{\theta }_t \mid (\varvec{\theta }_{t+1}, D_t, \varphi ) \sim N(\varvec{h}_t, \varvec{H}_t) \end{aligned}$$

(28)

and with

$$\begin{aligned} \varvec{h}_t= & {} \varvec{m}_t + \varphi \varvec{\beta }_t + \varvec{C}_t \varvec{G}_{t+1}^{\top } \varvec{R}_{t+1}^{-1} (\varvec{\theta }_{t+1} - \varvec{a}_{t+1} - \varphi \varvec{b}_{t+1})\\ \varvec{H}_t= & {} \varvec{C}_t - \varvec{C}_t \varvec{G}_{t+1}^{\top } \varvec{R}_{+1}^{-1} \varvec{G}_{t+1} \varvec{C}_t. \end{aligned}$$

4.1 mcmc algorithm

Posterior sampling can be performed combining the above results in a mcmc algorithm, alternating the Kalman filter with sampling from the conditional distributions. The following pseudo-code illustrates the steps of a single mcmc iteration:

1.
Sample $\varvec{\Theta }$ using the following modified ffbs algorithm:
1. 1a.
  For $t=1,\dots , T$, update the parameters of the distribution $\varvec{\theta }_t|D_t$ using the Kalman filter given in Sect. 3.2 (forward filtering)
2. 1b.
  For $t=1, \dots , T$, sample $\varphi |D_t$ from the conditional distribution outlined in Eq. 20;
3. 1c.
  Sample $\varvec{\theta }_T|D_T$ from the filtering distribution reported in Eq. 23;
4. 1d.
  For $t = T-1, T-2, \dots , 1$, sample $\varvec{\theta }_t|(\varvec{\theta }_{t+1},D_t, \varphi )$ from the distribution outlined in Eq. 28, conditioning on the $\varvec{\theta }_{t+1}$ sampled in the previous step (backward smoothing)
2.
Sample $\varvec{V}$ from its Inverse-Wishart full-conditional distribution, outlined in Equation (24);
3.
Sample $\varvec{W}$ from its Inverse-Wishart full-conditional distribution, outlined in Equation (25);

5 Simulation

We propose a simulation study to compare the performance of the proposed approach against a Gaussian dlm, focusing on different settings with varying sample size. We focus on univariate settings, assuming that the matrices $\{\varvec{G}_t\}$ and $\{\varvec{F}_t\}$ are unidimensional and do not depend on time, namely $\varvec{F}=\varvec{G}=1$. We simulated $T=50$ observations from the dlm defined by

$$\begin{aligned} \begin{aligned}&Y_t = \theta _t + \nu _t, \\&\theta _t = \theta _{t-1} + \omega _t, \\ \end{aligned} \end{aligned}$$

with different specification of the initial distribution $\theta _0$ and the disturbances $\nu _t$ and $\omega _t$. Specifically, we focus on the following settings:

(1)
Scenario 1: data are generated from a two-piece dlm, with initial distribution $\theta _0|\varphi \sim N(-3+2\varphi , 2)$ and $\varphi \sim TPN(3,\sqrt{3},0.5)$, letting $a(\gamma ) = 1 + \gamma$ and $b(\gamma )= 1-\gamma$, with Gaussian errors $\nu _t \sim N(0,5)$ and $\omega _t \sim N(0,3)$
(2)
Scenario 2: data are generated from a Gaussian dlm, with $\theta _0 \sim N(-3, 2)$ and Gaussian errors $\nu _t \sim N(0,5)$, $\omega _t \sim N(0,3)$
(3)
Scenario 3: data are generated from a dlm with heavy tails, simulating $\theta _0$, $\nu _t$ and $\omega _t$ from independent Student’s t distribution with 3 degrees of freedom.

We chose diffuse inverse-gamma distributions as priors for V and W, which in this case are scalar, with parameters $\ell =M=g=Z=0.001$. We compare our approach with a Gaussian dlm with same prior distributions, running both algorithms for 5000 iterations after 500 burn-in samples, and focusing on the one-step-ahead predictions and state parameters. Examination of traceplots of the parameters, auto-correlation function and Rubin’s diagnostics showed no evidence against convergence.

Figure 1, 2, and 3 show the one-step-ahead predictions and filtered estimates in the three scenarios. Current empirical findings indicate that, as expected, the main advantage of the proposed approach is more evident in the initial part of the series, where the impact of the initial distribution is substantial. This result is clearly seen in Fig. 1, where the tpn-dlm is correctly specified, and the Gaussian dlm tend to underestimate the state parameter and the one-step-ahead predictions. When data are generated from a Gaussian dlm, as in Fig. 2, the tpn initial distribution is incorrectly specified. However, its impact vanishes after few steps, and its one-step-ahead predictions are indistinguishable from a Gaussian dlm. Lastly, Fig. 3 focuses on a setting where both models are incorrectly specified, in terms of initial and distribution of the errors. We observe that the proposed tpn-dlm is robust again such misspecification, obtaining one-step-ahead predictions and estimates for the state parameters that are closer to the true level.

These findings are further explored replicating the simulations scenarios for different sample sizes, ranging $T\in \{10,50,100\}$, with $T=50$ corresponding to the results from Figs. 1,2 and 3. Results are reported in Table 1, comparing the Mean Squared Error (mse) of the expected value of the one-step-ahead distributions under both approaches. Empirical results are consistent with the previous discussion, with the tpn-dlm performing particularly well with small sample sizes, under correct specification and with heavy-tails processes.

Table 1 Mean squared error for the expected value of the one-step-ahead distribution

Full size table

6 Analysis of real data

Finally, we illustrate the tpn-dlm by analyzing the quarterly earnings in dollars per Johnson and Johnson share from 1960 to 1980 (Shumway et al. 2000, Example 1.1).

Data are characterized by a seasonality larger in the starting and ending years, almost missing for the central years. Trend is increasing and regular. Following Shumway et al. (2000), the time series will be modelled with the trend and seasonality components added to a white noise

$$\begin{aligned} Y_t = T_t + S_t + \nu _t; \end{aligned}$$

trend will be modelled as follow

$$\begin{aligned} T_t = \phi T_{t-1} + \omega _{t1} \end{aligned}$$

and we assume that the seasonal component is expected to sum to zero over a complete period of four quarters

$$\begin{aligned} S_t + S_{t-1} + S_{t-2} + S_{t-3} = \omega _{t2}. \end{aligned}$$

We may express the model in state-space form, by choosing $[T_t, S_t, S_{t-1}, S_{t-2}]^{\top }$ as state vector:

$$\begin{aligned} Y_t&= \left[ \begin{array}{cccc} 1&1&0&0 \end{array}\right] \left[ \begin{matrix} T_t \\ S_{t} \\ S_{t-1} \\ S_{t-2} \end{matrix} \right] + v_t\qquad \text{ where } \qquad v_t \sim N(0,V)\\ \left[ \begin{matrix} T_t \\ S_{t} \\ S_{t-1} \\ S_{t-2} \end{matrix} \right]&= \left[ \begin{matrix} \phi &{} 0 &{} 0 &{} 0 \\ 0 &{} -1 &{} -1 &{} -1 \\ 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 \\ \end{matrix} \right] \left[ \begin{matrix} T_{t-1} \\ S_{t-1} \\ S_{t-2} \\ S_{t-3} \end{matrix} \right] + \left[ \begin{matrix} \omega _{t1} \\ \omega _{t2} \\ 0 \\ 0 \end{matrix} \right] \\&\quad \text{ where } \qquad \left[ \begin{matrix} \omega _{t1} \\ \omega _{t2} \\ 0 \\ 0 \end{matrix} \right] \sim N_4 \left( \left[ \begin{matrix} 0 \\ 0 \\ 0 \\ 0 \end{matrix} \right] , \left[ \begin{matrix} W_{11} &{} 0 &{} 0 &{} 0 \\ 0 &{} W_{22} &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ \end{matrix} \right] \right) \end{aligned}$$

The parameters to be estimated are the observation noise variance, V, and the state noise variances associated with the trend, $W_{11}$, and the seasonal components, $W_{22}$. In addition, we need to estimate the transition parameter associated with the growth rate, $\phi$. Following Shumway et al. (2000), Example 6.27, we write $\phi =1+\zeta$, where $0<\zeta \le 1$, and we rewrite the trend component as

$$\begin{aligned} T_t-T_{t-1}=\zeta T_{t-1}+\omega _{t1}, \end{aligned}$$

so that, conditionally on the states, $\zeta$ is the slope of the linear regression of $(T_t-T_{t-1})$ on $T_{t-1}$ and $\omega _{t1}$ is the error. We choose a reference uninformative prior on $(\zeta ,\omega _{t1})$ and weakly informative priors for the remaining parameters by letting $\ell =M=0.001$ and $g=0.05$ and $\textbf{Z}=\text{ diag }\{0.05,0.05,0,0\}$. We ran the algorithm for 5000 iterations collected after 5000 burn-in samples. Examination of traceplots of the parameters, auto-correlation function and Rubin’s diagnostics showed no evidence against convergence. Fig. 4 displays the comparison of the trends ($T_t$) and season ($T_t + S_t$) along with $99\%$ credible intervals for Gaussian dlm and tpn-dlm. Figure 5 displays the data and the one-step-ahead predictions for the time series $Y_t$, again along with 99% credible intervals for Gaussian dlm and two-piece dlm.

Figures 4 and 5 show that the $99\%$ credibility intervals of state and response are different between of the two-piece dlm and dlm: as a consequence the entire distributions of those quantities are different. In addition we note that skewness of predictive distributions is maintained with the increasing of time, showing the usefulness of the two-piece dlm.

Mean squared error was 0.2131 for the two-piece dlm and 0.3512 for the dlm, showing an advantage towards the two-piece dlm. We also considered a BIC criteria for competing alternative models k, $k = 1, 2,\ldots , K$, the smaller-is-better criterion BIC is $\hbox {BIC}_k = n \log MSE_k + m_k \log (n)$, where $MSE_k$ is the predicting mean squared error and $m_k$ is the number of independent parameters used to fit model k. We obtained $-70.18$ for the dlm, where $m_k=4$ and $-107.69$ for the two-piece dlm, where $m_k=5$, confirming that also including complexity of the model, two-piece dlm is preferable.

7 Conclusion

In this article we proposed a flexible dynamic linear model (dlm) for modeling and forecasting multivariate time series relaxing the assumption of normality for the initial distribution of the state space parameter, replacing it by a more flexible class of distributions, which is called two-piece normal distributions. This model allows the initial distribution of the state space parameter to be skewed, and the asymmetry can be controlled by a scalar parameter. We derive a Kalman filter for this model, obtaining a two component mixture as predictive and filtering distributions that maintain skeweness.

In our opinion, the main contribution of this article is to present a simple and effective tool to model time series with possibly skewed distribution, like the Example 1.1 in Shumway et al. (2000) here. Also since we obtained a two component mixture for predictive and filtering distributions so this new model can simultaneously deal with some issues related to departures from normality like skewed, heavy-tailed data and also, multi-modality.

References

Arellano-Valle RB, Azzalini A, Ferreira CS, Santoro K (2020) A two-piece normal measurement error model. Comput Stat Data Anal 144:106863
Article MathSciNet MATH Google Scholar
Arellano-Valle RB, Contreras-Reyes JE, Quintero FOL, Valdebenito A (2019) A skew-normal dynamic linear model and Bayesian forecasting. Comput Stat 34(3):1055–1085
Article MathSciNet MATH Google Scholar
Arellano-Valle RB, Gómez HW, Quintana FA (2005) Statistical inference for a general class of asymmetric distributions. J Stat Plan Inference 128(2):427–443
Article MathSciNet MATH Google Scholar
Azzalini A, Capitanio A (1999) Statistical applications of the multivariate skew normal distribution. J R Stat Soc Ser B Stat Methodol 61:579–602
Article MathSciNet MATH Google Scholar
Barr DR, Sherrill ET (1999) Mean and variance of truncated normal distributions. Am Stat 53(4):357–361
Google Scholar
Cabral CRB, Da-Silva CQ, Migon HS (2014) A dynamic linear model with extended skew-normal for the initial distribution of the state parameter. Comput Stat Data Anal 74:64–80
Article MathSciNet MATH Google Scholar
Carter CK, Kohn R (1994) On Gibbs sampling for state space models. Biometrika 81(3):541–553
Article MathSciNet MATH Google Scholar
Corns T, Satchell S (2007) Skew Brownian motion and pricing European options. Eur J Financ 13(6):523–544
Article Google Scholar
Fasano A, Rebaudo G, Durante D, Petrone S (2021) A closed-form filter for binary time series. Stat Comput 31(4):1–20
Article MathSciNet MATH Google Scholar
Frühwirth-Schnatter S (1994) Data augmentation and dynamic linear models. J Time Ser Anal 15(2):183–202
Article MathSciNet MATH Google Scholar
Gualtierotti A (2005) Skew-normal processes as models for random signals corrupted by Gaussian noise. Int J Pure Appl Math 20:109–142
MathSciNet MATH Google Scholar
Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82:35–45
Article MathSciNet Google Scholar
Kim H-M, Ryu D, Mallick BK, Genton MG (2014) Mixtures of skewed Kalman filters. J Multivar Anal 123:228–251
Article MathSciNet MATH Google Scholar
Naveau P, Genton MG, Shen X (2005) A skewed Kalman filter. J Multivar Anal 94(2):382–400
Article MathSciNet MATH Google Scholar
Petris G, Petrone S, Campagnoli P (2009) Dynamic linear models with R. Springer, Berlin
Book MATH Google Scholar
Pourahmadi M (2007) Skew-normal ARMA models with nonlinear heteroscedastic predictors. Commun Stat Theory Methods 36(9):1803–1819
Article MathSciNet MATH Google Scholar
Shumway RH, Stoffer DS, Stoffer DS (2000) Time series analysis and its applications, vol 3. Springer, Berlin
Book MATH Google Scholar
West M, Harrison J (1997) Bayesian forecasting and dynamic models. Springer Science & Business Media, Berlin
MATH Google Scholar

Download references

Funding

Open access funding provided by Università degli Studi di Padova within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Dipartimento di Scienze Statistiche, Università di Padova, Padua, Italy
Emanuele Aliverti, Fereshteh Kahrari & Bruno Scarpa
Departamento de Estadística, Pontificia Universidad Católica de Chile, Santiago, Chile
Reinaldo B. Arellano-Valle
Dipartimento di Matematica “Tullio Levi Civita”, Università di Padova, Padua, Italy
Bruno Scarpa

Authors

Emanuele Aliverti
View author publications
You can also search for this author in PubMed Google Scholar
Reinaldo B. Arellano-Valle
View author publications
You can also search for this author in PubMed Google Scholar
Fereshteh Kahrari
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Scarpa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bruno Scarpa.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proofs

Proof of Proposition 3.1

Using the hierarchical representation defined by Eqs. (8)-(9), and the fact that $\varphi$ has a tpn density (2), we obtain

$$\begin{aligned} p(\varvec{\theta }_0)= & {} \int _{-\infty }^{\infty }\phi _p(\varvec{\theta }_0;\varvec{m}_0+\varphi \varvec{\beta }_0,\varvec{C}_0)h(\varphi ;\mu ,\sigma _0,\gamma _0)d\varphi \\= & {} \frac{2\pi _a}{\sigma _0 a(\gamma _0)} \int _{\mu }^{\infty }\phi _p (\varvec{\theta }_0;\varvec{m}_0+\varphi \varvec{\beta }_0,\varvec{C}_0)\phi \left( \frac{\varphi -\mu }{\sigma _0 a(\gamma _0)}\right) d\varphi \\{} & {} + \frac{2\pi _b}{\sigma _0 b(\gamma _0)}\int _{-\infty }^{\mu }\phi _p (\varvec{\theta }_0;\varvec{m}_0+\varphi \varvec{\beta }_0,\varvec{C}_0)\phi \left( \frac{\varphi -\mu }{\sigma _0 b(\gamma _0)}\right) d\varphi \end{aligned}$$

By making the change of variables $\varphi _a=\frac{{\varphi -\mu }}{\sigma _0 a(\gamma _0)}$ and $\varphi _b=\frac{-({\varphi -\mu })}{\sigma _0 b(\gamma _0)}$ we have

$$\begin{aligned} p(\varvec{\theta }_0)= & {} 2\pi _a \int _{0}^{\infty }\phi _p(\varvec{\theta }_0;\varvec{\xi }_0 +\varvec{\alpha }_a\varphi _a,\varvec{C}_0)\phi (\varphi _a;0,1)d\varphi _a\\{} & {} + 2\pi _b \int _{0}^{\infty }\phi _p(\varvec{\theta }_0;\varvec{\xi }_0 +\varvec{\alpha }_b\varphi _b,\varvec{C}_0)\phi (\varphi _b;0,1)d\varphi _b\\= & {} 2\pi _a\phi _p(\varvec{\theta }_0;\varvec{\xi }_0,\varvec{\Omega }_a) \int _{0}^{\infty }\phi (\varphi _a;{\varvec{\alpha }}^\top _a {\varvec{\Omega }}^{-1}_a(\varvec{\theta }_0-\varvec{\xi }_0),1 -{\varvec{\alpha }}^\top _a{\varvec{\Omega }}^{-1}_a{\varvec{\alpha }}_a)d\varphi _a \\{} & {} +2\pi _b\phi _p(\varvec{\theta }_0;\varvec{\xi }_0,\varvec{\Omega }_b) \int _{0}^{\infty }\phi (\varphi _b;{\varvec{\alpha }}^\top _b {\varvec{\Omega }}^{-1}_b(\varvec{\theta }_0-\varvec{\xi }_0),1 -{\varvec{\alpha }}^\top _b{\varvec{\Omega }}^{-1}_b{\varvec{\alpha }}_b)d\varphi _b\\= & {} 2\pi _a\phi _p(\varvec{\theta }_0;\varvec{\xi }_0,\varvec{\Omega }_a) \Phi \left( \frac{{\varvec{\alpha }}^\top _a{\varvec{\Omega }}^{-1}_a(\varvec{\theta }_0-\varvec{\xi }_0)}{\sqrt{1- {\varvec{\alpha }}^\top _a{\varvec{\Omega }}^{-1}_a{\varvec{\alpha }}_a}}\right) \\{} & {} +2\pi _b\phi _p(\varvec{\theta }_0;\varvec{\xi }_0,\varvec{\Omega }_b) \Phi \left( \frac{{\varvec{\alpha }}^\top _b{\varvec{\Omega }}^{-1}_b(\varvec{\theta }_0-\varvec{\xi }_0)}{\sqrt{1 -{\varvec{\alpha }}^\top _b{\varvec{\Omega }}^{-1}_b{\varvec{\alpha }}_b}}\right) . \end{aligned}$$

$\square$

Proof of Proposition 3.3

By the induction hypotheses, the conditional distribution of $\varphi$ given $D_t$ is

$$\begin{aligned} p(\varphi \vert D_{t}) \varpropto p({\textbf{y}}_t \vert \varphi ,D_{t-1})p(\varphi \vert D_{t-1}). \end{aligned}$$

Using (6) and (16), we get

$$\begin{aligned} p(\varphi \vert D_{t})&\varpropto 2\pi ^a_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \varphi \varvec{b}_{t}) , {\varvec{\Sigma }}_t) \phi (\varphi ; \eta _{t-1}^a,\tau _{t-1}^a) I_{[\mu ,\infty )}(\varphi )\\&\quad + 2\pi ^b_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \varphi \varvec{b}_{t}) , {\varvec{\Sigma }}_t) \phi (\varphi ; \eta _{t-1}^b,\tau _{t-1}^b) I_{(-\infty ,\mu )}(\varphi ). \end{aligned}$$

From the marginal/conditional representation of the multivariate normal distribution, we find, for $s=a,b$, the following identity:

$$\begin{aligned}&\phi _r({\textbf{y}}_t; \varvec{F}^{\top }_t(\varvec{a}_{t} + \varphi \varvec{b}_{t}) , {\varvec{\Sigma }}_t) \phi (\varphi ; \eta _{t-1}^s,\tau _{t-1}^s)\\&\quad = \phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^s_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^s_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t)\phi (\varphi ;\eta ^s_t,\tau _t^s), \end{aligned}$$

where, for $s=a,b$,

$$\begin{aligned} \eta ^s_t&=\eta ^s_{t-1}+\tau ^s_{t-1}\varvec{b}^{\top }_t\varvec{F}_{t}({\varvec{\Sigma }}_t + \tau ^s_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t\varvec{F}_t)^{-1} ({\textbf{y}}_t-{\textbf{F}}_t^{\top }(\varvec{a}_t-\eta ^s_{t-1}\varvec{b}_t))\\&=\frac{\eta ^s_{t-1}+\tau _{t-1}^s\varvec{b}^\top _t\varvec{F}_t\varvec{\Sigma }^{-1}_t ({\textbf{y}}_t-\varvec{F}^\top _t\varvec{a}_t)}{1+\tau _{t-1}^s\varvec{b}^\top _t\varvec{F}_t\varvec{\Sigma }^{-1}_t\varvec{F}^\top _t\varvec{b}_t},\\ \tau ^s_t&=\tau ^s_{t-1}-[\tau ^s_{t-1}]^2\varvec{b}^{\top }_t\varvec{F}_{t} ({\varvec{\Sigma }}_t + \tau ^s_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t} \varvec{b}_t\varvec{F}_t)^{-1}\varvec{F}_t^{\top }\varvec{b}_t\\&=\frac{\tau ^s_{t-1}}{1+\tau ^s_{t-1}\varvec{b}^\top _t\varvec{F}_t \varvec{\Sigma }^{-1}_t\varvec{F}^\top _t\varvec{b}_t}. \end{aligned}$$

Hence, we have

$$\begin{aligned} p(\varphi \vert D_{t})&\propto \pi ^a_{t-1} \phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^a_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^a_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t) \phi (\varphi ;\eta ^a_t,\tau _t^a)I_{[\mu ,\infty )}(\varphi )\\&\quad + \pi ^b_{t-1} \phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^b_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^b_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t) \phi (\varphi ;\eta ^b_t,\tau _t^b)I_{(-\infty ,\mu )}(\varphi ), \end{aligned}$$

from where the inverse of the proportionality/normalization constant becomes

$$\begin{aligned} c_t^{-1}&=\pi ^a_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^a_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^a_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t)\int _{-\infty }^{\infty } \phi (\varphi ;\eta ^a_t,\tau ^a_t)I_{[\mu ,\infty )}(\varphi ) d\varphi \nonumber \\&\quad +\pi ^b_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^b_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^b_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t)\int _{-\infty }^{\infty } \phi (\varphi ;\eta ^b_t,\tau ^b_t)I_{[-\infty , \mu )}(\varphi ) d\varphi \nonumber \\&=\pi ^a_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^a_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^a_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t)\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}}\right) \nonumber \\&\quad +\pi ^b_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^b_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^b_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t) \Phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau _t^b}}\right) . \end{aligned}$$

(29)

Therefore, the conditional distribution of $\varphi \mid D_t$ can be written as

$$\begin{aligned} p(\varphi \vert D_{t})&= \pi _t^a \frac{\phi (\varphi ;\eta _t^a,\tau _t^a )}{\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}} \right) }\,I_{[\mu ,\infty )}(\varphi ) + \pi _t^b\frac{\phi (\varphi ;\eta _t^b, \tau _t^b)}{\Phi \left( \frac{\mu -\eta ^b_t}{\sqrt{\tau _t^b}} \right) }\,I_{(-\infty ,\mu )}(\varphi ), \end{aligned}$$

with

$$\begin{aligned} \pi ^a_t&=c_t \pi ^a_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^a_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^a_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top } \varvec{F}_t)\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau _t^a}}\right) ,\\ \pi ^b_t&=c_t \pi ^b_{t-1}\phi _r({\textbf{y}}_t ; \varvec{F}^{\top }_t(\varvec{a}_{t} + \eta ^b_{t-1}\varvec{b}_{t}) , {\varvec{\Sigma }}_t + \tau ^b_{t-1}\varvec{F}^{\top }_t\varvec{b}_{t}\varvec{b}_t^{\top }\varvec{F}_t) \Phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau _t^b}}\right) , \end{aligned}$$

which correspond to the density of

$$\begin{aligned} \varphi \vert D_{t} \sim \pi _{t}^a TN(\eta _{t}^a,\tau _{t}^a, [\mu ,\infty )) + \pi _{t}^b TN(\eta _{t}^b,\tau _{t}^b,(-\infty ,\mu )) \end{aligned}$$

and the result is proved. $\square$

Proof of Proposition 3.4

Part (i): by using (14) and (6) we have

$$\begin{aligned} p(\varvec{\theta }_{t} \vert D_{t-1})&= \int _{\varphi }p(\varvec{\theta }_{t} \vert \varphi , D_{t-1})p(\varphi \vert D_{t-1} ) d\varphi \\&\propto \pi ^a_{t-1} \int _{-\infty }^{\infty }\phi _p(\varvec{\theta }_{t}; \varvec{a}_t + \varphi \varvec{b}_t,\varvec{R}_t )\phi (\varphi ;\eta ^a_{t-1},\tau ^a_{t-1}) d\varphi \\&\quad + \pi ^b_{t-1}\int _{-\infty }^{\infty }\phi _p(\varvec{\theta }_{t}; \varvec{a}_t + \varphi \varvec{b}_t,\varvec{R}_t )\phi (\varphi ;\eta ^b_{t-1},\tau ^b_{t-1}) d\varphi \\&=\pi ^a_{t-1}\phi _p(\varvec{\theta }_{t};\varvec{a}_t+\eta ^a_{t-1}\varvec{b}_t, \varvec{R}_t+\tau ^a_{t-1}\varvec{b}_t\varvec{b}^\top _t)\int _{\mu }^{\infty } \phi (\varphi ;\chi _t^a,\vartheta ^a_t )d\varphi \\&\quad +\pi ^b_{t-1}\phi _p(\varvec{\theta }_{t};\varvec{a}_t+\eta ^b_{t-1} \varvec{b}_t,\varvec{R}_t+\tau ^b_{t-1}\varvec{b}_t\varvec{b}^\top _t)\int _{-\infty }^{\mu }\ \phi (\varphi ;\chi _t^b,\vartheta ^b_t )d\varphi \\&=\pi ^a_{t-1}\phi _p(\varvec{\theta }_{t};\varvec{a}_t+\eta ^a_{t-1}\varvec{b}_t,\varvec{R}_t +\tau ^a_{t-1}\varvec{b}_t\varvec{b}^\top _t)\Phi \left( -\frac{\mu -\chi _t^a}{\sqrt{\vartheta _t^a}}\right) \\&\quad +\pi ^b_{t-1}\phi _p(\varvec{\theta }_{t};\varvec{a}_t+\eta ^b_{t-1} \varvec{b}_t,\varvec{R}_t+\tau ^b_{t-1}\varvec{b}_t\varvec{b}^\top _t)\Phi \left( \frac{\mu -\chi _t^b}{\sqrt{\vartheta _t^b}}\right) , \end{aligned}$$

where, for $s=a, b$, we use that

$$\begin{aligned} \phi _p(\varvec{\theta }_{t}; \varvec{a}_t + \varphi \varvec{b}_t,\varvec{R}_t )\phi (\varphi ;\eta ^s_{t-1}, \tau ^s_{t-1} ) =\phi _p(\varvec{\theta }_{t};\varvec{a}_t+\eta ^s_{t-1}\varvec{b}_t,\varvec{R}_t + \tau ^s_{t-1}\varvec{b}_t\varvec{b}^\top _t)\phi (\varphi ;\chi _t^s,\vartheta ^s_t), \end{aligned}$$

by considering again the marginal/conditional factorization of the multivariate normal density, and

$$\begin{aligned} \chi _t^s&= \eta ^s_{t-1}+\tau _{t-1}^s\varvec{b}^\top _t(\varvec{R}_t +\tau ^s_{t-1}\varvec{b}_t\varvec{b}^\top _t)^{-1}(\varvec{\theta }_t -\varvec{a}_t-\eta ^s_{t-1}\varvec{b}_t)\\&=\frac{\eta ^s_{t-1}+\tau ^s_{t-1}\varvec{b}^\top _t\varvec{R}_t^{-1} (\varvec{\theta }_t-\varvec{a}_t)}{1+\tau _{t-1}^s \varvec{b}^\top _t\varvec{R}_t^{-1}\varvec{b}_t},\\ \vartheta _t^s&= \tau _{t-1}^s - [\tau _{t-1}^s]^2\varvec{b}^\top _t (\varvec{R}_t + \tau _{t-1}^s\varvec{b}_t\varvec{b}^\top _t)^{-1}\varvec{b}_t\\&=\frac{\tau _{t-1}^s}{1+\tau _{t-1}^s\varvec{b}^\top _t\varvec{R}_t^{-1}\varvec{b}_t}, \end{aligned}$$

which leads to the proof of part (i).

Part (ii): The proof of this part is direct from (29) since $c_t^{-1}=p({\textbf{y}}_{t} \vert D_{t-1})$.

Part (iii): Proceeding similarly to part (i), but now by using part (iii) of Proposition 3.2 and Proposition 3.3, we get

$$\begin{aligned} p(\varvec{\theta }_{t} \vert D_{t})&= \int _{-\infty }^{\infty }p (\varvec{\theta }_{t} \vert \varphi , D_{t})p(\varphi \vert D_{t} ) d\varphi \\&=\pi ^a_t\int _{\mu }^{\infty }\phi _p(\varvec{\theta }_{t}; \varvec{m}_t +\varphi \varvec{\beta }_t,\varvec{C}_t)\frac{\phi (\varphi ;\eta _t^a,\tau ^a_t )}{\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau ^a_t}}\right) }\,d\varphi \\&\quad +\pi ^b_t\int _{-\infty }^{\mu }\phi _p(\varvec{\theta }_{t}; \varvec{m}_t +\varphi \varvec{\beta }_t,\varvec{C}_t)\frac{\phi (\varphi ;\eta _t^b,\tau ^b_t )}{\Phi \left( \frac{\mu -\eta ^b_t}{\sqrt{\tau _t^b}}\right) }\,d\varphi \\&=\pi _t^a\phi _p(\varvec{\theta }_{t}; \varvec{m}_t+\eta _t^a\varvec{\beta }_t,\varvec{C}_t + \tau _t^a\varvec{\beta }_t\varvec{\beta }^\top _t) \int _{\mu }^{\infty }\frac{\phi (\varphi ; \delta _t^a, \upsilon _t^a)}{\Phi \left( -\frac{\mu -\eta _t^a}{\sqrt{\tau ^a_t}}\right) }\,d\varphi \\&\quad +\pi _t^b\phi _p(\varvec{\theta }_{t}; \varvec{m}_t+\eta _t^b\varvec{\beta }_t, \varvec{C}_t + \tau _t^a\varvec{\beta }_t\varvec{\beta }^\top _t) \int _{-\infty }^{\mu }\frac{\phi (\varphi ; \delta _t^b, \upsilon _t^b)}{\Phi \left( \frac{\mu -\eta ^b_t}{\sqrt{\tau ^b_t}}\right) }\,d\varphi \\&=\pi ^a_t\phi _p(\varvec{\theta }_{t}; \varvec{m}_t+\eta _t^a\varvec{\beta }_t, \varvec{C}_t + \tau ^a_t\varvec{\beta }_t\varvec{\beta }^\top _t) \frac{\Phi \left( -\frac{\mu -\delta _t^a}{\sqrt{\upsilon _t^a}}\right) }{\Phi \left( -\frac{\mu -\eta ^a_t}{\sqrt{\tau ^a_t}}\right) }\,\\&\quad +\pi ^b_t\phi _p(\varvec{\theta }_{t}; \varvec{m}_t+\eta _t^b\varvec{\beta }_t, \varvec{C}_t + \tau _t^b\varvec{\beta }_t\varvec{\beta }^\top _t) \frac{\Phi \left( \frac{\mu -\delta _t^b}{\sqrt{\upsilon _t^b}}\right) }{\Phi \left( \frac{\mu -\eta _t^b}{\sqrt{\tau ^b_t}}\right) }, \end{aligned}$$

where, for

$$\begin{aligned} \delta _t^s&=\eta _t^s+\tau _t^s\varvec{\beta }^\top _t(\varvec{C}_t + \tau _t^s\varvec{\beta }_t\varvec{\beta }^\top _t)^{-1}(\varvec{\theta }_{t} -\varvec{m}_t-\eta _t^b\varvec{\beta }_t) \,=\,\frac{\eta _t^s+\tau _t^s\varvec{\beta }^\top _t\varvec{C}_t^{-1} (\varvec{\theta }_{t}-\varvec{m}_t)}{1+\tau _t^s\varvec{\beta }^\top _t\varvec{C}_t^{-1}\varvec{\beta }_t},\\ \upsilon _t^s&=\tau _t^s-(\tau _t^s)^2\varvec{\beta }^\top _t (\varvec{C}_t + \tau _t^s\varvec{\beta }_t\varvec{\beta }^\top _t)^{-1}\varvec{\beta }_t \,=\,\frac{\tau _t^s}{1+\tau _t^s\varvec{\beta }^\top _t\varvec{C}_t^{-1}\varvec{\beta }_t}. \end{aligned}$$

This proves part (iii). $\square$

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Aliverti, E., Arellano-Valle, R.B., Kahrari, F. et al. A flexible two-piece normal dynamic linear model. Comput Stat 38, 2075–2096 (2023). https://doi.org/10.1007/s00180-023-01355-3

Download citation

Received: 09 April 2022
Accepted: 03 April 2023
Published: 24 April 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00180-023-01355-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A flexible two-piece normal dynamic linear model

Abstract

Similar content being viewed by others

Introduction to Dynamic Linear Models for Time Series Analysis

Time Series Modeling with MATLAB: The SSpace Toolbox

Linear models with time-varying parameters: a comparison of different approaches

1 Introduction

2 Two-piece normal and skew normal distributions

2.1 Two-piece normal distributions

2.2 Skew-normal distribution

3 A two-piece normal dynamic linear model

3.1 The initial state distribution

Proposition 3.1

3.2 The Kalman filter

Proposition 3.2

Proposition 3.3

Proposition 3.4

Proposition 3.5

4 Outline of Bayesian computation

4.1 mcmc algorithm

5 Simulation

6 Analysis of real data

7 Conclusion

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proofs

Proof of Proposition 3.1

Proof of Proposition 3.3

Proof of Proposition 3.4

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A flexible two-piece normal dynamic linear model

Abstract

Similar content being viewed by others

Introduction to Dynamic Linear Models for Time Series Analysis

Time Series Modeling with MATLAB: The SSpace Toolbox

Linear models with time-varying parameters: a comparison of different approaches

1 Introduction

2 Two-piece normal and skew normal distributions

2.1 Two-piece normal distributions

2.2 Skew-normal distribution

3 A two-piece normal dynamic linear model

3.1 The initial state distribution

Proposition 3.1

3.2 The Kalman filter

Proposition 3.2

Proposition 3.3

Proposition 3.4

Proposition 3.5

4 Outline of Bayesian computation

4.1 mcmc algorithm

5 Simulation

6 Analysis of real data

7 Conclusion

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proofs

Appendix: Proofs

Proof of Proposition 3.1

Proof of Proposition 3.3

Proof of Proposition 3.4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation