Skip to main content
Log in

A Bayesian nonparametric Markovian model for non-stationary time series

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Stationary time series models built from parametric distributions are, in general, limited in scope due to the assumptions imposed on the residual distribution and autoregression relationship. We present a modeling approach for univariate time series data, which makes no assumptions of stationarity, and can accommodate complex dynamics and capture non-standard distributions. The model for the transition density arises from the conditional distribution implied by a Bayesian nonparametric mixture of bivariate normals. This results in a flexible autoregressive form for the conditional transition density, defining a time-homogeneous, non-stationary Markovian model for real-valued data indexed in discrete time. To obtain a computationally tractable algorithm for posterior inference, we utilize a square-root-free Cholesky decomposition of the mixture kernel covariance matrix. Results from simulated data suggest that the model is able to recover challenging transition densities and non-linear dynamic relationships. We also illustrate the model on time intervals between eruptions of the Old Faithful geyser. Extensions to accommodate higher order structure and to develop a state-space model are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Antoniano-Villalobos, I., Walker, S.G.: A nonparametric model for stationary time series. J. Time Ser. Anal. 37, 126–142 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  • Azzalini, A.: A class of distributions which includes the normal ones. Scand. J. Stat. 12, 171–178 (1985)

    MathSciNet  MATH  Google Scholar 

  • Caron, F., Davy, M., Doucet, A., Duflos, E., Vanheeghe, P.: Bayesian Inference for linear dynamic models with Dirichlet process mixtures. IEEE Trans. Signal Process. 56, 71–84 (2008)

    Article  MathSciNet  Google Scholar 

  • Carvalho, A., Tanner, M.: Modeling nonlinear time series with local mixtures of generalized linear models. Can. J. Stat. 33, 97–113 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Carvalho, A., Tanner, M.: Modeling nonlinearities with mixtures of experts time series models. Int. J. Math. Math. Sci. (2006)

  • Connor, R., Mosimann, J.: Concepts of independence for proportions with a generalization of the Dirichlet distribution. J. Am. Stat. Assoc. 64, 194–206 (1969)

    Article  MathSciNet  MATH  Google Scholar 

  • Daniels, M., Pourahmadi, M.: Bayesian analysis of covariance matrices and dynamic models for longitudinal data. Biometrika 89, 553–566 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • DeYoreo, M., Kottas, A.: A fully nonparametric modeling approach to binary regression. Bayesian Anal. 10, 821–847 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Di Lucca, M., Guglielmi, A., Müller, P., Quintana, F.: A simple class of Bayesian autoregression models. Bayesian Anal. 8, 63–88 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Ferguson, T.: A Bayesian analysis of some nonparametric problems. Annals Stat. 1, 209–230 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  • Fox, E., Sudderth, E., Jordan, M., Willsky, A.: Bayesian nonparametric inference for switching dynamic linear models. IEEE Trans. Signal Process. 59, 1569–1585 (2011)

    Article  Google Scholar 

  • Früwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006)

    Google Scholar 

  • Geweke, J., Terui, N.: Bayesian threshold autoregressive models for nonlinear time series. J. Time Ser. Anal. 14, 441–454 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  • Ishwaran, H., James, L.: Gibbs sampling methods for stick-breaking priors. J. Am. Stat. Assoc. 96, 161–173 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Juang, B.H., Rabiner, L.: Mixture autoregressive hidden Markov models for speech signals. IEEE Trans. Acoust. Speech Signal Process. 33(6), 1404–1413 (1985)

    Article  Google Scholar 

  • Lau, J., So, M.: Bayesian mixture of autoregressive models. Comput. Stat. Data Anal. 53, 38–60 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • MacEachern, S.: Dependent Dirichlet processes. The Ohio State University, Department of Statistics, Tech. rep. (2000)

  • Martinez-Ovando, J., Walker, S.G.: Time-series modeling, stationarity, and Bayesian nonparametric methods. Tech. rep, Banco de Mexico (2011)

  • Mena, R., Walker, S.: Stationary autoregressive models via a Bayesian nonparametric approach. J. Time Ser. Anal. 26, 789–805 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Müller, P., West, M., MacEachern, S.: Bayesian models for nonlinear autoregressions. J. Time Ser. Anal. 18, 593–614 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Ombao, H., Raz, J.A., von Sachs, R., Malow, B.A.: Automatic statistical analysis of bivariate nonstationary time series. J. Am. Stat. Assoc. 96, 543–560 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Primiceri, G.: Time varying structural vector autoregressions and monetary policy. Rev. Econ. Stud. 72, 821–852 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sin. 4, 639–650 (1994)

    MathSciNet  MATH  Google Scholar 

  • Tang, Y., Ghosal, S.: A consistent nonparametric Bayesian procedure for estimating autoregressive conditional densities. Comput. Stat. Data Anal. 51, 4424–4437 (2007a)

    Article  MathSciNet  MATH  Google Scholar 

  • Tang, Y., Ghosal, S.: Posterior consistency of Dirichlet mixtures for estimating a transition density. J. Stat. Plan. Inference 137, 1711–1726 (2007b)

    Article  MathSciNet  MATH  Google Scholar 

  • Tong, H.: On a threshold model. In: Chen, C. (ed.) Recognition, Pattern, Processing, Signal. Sijthoff and Nordhoff, Amsterdam (1987)

    Google Scholar 

  • Tong, H.: Non-Linear Time Series: A Dynamical System Approach. Oxford University Press, Oxford (1990)

    MATH  Google Scholar 

  • Webb, E., Forster, J.: Bayesian model determination for multivariate ordinal and binary data. Comput. Stat. Data Anal. 52, 2632–2649 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • West, M., Harrison, J.: Bayesian Forecasting and Dynamic Models. Springer, New York (1999)

    MATH  Google Scholar 

  • Wong, C.S., Li, W.K.: On a mixture autoregressive model. J. R. Stat. Soc. Ser. B 62, 95–115 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Wood, S., Rosen, O., Kohn, R.: Bayesian mixtures of autoregressive models. J. Comput. Graph. Stat. 20, 174–195 (2011)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The work of the first author was supported by the National Science Foundation under award SES 1131897. The work of the second author was supported in part by the National Science Foundation under awards DMS 1310438 and DMS 1407838. The authors wish to thank three reviewers for constructive feedback and for comments that improved the presentation of the material in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria DeYoreo.

Appendices

Appendix 1: The Markov chain Monte Carlo algorithm

Here, we provide the details of the MCMC method for posterior simulation from the nonparametric mixture model developed in Sect. 2.1.

The posterior full conditional distributions for \(\alpha \) and the components of vector \(\psi \) are standard as they are assigned conditionally conjugate priors. Each \(U_t\), \(t=2,\ldots ,n\) is sampled from a discrete distribution on \(\{1,\ldots ,L\}\), with probabilities \((\tilde{p}_{1,t},\ldots ,\tilde{p}_{L,t})\), where \(\tilde{p}_{l,t}\propto p_l\mathrm {N}(z_t \mid \mu _{l}^y-\beta _{l}(z_{t-1}-\mu _{l}^x),\delta _{l}^y) \mathrm {N}(z_{t-1} \mid \mu _l^x,\delta _l^x)\), for \(l=1,\ldots ,L\).

Next, consider the mixing parameters. Letting \(\{U_j^*:j=1,\ldots ,n^*\}\) be the \(n^*\) distinct values of \((U_2,\ldots ,U_n)\), and \(M_l=|\{U_t:U_t=l\}|\), we obtain the full conditional

$$\begin{aligned}&p(\varvec{\eta }_l \mid \ldots ,\mathrm {data}) \propto G_0(\varvec{\eta }_l \mid \varvec{\psi }) \\&\quad \left\{ \prod _{j=1}^{n^*}\prod _{\{t:U_t=U_j^*\}}\mathrm {N}(z_t \mid \mu _{l}^y-\beta _{l}(z_{t-1}-\mu _{l}^x),\delta _{l}^y) \right\} \\&\quad \left\{ \prod _{r=1}^L \prod _{\{t:U_t=r\}}q_r(z_{t-1}) \right\} . \end{aligned}$$

Therefore, if \(l\in \{U_j^*\}\), \(\mu _l^y\) is sampled from a normal distribution with variance \((v^y)^*=[(\nu ^y)^{-1}+M_l(\delta _l^y)^{-1}]^{-1}\), and mean \((v^y)^*[(\nu ^y)^{-1}m^y+(\delta _l^y)^{-1}\sum _{\{t:U_t=U_j^*\}}(z_t+\beta _l(z_{t-1}-\mu _l^x))]\). If component l is empty, that is, \(l\notin \{U_j^*\}\), then \(\mu _l^y\sim \mathrm {N}(m^y,v^y)\). The updates for \(\delta _l^y\) and \(\beta _l\) also require only Gibbs sampling. If \(l\in \{U_j^*\}\), then \(\delta _l^y\sim \mathrm {IG}(\nu ^y+0.5M_l,s^y+0.5\sum _{\{t:U_t=l\}}(z_t-\mu _l^y+\beta _l(z_{t-1}-\mu _l^x))^2)\) and \(\beta _l\) is sampled from a normal with variance \(c^*=[c^{-1}+(\delta _l^y)^{-1}\sum _{\{t:U_t=l\}}(z_{t-1}-\mu _l^x)^2]^{-1}\) and mean \(c^*[c^{-1}\theta +(\delta _l^y)^{-1}\sum _{\{t:U_t=l\}}(z_{t-1}-\mu _l^x)(\mu _l^y-z_t)]\). If \(l\notin \{U_j^*\}\), then we sample from \(G_0\): \(\delta _l^y\sim \mathrm {IG}(\nu ^y,s^y)\) and \(\beta _l\sim \mathrm {N}(\theta ,c)\).

No matter the choice of \(G_0\), the full conditionals for \(\mu _l^x\) and \(\delta _l^x\) are not proportional to any standard distribution, as these parameters are contained in the sum of L terms in the denominator of \(q_l(z_{t-1})\). The posterior full conditional \(p(\mu _l^x \mid \ldots ,\mathrm {data})\), when \(l\in \{U_j^*\}\), is given by

$$\begin{aligned}&\mathrm {N}\left( \mu _l^x \mid m^x,v^x\right) \\&\quad \prod _{\{t:U_t=l\}}\mathrm {N}\left( z_t \mid \mu _{l}^y-\beta _{l}\left( z_{t-1}-\mu _{l}^x\right) ,\delta _{l}^y\right) \mathrm {N}(z_{t-1} \mid \mu _l^x,\delta _l^x) \\&\quad \left( \prod _{t=2}^n \sum _{m=1}^L p_m\mathrm {N}(z_{t-1} \mid \mu _m^x,\delta _m^x)\right) ^{-1}. \end{aligned}$$

This can be written as \(p(\mu _l^x|\ldots ,\mathrm {data})\propto \) \(\mathrm {N}(\mu _l^x \mid (m^x)^*,(v^x)^*)(\prod _{t=2}^n \sum _{m=1}^L p_m\mathrm {N}(z_{t-1} \mid \mu _m^x,\delta _m^x))^{-1}\), with \((v^x)^*=((v^x)^{-1}+M_l(\delta _l^x)^{-1}+M_l\beta _l^2(\delta _l^y)^{-1})\) and \((m^x)^*=(v^x)^*((v^x)^{-1}m^x+(\delta _l^x)^{-1}\sum _{\{t:U_t=l\}}z_{t-1}+(\delta _l^y)^{-1}\beta _l^2\sum _{\{t:U_t=l\}}(z_{t-1}+(z_t-\mu _l^y)/\beta _l))\). We use a random-walk Metropolis step to update \(\mu _l^x\). For \(l\notin \{U_j^*\}\), \(p(\mu _l^x \mid \ldots ,\mathrm {data})\) is proportional to \(\mathrm {N}(\mu _l^x \mid m^x,v^x)\left[ \prod _{t=2}^n \sum _{m=1}^L p_m\mathrm {N}(z_{t-1} \mid \mu _m^x,\delta _m^x)\right] ^{-1}\), and in this case we use a Metropolis–Hastings algorithm, proposing a candidate value \(\mu _l^x\) from the base distribution \(\mathrm {N}(m^x,v^x)\).

The full conditional and sampling strategy for \(\delta _l^x\) are similar to those for \(\mu _l^x\). We have

$$\begin{aligned}&p(\delta _l^x \mid \ldots ,\mathrm {data})\propto \mathrm {IG}(\delta _l^x \mid \nu ^x,s^x) \prod _{\{t:U_t=l\}} \mathrm {N}(z_{t-1} \mid \mu _l^x,\delta _l^x) \\&\quad \left( \prod _{t=2}^n \sum _{m=1}^L p_m\mathrm {N}(z_{t-1} \mid \mu _m^x,\delta _m^x)\right) ^{-1}, \end{aligned}$$

which for an active component, is written as proportional to

$$\begin{aligned}&\mathrm {IG}\left( \delta _l^x \mid \nu ^x+0.5M_l,s^x+0.5\sum _{\{t:U_t=l\}}(z_{t-1}-\mu _l^x)^2\right) \\&\quad \left( \prod _{t=2}^n \sum _{m=1}^L p_m\mathrm {N}(z_{t-1} \mid \mu _m^x,\delta _m^x)\right) ^{-1}. \end{aligned}$$

For non-active components, the full conditional is \(\mathrm {IG}(\delta _l^x \mid \nu ^x,s^x)(\prod _{t=2}^n \sum _{m=1}^L p_m\mathrm {N}(z_{t-1} \mid \mu _m^x,\delta _m^x))^{-1}\). We use a similar strategy for sampling \(\delta _l^x\) as we did with \(\mu _l^x\), using a random-walk Metropolis algorithm for the active components of \(\delta _l^x\), working on the log-scale and sampling \(\log (\delta _l^x)\), and proposing the non-active components from \(G_0(\delta _l^x)=\) \(\mathrm {IG}(\nu ^x,s^x)\).

We next discuss the updating scheme for the vector \(\varvec{p}=\) \((p_1,\ldots ,p_L)\), which poses the main challenge for posterior simulation. The full conditional for \(\varvec{p}\) has the form

$$\begin{aligned} f(\varvec{p} \mid \alpha )\prod _{l=1}^L p_l^{M_l} \left( \prod _{t=2}^n \sum _{m=1}^L p_m\mathrm {N}(z_{t-1} \mid \mu _m^x,\delta _m^x) \right) ^{-1}. \end{aligned}$$

In standard DP mixture models, the implied generalized Dirichlet prior for \(f(\varvec{p} \mid \alpha )\) combines with \(\prod _{l=1}^L p_l^{M_l}\) to form another generalized Dirichlet distribution. However, in this case there is an additional term. Metropolis–Hastings algorithms with various proposal distributions were explored to sample the vector p, resulting in very low acceptance rates. We instead devise an alternative sampling scheme, in which we work directly with the latent beta-distributed random variables which determine the probability vector \(\varvec{p}\) arising from the DP truncation approximation. Recall that \(p_1=v_1\), \(p_l=v_l\prod _{r=1}^{l-1}(1-v_r)\), for \(l=2,\ldots ,L-1\), and \(p_L=\prod _{r=1}^{L-1}(1-v_r)\), where \(v_1,\ldots ,v_{L-1}\mathop {\sim }\limits ^{i.i.d.}\mathrm {beta}(1,\alpha )\). Equivalently, let \(\zeta _1,\ldots ,\zeta _{L-1}\mathop {\sim }\limits ^{i.i.d.}\mathrm {beta}(\alpha ,1)\), and define \(p_1=\) \(1-\zeta _1\), \(p_l =\) \((1-\zeta _l)\prod _{r=1}^{l-1}\zeta _r\), and \(p_L=\prod _{r=1}^{L-1}\zeta _r\). Rather than updating directly \(\varvec{p}\), we work with the \(\zeta _{l}\), a sample for which implies a particular probability vector \(\varvec{p}\).

The full conditional for \(\zeta _l\), \(l=1,\ldots ,L-1\), has the form

$$\begin{aligned}&p(\zeta _{l} \mid \ldots ,\mathrm {data}) \propto \mathrm {beta}\left( \zeta _{l} \mid \alpha +\sum _{r=l+1}^{L}M_{r},M_{l}+1\right) \nonumber \\&\quad \left( \prod _{t=2}^{n}d(z_{t-1})\right) ^{-1}, \end{aligned}$$
(8)

where

$$\begin{aligned} d(z_{t-1})= & {} \mathrm {N}(z_{t-1} \mid \mu _{1}^{x},\delta _{1}^{x})(1-\zeta _{1})\\&\quad +\sum _{l=2}^{L-1} \mathrm {N}(z_{t-1} \mid \mu _{l}^{x},\delta _{l}^{x})(1-\zeta _{l})\prod _{s=1}^{l-1}\zeta _{s}\\&\quad +\, \mathrm {N}(z_{t-1} \mid \mu _{L}^{x},\delta _{L}^{x})\prod _{s=1}^{L-1}\zeta _{s}. \end{aligned}$$

Also, let \(c_{t,l}=\mathrm {N}(z_{t-1} \mid \mu _{l}^{x},\delta _{l}^{x})\), which is constant with respect to each \(\zeta _l\). The form of the full conditional in (8) suggests the use of a slice sampler to update each \(\zeta _l\) one at a time. The slice sampler is implemented by drawing auxiliary random variables \(u_{t}\sim \mathrm {uniform}(0,(d(z_{t-1}))^{-1}),\) \(t=2,\ldots ,n,\) and then sampling \(\zeta _{l}\sim \mathrm {beta}(\alpha +\sum _{r=l+1}^{L}M_{r},M_{l}+1)\), but restricted to the set \(\{\zeta _{l}:u_{t}<(d(z_{t-1}))^{-1},t=2,\ldots ,n\}\). The term \(d(z_{t-1})\) can be expressed as

\(d(z_{t-1})=\) \(\zeta _{l} w_{1t}+w_{0t}\), for any \(l=1,\ldots ,L-1\), where

$$\begin{aligned} w_{1t}= & {} -c_{t,l}\prod _{s=1}^{l-1}\zeta _{s}+\left( \sum _{m=l+1}^{L-1}c_{t,m}(1-\zeta _{m})\prod _{s=1,s\ne l}^{m-1}\zeta _{s}\right) \\&\quad +\,c_{t,L}\prod _{s=1,s\ne l}^{L-1}\zeta _{s} \end{aligned}$$

and, if \(l=1,\) \(w_{0t}=c_{t,1}\), otherwise \(w_{0t}=c_{t,1}(1-\zeta _{1})+\sum _{s=2}^{l-1}c_{t,s}(1-\zeta _{s})\prod _{r=1}^{s-1}\zeta _{r}+c_{t,l}\prod _{s=1}^{l-1}\zeta _{s}\). Then, the set \(\{\zeta _{l}:d(z_{t-1})<u_{t}^{-1}\}\) is \(\{\zeta _{l}:\zeta _{l}w_{1t}<u_{t}^{-1}-w_{0t}\}.\) This takes the form of \(\{\zeta _{l}:\zeta _{l}<(u_{t}w_{1t})^{-1}-w_{0t}(w_{1t})^{-1}\}\) when \(w_{1t}\) is positive, and has the form \(\{\zeta _{l}:\zeta _{l}>(u_{t}w_{1t})^{-1}-w_{0t}(w_{1t})^{-1}\}\) otherwise. Therefore, the truncated beta random draw for \(\zeta _l\) must lie in the interval \((\max _{\{t:w_{1t}<0\}}[(u_{t}w_{1t})^{-1}-w_{0t}(w_{1t})^{-1}],\min _{\{t:w_{1t}>0\}}[(u_{t}w_{1t})^{-1}-w_{0t}(w_{1t})^{-1}])\). The inverse CDF random variate generation method can be used to sample from these truncated beta random variables. This strategy results in direct draws for the \(\zeta _l\).

Appendix 2: Computing posterior predictive ordinates

We describe here an approach to computing one-step-ahead posterior predictive ordinates, \(p(z_{t} \mid \varvec{z}_{(t-1)})\), where \(\varvec{z}_{(m)}=\) \((z_{2},\ldots ,z_{m})\), for \(m=2,\ldots ,n\), is the observed series up to time m. The objective is to compute \(p(z_{t} \mid \varvec{z}_{(t-1)})\) for any desired number of observations \(z_{t}\), using the samples from the posterior distribution given the full data vector \(\varvec{z}_{(n)}\).

Denote by \(\varvec{\Theta }=\) \((\{ \varvec{\eta }_{l}: l=1,\ldots ,L \},\varvec{p},\alpha ,\varvec{\psi })\) all model parameters, excluding the latent configuration variables. We abbreviate \(f(z_t \,{\mid }\, z_{t-1},G)\) in (3) to \(f(z_t \mid z_{t-1})\), but note that, given the \(\varvec{\eta }_{l}\) and \(\varvec{p}\), the mixture model for the transition density can be computed at any values \(z_{t}\) and \(z_{t-1}\). Let \(B_{(m)}\) be the normalizing constant of the posterior distribution for \(\varvec{\Theta }\) given \(\varvec{z}_{(m)}\), and \(p(\varvec{\Theta })=\) \(\left\{ \prod _{l=1}^{L} G_0(\varvec{\eta }_l \mid \varvec{\psi })\right\} f(\varvec{p}\,{\mid }\, \alpha ) p(\alpha ) p(\varvec{\psi })\) be the prior for \(\varvec{\Theta }\). Then,

$$\begin{aligned} p(\varvec{\Theta } \mid \varvec{z}_{(n-1)})= & {} \frac{p(\varvec{\Theta }) \prod _{t=2}^{n-1} f(z_{t} \mid z_{t-1}) }{B_{(n-1)}}\\= & {} \frac{p(\varvec{\Theta }) \prod _{t=2}^{n} f(z_{t} \mid z_{t-1}) }{B_{(n-1)}\, f(z_{n} \mid z_{n-1})} \\= & {} \frac{ B_{(n)} \, p(\varvec{\Theta } \mid \varvec{z}_{(n)})}{ B_{(n-1)} \, f(z_{n} \mid z_{n-1})} \end{aligned}$$

and therefore \(p(z_{n} \mid \varvec{z}_{(n-1)}) =\) \(\int f(z_{n} \mid z_{n-1}) p(\varvec{\Theta } \mid \varvec{z}_{(n-1)}) \, \text {d}\varvec{\Theta }=\) \(B_{(n)}/B_{(n-1)}\). In addition, \(\int \{ f(z_{n} \mid z_{n-1}) \}^{-1} p(\varvec{\Theta } \mid \varvec{z}_{(n)}) \, \text {d}\varvec{\Theta }=\) \(B_{(n-1)}/B_{(n)}\), and thus

$$\begin{aligned}&p(z_{n} \mid \varvec{z}_{(n-1)})\nonumber \\&\quad = \left( \int \left\{ f(z_{n} \mid z_{n-1})\right\} ^{-1} p\left( \varvec{\Theta } \mid \varvec{z}_{(n)}\right) \, \text {d}\varvec{\Theta } \right) ^{-1}. \end{aligned}$$
(9)

Similarly, \(p(\varvec{\Theta } \mid \varvec{z}_{(n-2)})=\) \(\{ B_{(n)} p(\varvec{\Theta } \mid \varvec{z}_{(n)}) \}/\{ B_{(n-2)} f(z_{n} \mid z_{n-1}) f(z_{n-1} \mid z_{n-2}) \}\). Hence, \(p(z_{n-1} \mid \varvec{z}_{(n-2)})=\) \(\int f(z_{n-1} \mid z_{n-2}) p(\varvec{\Theta } \mid \varvec{z}_{(n-2)}) \, \text {d}\varvec{\Theta }=\) \(\frac{ B_{(n)} }{ B_{(n-2)} } \int \{ f(z_{n} \mid z_{n-1}) \}^{-1} p(\varvec{\Theta } \mid \varvec{z}_{(n)}) \, \text {d}\varvec{\Theta }\). Then, observing that \(\int \{ f(z_{n} \mid z_{n-1}) f(z_{n-1} \mid z_{n-2}) \}^{-1} p(\varvec{\Theta } \mid \varvec{z}_{(n)}) \, \text {d}\varvec{\Theta }=\) \(B_{(n-2)}/B_{(n)}\), we obtain an expression for \(p(z_{n-1} \mid \varvec{z}_{(n-2)})\) that involves the product of the two integrals above. Extending the derivation for \(p(z_{n-1} \mid \varvec{z}_{(n-2)})\), we obtain

$$\begin{aligned} p(z_{t} \mid \varvec{z}_{(t-1)})= & {} \left( \int \left\{ \prod _{s=t}^{n} f(z_{s} \mid z_{s-1}) \right\} ^{-1} p(\varvec{\Theta } \mid \varvec{z}_{(n)}) \, \text {d}\varvec{\Theta } \right) ^{-1} \\&\left( \int \left\{ \prod _{s=t+1}^{n} f(z_{s} \mid z_{s-1}) \right\} ^{-1} p(\varvec{\Theta } \mid \varvec{z}_{(n)}) \, \text {d}\varvec{\Theta } \right) \end{aligned}$$

for any \(t=3,\ldots ,n-1\), with the expression for \(t=n\) given in (9). These expressions allow us to estimate any posterior predictive ordinate \(p(z_{t} \mid \varvec{z}_{(t-1)})\), using Monte Carlo integration based on the samples from \(p(\varvec{\Theta } \mid \varvec{z}_{(n)})\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

DeYoreo, M., Kottas, A. A Bayesian nonparametric Markovian model for non-stationary time series. Stat Comput 27, 1525–1538 (2017). https://doi.org/10.1007/s11222-016-9702-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-016-9702-x

Keywords

Navigation