Skip to main content
Log in

Estimation in threshold autoregressive models with correlated innovations

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Large sample statistical analysis of threshold autoregressive models is usually based on the assumption that the underlying driving noise is uncorrelated. In this paper, we consider a model, driven by Gaussian noise with geometric correlation tail and derive a complete characterization of the asymptotic distribution for the Bayes estimator of the threshold parameter.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Caner, M., Hansen, B. E. (2001). Threshold autoregression with a unit root. Econometrica, 69(6), 1555–1596.

    Google Scholar 

  • Cappé, O., Moulines, E., Rydén, T. (2005). Inference in hidden Markov models. New York: Springer Series in Statistics, Springer.

  • Chan, K. S. (1993). Consistency and limiting distribution of the least squares estimator of a threshold autoregressive model. The Annals of Statistics, 21(1), 520–533.

    Google Scholar 

  • Chan, N. H., Kutoyants, Y. A. (2010). Recent developments of threshold estimation for nonlinear time series. Journal of the Japan Statistical Society, 40(2), 277–303.

    Google Scholar 

  • Chan, N. H., Kutoyants, Y. A. (2012). On parameter estimation of threshold autoregressive models. Statistical Inference for Stochastic Processes, 15(1), 81–104.

    Google Scholar 

  • Dedecker, J., Doukhan, P. (2003). A new covariance inequality and applications. Stochastic Processes and their Applications, 106(1), 63–80.

    Google Scholar 

  • Doukhan, P., Louhichi, S. (1999). A new weak dependence condition and applications to moment inequalities. Stochastic Processes and their Applications, 84(2), 313–342.

    Google Scholar 

  • Hansen, B. E. (2011). Threshold autoregression in economics. Statistics and its Interface, 4(2), 123–127.

    Google Scholar 

  • Ibragimov, I. A., Has’minskiĭ, R. Z. (1981). Statistical estimation. Asymptotic theory. In Applications of Mathematics (Vol. 16). New York: Springer.

  • Kutoyants, Y. A. (2012). On identification of the threshold diffusion processes. Annals of the Institute of Statistical Mathematics, 64(2), 383–413.

    Google Scholar 

  • Ling, S., Tong, H. (2005). Testing for a linear MA model against threshold MA models. The Annals of Statistics, 33(6), 2529–2552.

    Google Scholar 

  • Ling, S., Tong, H., Li, D. (2007). Ergodicity and invertibility of threshold moving-average models. Bernoulli, 13(1), 161–168.

    Google Scholar 

  • Liptser, R. S., Shiryaev, A. N. (2001). Statistics of random processes. II Applications. In Applications of Mathematics (New York) (Vol. 6, expanded ed.). Berlin: Springer.

  • Liu, W., Ling, S., Shao, Q. M. (2011). On non-stationary threshold autoregressive models. Bernoulli, 17(3), 969–986.

    Google Scholar 

  • Meyer, R. M. (1973). A Poisson-type limit theorem for mixing sequences of dependent “rare” events. Annals of Probability, 1, 480–483.

    Google Scholar 

  • Meyn, S., Tweedie, R. L. (2009). Markov chains and stochastic stability (2nd ed.). Cambridge: Cambridge University Press.

  • Pham, D. T., Chan, K. S., Tong, H. (1991). Strong consistency of the least squares estimator for a nonergodic threshold autoregressive model. Statistica Sinica, 1(2), 361–369.

    Google Scholar 

  • Tong, H. (1983). Threshold models in nonlinear time series analysis. In Lecture Notes in Statistics (Vol. 21). New York: Springer.

  • Tong, H. (2011). Threshold models in time series analysis—30 years on. Statistics and its Interface, 4(2), 107–118.

    Google Scholar 

  • Tsay, R. S. (1989). Testing and modeling threshold autoregressive processes. Journal of the American Statistical Association, 84(405), 231–240.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Chigansky.

Additional information

P. Chigansky is supported by ISF grant 314/09.

Appendix: Ergodic lemmas used in the proof

Appendix: Ergodic lemmas used in the proof

The proofs in Sect. 2 use the ergodic properties of the processes, summarized in the following lemmas. Our standing assumption is (10).

Lemma 6

For all integers \(j\ge 0\) and \(p\ge 1\),

$$\begin{aligned} \mathbb{P }_{\theta _0}\left( X_j \in [\theta _0,\theta _0+v/n]\right) \le \frac{|v|}{n}, \end{aligned}$$
(44)

and

$$\begin{aligned} \mathbb{E }_{\theta _0} \left( |X_j|^p + |\xi _j|^p\big |X_0=x, \xi _0=y\right) \le r_1^j R_1 \left( |x|^p+|y|^p\right) + R_2, \end{aligned}$$
(45)

with a positive constant \(r_1<1\) and constants \(R_1\) and \(R_2\), independent of \(\theta _0\).

Proof

For \(j\ge 1\),

$$\begin{aligned} \mathbb{P }_{\theta _0}\left( X_j \in [\theta _0,\theta _0+v/n]\right) =\mathbb{E }_{\theta _0} \int _{\theta _0}^{\theta _0+v/n} \frac{1}{\sqrt{2\pi }} {\text{ e }}^{-\frac{1}{2} (x-f(X_{j-1},\theta _0)-\xi _{j-1})^2}{\text{ d }}x \le \frac{v}{n}. \end{aligned}$$

Further, by Jensen’s inequality

$$\begin{aligned} |\xi _j|^p \le \left( |a|\xi _{j-1}+(1-|a|) \frac{1}{1-|a|}|\zeta _j|\right) ^p \le |a| |\xi _{j-1}|^p + (1-|a|)^{1-p} |\zeta _j|^p, \end{aligned}$$

and hence

$$\begin{aligned} \mathbb{E }_{\theta _0}\left( |\xi _j|^p \big |\xi _0=y\right) \le |a|^j |y|^p + C_1. \end{aligned}$$

Similarly, with \(\rho := |\rho ^+|\vee |\rho ^-|\)

$$\begin{aligned} |X_j|^p \le \left( \rho |X_{j-1}| \!+\! |\xi _{j-1}|\!+\!|\varepsilon _j|\right) ^p\!=\!\rho |X_{j-1}|^p \!+\! \left( \frac{2}{1\!-\!\rho }\right) ^{p-1}\left( |\xi _{j-1}|^p+|\varepsilon _j|^p\right) , \end{aligned}$$

and

$$\begin{aligned} \mathbb{E }_{\theta _0}\left( |X_{j}|^p\big |X_0=x,\xi _0=y\right) \le \rho ^j |x|^p + C_2 |a|^j |y|^p + C_3, \end{aligned}$$

which gives (45). \(\square \)

Lemma 7

The Markov chain \((X_j, \xi _j)\) has the unique invariant measure under \(\mathbb{P }_{\theta _0}\), with uniformly bounded probability density \(p(x,y;\theta _0)\) satisfying

$$\begin{aligned}&\widetilde{\mathbb{P }}_{\theta _0}\left( X_j \in [\theta _0,\theta _0+v/n]\right) \nonumber \\&\quad = \int _{\theta _0}^{\theta _0+v/n}\int _{\mathbb{R }} p(x,y;\theta _0)\mathrm{{d}}x\mathrm{{d}}y= \frac{v}{n} \int _\mathbb{R }p(\theta _0,y;\theta _0)\mathrm{{d}}y+ O(n^{-2}), \end{aligned}$$
(46)

where \(\widetilde{\mathbb{P }}_{\theta _0}\) is the corresponding stationary probability on \((\Omega ,\mathcal{F })\).

Moreover, the chain is geometrically ergodic, i.e., there exist positive constants \(C\) and \(r<1\), such that for a measurable function \(|h|\le 1\) and \(m\ge k\)

$$\begin{aligned} \Big |\mathbb{E }_{\theta _{0}}\left( h(X_m,\xi _m)|X_k\!=\!x,\xi _k\!=\!y\right) \!-\!\widetilde{\mathbb{E }}_{\theta _0}h(X_k,\xi _k)\Big |\!\le \! C r^{m-k} (|x|\!+\!|y|), \quad x,y\in \mathbb{R },\nonumber \\ \end{aligned}$$
(47)

and consequently, for an \(\mathcal{F }_{m,\infty }\)-measurable random variable \(|H|\le 1\)

$$\begin{aligned} \mathbb{E }_{\theta _0}\Big |\mathbb{E }_{\theta _0}\left( H|\mathcal{F }_k\right) -\widetilde{\mathbb{E }}_{\theta _0} H\Big |\le C r^{m-k}. \end{aligned}$$
(48)

Finally, \((X_j,\xi _j)\) is geometrically mixing, i.e., for measurable functions \(|g|\le 1\), \(|h|\le 1\)

$$\begin{aligned} \Big |\mathbb{E }_{\theta _0}g(X_k,\xi _k)h(X_{k+m},\xi _{k+m})- \mathbb{E }_{\theta _0}g(X_k,\xi _k)\mathbb{E }_{\theta _0} h(X_{k+m},\xi _{k+m}) \Big |\le C r^m(1+r^k).\nonumber \\ \end{aligned}$$
(49)

In particular, (48) and (49) hold with the stationary expectation \(\widetilde{\mathbb{E }}_{\theta _0}\).

Proof

The transition kernel of the process \((X_j,\xi _j)\) has a positive density with respect to the Lebesgue measure:

$$\begin{aligned} (P\mathbf 1 _{\{A\}})(x,y) := \int _A \frac{1}{2\pi } \exp \left( -\frac{1}{2} \left( u-f(x,\theta _0)-y\right) ^2-\frac{1}{2} (v-ay)^2\right) {\text{ d }}u{\text{ d }}v \end{aligned}$$

and hence in the terminology of Meyn and Tweedie (2009), it is \(\psi \)-irreducible and aperiodic. Further, a ball \(B_R\) of radius \(R>0\) around the origin is a small set with respect to, e.g., the measure

$$\begin{aligned} \nu ({\text{ d }}x,{\text{ d }}y):={\text{ e }}^{ -\frac{1}{2} (u^2+ v^2)-(1+R)(|u| +|v|)}{\text{ d }}u{\text{ d }}v \end{aligned}$$

and \(V(x,y)=|x|+|y|\) satisfies the drift condition

$$\begin{aligned} (PV)(x,y) -V(x,y)\le -\frac{1}{2} (1-\rho \wedge |a|) V(x,y)+ 2\mathbf 1 _{\{(x,y)\in B_R\}}, \quad (x,y)\in \mathbb{R }^2, \end{aligned}$$

for sufficiently large \(R\). By Theorem 15.0.1 in Meyn and Tweedie (2009), it follows that there exists a unique invariant probability measure \(\pi \) and for any measurable \(h(x,y)\le V(x,y)\),

$$\begin{aligned} \left| P^n h-\int h{\text{ d }}\pi \right| \le Cr^n V(x,y), \end{aligned}$$

with positive constants \(C\) and \(r<1\), i.e., (47) holds. Since \(\widetilde{\mathbb{E }}_{\theta _0} H = \widetilde{\mathbb{E }}_{\theta _0} h(X_m,\xi _m)\) and \(\mathbb{E }_{\theta _0} (H|\mathcal{F }_k) = \mathbb{E }_{\theta _0} (h(X_m,\xi _m)|\mathcal{F }_k)\) with \(h(x,y):= \mathbb{E }_{\theta _0}(H|X_m=x,\xi _m=y)\), the claim (48) follows from (45) and (47). Since the transition kernel \(P\) has a bounded continuously differentiable density with respect to the Lebesgue measure, so does the invariant measure \(\pi \) and (46) follows. The mixing inequality (49) follows from Theorem 16.1.5 in Meyn and Tweedie (2009). \(\square \)

The theory, used in the proof of the previous lemma, does not directly apply to the Markov chain \((X_j,\xi _j,\widehat{\Xi }_j)\) [see (16) for the definition of \(\widehat{\Xi }_j\)], since it is generated by a \((3+d)\)-dimensional recursion, driven by two dimensional noise. This typically excludes \(\psi \)-irreducibility. Fortunately, for our purposes the following weaker properties are sufficient:

Lemma 8

The Markov process \((X_j,\xi _j,\widehat{\Xi }_j)\) has the unique invariant measure. Let \(\widetilde{\mathbb{P }}_{\theta _0}\) denote the corresponding stationary probability (by uniqueness, the stationary probabilities \(\widetilde{\mathbb{P }}_{\theta _0}\), introduced in Lemmas 7 and 8, coincide). Then for a measurable function \(h(x,y,z)\), satisfying \(|h(x,y,z)|<1\) and the Lipschitz condition

$$\begin{aligned}&|h(x,y,z)\!-\!h(x,y,z^{\prime })|\!\le \! L\left( 1\!+\!|x|\!+\!|y|\!+\!\Vert z\Vert \!+\!\Vert z^{\prime }\Vert \right) \Vert z\!-\!z^{\prime }\Vert ,\\&\qquad x,y\in \mathbb{R },\; z,z^{\prime }\in \mathbb{R }^{d+1}, \end{aligned}$$

with a positive constant \(L\),

$$\begin{aligned} \mathbb{E }_{\theta _0}\Big |\mathbb{E }_{\theta _0} \left( h(X_m,\xi _m, \widehat{\Xi }_m)\big |\mathcal{F }_\ell \right) -\widetilde{\mathbb{E }}_{\theta _0} h(X_0,\xi _0, \widehat{\Xi }_0)\Big |\le C q^{m-\ell } \end{aligned}$$
(50)

for some positive constants \(C\) and \(q<1\) and all integers \(m\ge \ell \ge 0\).

Proof

Under the stationary measure \(\widetilde{\mathbb{P }}_{\theta _0}\) from Lemma 7, we can extend the definition of \((X_j,\xi _j)\) to the negative integers and define

$$\begin{aligned} \widehat{\Xi }^{k}_0 {:=} c\sum _{i=-\infty }^0 b^i \left( X_i-f(X_{i-1},\theta _0+u_k/n)\right) , \quad k = 0,{\ldots },d \end{aligned}$$
(51)

where \(b:=\frac{a}{1+\gamma }\) and \(c:=\frac{a\gamma }{1+\gamma }\). The distribution of \((X_0,\xi _0,\widehat{\Xi }_0)\) is invariant. To establish uniqueness, let \(\mu \) and \(\mu ^{\prime }\) be two invariant measures and note that by Lemma 7 their \((X,\xi )\) marginals coincide. Hence,

$$\begin{aligned} \mu (\mathrm{d}x,\mathrm{d}y,\mathrm{d}z) = \nu (\mathrm{d}x,\mathrm{d}y)\mu (x,y;\mathrm{d}z),\quad \mu ^{\prime }(\mathrm{d}x,\mathrm{d}y,\mathrm{d}z) = \nu (\mathrm{d}x,\mathrm{d}y)\mu ^{\prime }(x,y;\mathrm{d}z) \end{aligned}$$

where \(\nu \) is the invariant measure of the process \((X_j,\xi _j)\) and \(\mu (x,y;\mathrm{d}z)\) and \(\mu ^{\prime }(x,y;\mathrm{d}z)\) are corresponding regular conditional probabilities. Let \((X_j,\xi _j,\widehat{\Xi }_j)\) and \((X_j,\xi _j,\widehat{\Xi }^{\prime }_j)\) be the solutions of the recursions (2), (3) and (7) with \(u:=u_k\), \(k=0,{\ldots },d\) subject to the initial conditions \((X_0,\xi _0,\widehat{\Xi }_0)\) and \((X_0,\xi _0,\widehat{\Xi }^{\prime }_0)\), where \((X_0,\xi _0)\) is sampled from \(\nu \) and \(\widehat{\Xi }_0\) and \(\widehat{\Xi }^{\prime }_0\) are sampled from \(\mu (X_0,\xi _0;\mathrm{d}z)\) and \(\mu ^{\prime }(X_0,\xi _0;\mathrm{d}z)\). Note that \(\widehat{\Xi }^{\prime }_j-\widehat{\Xi }_j= b^j(\widehat{\Xi }^{\prime }_0-\widehat{\Xi }_0)\) and hence for any uniformly continuous function \(g\)

$$\begin{aligned} \left| \int g {\text{ d }}\mu -\int g{\text{ d }}\mu ^{\prime }\right| \le \mathbb{E }_{\theta _0}\big |g(X_j,\xi _j,\widehat{\Xi }_j)-g(X_j,\xi _j,\widehat{\Xi }^{\prime }_j)\big | \xrightarrow {j\rightarrow \infty }0. \end{aligned}$$

Since uniformly continuous functions form a measure defining class, the uniqueness follows.

To derive the bound (50), note that for \(\ell \le m\)

$$\begin{aligned} \widehat{\Xi }_{m}^{k}&= \widehat{\Xi }^k_\ell b^{m-\ell } + c\sum _{j=\ell +1}^{m} \left( X_j-f(X_{j-1}, \theta _0+u_k/n)\right) b^{m-j}\\&= b^{\frac{1}{2} (m-\ell )}\left( \widehat{\Xi }^k_\ell b^{\frac{1}{2} (m-\ell )} + c\sum _{j=\ell +1}^{\frac{1}{2} (m+\ell )} \left( X_j-f(X_{j-1}, \theta _0+u_k/n)\right) b^{\frac{1}{2} (m+\ell )-j}\right) \\&\quad +c\sum _{j=\frac{1}{2} (m+\ell )+1}^{m} \left( X_j-f(X_{j-1}, \theta _0+u_k/n)\right) b^{m-j}=: b^{\frac{1}{2} (m-\ell )}J^k_1+J^k_2. \end{aligned}$$

Using the bound (45), we get

$$\begin{aligned} \mathbb{E }_{\theta _0} \big |J^k_1\big |^{2} \!\le \! 2\mathbb{E }_{\theta _0}|\widehat{\Xi }^k_\ell |^{2} \!+\! C_1 \sum _{j=\ell +1}^{\frac{1}{2} (m\!+\!\ell )} \mathbb{E }_{\theta _0}\left( |X_j|\!+\!|X_{j-1}|\right) ^{2} |b|^{\frac{1}{2} (m\!+\!\ell )\!-\!j}\!\le \! C_2. \end{aligned}$$
(52)

By the triangle inequality

$$\begin{aligned}&\mathbb{E }_{\theta _0}\Big |\mathbb{E }_{\theta _0} \left( h(X_m,\xi _m, \widehat{\Xi }_m)\big |\mathcal{F }_\ell \right) -\widetilde{\mathbb{E }}_{\theta _0} h(X_0,\xi _0, \widehat{\Xi }_0)\Big |\nonumber \\&\quad =\mathbb{E }_{\theta _0}\Big |\mathbb{E }_{\theta _0} \left( h(X_m,\xi _m, \widehat{\Xi }_m)\big |\mathcal{F }_\ell \right) -\widetilde{\mathbb{E }}_{\theta _0} h(X_m,\xi _m, \widehat{\Xi }_m)\Big |\nonumber \\&\quad \le \mathbb{E }_{\theta _0}\Big |\mathbb{E }_{\theta _0} \left( h(X_m,\xi _m, b^{\frac{1}{2} (m-\ell )}J_1+J_2)\big |\mathcal{F }_\ell \right) -\mathbb{E }_{\theta _0} \left( h(X_m,\xi _m, J_2)\big |\mathcal{F }_\ell \right) \Big |\nonumber \\&\qquad + \mathbb{E }_{\theta _0}\Big | \mathbb{E }_{\theta _0} \left( h(X_m,\xi _m, J_2)\big |\mathcal{F }_\ell \right) -\widetilde{\mathbb{E }}_{\theta _0} h(X_m,\xi _m, J_2) \Big |\nonumber \\&\qquad + \Big | \widetilde{\mathbb{E }}_{\theta _0} h(X_m,\xi _m, J_2) -\widetilde{\mathbb{E }}_{\theta _0} h(X_m,\xi _m, b^{\frac{1}{2} (m-\ell )}J_1+J_2)\Big |. \end{aligned}$$
(53)

Note that \(h(X_m,\xi _m, J_2)\) is measurable with respect to \(\mathcal{F }_{\frac{1}{2} (m+\ell ),\infty }\) and by (48)

$$\begin{aligned} \mathbb{E }_{\theta _0}\Big | \mathbb{E }_{\theta _0} \left( h(X_m,\xi _m, J_2)\big |\mathcal{F }_\ell \right) -\widetilde{\mathbb{E }}_{\theta _0} h(X_m,\xi _m, J_2) \Big |\le C r^{\frac{1}{2} (m-\ell )}. \end{aligned}$$

By the Lipschitz property of \(h\) and (52), we have

$$\begin{aligned}&\mathbb{E }_{\theta _0}\Big |\mathbb{E }_{\theta _0} \left( h(X_m,\xi _m, b^{\frac{1}{2} (m-\ell )}J_1\!+\!J_2)\big |\mathcal{F }_\ell \right) \!-\!\mathbb{E }_{\theta _0} \left( h(X_m,\xi _m, J_2)\big |\mathcal{F }_\ell \right) \Big |\\&\quad \le \mathbb{E }_{\theta _0}\Big |h(X_m,\xi _m, b^{\frac{1}{2} (m-\ell )}J_1\!+\!J_2) \!-\!h(X_m,\xi _m, J_2)\Big |\\&\quad \le b^{\frac{1}{2} (m-\ell )}\mathbb{E }_{\theta _0}L\left( 1\!+\!|X_m|\!+\!|\xi _m|\!+\!\Vert \widehat{\Xi }_m\Vert \!+\!\Vert J_2\Vert \right) \Vert J_1\Vert \\&\quad \le b^{\frac{1}{2} (m-\ell )}L\left( \mathbb{E }_{\theta _0}\left( 1\!+\!|X_m|\!+\!|\xi _m|\!+\!\Vert \widehat{\Xi }_m\Vert \!+\!\Vert J_2\Vert \right) ^2\right) ^{1/2}\left( \mathbb{E }_{\theta _0}\Vert J_1\Vert ^2\right) ^{1/2}\\&\quad \le C_3 b^{\frac{1}{2} (m-\ell )}. \end{aligned}$$

Similar bound holds for the last term in (53) and the claim follows with \(q:=\sqrt{|b|\vee r}\). \(\square \)

About this article

Cite this article

Chigansky, P., Kutoyants, Y.A. Estimation in threshold autoregressive models with correlated innovations. Ann Inst Stat Math 65, 959–992 (2013). https://doi.org/10.1007/s10463-013-0402-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-013-0402-4

Keywords

Navigation