Skip to main content
Log in

Mean targeting estimator for the integer-valued GARCH(1, 1) model

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

The integer-valued GARCH model is commonly used in modeling time series of counts. Maximum likelihood estimation (MLE) is used to estimate unknown parameters, but numerical results for MLE are sensitive to the choice of initial values, which also occurs in estimating the GARCH model. To alleviate this numerical difficulty, we propose an alternative to MLE and name it as mean targeting estimation (MTE), which is an analogue to variance targeting estimation used in the GARCH model. Consistency and asymptotic normality for MTE are established. Comparisons with the standard MLE are provided and the merits of the mean targeting method are discussed. In particular, it is shown that MTE can be superior to MLE for estimating parameters or prediction when the model is well specified and misspecified. We conduct numerical studies to confirm our theoretical findings and illustrate the practical utility of our proposals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Ahmad A, Francq C (2016) Poisson QMLE of count time series models. J Time Ser Anal 37:291–314

    Article  MathSciNet  Google Scholar 

  • Chen CWS, Lee S (2016) Generalized Poisson autoregressive models for time series of counts. Comput Stat Data Anal 99:51–67

    Article  MathSciNet  Google Scholar 

  • Christou V, Fokianos K (2014) Quasi-likelihood inference for negative binomial time series models. J Time Ser Anal 35:55–78

    Article  MathSciNet  Google Scholar 

  • Cui Y, Wu R (2016) On conditional maximum likelihood estimation for INGARCH\((p, q)\) models. Stat Probab Lett 118:1–7

    Article  MathSciNet  Google Scholar 

  • Cui Y, Zheng Q (2017) Conditional maximum likelihood estimation for a class of observation-driven time series models for count data. Stat Probab Lett 123:193–201

    Article  MathSciNet  Google Scholar 

  • Davis RA, Liu H (2016) Theory and inference for a class of nonlinear models with application to time series of counts. Stat Sin 26:1673–1707

    MathSciNet  MATH  Google Scholar 

  • Davis RA, Holan SH, Lund R, Ravishanker N (2016) Handbook of discrete-valued time series. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

  • Engle RF, Mezrich J (1996) GARCH for groups. Risk 6:36–40

    Google Scholar 

  • Ferland R, Latour A, Oraichi D (2006) Integer-valued GARCH process. J Time Ser Anal 27:923–942

    Article  MathSciNet  Google Scholar 

  • Fokianos K, Rahbek A, Tjøstheim D (2009) Poisson autoregression. J Am Stat Assoc 104:1430–1439

    Article  MathSciNet  Google Scholar 

  • Francq C, Zakoïan J-M (2004) Maximum likelihood estimation of pure GARCH and ARMA-GARCH processes. Bernoulli 10:605–637

    Article  MathSciNet  Google Scholar 

  • Francq C, Horváth L, Zakoïan J-M (2011) Merits and drawbacks of variance targeting in GARCH models. J Financ Econom 9:619–656

    Article  Google Scholar 

  • Francq C, Horváth L, Zakoïan J-M (2016) Variance targeting estimation of multivariate GARCH models. J Financ Econom 14:353–382

    Article  Google Scholar 

  • Gonçalves E, Mendes-Lopes N, Silva F (2015) Infinitely divisible distributions in integer-valued GARCH models. J Time Ser Anal 36:503–527

    Article  MathSciNet  Google Scholar 

  • Gonçalves E, Mendes-Lopes N, Silva F (2016) Zero-inflated compound Poisson distributions in integer-valued GARCH models. Statistics 50:558–578

    Article  MathSciNet  Google Scholar 

  • Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. In: Le Cam LM, Neyman J (eds) Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1. University of California Press, Berkeley, pp 221–233

  • Khoo WC, Ong SH, Biswas A (2017) Modeling time series of counts with a new class of INAR(1) model. Stati Pap 58:393–416

    Article  MathSciNet  Google Scholar 

  • Kristensen D, Linton O (2004) Consistent standard errors for target variance approach to GARCH estimation. Econom Theory 20:990–993

    Article  Google Scholar 

  • Kristensen D, Rahbek A (2005) Asymptotics of the QMLE for a class of ARCH\((q)\) models. Econom Theory 21:946–961

    Article  MathSciNet  Google Scholar 

  • Lee S, Lee Y, Chen CWS (2016) Parameter change test for zero-inflated generalized Poisson autoregressive models. Statistics 50:540–557

    Article  MathSciNet  Google Scholar 

  • Li Q, Lian H, Zhu F (2016) Robust closed-form estimators for the integer-valued GARCH(1,1) model. Comput Stat Data Anal 101:209–225

    Article  MathSciNet  Google Scholar 

  • Liu T, Yuan X (2013) Random rounded integer-valued autoregressive conditional heteroskedastic process. Stat Pap 54:645–683

    Article  MathSciNet  Google Scholar 

  • Neumann MH (2011) Absolute regularity and ergodicity of Poisson count processes. Bernoulli 17:1268–1284

    Article  MathSciNet  Google Scholar 

  • Pedersen RS (2016) Targeting estimation of CCC-GARCH models with infinite fourth moments. Econom Theory 32:498–531

    Article  MathSciNet  Google Scholar 

  • Pedersen RS, Rahbek A (2014) Multivariate variance targeting in the BEKK-GARCH model. Econom J 17:24–55

    Article  MathSciNet  Google Scholar 

  • Scotto MG, Weiß CH, Gouveia S (2015) Thinning-based models in the analysis of integer-valued time series: a review. Stat Model 15:590–618

    Article  MathSciNet  Google Scholar 

  • Shmueli G, Minka TP, Kadane JB, Borle S, Boatwright P (2005) A useful distribution for fitting discrete data: revival of the Conway-Maxwell-Poisson distribution. J R Stat Soc Ser C 54:127–142

    Article  MathSciNet  Google Scholar 

  • Vaynman I, Beare BK (2014) Stable limit theory for the variance targeting estimator. Adv Econom 33:639–672

    Article  Google Scholar 

  • Wang X, Wang D, Zhang H (2017) Poisson autoregressive process modeling via the penalized conditional maximum likelihood procedure. Statistical Papers, forthcoming. doi:10.1007/s00362-017-0938-0

  • Weiß CH, Homburg A, Puig P (2017) Testing for zero inflation and overdispersion in INAR(1) models. Statistical Papers, forthcoming. doi:10.1007/s00362-016-0851-y

  • Yang K, Wang D, Jia B, Li H (2017) An integer-valued threshold autoregressive process based on negative binomial thinning. Statistical Papers, forthcoming. doi:10.1007/s00362-016-0808-1

  • Zhu F (2011) A negative binomial integer-valued GARCH model. J Time Ser Anal 32:54–67

    Article  MathSciNet  Google Scholar 

  • Zhu F (2012a) Modeling overdispersed or underdispersed count data with generalized Poisson integer-valued GARCH models. J Math Anal Appl 389:58–71

    Article  MathSciNet  Google Scholar 

  • Zhu F (2012b) Zero-inflated Poisson and negative binomial integer-valued GARCH models. J Stat Plan Inference 142:826–839

    Article  MathSciNet  Google Scholar 

  • Zhu F (2012c) Modeling time series of counts with COM-Poisson INGARCH models. Math Comput Model 56:191–203

    Article  MathSciNet  Google Scholar 

  • Zhu F, Wang D (2011) Estimation and testing for a Poisson autoregressive model. Metrika 73:211–230

    Article  MathSciNet  Google Scholar 

  • Zhu F, Wang D (2015) Empirical likelihood for linear and log-linear INGARCH models. J Korean Stat Soc 44:150–160

    Article  MathSciNet  Google Scholar 

  • Zucchini W, MacDonald IL (2009) Hidden Markov models for time series. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

Download references

Acknowledgements

The authors sincerely thank Editor and two anonymous referees for their comprehensive reviews which led to substantial improvement of the paper. This work is supported by National Natural Science Foundation of China (Nos. 11731015, 11371168), Science and Technology Developing Plan of Jilin Province (No. 20170101057JC), Science and Technology Program of Jilin Educational Department during the “13th Five-Year” Plan Period (No. 2016-399), and Cultivation Plan for Excellent Young Scholar Candidates of Jilin University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fukang Zhu.

Appendix

Appendix

Define

$$\begin{aligned} L_n(\phi )=\frac{1}{n}\sum _{t=1}^nl_t(\mu _0,\phi ),~~~l_t(\mu ,\phi )=l_t(\theta )=X_t\ln \lambda _t-\lambda _t. \end{aligned}$$

For \(t\ge 1\), define

$$\begin{aligned} \tilde{l}_t=\tilde{l}_t(\theta )=X_t\ln \tilde{\lambda }_t-\tilde{\lambda }_t, \end{aligned}$$

where \(\tilde{\lambda }_t=\tilde{\lambda }_t(\theta )=\mu (1-\alpha -\beta )+\alpha X_{t-1}+\beta \tilde{\lambda }_{t-1}\) with fixed initial values \(X_0\) and \(\tilde{\lambda }_0\). Then we know that \(\lambda _{t,n}(\phi )=\tilde{\lambda }_t(\hat{\mu }_n,\phi )\).

In this appendix, the letters K and \(\rho \) denote generic constants, whose values can vary but always satisfy \(K>0\) and \(0<\rho <1\). Note that from Lemma 1 in Ferland et al. (2006) we know that \(E(X_t^k)\) is finite for all k, so there is no trouble about existences of moments as in the classic GARCH model. We use arguments similar to Francq and Zakoïan (2004) and Francq et al. (2011) to prove Theorem 1.

Proof of consistency in Theorem 1

The almost sure convergence of \(\hat{\mu }_n\) to \(\mu _0\) is a direct consequence of the ergodic theorem. To show the strong consistency of \(\hat{\phi }_n\), it suffices to establish the following results:

  1. (i)

    \(\displaystyle \lim _{n\rightarrow \infty }\sup _{\phi \in \Phi }|L_n(\phi )-L_n^*(\phi )|=0\) a.s.;

  2. (ii)

    If \(\lambda _t(\mu _0,\phi )=\lambda _t(\mu _0,\phi _0)\) a.s., then \(\phi =\phi _0\);

  3. (iii)

    If \(\phi \ne \phi _0\), then \(El_t(\mu _0,\phi )<El_t(\mu _0,\phi _0)\);

  4. (iv)

    Any \(\phi \ne \phi _0\) has a neighbourhood \(V(\phi )\) such that \(\displaystyle \limsup _{n\rightarrow \infty } \sup _{\phi ^*\in V(\phi )}L_n^*(\phi ^*)<El_1(\mu _0,\phi _0)\) a.s.

We first show (i). Note that

$$\begin{aligned} \lambda _{t,n}(\phi )-\lambda _t(\mu _0,\phi )&=(1-\alpha -\beta )(\hat{\mu }_n-\mu _0)+\beta (\lambda _{t-1,n}(\phi )-\lambda _{t-1}(\mu _0,\phi ))\\&=(1-\alpha -\beta )\frac{1-\beta ^t}{1-\beta }(\hat{\mu }_n-\mu _0)+\beta ^t(\tilde{\lambda }_0-\lambda _0), \end{aligned}$$

then we have \(\displaystyle \sup _{\phi \in \Phi }|\lambda _{t,n}(\phi )-\lambda _t(\mu _0,\phi )|\le K\rho ^t+o(1)\) a.s. since \(\hat{\mu }_n\) converges to \(\mu _0\) a.s. Note that K is a measurable function of \(\{X_k, k<0\}\). For the almost sure consistency, the trajectory is fixed in a set of probability one and n tends to infinity. Thus, K can be considered as a constant, that is, K is almost surely invariant with n, see Francq et al. (2011) for similar arguments. Since \(\ln x\le x-1\) for all \(x>0\), we have

$$\begin{aligned} \sup _{\phi \in \Phi }|L_n(\phi )-L_n^*(\phi )|&\le \frac{1}{n}\sum _{t=1}^n\sup _{\phi \in \Phi } \left\{ X_t\left| \ln \left( 1+\frac{\lambda _t(\mu _0,\phi )-\lambda _{t,n}(\phi )}{\lambda _{t,n}(\phi )}\right) \right| +|\lambda _{t,n}(\phi )-\lambda _t(\mu _0,\phi )|\right\} \\&\le \frac{K}{\hat{\mu }_n(1-\alpha _U-\beta _U)}\frac{1}{n}\sum _{t=1}^n\rho ^tX_t+K\frac{1}{n}\sum _{t=1}^n\rho ^t+o(1), \end{aligned}$$

then (i) holds using arguments similar to Francq and Zakoïan (2004, p. 616).

Note that using the parametrization

$$\begin{aligned} \lambda _t(\theta )=\mu (1-\alpha -\beta )\frac{1}{1-\beta }+\alpha \sum _{k=1}^\infty \beta ^{k-1}X_{t-k} \end{aligned}$$

and the technique in the proof of Theorem 3 of Davis and Liu (2016), it can be shown that if \(\lambda _t(\theta )=\lambda _t(\theta _0)\) a.s., then \(\theta =\theta _0\), which proves (ii).

Let \(a^*\) be a constant. For any \(x>0\), \(f(x)\equiv a^*\ln x-x\) reaches its maximum at \(x=a^*\). Note that \(El_t(\theta )=E(X_t\ln \lambda _t(\theta )-\lambda _t(\theta ))=E(\lambda _t(\theta _0)\ln \lambda _t(\theta )-\lambda _t(\theta ))\). Thus \(El_t(\theta )\) reaches its maximum at \(\lambda _t(\theta )=\lambda _t(\theta _0)\), which is equivalent to \(\theta =\theta _0\) in terms of (ii).

To show (iv), let \(V_k(\phi )\) be the open ball with center \(\phi \) and radius 1 / k. Using (i), the ergodicity of the process, the monotone convergence theorem and (iii), we obtain

$$\begin{aligned}&\limsup _{n\rightarrow \infty }\sup _{\phi ^*\in V_k(\phi )\cap \Phi }L_n^*(\phi ^*)\\&\quad \le \limsup _{n\rightarrow \infty }\sup _{\phi ^*\in V_k(\phi )\cap \Phi }L_n(\phi ^*)+\limsup _{n\rightarrow \infty }\sup _{\phi \in \Phi }|L_n(\phi )-L_n^*(\phi )|\\&\quad \le \limsup _{n\rightarrow \infty }\frac{1}{n}\sum _{t=1}^n\sup _{\phi ^*\in V_k(\phi )\cap \Phi }l_t(\mu _0,\phi ^*)\\&\quad =E\left( \sup _{\phi ^*\in V_k(\phi )\cap \Phi }l_1(\mu _0,\phi ^*)\right) \\&\quad <El_1(\mu _0,\phi _0) \end{aligned}$$

for k large enough, when \(\phi \ne \phi _0\).\(\square \)

Proof of asymptotical normality in Theorem 1

Write \(\phi =(\phi _1,\phi _2)^\top ,\theta =(\theta _1,\theta _2,\theta _3)^\top \). Noting that \(l_{t,n}(\phi )=\tilde{l}_t(\hat{\mu }_n,\phi )\), we have

$$\begin{aligned} (0,0)^{\top }&=\frac{1}{\sqrt{n}}\sum _{t=1}^n\frac{\partial }{\partial \phi }l_{t,n}(\hat{\phi }_n) =\frac{1}{\sqrt{n}}\sum _{t=1}^n\frac{\partial }{\partial \phi }\tilde{l}_t(\hat{\theta }_n)\\&=\frac{1}{\sqrt{n}}\sum _{t=1}^n\frac{\partial }{\partial \phi }\tilde{l}_t(\hat{\theta }_0)+J_n\sqrt{n}(\hat{\phi }_n-\phi _0)+K_n\sqrt{n}(\hat{\mu }_n-\mu _0), \end{aligned}$$

where

$$\begin{aligned} J_n= & {} \left( \frac{1}{n}\sum _{t=1}^n\frac{\partial ^2}{\partial \phi _i\partial \phi _j}\tilde{l}_t(\hat{\theta }_{i+1,j+1}^*)\right) _{2\times 2},\\ K_n= & {} \left( \frac{1}{n}\sum _{t=1}^n\frac{\partial ^2}{\partial \mu \partial \phi _1}\tilde{l}_t(\hat{\theta }_{12}^*), \frac{1}{n}\sum _{t=1}^n\frac{\partial ^2}{\partial \mu \partial \phi _2}\tilde{l}_t(\hat{\theta }_{13}^*)\right) ^\top \end{aligned}$$

with \(\Vert \hat{\theta }_{ij}^*-\theta _0\Vert \le \Vert \hat{\theta }-\theta _0\Vert ,i=1,2,j=1,2,3.\)

We will show that

  1. (i)

    \(\displaystyle E\sup _{\theta \in \Theta }\left\| \frac{\partial l_t(\theta )}{\partial \theta } \frac{\partial l_t(\theta )}{\partial \theta ^\top }\right\|<\infty , \quad E\sup _{\theta \in \Theta }\left\| \frac{\partial ^2l_t(\theta )}{\partial \theta \partial \theta ^\top }\right\|<\infty , \quad E\sup _{\theta \in \Theta }\left| \frac{\partial ^3l_t(\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k}\right| <\infty ,i,j,k=1,2,3\);

  2. (ii)

    \(\displaystyle A\equiv \mathrm{Var}\left( \frac{\partial l_t(\theta _0)}{\partial \theta }\right) =E\left( \frac{1}{\lambda _t(\theta _0)} \frac{\partial \lambda _t(\theta _0)}{\partial \theta }\frac{\partial \lambda _t(\theta _0)}{\partial \theta ^\top }\right) \) is nonsingular;

  3. (iii)

    There exists a neighbourhood \(\mathcal {V}(\theta _0)\) of \(\theta _0\) such that \(\displaystyle \left\| \frac{1}{\sqrt{n}}\sum _{t=1}^n\left\{ \frac{\partial l_t(\theta _0)}{\partial \theta } -\frac{\partial \tilde{l}_t(\theta _0)}{\partial \theta }\right\} \right\| \rightarrow 0\) and \(\displaystyle \sup _{\theta \in \mathcal {V}(\theta _0)}\left\| \frac{1}{n}\sum _{t=1}^n\left\{ \frac{\partial ^2l_t(\theta )}{\partial \theta \partial \theta ^\top } -\frac{\partial ^2\tilde{l}_t(\theta )}{\partial \theta \partial \theta ^\top }\right\} \right\| \rightarrow 0\) in probability when \(n\rightarrow \infty \);

  4. (iv)

    \(\displaystyle \frac{1}{n}\sum _{t=1}^n\frac{\partial ^2l_t(\theta _{ij}^*)}{\partial \theta _i\partial \theta _j}\rightarrow A(i,j)\) a.s.;

  5. (v)

    \(\displaystyle \left( \begin{array}{c}\sqrt{n}(\hat{\mu }_n-\mu _0)\\ \displaystyle \frac{1}{\sqrt{n}} \sum _{t=1}^n\frac{\partial l_t(\theta _0)}{\partial \phi }\end{array}\right) {\mathop {\longrightarrow }\limits ^{d}}\mathcal {N}\left( 0,\left( \begin{array}{cc}b&{}0\\ 0&{}J\end{array}\right) \right) \).

First, we will prove (i). Note that

$$\begin{aligned} \frac{\partial l_t(\theta )}{\partial \theta }&=\left( \frac{X_t}{\lambda _t(\theta )}-1\right) \frac{\partial \lambda _t(\theta )}{\partial \theta }, \quad \frac{\partial ^2l_t(\theta )}{\partial \theta \partial \theta ^\top }=\left( \frac{X_t}{\lambda _t(\theta )}-1\right) \frac{\partial ^2\lambda _t(\theta )}{\partial \theta \partial \theta ^\top }\\&\quad -\,\frac{X_t}{\lambda _t^2(\theta )}\frac{\partial \lambda _t(\theta )}{\partial \theta }\frac{\partial \lambda _t(\theta )}{\partial \theta ^\top },\\ \frac{\partial ^3l_t(\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k}&= \frac{-X_t}{\lambda _t^2(\theta )}\left( \frac{\partial ^2\lambda _t(\theta )}{\partial \theta _i\partial \theta _j} \frac{\partial \lambda _t(\theta )}{\partial \theta _k} +\frac{\partial ^2\lambda _t(\theta )}{\partial \theta _i\partial \theta _k}\frac{\partial \lambda _t(\theta )}{\partial \theta _j} +\frac{\partial ^2\lambda _t(\theta )}{\partial \theta _j\partial \theta _k}\frac{\partial \lambda _t(\theta )}{\partial \theta _i}\right) \\&\quad +\,\frac{2X_t}{\lambda _t^3(\theta )}\frac{\partial \lambda _t(\theta )}{\partial \theta _i}\frac{\partial \lambda _t(\theta )}{\partial \theta _j} \frac{\partial \lambda _t(\theta )}{\partial \theta _k} +\left( \frac{X_t}{\lambda _t(\theta )}-1\right) \frac{\partial ^3\lambda _t(\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k}. \end{aligned}$$

It can be shown that

$$\begin{aligned} \frac{X_t}{\lambda _t^2(\theta )}\le \frac{X_t}{\omega _L^2},~~~\frac{X_t}{\lambda _t^3(\theta )}\le \frac{X_t}{\omega _L^3}, \quad \left| \frac{X_t}{\lambda _t(\theta )}-1\right| \le \frac{X_t}{\omega _L}+1, \end{aligned}$$

where \(\omega _L=\mu _L(1-\alpha _U-\beta _U)>0\). The most involved derivatives are those with respect to \(\theta _3=\beta \). Using arguments similar to the proof of Lemma 3.4 in Fokianos et al. (2009) and the fact that \(E(X_t^k)\) is finite for all k we obtain (i).

If A is singular, there exists \(x=(x_1,x_2,x_3)^\top \ne 0\) such that \(x^\top \dfrac{\partial \lambda _t(\theta _0)}{\partial \theta }=0\). Since

$$\begin{aligned} \frac{\partial \lambda _t(\theta )}{\partial \theta }=\frac{\partial \mu (1-\alpha -\beta )}{\partial \theta } +X_{t-1}\frac{\partial \alpha }{\partial \theta }+\lambda _{t-1}(\theta _0)\frac{\partial \beta }{\partial \theta } +\beta \frac{\partial \lambda _{t-1}(\theta )}{\partial \theta }, \end{aligned}$$

the strict stationary of \(\lambda _t(\theta _0)\) implies \(x_1(1-\alpha _0-\beta _0)-\mu _0(x_2+x_3)+x_2X_{t-1}+x_3\lambda _{t-1}(\theta _0)=0\). This means that \(x_2X_{t-1}\) is a function of \(\{X_{t-i},i>1\}\), which is impossible unless \(x_2=0\). We have \(x_3=0\) because \(\lambda _{t-1}(\theta )\) is not almost surely constant, then we know \(x_1=0\). From this contradiction we know that (ii) holds.

Note that \(\lambda _t(\theta )-\tilde{\lambda }_t(\theta )=\beta ^t(\lambda _0-\tilde{\lambda }_0)\), choosing \(\mathcal {V}(\theta _0)\) such that \(\phi \in \Phi \) for all \(\theta \in \mathcal {V}(\theta _0)\), then we have

$$\begin{aligned}&\sup _{\theta \in \mathcal {V}(\theta _0)}|\lambda _t(\theta )-\tilde{\lambda }_t(\theta )|\le K\rho ^t, \quad \sup _{\theta \in \mathcal {V}(\theta _0)}\left| \frac{\partial \lambda _t(\theta )}{\partial \theta } -\frac{\partial \tilde{\lambda }_t(\theta )}{\partial \theta }\right| \le K\rho ^t, \quad \\&\sup _{\theta \in \mathcal {V}(\theta _0)}\left| \dfrac{1}{\lambda _t(\theta )}-\dfrac{1}{\tilde{\lambda }_t(\theta )}\right| \le K\rho ^t. \end{aligned}$$

Thus we obtain

$$\begin{aligned}&\left\| \frac{\partial l_t(\theta )}{\partial \theta }-\frac{\partial \tilde{l}_t(\theta )}{\partial \theta }\right\| \\&\quad =\left\| \left( \frac{X_t}{\lambda _t(\theta )}-1\right) \frac{\partial \lambda _t(\theta )}{\partial \theta } -\left( \frac{X_t}{\tilde{\lambda }_t(\theta )}-1\right) \frac{\partial \tilde{\lambda }_t(\theta )}{\partial \theta }\right\| \\&\quad =\left\| \left( \frac{X_t}{\lambda _t(\theta )}-\frac{X_t}{\tilde{\lambda }_t(\theta )}\right) \frac{\partial \lambda _t(\theta )}{\partial \theta } -\left( \frac{X_t}{\tilde{\lambda }_t(\theta )}-1\right) \left( \frac{\partial \tilde{\lambda }_t(\theta )}{\partial \theta } -\frac{\partial \lambda _t(\theta )}{\partial \theta }\right) \right\| \\&\quad \le K\rho ^t(1+X_t)\left( 1+\frac{\partial \lambda _t(\theta )}{\partial \theta }\right) , \end{aligned}$$
$$\begin{aligned} \sup _{\theta \in \mathcal {V}(\theta _0)}\left\| \frac{1}{\sqrt{n}}\sum _{t=1}^n\left\{ \frac{\partial l_t(\theta )}{\partial \theta } -\frac{\partial \tilde{l}_t(\theta )}{\partial \theta }\right\} \right\| \le \frac{K}{\sqrt{n}}\sum _{t=1}^n\rho ^t(1+X_t)\sup _{\theta \in \mathcal {V}(\theta _0)}\left( 1+\frac{\partial \lambda _t(\theta )}{\partial \theta }\right) , \end{aligned}$$

the right side converges in probability to 0, which can be proved using arguments similar to Francq and Zakoïan (2004, p. 616) and this technique has been used in the proof of consistency of this theorem. Using the same arguments and replacing first derivatives with second derivatives, (iii) is shown.

Similar to the proof of (4.36) in Francq and Zakoïan (2004), (iv) follows from the Taylor expansion of \(\displaystyle \frac{1}{n}\sum _{t=1}^n\frac{\partial ^2l_t(\theta _{ij}^*)}{\partial \theta \partial \theta ^\top }\) around \(\theta _0\), the convergence of \(\theta _{ij}^*\) to \(\theta _0\), the ergodic theorem and (i).

Note that \(\lambda _t(\theta _0)=\mu _0(1-\alpha _0-\beta _0)+\alpha _0X_{t-1}+\beta _0\lambda _{t-1}(\theta _0)\), then we have

$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n\lambda _t(\theta _0)&=\mu _0(1-\alpha _0-\beta _0)+\alpha _0\frac{1}{n}\sum _{t=1}^nX_{t-1}+\beta _0 \frac{1}{n}\sum _{t=1}^n\lambda _{t-1}(\theta _0)\\&=\mu _0(1-\alpha _0-\beta _0)+\alpha _0\hat{\mu }_n+\beta _0\frac{1}{n}\sum _{t=1}^n\lambda _t(\theta _0)+o_p(n^{-{1\over 2}}),\\ \frac{1}{n}\sum _{t=1}^n\lambda _t(\theta _0)&=\frac{\mu _0(1-\alpha _0-\beta _0)}{1-\beta _0}+\frac{\alpha _0}{1-\beta _0}\hat{\mu }_n+o_p(n^{-{1\over 2}}),\\ \hat{\mu }_n&=\frac{\mu _0(1-\alpha _0-\beta _0)}{1-\beta _0} +\frac{1}{n}\sum _{t=1}^n(X_t-\lambda _t(\theta _0))+\frac{\alpha _0}{1-\beta _0}\hat{\mu }_n+o_p(n^{-{1\over 2}}), \end{aligned}$$

thus we obtain

$$\begin{aligned} \hat{\mu }_n=\mu _0+\frac{1-\beta _0}{1-\alpha _0-\beta _0}\frac{1}{n}\sum _{t=1}^n(X_t-\lambda _t(\theta _0))+o_p(n^{-{1\over 2}}). \end{aligned}$$

Based the above results we have the representation

$$\begin{aligned} \left( \begin{array}{c}\sqrt{n}(\hat{\mu }_n-\mu _0)\\ \displaystyle \frac{1}{\sqrt{n}}\sum _{t=1}^n\frac{\partial l_t(\theta _0)}{\partial \phi }\end{array}\right) =\frac{1}{\sqrt{n}}\sum _{t=1}^n(X_t-\lambda _t(\theta _0))Z_t+o_p(1), \quad Z_t=\left( \begin{array}{c}\displaystyle \frac{1-\beta _0}{1-\alpha _0-\beta _0}\\ \displaystyle \frac{1}{\lambda _t(\theta _0)} \frac{\partial \lambda _t(\theta _0)}{\partial \phi }\end{array}\right) . \end{aligned}$$

Note that \(E((X_t-\lambda _t(\theta _0))Z_t|\mathcal {F}_{t-1})=0\). Moreover,

$$\begin{aligned}&\frac{\lambda _t(\theta _0)}{\partial \alpha }=X_{t-1}-\mu _0+\beta _0\frac{\lambda _{t-1}(\theta _0)}{\partial \alpha } =\sum _{i=0}^\infty \beta _0^i(X_{t-1-i}-\mu _0), \quad \\&\quad \frac{\lambda _t(\theta _0)}{\partial \beta }=\sum _{i=0}^\infty \beta _0^i(\lambda _{t-1-i}-\mu _0), \end{aligned}$$

thus \(E\displaystyle \left( \frac{\lambda _t(\theta _0)}{\partial \phi }\right) =0\). If follows that Var \(((X_t-\lambda _t(\theta _0))Z_t)=\left( \begin{array}{cc}b&{}0\\ 0&{}J\end{array}\right) \). Using the martingale difference central limit theorem and the Cramer-Wold device, we obtain the asymptotic normality in (v).

Now we will complete the proof of the theorem. From (ii) to (iv) if follows that the matrix \(J_n\) is a.s. invertible for sufficiently large n, then

$$\begin{aligned} \sqrt{n}(\hat{\phi }_n-\phi _0)=-J_n^{-1}\left\{ \frac{1}{\sqrt{n}}\sum _{t=1}^n\frac{\partial }{\partial \phi }\tilde{l}_t(\theta _0) +K_n\sqrt{n}(\hat{\mu }_n-\mu _0)\right\} . \end{aligned}$$

Using (iii), we obtain that

$$\begin{aligned} \sqrt{n}\left( \begin{array}{c}\hat{\mu }_n-\mu _0\\ \hat{\phi }_n-\phi _0\end{array}\right) =\left( \begin{array}{cc} 1&{}0\\ -J_n^{-1}K_n&{}-J_n^{-1}\end{array}\right) \left( \begin{array}{c}\sqrt{n}(\hat{\mu }_n-\mu _0)\\ \displaystyle \frac{1}{\sqrt{n}} \sum _{t=1}^n\frac{\partial l_t(\theta _0)}{\partial \phi }\end{array}\right) +o_p(1). \end{aligned}$$

In terms of (iv), (v) and Slutsky’s lemma, \(\sqrt{n}(\hat{\theta }-\theta _0)\) is asymptotically \(\mathcal {N}(0,\Sigma )\) distributed, with

$$\begin{aligned} \Sigma =\left( \begin{array}{cc} 1&{}0\\ -J^{-1}K&{}-J^{-1}\end{array}\right) \left( \begin{array}{cc} b&{}0\\ 0&{}J\end{array}\right) \left( \begin{array}{cc}1&{}-K^\top J^{-1}\\ 0&{}-J^{-1}\end{array}\right) . \end{aligned}$$

\(\square \)

Proof of Corollary 2

Let

$$\begin{aligned} S_t=(X_t-\lambda _t(\theta _0))\left( \frac{1}{\lambda _t(\theta _0)}\frac{\partial \lambda _t(\theta _0)}{\partial \mu }, \frac{1}{\lambda _t(\theta _0)}\frac{\partial \lambda _t(\theta _0)}{\partial \phi },\frac{1-\beta _0}{1-\alpha _0-\beta _0}\right) ^\top , \end{aligned}$$

then we observe that

$$\begin{aligned} E(S_tS_t^\top )=\left( \begin{array}{ccc} d&{}K^\top &{}1\\ K&{}J&{}0\\ 1&{}0&{}b\end{array}\right) . \end{aligned}$$

First, we have

$$\begin{aligned} \displaystyle \Sigma ^*=\left( \begin{array}{cc}d&{}K^\top \\ K&{}J\end{array}\right) ^{-1}= \left( \begin{array}{cc}a&{}-aK^\top J^{-1}\\ -aJ^{-1}K&{}J^{-1}+aJ^{-1}KK^\top J^{-1}\end{array}\right) , \end{aligned}$$

thus \(\Sigma -\Sigma ^*=(b-a)CC^\top \).

Second, observe that \(\Sigma ^*=[E(GS_tS_t^\top G^\top )]^{-1},\Sigma =E(HS_tS_t^\top H^\top )\), where

$$\begin{aligned} G=(I_3~~0)_{3\times 4},~~~H=\left( \begin{array}{ccc}0&{}0&{}1\\ 0&{}J^{-1}&{}-J^{-1}K\end{array}\right) _{3\times 4} \end{aligned}$$

with \(I_k\) denoting the identity matrix of size k. Note that \(GE(S_tS_t^\top )H^\top =I_3\). Let \(D_t=\Sigma ^*GS_t-HS_t\), then we have

$$\begin{aligned} E(D_tD_t^\top )=\Sigma ^*+\Sigma -\Sigma ^*GE(S_tS_t^\top )H^\top -HE(S_tS_t^\top )G^\top \Sigma ^*=\Sigma -\Sigma ^*, \end{aligned}$$

which shows that \(\Sigma -\Sigma ^*\) is positive semidefinite. It can be seen that the matrix \(E(D_tD_t^\top ) =(\Sigma ^*G-H)E(S_tS_t^\top )(\Sigma ^*G-H)^\top \) is not positive because

$$\begin{aligned} \Sigma ^*G-H=\left( \begin{array}{ccc}a&{}\quad -aK^\top J^{-1}&{}\quad -1\\ -aJ^{-1}K&{}\quad aJ^{-1}KK^\top J^{-1}&{}\quad J^{-1}K\end{array}\right) =(aCC^\top ~-C) \end{aligned}$$

is of rank 1.

Proof of Corollary 3

The first convergence in distribution is a direct consequence of Theorem 1 and of the delta method. In view of Corollary 3 and the delta method, we have \(\sqrt{n}(\varphi (\tilde{\theta }_n)-\varphi (\theta _0)){\mathop {\longrightarrow }\limits ^{d}}\mathcal {N}(0,s^{*2})\), where

$$\begin{aligned} s^{*2}=\frac{\partial \varphi (\theta _0)}{\partial \theta ^\top }\Sigma ^*\frac{\partial \varphi (\theta _0)}{\partial \theta ^\top } =s^2-(b-a)\frac{\partial \varphi (\theta _0)}{\partial \theta ^\top }CC^\top \frac{\partial \varphi (\theta _0)}{\partial \theta ^\top }=s^2. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Q., Zhu, F. Mean targeting estimator for the integer-valued GARCH(1, 1) model. Stat Papers 61, 659–679 (2020). https://doi.org/10.1007/s00362-017-0958-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-017-0958-9

Keywords

Navigation