Abstract
The integer-valued GARCH model is commonly used in modeling time series of counts. Maximum likelihood estimation (MLE) is used to estimate unknown parameters, but numerical results for MLE are sensitive to the choice of initial values, which also occurs in estimating the GARCH model. To alleviate this numerical difficulty, we propose an alternative to MLE and name it as mean targeting estimation (MTE), which is an analogue to variance targeting estimation used in the GARCH model. Consistency and asymptotic normality for MTE are established. Comparisons with the standard MLE are provided and the merits of the mean targeting method are discussed. In particular, it is shown that MTE can be superior to MLE for estimating parameters or prediction when the model is well specified and misspecified. We conduct numerical studies to confirm our theoretical findings and illustrate the practical utility of our proposals.
Similar content being viewed by others
References
Ahmad A, Francq C (2016) Poisson QMLE of count time series models. J Time Ser Anal 37:291–314
Chen CWS, Lee S (2016) Generalized Poisson autoregressive models for time series of counts. Comput Stat Data Anal 99:51–67
Christou V, Fokianos K (2014) Quasi-likelihood inference for negative binomial time series models. J Time Ser Anal 35:55–78
Cui Y, Wu R (2016) On conditional maximum likelihood estimation for INGARCH\((p, q)\) models. Stat Probab Lett 118:1–7
Cui Y, Zheng Q (2017) Conditional maximum likelihood estimation for a class of observation-driven time series models for count data. Stat Probab Lett 123:193–201
Davis RA, Liu H (2016) Theory and inference for a class of nonlinear models with application to time series of counts. Stat Sin 26:1673–1707
Davis RA, Holan SH, Lund R, Ravishanker N (2016) Handbook of discrete-valued time series. Chapman and Hall/CRC, Boca Raton
Engle RF, Mezrich J (1996) GARCH for groups. Risk 6:36–40
Ferland R, Latour A, Oraichi D (2006) Integer-valued GARCH process. J Time Ser Anal 27:923–942
Fokianos K, Rahbek A, Tjøstheim D (2009) Poisson autoregression. J Am Stat Assoc 104:1430–1439
Francq C, Zakoïan J-M (2004) Maximum likelihood estimation of pure GARCH and ARMA-GARCH processes. Bernoulli 10:605–637
Francq C, Horváth L, Zakoïan J-M (2011) Merits and drawbacks of variance targeting in GARCH models. J Financ Econom 9:619–656
Francq C, Horváth L, Zakoïan J-M (2016) Variance targeting estimation of multivariate GARCH models. J Financ Econom 14:353–382
Gonçalves E, Mendes-Lopes N, Silva F (2015) Infinitely divisible distributions in integer-valued GARCH models. J Time Ser Anal 36:503–527
Gonçalves E, Mendes-Lopes N, Silva F (2016) Zero-inflated compound Poisson distributions in integer-valued GARCH models. Statistics 50:558–578
Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. In: Le Cam LM, Neyman J (eds) Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1. University of California Press, Berkeley, pp 221–233
Khoo WC, Ong SH, Biswas A (2017) Modeling time series of counts with a new class of INAR(1) model. Stati Pap 58:393–416
Kristensen D, Linton O (2004) Consistent standard errors for target variance approach to GARCH estimation. Econom Theory 20:990–993
Kristensen D, Rahbek A (2005) Asymptotics of the QMLE for a class of ARCH\((q)\) models. Econom Theory 21:946–961
Lee S, Lee Y, Chen CWS (2016) Parameter change test for zero-inflated generalized Poisson autoregressive models. Statistics 50:540–557
Li Q, Lian H, Zhu F (2016) Robust closed-form estimators for the integer-valued GARCH(1,1) model. Comput Stat Data Anal 101:209–225
Liu T, Yuan X (2013) Random rounded integer-valued autoregressive conditional heteroskedastic process. Stat Pap 54:645–683
Neumann MH (2011) Absolute regularity and ergodicity of Poisson count processes. Bernoulli 17:1268–1284
Pedersen RS (2016) Targeting estimation of CCC-GARCH models with infinite fourth moments. Econom Theory 32:498–531
Pedersen RS, Rahbek A (2014) Multivariate variance targeting in the BEKK-GARCH model. Econom J 17:24–55
Scotto MG, Weiß CH, Gouveia S (2015) Thinning-based models in the analysis of integer-valued time series: a review. Stat Model 15:590–618
Shmueli G, Minka TP, Kadane JB, Borle S, Boatwright P (2005) A useful distribution for fitting discrete data: revival of the Conway-Maxwell-Poisson distribution. J R Stat Soc Ser C 54:127–142
Vaynman I, Beare BK (2014) Stable limit theory for the variance targeting estimator. Adv Econom 33:639–672
Wang X, Wang D, Zhang H (2017) Poisson autoregressive process modeling via the penalized conditional maximum likelihood procedure. Statistical Papers, forthcoming. doi:10.1007/s00362-017-0938-0
Weiß CH, Homburg A, Puig P (2017) Testing for zero inflation and overdispersion in INAR(1) models. Statistical Papers, forthcoming. doi:10.1007/s00362-016-0851-y
Yang K, Wang D, Jia B, Li H (2017) An integer-valued threshold autoregressive process based on negative binomial thinning. Statistical Papers, forthcoming. doi:10.1007/s00362-016-0808-1
Zhu F (2011) A negative binomial integer-valued GARCH model. J Time Ser Anal 32:54–67
Zhu F (2012a) Modeling overdispersed or underdispersed count data with generalized Poisson integer-valued GARCH models. J Math Anal Appl 389:58–71
Zhu F (2012b) Zero-inflated Poisson and negative binomial integer-valued GARCH models. J Stat Plan Inference 142:826–839
Zhu F (2012c) Modeling time series of counts with COM-Poisson INGARCH models. Math Comput Model 56:191–203
Zhu F, Wang D (2011) Estimation and testing for a Poisson autoregressive model. Metrika 73:211–230
Zhu F, Wang D (2015) Empirical likelihood for linear and log-linear INGARCH models. J Korean Stat Soc 44:150–160
Zucchini W, MacDonald IL (2009) Hidden Markov models for time series. Chapman and Hall/CRC, Boca Raton
Acknowledgements
The authors sincerely thank Editor and two anonymous referees for their comprehensive reviews which led to substantial improvement of the paper. This work is supported by National Natural Science Foundation of China (Nos. 11731015, 11371168), Science and Technology Developing Plan of Jilin Province (No. 20170101057JC), Science and Technology Program of Jilin Educational Department during the “13th Five-Year” Plan Period (No. 2016-399), and Cultivation Plan for Excellent Young Scholar Candidates of Jilin University.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Define
For \(t\ge 1\), define
where \(\tilde{\lambda }_t=\tilde{\lambda }_t(\theta )=\mu (1-\alpha -\beta )+\alpha X_{t-1}+\beta \tilde{\lambda }_{t-1}\) with fixed initial values \(X_0\) and \(\tilde{\lambda }_0\). Then we know that \(\lambda _{t,n}(\phi )=\tilde{\lambda }_t(\hat{\mu }_n,\phi )\).
In this appendix, the letters K and \(\rho \) denote generic constants, whose values can vary but always satisfy \(K>0\) and \(0<\rho <1\). Note that from Lemma 1 in Ferland et al. (2006) we know that \(E(X_t^k)\) is finite for all k, so there is no trouble about existences of moments as in the classic GARCH model. We use arguments similar to Francq and Zakoïan (2004) and Francq et al. (2011) to prove Theorem 1.
Proof of consistency in Theorem 1
The almost sure convergence of \(\hat{\mu }_n\) to \(\mu _0\) is a direct consequence of the ergodic theorem. To show the strong consistency of \(\hat{\phi }_n\), it suffices to establish the following results:
- (i)
\(\displaystyle \lim _{n\rightarrow \infty }\sup _{\phi \in \Phi }|L_n(\phi )-L_n^*(\phi )|=0\) a.s.;
- (ii)
If \(\lambda _t(\mu _0,\phi )=\lambda _t(\mu _0,\phi _0)\) a.s., then \(\phi =\phi _0\);
- (iii)
If \(\phi \ne \phi _0\), then \(El_t(\mu _0,\phi )<El_t(\mu _0,\phi _0)\);
- (iv)
Any \(\phi \ne \phi _0\) has a neighbourhood \(V(\phi )\) such that \(\displaystyle \limsup _{n\rightarrow \infty } \sup _{\phi ^*\in V(\phi )}L_n^*(\phi ^*)<El_1(\mu _0,\phi _0)\) a.s.
We first show (i). Note that
then we have \(\displaystyle \sup _{\phi \in \Phi }|\lambda _{t,n}(\phi )-\lambda _t(\mu _0,\phi )|\le K\rho ^t+o(1)\) a.s. since \(\hat{\mu }_n\) converges to \(\mu _0\) a.s. Note that K is a measurable function of \(\{X_k, k<0\}\). For the almost sure consistency, the trajectory is fixed in a set of probability one and n tends to infinity. Thus, K can be considered as a constant, that is, K is almost surely invariant with n, see Francq et al. (2011) for similar arguments. Since \(\ln x\le x-1\) for all \(x>0\), we have
then (i) holds using arguments similar to Francq and Zakoïan (2004, p. 616).
Note that using the parametrization
and the technique in the proof of Theorem 3 of Davis and Liu (2016), it can be shown that if \(\lambda _t(\theta )=\lambda _t(\theta _0)\) a.s., then \(\theta =\theta _0\), which proves (ii).
Let \(a^*\) be a constant. For any \(x>0\), \(f(x)\equiv a^*\ln x-x\) reaches its maximum at \(x=a^*\). Note that \(El_t(\theta )=E(X_t\ln \lambda _t(\theta )-\lambda _t(\theta ))=E(\lambda _t(\theta _0)\ln \lambda _t(\theta )-\lambda _t(\theta ))\). Thus \(El_t(\theta )\) reaches its maximum at \(\lambda _t(\theta )=\lambda _t(\theta _0)\), which is equivalent to \(\theta =\theta _0\) in terms of (ii).
To show (iv), let \(V_k(\phi )\) be the open ball with center \(\phi \) and radius 1 / k. Using (i), the ergodicity of the process, the monotone convergence theorem and (iii), we obtain
for k large enough, when \(\phi \ne \phi _0\).\(\square \)
Proof of asymptotical normality in Theorem 1
Write \(\phi =(\phi _1,\phi _2)^\top ,\theta =(\theta _1,\theta _2,\theta _3)^\top \). Noting that \(l_{t,n}(\phi )=\tilde{l}_t(\hat{\mu }_n,\phi )\), we have
where
with \(\Vert \hat{\theta }_{ij}^*-\theta _0\Vert \le \Vert \hat{\theta }-\theta _0\Vert ,i=1,2,j=1,2,3.\)
We will show that
- (i)
\(\displaystyle E\sup _{\theta \in \Theta }\left\| \frac{\partial l_t(\theta )}{\partial \theta } \frac{\partial l_t(\theta )}{\partial \theta ^\top }\right\|<\infty , \quad E\sup _{\theta \in \Theta }\left\| \frac{\partial ^2l_t(\theta )}{\partial \theta \partial \theta ^\top }\right\|<\infty , \quad E\sup _{\theta \in \Theta }\left| \frac{\partial ^3l_t(\theta )}{\partial \theta _i\partial \theta _j\partial \theta _k}\right| <\infty ,i,j,k=1,2,3\);
- (ii)
\(\displaystyle A\equiv \mathrm{Var}\left( \frac{\partial l_t(\theta _0)}{\partial \theta }\right) =E\left( \frac{1}{\lambda _t(\theta _0)} \frac{\partial \lambda _t(\theta _0)}{\partial \theta }\frac{\partial \lambda _t(\theta _0)}{\partial \theta ^\top }\right) \) is nonsingular;
- (iii)
There exists a neighbourhood \(\mathcal {V}(\theta _0)\) of \(\theta _0\) such that \(\displaystyle \left\| \frac{1}{\sqrt{n}}\sum _{t=1}^n\left\{ \frac{\partial l_t(\theta _0)}{\partial \theta } -\frac{\partial \tilde{l}_t(\theta _0)}{\partial \theta }\right\} \right\| \rightarrow 0\) and \(\displaystyle \sup _{\theta \in \mathcal {V}(\theta _0)}\left\| \frac{1}{n}\sum _{t=1}^n\left\{ \frac{\partial ^2l_t(\theta )}{\partial \theta \partial \theta ^\top } -\frac{\partial ^2\tilde{l}_t(\theta )}{\partial \theta \partial \theta ^\top }\right\} \right\| \rightarrow 0\) in probability when \(n\rightarrow \infty \);
- (iv)
\(\displaystyle \frac{1}{n}\sum _{t=1}^n\frac{\partial ^2l_t(\theta _{ij}^*)}{\partial \theta _i\partial \theta _j}\rightarrow A(i,j)\) a.s.;
- (v)
\(\displaystyle \left( \begin{array}{c}\sqrt{n}(\hat{\mu }_n-\mu _0)\\ \displaystyle \frac{1}{\sqrt{n}} \sum _{t=1}^n\frac{\partial l_t(\theta _0)}{\partial \phi }\end{array}\right) {\mathop {\longrightarrow }\limits ^{d}}\mathcal {N}\left( 0,\left( \begin{array}{cc}b&{}0\\ 0&{}J\end{array}\right) \right) \).
First, we will prove (i). Note that
It can be shown that
where \(\omega _L=\mu _L(1-\alpha _U-\beta _U)>0\). The most involved derivatives are those with respect to \(\theta _3=\beta \). Using arguments similar to the proof of Lemma 3.4 in Fokianos et al. (2009) and the fact that \(E(X_t^k)\) is finite for all k we obtain (i).
If A is singular, there exists \(x=(x_1,x_2,x_3)^\top \ne 0\) such that \(x^\top \dfrac{\partial \lambda _t(\theta _0)}{\partial \theta }=0\). Since
the strict stationary of \(\lambda _t(\theta _0)\) implies \(x_1(1-\alpha _0-\beta _0)-\mu _0(x_2+x_3)+x_2X_{t-1}+x_3\lambda _{t-1}(\theta _0)=0\). This means that \(x_2X_{t-1}\) is a function of \(\{X_{t-i},i>1\}\), which is impossible unless \(x_2=0\). We have \(x_3=0\) because \(\lambda _{t-1}(\theta )\) is not almost surely constant, then we know \(x_1=0\). From this contradiction we know that (ii) holds.
Note that \(\lambda _t(\theta )-\tilde{\lambda }_t(\theta )=\beta ^t(\lambda _0-\tilde{\lambda }_0)\), choosing \(\mathcal {V}(\theta _0)\) such that \(\phi \in \Phi \) for all \(\theta \in \mathcal {V}(\theta _0)\), then we have
Thus we obtain
the right side converges in probability to 0, which can be proved using arguments similar to Francq and Zakoïan (2004, p. 616) and this technique has been used in the proof of consistency of this theorem. Using the same arguments and replacing first derivatives with second derivatives, (iii) is shown.
Similar to the proof of (4.36) in Francq and Zakoïan (2004), (iv) follows from the Taylor expansion of \(\displaystyle \frac{1}{n}\sum _{t=1}^n\frac{\partial ^2l_t(\theta _{ij}^*)}{\partial \theta \partial \theta ^\top }\) around \(\theta _0\), the convergence of \(\theta _{ij}^*\) to \(\theta _0\), the ergodic theorem and (i).
Note that \(\lambda _t(\theta _0)=\mu _0(1-\alpha _0-\beta _0)+\alpha _0X_{t-1}+\beta _0\lambda _{t-1}(\theta _0)\), then we have
thus we obtain
Based the above results we have the representation
Note that \(E((X_t-\lambda _t(\theta _0))Z_t|\mathcal {F}_{t-1})=0\). Moreover,
thus \(E\displaystyle \left( \frac{\lambda _t(\theta _0)}{\partial \phi }\right) =0\). If follows that Var \(((X_t-\lambda _t(\theta _0))Z_t)=\left( \begin{array}{cc}b&{}0\\ 0&{}J\end{array}\right) \). Using the martingale difference central limit theorem and the Cramer-Wold device, we obtain the asymptotic normality in (v).
Now we will complete the proof of the theorem. From (ii) to (iv) if follows that the matrix \(J_n\) is a.s. invertible for sufficiently large n, then
Using (iii), we obtain that
In terms of (iv), (v) and Slutsky’s lemma, \(\sqrt{n}(\hat{\theta }-\theta _0)\) is asymptotically \(\mathcal {N}(0,\Sigma )\) distributed, with
\(\square \)
Proof of Corollary 2
Let
then we observe that
First, we have
thus \(\Sigma -\Sigma ^*=(b-a)CC^\top \).
Second, observe that \(\Sigma ^*=[E(GS_tS_t^\top G^\top )]^{-1},\Sigma =E(HS_tS_t^\top H^\top )\), where
with \(I_k\) denoting the identity matrix of size k. Note that \(GE(S_tS_t^\top )H^\top =I_3\). Let \(D_t=\Sigma ^*GS_t-HS_t\), then we have
which shows that \(\Sigma -\Sigma ^*\) is positive semidefinite. It can be seen that the matrix \(E(D_tD_t^\top ) =(\Sigma ^*G-H)E(S_tS_t^\top )(\Sigma ^*G-H)^\top \) is not positive because
is of rank 1.
Proof of Corollary 3
The first convergence in distribution is a direct consequence of Theorem 1 and of the delta method. In view of Corollary 3 and the delta method, we have \(\sqrt{n}(\varphi (\tilde{\theta }_n)-\varphi (\theta _0)){\mathop {\longrightarrow }\limits ^{d}}\mathcal {N}(0,s^{*2})\), where
Rights and permissions
About this article
Cite this article
Li, Q., Zhu, F. Mean targeting estimator for the integer-valued GARCH(1, 1) model. Stat Papers 61, 659–679 (2020). https://doi.org/10.1007/s00362-017-0958-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-017-0958-9