Abstract
We consider the problems of robust estimation and testing for a log-linear model with feedback for the analysis of count time series. We study inference for contaminated data with transient shifts, level shifts and additive outliers. It turns out that the case of additive outliers deserves special attention. We propose a robust method for estimating the regression coefficients in the presence of interventions. The resulting robust estimators are asymptotically normally distributed under some regularity conditions. A robust score type test statistic is also examined. The methodology is applied to real and simulated data.
Similar content being viewed by others
References
Barczy M, Ispány M, Pap G, Scotto M, Silva ME (2012) Additive outliers in INAR(1) models. Stat Papers 53:935–949
Breslow N (1990) Tests of hypotheses in overdispersed Poisson regression and other quasi-likelihood models. J Am Stat Assoc 85:565–571
Brockwell PJ, Davis RA (1991) Time series: theory and methods, 2nd edn. Springer, New York
Cantoni E, Ronchetti E (2001) Robust inference for generalized linear models. J Am Stat Assoc 96:1022–1030
Chen C, Liu L (1993) Joint estimation of model parameters and outlier effects in time-series. J Am Stat Assoc 88:284–297
Chow YS (1967) On a strong law of large numbers for martingales. Ann Math Stat 38:610
Christou V, Fokianos K (2015) Estimation and testing linearity for mixed Poisson autoregressions. Electron J Stat 9:1357–1377
Douc R, Doukhan P, Moulines E (2013) Ergodicity of observation-driven time series models and consistency of the maximum likelihood estimator. Stoch Process Appl 123:2620–2647
El Saied H (2012) Robust modelling of count time series: applications in medicine. Ph.D. thesis, TU Dortmund University, Germany
El Saied H, Fried R (2014) Robust fitting of INARCH models. J Time Ser Anal 35:517–535
Ferland R, Latour A, Oraichi D (2006) Integer-valued GARCH process. J Time Ser Anal 27:923–942
Fokianos K, Fried R (2010) Interventions in INGARCH processes. J Time Ser Anal 31:210–225
Fokianos K, Fried R (2012) Interventions in log-linear Poisson autoregression. Stat Model 12:299–322
Fokianos K, Rahbek A, Tjøstheim D (2009) Poisson autoregression. J Am Stat Assoc 104:1430–1439
Fokianos K, Tjøstheim D (2011) Log-linear Poisson autoregression. J Multivar Anal 102:563–578
Francq C, Zakoïan J-M (2010) GARCH models: structure, statistical inference and financial applications. Wiley, Hoboken
Fried R, Elsaied H, Liboschik T, Fokianos K, Kitromilidou S (2014) On outliers and interventions in count time series following GLMs. Austrian J Stat 43:181–193
Hall P, Heyde CC (1980) Martingale limit theory and its application. Academic Press, New York
Harvey AC (1990) The econometric analysis of time series, 2nd edn. MIT Press, Cambridge
Heritier S, Ronchetti E (1994) Robust bounded-influence tests in general parametric models. J Am Stat Assoc 89:897–904
Huber PJ, Ronchetti E (2009) Robust statistics, 2nd edn. Wiley, New York
Kedem B, Fokianos K (2002) Regression models for time series analysis. Wiley, New York
Kitromilidou S, Fokianos K (2016) Robust estimation methods for a class of count time series log-linear models. J Stat Comput Simul 86:740–755
Klimko LA, Nelson PI (1978) On conditional least squares estimation for stochastic processes. Ann Stat 6:629–642
Künsch HR, Stefanski LA, Carroll RJ (1989) Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. J Am Stat Assoc 84:460–466
Lô SN, Ronchetti E (2009) Robust and accurate inference for generalized linear models. J Multivar. Anal 100:2126–2136
Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics. Wiley, Hoboken
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, London
Mukherjee K (2008) \(M\)-estimation in GARCH models. Econ Theory 24:1530–1553
Muler N, Yohai VJ (2002) Robust estimates for ARCH processes. J Time Ser Anal 23:341–375
Muler N, Yohai VJ (2008) Robust estimates for GARCH models. J Stat Plan Inference 138:2918–2940
Rousseeuw PJ, van Zomeren BC (1990) Unmasking multivariate outliers and leverage points. J Am Stat Assoc 85:633–639
Seber GF, Lee AJ (2003) Linear regression analysis, 2nd edn. Wiley, New York
Taniguchi M, Kakizawa Y (2000) Asymptotic theory of statistical inference for time series. Springer, New York
Tjøstheim D (2012) Some recent theory for autoregressive count time series. TEST 21:413–438 (with discussion)
Valdora M, Yohai VJ (2014) Robust estimators for generalized linear models. J Stat Plan Inference 146:31–48
van der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, Cambridge
Woodard DW, Matteson DS, Henderson SG (2011) Stationarity of count-valued and nonlinear time series models. Electron J Stat 5:800–828
Acknowledgments
We cordially thank two anonymous reviewers for several useful comments that improved the article considerably. The authors would like to acknowledge the project eMammoth - Compute and Store on Grids and Clouds infrastructure(ANABATHMISI/06609/09), which is co-funded by the Republic of Cyprus and the European Regional Development Fund of the EU. Work supported by Cyprus Research Promotion Foundation TEXNOLOGIA/THEPIS/0609(BE)/02.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
In the following, the symbol C denotes a constant which depends upon the context. Define also \(d_M=\max (|d_L|,|d_U|)\), \(a_M=\max (|a_L|,|a_U|)\) and \(b_M=\max (|b_L|,|b_U|)\). In addition, when a quantity is evaluated at the true value of the parameter \(\varvec{\theta }\), denoted it by \(\varvec{\theta }_{0}\), then the notation will be simplified by dropping \(\varvec{\theta }_{0}\). For instance, \(m_{t} \equiv m_{t}(\varvec{\theta }_{0})\) and so on. The following two results are taken from Fokianos and Tjøstheim (2011) and are included for completeness.
Lemma 6.1
Assume model (2) and suppose that \(|a|<1\). In addition, assume that when \(b>0\) then \(|a+b|<1\), and when \(b<0\) then \(|a||a+b|<1\). Then, the following conclusions hold:
-
1.
The process \(\{\nu _t^m,t \ge 0\}\) is a geometrically ergodic Markov chain with finite moments of order k, for an arbitrary k.
-
2.
The process \(\{(Y_t^m,U_t,\nu _t^m),t \ge 0\}\) is a \(V_{(Y,U,\nu )}\)-geometrically ergodic Markov chain with \(V_{Y,U,\nu }(Y,U,\nu )=1+\log ^{2k}(1+Y)+\nu ^{2k}+U^{2k}\), k being a positive integer.
Lemma 6.2
Suppose that \((Y_t,\nu _t)\) and \((Y_t^m,\nu _t^m)\) are defined by (1) and (2) respectively. Assume that \(|a+b|<1\), if a and b have the same sign, and \(a^2+b^2<1\) if a and b have different signs. Then the following statements are true:
-
1.
\(E|\nu _t^m-\nu _t| \rightarrow 0\) and \(|\nu _t^m-\nu _t|<\delta _{1,m}\) almost surely for m large.
-
2.
\(E(\nu _t^m-\nu _t)^2 \le \delta _{2,m}\),
-
3.
\(E|\lambda _t^m-\lambda _t| \le \delta _{3,m}\),
-
4.
\(E|Y_t^m-Y_t| \le \delta _{4,m}\),
-
5.
\(E(\lambda _t^m-\lambda _t)^2 \le \delta _{5,m}\),
-
6.
\(E(Y_t^m-Y_t)^2 \le \delta _{6,m}\),
where \(\delta _{i,m} \rightarrow 0\) as \(m \rightarrow \infty \) for \(i=1,\ldots ,6\). Furthermore, almost surely, with m large enough
We will also need the following lemma whose proof is given below.
Lemma 6.3
Define the Pearson residuals for both perturbed and unperturbed models by
respectively. Suppose that \((Y_t,\nu _t)\) and \((Y_t^m,\nu _t^m)\) are defined by (1) and (2) respectively. Assume that \(|a+b|<1\), if a and b have the same sign, and \(a^2+b^2<1\) if a and b have different signs. Then,
-
1.
\(E|r_t^m-r_t| \rightarrow 0\),
-
2.
\(E(r_t^m-r_t)^2 \le \delta _{7;m}\),
where \(\delta _{7,m} \rightarrow 0\) as \(m \rightarrow \infty \). Furthermore, almost surely, with m large enough
Proof of Lemma 6.3
We have that
for any \(\delta >0\) almost surely and for m large enough by using the results of Lemma 6.2. The claims follow. \(\square \)
Proof of Lemma 2.1
We will show that
and
as \( m \rightarrow \infty \). Consider first (9). Working along the lines of Fokianos and Tjøstheim (2011), we consider differences of the perturbed and non perturbed matrix along the diagonal individually for \(\theta _i=d,a,b\). Then, we need to evaluate
with \(Z_t=\psi _c(r_t)w_t e^{\nu _t/2}\) and similarly for \(Z_t^m\). We have,
The first term can become arbitrarily small because it can be shown (following the proof of Fokianos et al. (2009, Lemma 3.1)) that
For the second term, note first that \(E (\partial \nu _t / \partial \theta _i)^{4}\) is bounded by a finite constant for \(i=1,2,3\) since
by using Lemma 6.1. Furthermore
where we have used the boundedness of the function \(\psi (\cdot )\), the mean-value theorem and Lemmas 6.2 and 6.3. Hence (9) follows. To prove (10), consider
The above quantity can be made arbitrarily small because of finite moments of \(\partial \nu _t / \partial \theta _{i}\), \(\partial \nu _t^{m} / \partial \theta _{i}\), \(\exp (\nu _{t}^{m})\), (11) and the fact that \(E \left| \left( Z_t^m-Z_t \right) \ \partial \nu _t / \partial \theta _{i} \right| \rightarrow 0\), as \( \rightarrow \infty \) which is proved following the previous arguments. \(\square \)
Proof of Lemma 2.2
The score function \(S_n^m\) for the perturbed model is a martingale sequence, with \(E(S_n^m||\mathcal{F}_{t-1}^m)=S_{n-1}^m\) at the true value \(\varvec{\theta }=\varvec{\theta }_0\) and \(\mathcal{F}_{t-1}^m\) denotes the \(\sigma \)-field generated by \(\{Y_0^m,\ldots ,Y_{t-1}^m,\mathcal{U}_0,\ldots ,\mathcal{U}_{t-1}\}\). We will show that it is square integrable. Proving that \(E||m_t^m|| ^2\) is finite for \(\varvec{\theta }_0=d_0,a_0\) and \(b_0\) guarantees an application of the strong law of large numbers for martingales (Chow 1967), which gives almost sure convergence to 0 of \(S_n^m/n\) as \(n \rightarrow \infty \). But
and this is finite because of Lemma 6.1 and (12). To show asymptotic normality of the perturbed score function \(S_n^m\) we apply the CLT for martingales (Hall and Heyde 1980, Cor. 3.1). \((S_n^m)_{n \ge 1}\) is a zero mean, square integrable martingale sequence with \((s_t^m)_{t \ge \mathbb {N}}\) a martingale difference sequence. To prove the conditional Lindeberg’s condition note that
since \(E||s_t^m||^4 < \infty \). In addition,
This concludes the second result of the Lemma.
The third result of the Lemma is identical to Lemma 2.1 by Brockwell and Davis (1991, Prop. 6.4.9.). Consider now the last result of the Lemma.
where \(W_{t}= Z_{t}-E[Z_{t} \mid \mathcal{F}_{t-1}]\) and similarly for the perturbed model. For the first summand in the above representation, we obtain that
as \(m \rightarrow \infty \), for some \(\epsilon _{m}\). For the second summand, note that
and therefore its expected value tends to 0 by Lemma 6.3. The fact that \(E||\partial \nu _t / \partial \varvec{\theta }||^2< \infty \) yields the desired conclusion. \(\square \)
Proof of Lemma 2.3
Because \(S_n(\varvec{\theta })=0\) is an unbiased estimating function, it holds that
where \(l_t(\varvec{\theta })= \nu _{t}(\varvec{\theta })Y_{t}- \exp ( \nu _{t}(\varvec{\theta }))\), is the logarithm of the conditional probability of \(Y_t||\mathcal{F}_{t-1}\) under the Poisson assumption. Then, the matrix \(V_n(\varvec{\theta })\) is rewritten in the form
and the matrix \(V_n^m(\varvec{\theta })\) for the perturbed model is defined analogously. We again examine the difference \(s_t^m ({\partial l_t^m}/{\partial \theta _{i}}) - s_t ({\partial l_t}/{\partial \theta _i})\). Notice that
Then,
For the first summand, (13) shows that it tends to zero. We work similarly for the second summand to obtain the desired result. To show that \(V_n\) is positive definite it is sufficient to show that \(z^T \left( {\partial \nu _t}/{\partial \varvec{\theta }} \right) \left( {\partial \nu _t}/{\partial \varvec{\theta }} \right) ^T z >0\) for any non-zero three dimensional real vector z. If \( z^{\prime } {\partial \nu _{t}}/{\partial \varvec{\theta }}=0\), then we obtain that \(z^{\prime } (1, \nu _{t-1}, \log (Y_{t-1}+1))^{\prime }=0\). But if the last equation holds, then \(z =0\) because \(\nu _{t}\) is expressed as a past function of \(\log (Y_{t}+1)\) and \(Y_{t}\) is non-zero for some t. The same reasoning holds for \(V_{n}^{m}\). \(\square \)
Proof of Lemma 2.4
The first assertion of the Lemma holds by using a LLN. For the second, the Hessian matrix \(H_n\) can be represented as
The matrix \(H_n^m\) for the perturbed model is defined analogously. Examining the difference \(H_n^m-H_n\), we obtain that
The second term in the above representation tends to zero as \(m \rightarrow \infty \) because of the previous Lemma and the fact that \(E \left\| \left( {\partial \nu _{t}}/{\partial \varvec{\theta }} \right) \left( {\partial \nu _t}/{\partial \varvec{\theta }} \right) ^T \right\| < \infty \).
For the first term in the representation of \(H_n^m-H_n\), we obtain the following
\(\square \)
Proof of Lemma 2.5
Recall that the components of the MQLE score are given by
The second derivative of the i-th component of the MQLE score \(\partial ^2 s_{ti}(\varvec{\theta })/\partial \theta _k \partial \theta _j\) is given by
where
with
Without loss of generality, we only consider derivatives with respect to a. For the derivatives with respect to d and b we use identical arguments. For the derivatives of \(\nu _t\) with respect to the parameter a we obtain the following bounds
With \(\theta _i=\theta _j=\theta _k=a\),
By defining \(\tilde{M}_{n}^{m}\) analogously and working as before the result of the Lemma follows. \(\square \)
Proof of Theorem 3.1
The first assertion of the Theorem follows from arguments given in Francq and Zakoïan (2010, Prop. 8.3). For the second assertion of the Theorem, we consider the difference
The above representation is composed of the following differences, \(W_{22}^m-W_{22}\), \(W_{12}^m-W_{12}\), \(W_{21}^m-W_{21}\), \(W_{11}^m-W_{11}\), \({V_{11}^m}^{-1}V_{12}^m-{V_{11}}^{-1}V_{12}\), \(V_{21}^m{V_{11}^m}^{-1}-V_{21}V_{11}^{-1}\) and \({V_{11}^m}^{-1}V_{12}^m-V_{11}^{-1}V_{12}\) which all converge to zero as results of Lemmas 2.1 and 2.3. \(\square \)
Rights and permissions
About this article
Cite this article
Kitromilidou, S., Fokianos, K. Mallows’ quasi-likelihood estimation for log-linear Poisson autoregressions. Stat Inference Stoch Process 19, 337–361 (2016). https://doi.org/10.1007/s11203-015-9131-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11203-015-9131-z
Keywords
- Autocorrelation
- Estimating equations
- Generalized linear models
- Integer valued time series
- Interventions
- Robust estimation