Poisson autoregressive process modeling via the penalized conditional maximum likelihood procedure

Wang, Xinyang; Wang, Dehui; Zhang, Haixiang

doi:10.1007/s00362-017-0938-0

Poisson autoregressive process modeling via the penalized conditional maximum likelihood procedure

Regular Article
Published: 28 July 2017

Volume 61, pages 245–260, (2020)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Xinyang Wang¹,
Dehui Wang¹ &
Haixiang Zhang²

374 Accesses
7 Citations
Explore all metrics

Abstract

In this paper, we consider the penalized estimation procedure for Poisson autoregressive model with sparse parameter structure. We study the theoretical properties of penalized conditional maximum likelihood (PCML) with several different penalties. We show that the penalized estimators perform as well as the true model was known. We establish the oracle properties of PCML estimators. Some simulation studies are conducted to verify the proposed procedure. A real data example is also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empirical likelihood-based inference in Poisson autoregressive model with conditional moment restrictions

Article Open access 02 July 2015

Statistical Inferences in a Partially Linear Model with Autoregressive Errors

Article 19 October 2022

Bayesian empirical likelihood inference and order shrinkage for autoregressive models

Article 08 April 2021

References

Al-Osh MA, Alzaid AA (1987) First-order integer-valued autoregressive (INAR (1)) process. J Time Ser Anal 8(3):261–275
Article MathSciNet Google Scholar
Al-Osh MA, Alzaid AA (1988) Integer-valued moving average (INMA) process. Stat Pap 29(1):281–300
Article MathSciNet Google Scholar
Brillinger DR (2001) Time series: data analysis and theory. SIAM
Davis RA, Dunsmuir WTM, Streett SB (2003) Observation-driven models for Poisson counts. Biometrika 90(4):777–790
Article MathSciNet Google Scholar
Dicker L, Huang B, Lin X (2013) Variable selection and estimation with the seamless-L0 penalty. Stat Sin:929–962
Doukhan P, Fokianos K, Tjøtheim D (2012) On weak dependence conditions for Poisson autoregressions. Stat Prob Lett 82:942–48
Article MathSciNet Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Article MathSciNet Google Scholar
Fan J, Lv J (2010) A selective overview of variable selection in high dimensional feature space. Stat Sin 20:101–148
MathSciNet MATH Google Scholar
Fan J, Peng H (2004) On nonconcave penalized likelihood with diverging number of parameters. Ann Stat 32:928–961
Article MathSciNet Google Scholar
Ferland R, Latour A, Oraichi D (2006) Integer-valued GARCH process. J Time Ser Anal 27(6):923–942
Article MathSciNet Google Scholar
Fokianos K, Rahbek A, Tjøstheim D (2009) Poisson autoregression. J Am Stat Assoc 104(488):1430–1439
Article MathSciNet Google Scholar
Hall P, Heyde CC (2014) Martingale limit theory and its application. Academic press, New York
MATH Google Scholar
Kedem B, Fokianos K (2005) Regression models for time series analysis. Wiley, New York
MATH Google Scholar
Khoo WC, Ong SH, Biswas A (2017) Modeling time series of counts with a new class of INAR (1) model. Stat Pap 58:393–416
Article MathSciNet Google Scholar
Nardi Y, Rinaldo A (2011) Autoregressive process modeling via the LASSO procedure. J Multivar Anal 102(3):528–549
Article MathSciNet Google Scholar
Steutel FW, Van Harn K (1979) Discrete analogues of self-decomposability and stability. Ann Prob:893–899
Article MathSciNet Google Scholar
Wang H, Li G, Tsai CL (2007a) Regression coefficient and autoregressive order shrinkage and selection via the LASSO. J R Stat Soc 69(1):63–78
Article MathSciNet Google Scholar
Wang H, Li R, Tsai CL (2007b) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94(3):553–568
Article MathSciNet Google Scholar
Yang K, Wang D, Jia B, Li H (2016) An integer-valued threshold autoregressive process based on negative binomial thinning. Stat Pap. doi:10.1007/s00362-016-0808-1
Article MathSciNet Google Scholar
Yoon YJ, Park C, Lee T (2013) Penalized regression models with autoregressive error terms. J Stat Comput Simul 83(9):1756–1772
Article MathSciNet Google Scholar
Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat:894–942
Article MathSciNet Google Scholar
Zhang H, Wang D, Zhu F (2010) Inference for INAR (p) processes with signed generalized power series thinning operator. J Stat Plan Inference 140(3):667–683
Article MathSciNet Google Scholar
Zheng H, Basawa IV, Datta S (2006) Inference for pth-order random coefficient integer-valued autoregressive processes. J Time Ser Anal 27(3):411–440
Article MathSciNet Google Scholar
Zhu F, Wang D (2011) Estimation and testing for a Poisson autoregressive model. Metrika 73(2):211–230
Article MathSciNet Google Scholar
Zou H (2006) The adaptive LASSO and its oracle properties. J Am Stat Assoc 101(476):1418–1429
Article MathSciNet Google Scholar

Download references

Acknowledgements

We thank the Editor and two reviewers for their valuable suggestions and comments which greatly improved the article. This work is supported by National Natural Science Foundation of China (No. 11271155, 11371168, J1310022, 11571138, 11501241, 11571051, 11301137, 11301212 and 11401146), National Social Science Foundation of China (16BTJ020), Science and Technology Research Program of Education Department in Jilin Province for the 12th Five-Year Plan (440020031139) and Jilin Province Natural Science Foundation (20150520053JH).

Author information

Authors and Affiliations

Mathematics School of Jilin University, Changchun, 130012, China
Xinyang Wang & Dehui Wang
Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China
Haixiang Zhang

Authors

Xinyang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dehui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haixiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dehui Wang.

Appendix

To prove Theorem 1, we need the following lemma.

Lemma A.1

Under condition (C.1), as $n\rightarrow \infty $ we have

$$\begin{aligned} \frac{1}{\sqrt{n}}B(\varvec{\theta _{0}})\mathop {\longrightarrow }\limits ^{D}N(\mathbf 0 ,\varvec{\Sigma }(\varvec{\theta _{0}})), \end{aligned}$$

where $B({\varvec{\theta }}_{\varvec{0}})=\sum ^{n}_{t=1}\frac{\partial l_{n}({\varvec{\theta }}_{\varvec{0}})}{\partial {\varvec{\theta }}}$; the Fisher information matrix ${\varvec{\Sigma }}({\varvec{\theta }_{\varvec{0}}})=E\left( \frac{{\varvec{Y}}_{t}{\varvec{Y}}_{t}^{\mathrm {T}}}{\gamma _{t}}\right) $ with ${\varvec{Y}}_{t}=(1,X_{t-p},\ldots ,X_{t-1})^{\mathrm {T}}$.

Proof of Lemma A.1

Let

$$\begin{aligned}&T_{n1}=\sum _{t=1}^{n}\left( \frac{X_{t}}{\gamma _{t}}-1\right) ,\\&T_{ni}=\sum _{t=1}^{n}\left( \frac{X_{t}}{\gamma _{t}}-1\right) X_{t-(p+2-i)},~2\le i\le p+1, \end{aligned}$$

Through some calculation, we can derive that

$$\begin{aligned}&E\left( \left( \frac{X_{n}}{\gamma _{n}}-1\right) \Big |\mathscr {F}_{n-1}\right) =0~,\\&E\bigl (T_{n1}|\mathscr {F}_{n-1}\bigr )=E\left( T_{(n-1)1}+\left( \frac{X_{n}}{\gamma _{n}}-1\right) \Big |\mathscr {F}_{n-1}\right) =T_{(n-1)1}, \end{aligned}$$

which implies that $\{T_{n1},\mathscr {F}_{n},n\ge 1\}$ is a martingale with $\mathscr {F}_{n}=\sigma \left( X_{n},X_{n-1},\ldots ,X_{0}\right) $. By $E|X_{t}|^{4}<\infty $, the strict stationarity of $\{X_{t}\}$, and the ergodic theorem, we obtain that

$$\begin{aligned}&E\left( \frac{X_{n}}{\gamma _{n}}-1\right) ^{2}<\infty ,\\&\frac{1}{n}\sum ^{n}_{t=1}\left( \frac{X_{t}}{\gamma _{t}}-1\right) ^{2}\mathop {\longrightarrow }\limits ^{a.s.}E\left( \frac{X_{n}}{\gamma _{n}}-1\right) ^{2}=E(\frac{1}{\gamma _{n}})=\sigma _{11}. \end{aligned}$$

Using the martingale central limit theorem (Hall and Heyde 2014), we get that

$$\begin{aligned} \frac{1}{\sqrt{n}}T_{n1}\mathop {\longrightarrow }\limits ^{D}N(0,\sigma _{11}). \end{aligned}$$

Similarly, we can prove $\{T_{ni},\mathscr {F}_{n},n\ge 1\}$, $i=2,\ldots ,p+1$ is a martingale and

$$\begin{aligned} \frac{1}{\sqrt{n}}T_{ni}\mathop {\longrightarrow }\limits ^{D}N(0,\sigma _{ii}). \end{aligned}$$

For any $\mathbf c =(c_{1},\ldots ,c_{p+1})^{\mathrm {T}}\in \mathbb {R}^{p+1}\backslash (0,\ldots ,0)^{\mathrm {T}}$, we get

$$\begin{aligned} \frac{1}{\sqrt{n}}{} \mathbf c ^{\mathrm {T}}\begin{pmatrix} T_{n1}\\ T_{n2}\\ \vdots \\ T_{n(p+1)} \end{pmatrix}&=\frac{1}{\sqrt{n}}\sum ^{n}_{t=1}\left( c_{1}+c_{2}X_{t-p}+\cdots +c_{p+1}X_{t-1}\right) \left( \frac{X_{t}}{\gamma _{t}}-1\right) \\&\mathop {\longrightarrow }\limits ^{D}N \left( \mathbf 0 ,E\left( c_{1}+c_{2}X_{0}+\cdots +c_{p+1}X_{p-1}\right) ^{2}\left( \frac{X_{p}}{\gamma _{p}}-1\right) ^{2}\right) . \end{aligned}$$

Thus, by the Cramer-Wold device,

$$\begin{aligned} \frac{1}{\sqrt{n}}\begin{pmatrix} T_{n1}\\ T_{n2}\\ \vdots \\ T_{n(p+1)} \end{pmatrix}=\frac{1}{\sqrt{n}}B(\varvec{\theta _{0}})\mathop {\longrightarrow }\limits ^{D}N(\mathbf 0 ,\varvec{\Sigma }(\varvec{\theta _{0}})). \end{aligned}$$

This end this proof. $\square $

Proof of Theorem 1

Let $\beta _{n}=(n^{-1/2}+a_{n})$, following Fan and Li (2001), we need to show that for any $\varepsilon >0$, there exists a constant d, such that

$$\begin{aligned} \mathbf P \left[ \sup _{\Vert \mathbf u \Vert =d}\left\{ Q_{n}(\varvec{\theta _{0}}+\beta _{n}{} \mathbf u )\right\} <Q_{n}(\varvec{\theta _{0}}) \right] \ge 1-\varepsilon , \end{aligned}$$

(7)

which implies that there exists a local maximum in the ball $\{\varvec{\theta _{0}}+\beta _{n}{} \mathbf u :\Vert \mathbf u \Vert \le d\}$ with probability at least $1-\varepsilon $, then there exists a local maximizer with $\Vert \varvec{\hat{\theta }}-\varvec{\theta _{0}}\Vert =O_{p}(\beta _{n})$. Note that

$$\begin{aligned} D_{n}(\mathbf u )&=Q_{n}(\varvec{\theta _{0}}+\beta _{n}{} \mathbf u )-Q_{n}(\varvec{\theta _{0}})\nonumber \\&=L_{n}(\varvec{\theta _{0}}+\beta _{n}{} \mathbf u )-L_{n}(\varvec{\theta _{0}})-n\sum ^{p+1}_{i}\left( P_{\lambda _{n}}(|\theta _{i}^{0}+\beta _{n}u_{i}|)-P_{\lambda _{n}}(|\theta _{i}^{0}|)\right) \nonumber \\&\le L_{n}(\varvec{\theta _{0}}+\beta _{n}{} \mathbf u )-L_{n}(\varvec{\theta _{0}})-n\sum ^{s}_{i}\left( P_{\lambda _{n}}(|\theta _{i}^{0}+\beta _{n}u_{i}|)-P_{\lambda _{n}}(|\theta _{i}^{0}|)\right) . \end{aligned}$$

(8)

By Taylor series expansion, we obtain

$$\begin{aligned} \beta _{n}{} \mathbf u ^{\mathrm {T}}B(\varvec{\theta _{0}})&+\frac{1}{2}\beta _{n}^{2}{} \mathbf u ^{\mathrm {T}}\frac{\partial ^{2}L_{n}(\varvec{\theta _{0}})}{\partial \varvec{\theta }\partial \varvec{\theta }^{\mathrm {T}}}{} \mathbf u \bigl \{1+o(1)\bigr \}\nonumber \\&-\sum ^{s}_{i}\left\{ n\beta _{n}\dot{P}_{\lambda _{n}}(|\theta ^{0}_{i}|)sgn(\theta ^{0}_{i})u_{i}+n\beta ^{2}_{n}\ddot{P}_{\lambda _{n}}(|\theta ^{0}_{i}|)u^{2}_{i}\left[ 1+o(1)\right] \right\} \nonumber \\&=A_{1}+A_{2}+A_{3}, \end{aligned}$$

where

$$\begin{aligned}&A_{1}=\beta _{n}{} \mathbf u ^{\mathrm {T}}B(\varvec{\theta _{0}}),\\&A_{2}=\frac{1}{2}\beta _{n}^{2}{} \mathbf u ^{\mathrm {T}}\frac{\partial ^{2}L_{n}(\varvec{\theta _{0}})}{\partial \varvec{\theta }\partial \varvec{\theta }^{\mathrm {T}}}{} \mathbf u \bigl \{1+o(1)\bigr \},\\&A_{3}=-\sum ^{s}_{i}\left\{ n\beta _{n}\dot{P}_{\lambda _{n}}(|\theta ^{0}_{i}|)\text {sgn}(\theta ^{0}_{i})u_{i}+n\beta ^{2}_{n}\ddot{P}_{\lambda _{n}}(|\theta ^{0}_{i}|)u^{2}_{j}\left[ 1+o(1)\right] \right\} . \end{aligned}$$

From Lemma A.1, we know that $n^{-1/2}B(\varvec{\theta _{0}})=O_{p}(1)$, then $A_{1}=O_{p}(n^{1/2}\beta _{n})=O_{p}(n\beta ^{2}_{n})$. By ergodicity, we get $A_{2}= -n\beta _{n}^{2}{} \mathbf u ^{\mathrm {T}}\varvec{\Sigma }(\varvec{\theta _{0}})\mathbf u $, as $n\rightarrow \infty $. From conditions (C.2) and (C.3), we have $A_{3}$ is bounded by $\sqrt{s}\beta _{n}a_{n}\Vert \mathbf u \Vert +n\beta _{n}^{2}b_{n}\Vert \mathbf u \Vert ^{2}.$ By choosing a sufficient large d, both $A_{1}$ and $A_{3}$ are dominated by $A_{2}$. The proof is completed. $\square $

Proof of Lemma 1

We need to prove that with probability tending to one, as $n\rightarrow \infty $ for any $\varvec{\theta _{1}}$ satisfying $\Vert \varvec{\theta _{1}}-\varvec{\theta _{10}}\Vert =O_{p}(n^{-1/2})$ and for some small $\epsilon _{n}=\eta n^{-1/2}$ and $j=s+1,\ldots ,p+1$

$$\begin{aligned}&\frac{\partial Q_{n}(\varvec{\theta })}{\partial \theta _{j}}<0, \quad \mathrm{for}~0<\theta _{j}<\epsilon _{n},\end{aligned}$$

(9)

$$\begin{aligned}&\frac{\partial Q_{n}(\varvec{\theta })}{\partial \theta _{j}}>0, \quad \mathrm{for}~-\epsilon _{n}<\theta _{j}<0. \end{aligned}$$

(10)

To show (9), by Taylor’s expansion,

$$\begin{aligned} \frac{\partial Q_{n}(\varvec{\theta })}{\partial \theta _{j}}=\,&\frac{\partial L_{n}(\varvec{\theta })}{\partial \theta _{j}}-n\dot{P}_{\lambda _{n}}(|\theta _{j}|)\text {sgn}(\theta _{j})\nonumber \\ =\,&\frac{\partial L_{n}(\varvec{\theta _{0}})}{\partial \theta _{j}}+\sum ^{p+1}_{i=1}\frac{\partial ^{2} L_{n}(\varvec{\theta _{0}})}{\partial \theta _{i}\partial \theta _{j}}(\theta _{i}-\theta _{i}^{0})\left\{ 1+o(1)\right\} \nonumber \\&-n\dot{P}_{\lambda _{n}}(|\theta _{j}|)\text {sgn}(\theta _{j}). \end{aligned}$$

(11)

From Lemma A.1, we know that $\frac{\partial L_{n}(\varvec{\theta _{0}})}{\partial \theta _{j}}=O_{p}(n^{1/2})$. By law of large numbers, strict stationarity and $\Vert \varvec{\theta _{1}}-\varvec{\theta _{10}}\Vert =O_{p}(n^{-1/2})$, we have

$$\begin{aligned} \sum ^{p+1}_{i=1}\frac{\partial ^{2} L_{n}(\varvec{\theta _{0}})}{\partial \theta _{i}\partial \theta _{j}}(\theta _{i}-\theta _{i0})\left\{ 1+o(1)\right\} =O_{p}(n^{1/2}). \end{aligned}$$

Thus, $\frac{\partial Q_{n}(\varvec{\theta })}{\partial \theta _{j}}=n\lambda _{n}\left\{ O_{p}(n^{-1/2}/\lambda _{n})-\lambda _{n}^{-1}\dot{P}_{\lambda _{n}}(|\theta _{j}|)\text {sgn}(\theta _{j})\right\} $. Since $n^{-1/2}/\lambda _{n}\rightarrow 0$ and $\lambda _{n}^{-1}\dot{P}_{\lambda _{n}}(|\theta _{j}|)>0$ as $n\rightarrow \infty $. The sign of (11) is dominated by that of $\theta _{j}$. Hence, (10) follows. This completes the proof. $\square $

Proof of Theorem 2

Part ($i$) holds by Lemma 1. We only need to prove ($ii$). From part ($i$), we know that $\varvec{\hat{\theta }_{2}}=\mathbf 0 $ with probability tending to 1. Thus, there exists a root-n consistent local maximum estimator $\varvec{\hat{\theta }_{1}}$ satisfies the following equation

$$\begin{aligned} \frac{\partial Q_{n}(\varvec{\theta })}{\partial \theta _{j}}\Bigg |_{\varvec{\theta }=\begin{pmatrix}\varvec{\hat{\theta }_{1}}\\ \mathbf 0 \end{pmatrix}}=0,~~for~j=1,\ldots ,s. \end{aligned}$$

By the Taylor expansion, we have

$$\begin{aligned} 0=&\,\frac{\partial L_{n}(\varvec{\theta _{0}})}{\partial \theta _{j}}-n\dot{P}_{\lambda _{n}}(|\hat{\theta }_{j}|)\text {sgn}(\hat{\theta }_{j})\\ =&\,\frac{\partial L_{n}(\varvec{\theta _{0}})}{\partial \theta _{j}}+\sum ^{s}_{l=1}\biggl \{\frac{\partial ^{2}L_{n}(\varvec{\theta _{0}})}{\partial \theta _{l}\partial \theta _{j}}+o_{p}(1)\biggr \}(\hat{\theta }_{l}-\theta _{l}^{0})\\&-n\left\{ \dot{P}_{\lambda _{n}}(|\theta _{j}^{0}|)\text {sgn}(\theta _{j}^{0})+\left( \ddot{P}_{\lambda _{n}}(|\theta _{j}^{0}|)+o_{p}(1)\right) (\hat{\theta }_{j}-\theta _{j}^{0})\right\} . \end{aligned}$$

This indicates

$$\begin{aligned} \sqrt{n}(\varvec{\Sigma }^{s}(\varvec{\theta _0})+\varvec{\Lambda })\{(\varvec{\hat{\theta }_{1}}-\varvec{\theta _{10}})+(\varvec{\Sigma }^{s}(\varvec{\theta _0})+\varvec{\Lambda })^{-1}{} \mathbf b )\}=\frac{1}{\sqrt{n}}B^{s}(\varvec{\theta }_{0})+o_{p}(1), \end{aligned}$$

where $B^{s}(\varvec{\theta }_{\varvec{0}})=\sum ^{n}_{t=1}\frac{1}{\gamma _{t}}\left( X_{t}-\gamma _{t}\right) {\varvec{Y}}^{s}_{t}$ and ${\varvec{Y}}^{s}_{t}=(1,X_{t-p},\ldots ,X_{t-p+s-2})^{\mathrm {T}}$. From the Slutskys theorem and the martingale central limit theorem, we complete the proof. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Wang, D. & Zhang, H. Poisson autoregressive process modeling via the penalized conditional maximum likelihood procedure. Stat Papers 61, 245–260 (2020). https://doi.org/10.1007/s00362-017-0938-0

Download citation

Received: 27 October 2016
Revised: 10 June 2017
Published: 28 July 2017
Issue Date: February 2020
DOI: https://doi.org/10.1007/s00362-017-0938-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Poisson autoregressive process modeling via the penalized conditional maximum likelihood procedure

Abstract

Access this article

Similar content being viewed by others

Empirical likelihood-based inference in Poisson autoregressive model with conditional moment restrictions

Statistical Inferences in a Partially Linear Model with Autoregressive Errors

Bayesian empirical likelihood inference and order shrinkage for autoregressive models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Lemma A.1

Proof of Lemma A.1

Proof of Theorem 1

Proof of Lemma 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Poisson autoregressive process modeling via the penalized conditional maximum likelihood procedure

Abstract

Access this article

Similar content being viewed by others

Empirical likelihood-based inference in Poisson autoregressive model with conditional moment restrictions

Statistical Inferences in a Partially Linear Model with Autoregressive Errors

Bayesian empirical likelihood inference and order shrinkage for autoregressive models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma A.1

Proof of Lemma A.1

Proof of Theorem 1

Proof of Lemma 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation